SAMPL submissions and evaluation

Goals of SAMPL

SAMPL focuses on advancing computational methods, so we focus on ensuring participants and the community can learn as much as possible from the challenges we pose. We seek to ensure the focus remains on advancing science rather than on “winning”. However, participants who excel at SAMPL do attract considerable attention, so we are making some changes to ensure fairness while still maximizing opportunities to learn from participation.

Multiple submissions

Previously, we allowed participants to submit multiple sets of predictions (for SAMPL1-6). Partly, this was to ensure optimal opportunities for learning from participation; participants could run several methods to compare and contrast their results.

However, for SAMPL7 onwards, we are modestly changing our policy on multiple submissions to distinguish two categories of blind submissions:

  1. Ranked submissions: Formal, judged entries in SAMPL (only one per research group or research organization)
  2. Verified submissions: Blind predictions (unlimited per research group or research organization) which are not formally judged.

Ranked submissions are intended to be the single entry each participant expects to be best performing, and only these methods will receive overall rankings in SAMPL challenges. This will help to alleviate any concerns that a participant might gain an unfair advantage from having “multiple shots on goal” which could involve small variations in method or parameters with relatively little variation in science. Only ranked submissions will receive formal ranking within the relevant SAMPL challenge.

Verified submissions, in contrast, allow us to serve as essentially a custodian of predictions, verifying that they were done in a blind manner in advance of data release. Verified submissions will not be judged or ranked as formal submissions, but we will still attempt to provide performance metrics for these submissions.

We believe that having these two categories will allow us to continue to play a valuable role in helping participants maximize what they learn by trying multiple methods when warranted, while also ensuring that participants applying multiple methods or method variations do not receive an unfair advantage in judging.

External evaluation

For SAMPL7 onwards, we seek to shift to using (or being aided by) external evaluators in judging SAMPL performance. If you are willing to assist with this, please contact David Mobley.

Anonymous submissions

Some historical SAMPL challenges allowed anonymous submissions. However, for SAMPL6 onwards, anonymous submissions are no longer allowed. Participants submissions and methods descriptions will be publicly disclosed via the relevant SAMPL GitHub repository. This helps to ensure the community is able to learn as much as possible from these challenges, and also assists with fairness (e.g. no participant can choose to stay anonymous until their performance becomes clear).

Focused virtual workshops and follow-ups

To help ensure participants learn as much as possible, we now plan virtual workshops focused on each challenge component shortly after the release of the results of each challenge. These are to help participants exchange early ideas and make connections for potential follow-up work well in advance of publishing their results. For example, participants who used similar methods can find out about it at a virtual workshop then compare and contrast and perhaps plan follow-up calculations to resolve any discrepancies, etc. In the past, such work has often been where SAMPL yielded some of its most important lessons.