stsievert · stsievert · Oct 1, 2021 · Jul 15, 2021 · Jul 15, 2021 · Jul 15, 2021
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
@@ -1,9 +1,9 @@
 name: Documentation build
 
-on: push
-# on:
-  # release:
-    # types: [published]
+# on: push
+on:
+  release:
+    types: [published]
 
 # Only run when release published (not created or edited, etc)
 # https://docs.github.com/en/actions/reference/events-that-trigger-workflows#release

diff --git a/Dockerfile b/Dockerfile
@@ -1,4 +1,4 @@
-FROM continuumio/miniconda3:4.9.2
+FROM continuumio/miniconda3:4.10.3
 
 RUN apt-get update
 RUN apt-get install -y gcc cmake g++

diff --git a/README.md b/README.md
@@ -5,4 +5,5 @@
   <img src="https://github.com/stsievert/salmon/actions/workflows/docs.yml/badge.svg" alt="Documentation status badge" />
 </a>
 
+
 See the documentation for more detail: https://docs.stsievert.com/salmon/
diff --git a/docs/source/_static/alieneggs.html b/docs/source/_static/alieneggs.html
@@ -1,10 +1,41 @@
-<!DOCTYPE html>
+ <!DOCTYPE html>
+
 <html>
   <body>
-    <script>
-    var urls = ["http://18.237.56.89:8421", "http://52.38.73.113:8421"];
-    window.location.href = urls[Math.floor(Math.random() * urls.length)]
-    </script>
-    <p>Please enable Javascript for to be randomly redirected.</p>
+
+  <div id="msg">
+    <p>Please enable Javascript for a (random) redirection</p>
+  </div>
+
+  <script>
+    var ips = ["54.218.25.50",
+               "34.211.149.10",
+               "54.190.3.114",
+               "52.37.51.10",
+               "18.236.202.200",
+               "18.237.150.81",
+               "54.186.127.78",
+               "54.185.86.120",
+               "52.13.66.158",
+               "54.202.49.132",
+               "34.212.172.73",
+               "18.237.134.246",
+               "52.41.75.11",
+               "52.37.147.72",
+               "34.213.44.119",
+               "34.209.246.231",
+               "54.71.64.190",
+               "54.190.102.80",
+               "18.237.54.48",
+               "54.68.42.104"];
+
+    var idx = Math.floor(Math.random() * ips.length);  // between [0, N). floor(r) == [0, 1, ..., N-1]
+    var ip = ips[Math.floor(idx)]
+    var url = "http://" + ip + ":8421/"
+
+    var link = "<a href='" + url + "'>"+url+"<\/a>";
+    document.getElementById("msg").innerHTML = "<p>Redirecting to " + link + " <\/p>";
+    window.open(url, '_self');
+  </script>
   </body>
 </html>
diff --git a/docs/source/algorithms.rst b/docs/source/algorithms.rst
@@ -21,10 +21,10 @@ sampling.
 
    targets: ["obj1", "obj2", "foo", "bar", "foobar!"]
    samplers:
-     RandomSampling: {}
+     Random: {}
      Validation: {"n_queries": 10}
 
-By default, ``samplers`` defaults to ``RandomSampling: {}``. We have to customize the ``samplers`` key use adaptive sampling algorithms:
+By default, ``samplers`` defaults to ``Random: {}``. We have to customize the ``samplers`` key use adaptive sampling algorithms:
 
 .. code-block:: yaml
 
@@ -42,7 +42,7 @@ configuration:
 
    targets: ["obj1", "obj2", "foo", "bar", "foobar!"]
    samplers:
-     RandomSampling: {}
+     Random: {}
      ARR:
        module: "TSTE"
 
@@ -56,7 +56,7 @@ would compare two different instances of
    targets: ["obj1", "obj2", "foo", "bar", "foobar!"]
 
    samplers:
-     RandomSampling: {}
+     Random: {}
      arr_tste:
        class: ARR
        module: "TSTE"
@@ -66,7 +66,7 @@ would compare two different instances of
        module__mu: 0.02
 
    sampling:
-     probs: {"RandomSampling": 20, "arr_ckl": 40, "arr_tste": 40}
+     probs: {"Random": 20, "arr_ckl": 40, "arr_tste": 40}
 
 In this configuration, custom names are provided for two instances of
 :class:`~salmon.triplets.samplers.ARR`. Both instances are sampled 40% of the

diff --git a/docs/source/api.rst b/docs/source/api.rst
@@ -47,7 +47,7 @@ Passive Algorithms
    :toctree: generated/
    :template: only-init.rst
 
-   salmon.triplets.samplers.RandomSampling
+   salmon.triplets.samplers.Random
    salmon.triplets.samplers.RoundRobin
    salmon.triplets.samplers.Validation
 

diff --git a/docs/source/benchmarks/active.rst b/docs/source/benchmarks/active.rst
@@ -76,13 +76,12 @@ sampling with these ``init.yaml`` configurations:
    d: 2
    samplers:
      ARR: {random_state: 42}  # active or adaptive sampling
-     RandomSampling: {}  # random sampling
-
-The "ARR" stands for "active round robin" and creates an instance of
-:class:`~salmon.triplets.samplers.ARR`. For this class, head rotates through
-available choices ("round robin") and for each head, the best comparisons are
-chosen (by some measure with information gain).
+     Random: {}  # random sampling
 
+The "ARR" stands for "asynchronous round robin" and creates an instance of
+:class:`~salmon.triplets.samplers.ARR`. For this class, the query head is
+randomly chosen, and then for each head, the best comparison items are ranked
+by some measure (information gain by default).
 
 .. note::
 

diff --git a/docs/source/getting-started.rst b/docs/source/getting-started.rst
@@ -89,9 +89,9 @@ Here's an example ``init.yaml`` YAML file for initialization:
    max_queries: 100
    samplers:
      ARR: {}
-     RandomSampling: {}
+     Random: {}
    sampling:
-     probs: {"ARR": 80, "RandomSampling": 20}
+     probs: {"ARR": 80, "Random": 20}
 
 The top-level elements like ``max_queries`` and ``targets`` are called "keys"
 in YAML jargon. Here's documentation for each key:
@@ -103,8 +103,15 @@ in YAML jargon. Here's documentation for each key:
 * ``max_queries``: int. The number of queries a participant should answer. Set
   ``max_queries: -1`` for unlimited queries.
 * ``samplers``. See :ref:`adaptive-config` for more detail.
-* ``sampling``. A dictionary with the key ``probs`` and percentage
-  probabilities for each algorithm.
+* ``sampling``. A dictionary with the following keys:
+
+    * ``probs``, a map between sampler names and the percentage that
+      each sampler is selected.
+
+    * ``samplers_per_user``: (optional int, default=0). Controls the
+      number of samplers each user sees. If ``samplers_per_user=0``, show
+      users a random sampler.
+
 * ``targets``, optional list. Choices:
 
     * YAML list. This ``targets: ["vonn", "miller", "ligety", "shiffrin"]`` is

diff --git a/docs/source/offline.rst b/docs/source/offline.rst
@@ -45,14 +45,18 @@ This code will generate an embedding:
 
    from salmon.triplets.offline import OfflineEmbedding
 
-   df = pd.read_csv("responses.csv")
+   df = pd.read_csv("responses.csv")  # from dashboard
    X = df[["head", "winner", "loser"]].to_numpy()
 
+   em = pd.read_csv("embedding.csv")  # from dashboard
+
    n = int(X.max() + 1)  # number of targets
    d = 2  # embed into 2 dimensions
 
    X_train, X_test = train_test_split(X, random_state=42, test_size=0.2)
    model = OfflineEmbedding(n=n, d=d)
+   model.initialize(X_train, embedding=em.to_numpy())
+
    model.fit(X_train, X_test)
 
    model.embedding_  # embedding

diff --git a/examples/alien-eggs/random.yaml b/examples/alien-eggs/random.yaml
@@ -3,4 +3,4 @@ debrief: Thanks! Please enter this participant ID in the assignment.
 max_queries: 100  # took about 2.05 minutes for 100 queries
 d: 2
 samplers:
-  RandomSampling: {}
+  Random: {}
diff --git a/examples/basic.yaml b/examples/basic.yaml
@@ -3,6 +3,8 @@ d: 2
 max_queries: 100
 samplers:
   ARR: {}
-  RandomSampling: {}
+  Random: {}
+  Validation:
+    queries: [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
 sampling:
-  probs: {"ARR": 80, "RandomSampling": 20}
+  probs: {"ARR": 40, "Validation": 40, "Random": 20}
diff --git a/examples/colors/colors.yaml b/examples/colors/colors.yaml
@@ -5,9 +5,9 @@ d: 2
 samplers:
   ARR:
     random_state: 42
-  RandomSampling: {}
+  Random: {}
 sampling:
-  probs: {"RandomSampling": 38, "ARR": 62}
+  probs: {"Random": 38, "ARR": 62}
 css: >
   body {
     background-color: #414141;

diff --git a/examples/faces/random.yaml b/examples/faces/random.yaml
@@ -3,4 +3,4 @@ debrief: Thanks! Please enter this participant ID in the assignment.
 max_queries: 100  # took about 2.05 minutes for 100 queries
 d: 2
 samplers:
-  RandomSampling: {}
+  Random: {}
diff --git a/examples/zappos/active.yaml b/examples/zappos/active.yaml
@@ -3,7 +3,7 @@ debrief: Thanks! Please enter this participant ID in the assignment.
 max_queries: 100
 d: 2
 samplers:
-  RandomSampling: {}
+  Random: {}
   ARR: {random_state: 42}
 sampling:
-  probs: {"ARR": 80, "RandomSampling": 20}
+  probs: {"ARR": 80, "Random": 20}
diff --git a/salmon.yml b/salmon.yml
@@ -23,6 +23,7 @@ dependencies:
   - pyyaml
   - ipykernel  # to run this environ in jupyter (required for viz)
   - nb_conda_kernels
+  - altair
   - pip:
     - seaborn
     - fastapi[all]

diff --git a/salmon/backend/__init__.py b/salmon/backend/__init__.py
@@ -1,2 +1,2 @@
 from .core import app
-from .sampler import Runner
+from .sampler import Sampler
diff --git a/salmon/backend/core.py b/salmon/backend/core.py
@@ -84,9 +84,9 @@ async def init(ident: str, background_tasks: BackgroundTasks) -> bool:
          - 3
          - 4
        samplers:
-         - RandomSampling
+         - Random
          - random2
-           - class: RandomSampling
+           - class: Random
            - foo: bar
 
     """

diff --git a/salmon/backend/sampler.py b/salmon/backend/sampler.py
@@ -21,7 +21,7 @@
 root = Path.rootPath()
 
 
-class Runner:
+class Sampler:
     """
     Run a sampling algorithm. Provides hooks to connect with the database and
     the Dask cluster.
@@ -117,12 +117,9 @@ def submit(fn: str, *args, allow_other_workers=True, **kwargs):
                     "process_answers", self_future, answers, workers=workers[1],
                 )
 
-                if hasattr(self, "get_queries"):
-                    f_search = submit(
-                        "get_queries", self_future, stop=done, workers=workers[2],
-                    )
-                else:
-                    f_search = client.submit(lambda x: ([], [], {}), 0)
+                f_search = submit(
+                    "get_queries", self_future, stop=done, workers=workers[2],
+                )
 
                 time_model = 0.0
                 time_post = 0.0
@@ -218,7 +215,7 @@ def _search_done(_):
 
     def save(self) -> bool:
         """
-        Save the runner's state and current embedding to the database.
+        Save the sampler's state and current embedding to the database.
         """
         rj2 = self.redis_client(decode_responses=False)
 
@@ -309,25 +306,27 @@ def process_answers(self, answers: List[Answer]):
         """
         raise NotImplementedError
 
-        #  def get_queries(self) -> Tuple[List[Query], List[float]]:
-        """
-        Get queries.
+        def get_queries(self) -> Tuple[List[Query], List[float]]:
+            """
+            Get queries.
 
-        Returns
-        -------
-        queries : List[Query]
-            The list of queries
-        scores : List[float]
-            The scores for each query. Higher scores are sampled more
-            often.
+            Returns
+            -------
+            queries : List[Query]
+                The list of queries
+            scores : List[float]
+                The scores for each query. Higher scores are sampled more
+                often.
+            meta : Dict[str, Any]
+                Information about the search.
 
-        Notes
-        -----
-        The scores have to be unique. The underlying implementation does
-        not sample queries of the same score unbiased.
+            Notes
+            -----
+            The scores have to be unique. The underlying implementation does
+            not sample queries of the same score unbiased.
 
-        """
-        raise NotImplementedError
+            """
+            return [], [], {}
 
     def get_model(self) -> Dict[str, Any]:
         """
@@ -343,7 +342,7 @@ def get_model(self) -> Dict[str, Any]:
 
     def clear_queries(self, rj: RedisClient) -> bool:
         """
-        Clear all queries that this runner has posted from the database.
+        Clear all queries that this sampler has posted from the database.
         """
         rj.delete(f"alg-{self.ident}-queries")
         return True
Original file line number	Diff line number	Diff line change
Expand Up		@@ -5,4 +5,5 @@
		<img src="https://github.com/stsievert/salmon/actions/workflows/docs.yml/badge.svg" alt="Documentation status badge" />
		</a>


		See the documentation for more detail: https://docs.stsievert.com/salmon/
-Original file line number
+Diff line change
@@ Expand Up / @@ -23,6 +23,7 @@ dependencies: @@
       - pyyaml
       - ipykernel  # to run this environ in jupyter (required for viz)
       - nb_conda_kernels
+      - altair
       - pip:
         - seaborn
         - fastapi[all]
@@ Expand Down @@