DOC: show adaptive algorithm performance (#92)

* BUG: Reverse posterior calculation * ENH: add query plot on dashboard * MAINT: add embedding algorithm SOE * MAINT: Remove sample_weight from active offline * DOC: Rework adaptive benchmark
stsievert · Feb 2, 2021 · f9eae7e · f9eae7e
1 parent 88bb858
commit f9eae7e
Show file tree

Hide file tree

Showing 44 changed files with 381 additions and 561 deletions.
diff --git a/docs/source/benchmarks/adaptive.rst b/docs/source/benchmarks/adaptive.rst
@@ -6,31 +6,31 @@ about a random question like random sampling. This can mean that higher
 accuracies are reached sooner, or that less human responses are required to
 reach a particular accuracy.
 
-Illustrative result
--------------------
-
-Let's run a quick benchmark with Salmon to see how well adaptive performs in
-the crowdsourcing context. This benchmark will accurately simulate a
-crowdsourcing context:
-
-* Answers will be received by Salmon at a rate of 4 responses/second.
-* The answers will come from the Zappos shoe dataset, an exhaustively sampled
-  triplets dataset with 4 human responses to every possible question.
-    * This dataset has :math:`n = 85` shoes, and I mirror Heim et. al and embed
-      into :math:`d = 2` dimensions [1]_.
-* The random and adaptive algorithms will be the same in every except in how
-  how select queries.
-
-With that setup, how much of a difference does query selection matter? Here's
-a result that illustrates the benefit of adaptive algorithms:
-
-.. image:: imgs/adaptive.png
-   :width: 400px
+Synthetic simulation
+--------------------
+
+Let's compare adaptive sampling and random sampling. Specifically, let's use
+Salmon like an experimentalist would:
+
+1. Launch Salmon with the "alien eggs" dataset (with :math:`n=50` objects and
+   using :math:`d=2` dimensions).
+2. Simulate human users (6 users with mean response time of 1s).
+3. Download the human responses from Salmon
+4. Generate the embedding offline.
+
+Let's run this process for adaptive and random sampling. When we do that, this
+is the graph that's produced:
+
+.. image:: imgs/synth-eg-acc.png
+   :width: 600px
    :align: center
 
+These are synthetic results, though they use a human noise model. These
+experiments provide evidence that Salmon works well with adaptive sampling.
+
 This measure provide evidence to support the hypothesis that Salmon has better
 performance than NEXT for adaptive triplet embeddings. For reference, in NEXT's
-introduction paper, the authors provided "no evidence for gains from adaptive
+introduction paper, the authors found "no evidence for gains from adaptive
 sampling" for the triplet embedding problem [2]_.
 
 .. [1] "Active Perceptual Similarity Modeling with Auxiliary Information" by E.

diff --git a/docs/source/benchmarks/imgs/adaptive.png b/docs/source/benchmarks/imgs/adaptive.png
diff --git a/docs/source/benchmarks/imgs/synth-eg-acc.png b/docs/source/benchmarks/imgs/synth-eg-acc.png
diff --git a/docs/source/imgs/logo.graffle/data.plist b/docs/source/imgs/logo.graffle/data.plist
diff --git a/docs/source/imgs/logo.graffle/image1.tiff b/docs/source/imgs/logo.graffle/image1.tiff
diff --git a/docs/source/imgs/logo.graffle/image2.tiff b/docs/source/imgs/logo.graffle/image2.tiff
diff --git a/docs/source/imgs/logo.graffle/image3.tiff b/docs/source/imgs/logo.graffle/image3.tiff
diff --git a/docs/source/imgs/logo.pages b/docs/source/imgs/logo.pages
diff --git a/docs/source/imgs/query.graffle/data.plist b/docs/source/imgs/query.graffle/data.plist
diff --git a/docs/source/imgs/query.png b/docs/source/imgs/query.png
diff --git a/docs/source/offline.rst b/docs/source/offline.rst
@@ -34,12 +34,6 @@ Install Salmon
 Generate embeddings
 -------------------
 
-First, let's cover random sampling. Adaptive algorithms require some special
-attention.
-
-Random embeddings
-"""""""""""""""""
-
 This code will generate an embedding:
 
 .. code-block:: python
@@ -52,42 +46,16 @@ This code will generate an embedding:
    n = int(X.max() + 1)  # number of targets
    d = 2  # embed into 2 dimensions
 
-   X_train, X_test = train_test_split(X, random_state=0, test_size=0.2)
+   X_train, X_test = train_test_split(X, random_state=42, test_size=0.2)
    model = OfflineEmbedding(n=n, d=d)
    model.fit(X_train, X_test)
 
    model.embedding_  # embedding
    model.history_  # to view information on how well train/test performed
 
 Some customization can be done with ``model.history_``; it may not be necessary
-to train for 200 epochs, for example. ``model.history_`` will include
-validation and training scores, which might help limit the number of epochs.
-
-Adaptive embeddings
-"""""""""""""""""""
-
-Adaptive embeddings are mostly the same, but require the following:
-
-1. Re-weighting the adaptively selected samples.
-2. Splitting train/test properly.
-
-Re-weighting the samples is required because we don't want to overfit the
-adaptive samples.
-
-.. code-block:: python
-
-   df = pd.read_csv("responses.csv")  # downloaded from dashboard
-
-   test = df.alg_ident == "RandomSampling"
-   train = df.alg_ident == "TSTE"  # an adaptive algorithm
-
-   cols = ["head", "winner", "loser"]
-   X_test = df.loc[test, cols].to_numpy()
-   X_train = df.loc[train, cols].to_numpy()
-
-   model = OfflineEmbedding(n=int(df["head"].max() + 1), d=2, weight=True)
-
-   model.fit(X_train, X_test, scores=df.loc[train, "score"])
+to train for 1,000,000 epochs. ``model.history_`` will include validation and
+training scores, which might help limit the number of epochs.
 
 Embedding visualization
 -----------------------
@@ -107,4 +75,3 @@ the embedding, which might be `Matplotlib`_, the `Pandas visualization API`_,
 .. _Bokeh: https://bokeh.org/
 .. _Matplotlib: https://matplotlib.org/
 .. _Altair: https://altair-viz.github.io/
-
diff --git a/examples/datasets.py b/examples/datasets.py
@@ -5,7 +5,7 @@
 from sklearn.utils import check_random_state
 
 
-def strange_fruit(head, left, right, random_state=None):
+def alien_egg(head, left, right, random_state=None):
     """
     Parameters
     ----------

diff --git a/.../datasets/strange-fruit-triplet/README.md → ...es/datasets/alien-eggs-triplets/README.md b/.../datasets/strange-fruit-triplet/README.md → ...es/datasets/alien-eggs-triplets/README.md
diff --git a/.../strange-fruit-triplet/emb1d_vs_truth.pdf → ...ts/alien-eggs-triplets/emb1d_vs_truth.pdf b/.../strange-fruit-triplet/emb1d_vs_truth.pdf → ...ts/alien-eggs-triplets/emb1d_vs_truth.pdf
diff --git a/examples/datasets/alien-eggs-triplets/images.zip b/examples/datasets/alien-eggs-triplets/images.zip
diff --git a/...datasets/strange-fruit-triplet/labels.txt → ...s/datasets/alien-eggs-triplets/labels.txt b/...datasets/strange-fruit-triplet/labels.txt → ...s/datasets/alien-eggs-triplets/labels.txt
diff --git a/.../datasets/strange-fruit-triplet/model.csv → ...es/datasets/alien-eggs-triplets/model.csv b/.../datasets/strange-fruit-triplet/model.csv → ...es/datasets/alien-eggs-triplets/model.csv
diff --git a/...s/strange-fruit-triplet/noise-model.ipynb → ...ets/alien-eggs-triplets/noise-model.ipynb b/...s/strange-fruit-triplet/noise-model.ipynb → ...ets/alien-eggs-triplets/noise-model.ipynb
@@ -623,7 +623,7 @@
     "from sklearn.utils import check_random_state\n",
     "\n",
     "\n",
-    "def strange_fruit(head, left, right, random_state=None):\n",
+    "def alien_egg(head, left, right, random_state=None):\n",
     "    \"\"\"\n",
     "    Parameters\n",
     "    ----------\n",
@@ -672,7 +672,7 @@
     }
    ],
    "source": [
-    "strange_fruit(0, 1, 3)"
+    "alien_egg(0, 1, 3)"
    ]
   },
   {
@@ -686,7 +686,7 @@
     "n_targets = 600\n",
     "num_ans = 100_000\n",
     "X = rng.choice(n_targets, size=(num_ans, 3))\n",
-    "y = [strange_fruit(h, l, r, random_state=rng) for h, l, r in X]"
+    "y = [alien_egg(h, l, r, random_state=rng) for h, l, r in X]"
    ]
   },
   {

diff --git a/.../datasets/strange-fruit-triplet/noise.png → ...es/datasets/alien-eggs-triplets/noise.png b/.../datasets/strange-fruit-triplet/noise.png → ...es/datasets/alien-eggs-triplets/noise.png
diff --git a/...s/strange-fruit-triplet/responses.csv.zip → ...ets/alien-eggs-triplets/responses.csv.zip b/...s/strange-fruit-triplet/responses.csv.zip → ...ets/alien-eggs-triplets/responses.csv.zip
diff --git a/examples/datasets/faces.zip b/examples/datasets/faces.zip