Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: implement SRR and various maintenance/improvement tasks. #118

Merged
merged 51 commits into from
Oct 1, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
5ea1f33
API: allow specification of triplets in validation
stsievert Jul 15, 2021
f9ecaa0
add example and debug
stsievert Jul 15, 2021
d44721f
only run doc build on release
stsievert Jul 15, 2021
97719c7
add samplers_per_user
stsievert Jul 19, 2021
d9a94b3
redirect
stsievert Jul 19, 2021
e38472a
another redirect
stsievert Jul 22, 2021
72416e4
redirect
stsievert Jul 23, 2021
3ccb004
API: Implement ARRProxy (and clean)
stsievert Jul 26, 2021
15dea81
test change
stsievert Jul 26, 2021
7745aec
DOC: better naming
stsievert Jul 26, 2021
1e2da68
Always run get_queries
stsievert Jul 26, 2021
3a516af
fmt
stsievert Jul 28, 2021
a7bbcaa
remove keyboard shortcut
stsievert Jul 28, 2021
dce1226
redirect
stsievert Jul 28, 2021
4727667
slightly edits probs
stsievert Jul 28, 2021
9c0bed6
properly redirect
stsievert Jul 28, 2021
dfabb2e
woah, *properly* redirect
stsievert Jul 28, 2021
df674c2
*actually* properly redirect
stsievert Jul 28, 2021
a97c488
typo
stsievert Jul 28, 2021
d7a4148
ENH: allow initializing embedding
stsievert Jul 30, 2021
12ed61e
BUG: allow some samplers not to have embeddings
stsievert Jul 30, 2021
bfa29c4
DOC: show usage of initialize
stsievert Jul 30, 2021
3970c20
add embedding init
stsievert Aug 2, 2021
9c083e5
bump
stsievert Aug 2, 2021
7556c1a
Update continuumio/miniconda3 docker version
stsievert Aug 17, 2021
c120a16
alpha preload
stsievert Aug 17, 2021
4f5d7d0
better preloading images
stsievert Aug 18, 2021
8cd2828
redirect
stsievert Aug 18, 2021
d8d2e75
new redirect...
stsievert Aug 18, 2021
6183b1b
doc typo
stsievert Aug 24, 2021
589d7af
Add function to see stats
stsievert Aug 26, 2021
7fb6cd7
test redirect
stsievert Aug 28, 2021
0c06c9e
final redirect?
stsievert Aug 30, 2021
ee0cba9
Update OGD
stsievert Sep 2, 2021
d0d31bc
Mirror GeoDamp
stsievert Sep 2, 2021
e544313
no_grad
stsievert Sep 2, 2021
ce38b65
mbs
stsievert Sep 2, 2021
c7d0faf
% 0
stsievert Sep 2, 2021
a9b6cc8
small change
stsievert Sep 3, 2021
f2997ad
small change
stsievert Sep 3, 2021
9b4abe3
gd
stsievert Sep 3, 2021
4c475a6
param change
stsievert Sep 5, 2021
c8d8d5a
param change
stsievert Sep 5, 2021
0d12ca9
another param change
stsievert Sep 8, 2021
5562f22
no dwell by default
stsievert Sep 9, 2021
36f071d
SARR → SRR
stsievert Sep 14, 2021
bab6dce
active RR → async RR
stsievert Oct 1, 2021
ce6ee93
revert docs
stsievert Oct 1, 2021
4bf9fa9
link to exps
stsievert Oct 1, 2021
2caf5e1
Delete unnecessary key
stsievert Oct 1, 2021
d8bf8f0
better reset
stsievert Oct 1, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
name: Documentation build

on: push
# on:
# release:
# types: [published]
# on: push
on:
release:
types: [published]

# Only run when release published (not created or edited, etc)
# https://docs.github.com/en/actions/reference/events-that-trigger-workflows#release
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM continuumio/miniconda3:4.9.2
FROM continuumio/miniconda3:4.10.3

RUN apt-get update
RUN apt-get install -y gcc cmake g++
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@
<img src="https://github.com/stsievert/salmon/actions/workflows/docs.yml/badge.svg" alt="Documentation status badge" />
</a>


See the documentation for more detail: https://docs.stsievert.com/salmon/
43 changes: 37 additions & 6 deletions docs/source/_static/alieneggs.html
Original file line number Diff line number Diff line change
@@ -1,10 +1,41 @@
<!DOCTYPE html>
<!DOCTYPE html>

<html>
<body>
<script>
var urls = ["http://18.237.56.89:8421", "http://52.38.73.113:8421"];
window.location.href = urls[Math.floor(Math.random() * urls.length)]
</script>
<p>Please enable Javascript for to be randomly redirected.</p>

<div id="msg">
<p>Please enable Javascript for a (random) redirection</p>
</div>

<script>
var ips = ["54.218.25.50",
"34.211.149.10",
"54.190.3.114",
"52.37.51.10",
"18.236.202.200",
"18.237.150.81",
"54.186.127.78",
"54.185.86.120",
"52.13.66.158",
"54.202.49.132",
"34.212.172.73",
"18.237.134.246",
"52.41.75.11",
"52.37.147.72",
"34.213.44.119",
"34.209.246.231",
"54.71.64.190",
"54.190.102.80",
"18.237.54.48",
"54.68.42.104"];

var idx = Math.floor(Math.random() * ips.length); // between [0, N). floor(r) == [0, 1, ..., N-1]
var ip = ips[Math.floor(idx)]
var url = "http://" + ip + ":8421/"

var link = "<a href='" + url + "'>"+url+"<\/a>";
document.getElementById("msg").innerHTML = "<p>Redirecting to " + link + " <\/p>";
window.open(url, '_self');
</script>
</body>
</html>
10 changes: 5 additions & 5 deletions docs/source/algorithms.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ sampling.

targets: ["obj1", "obj2", "foo", "bar", "foobar!"]
samplers:
RandomSampling: {}
Random: {}
Validation: {"n_queries": 10}

By default, ``samplers`` defaults to ``RandomSampling: {}``. We have to customize the ``samplers`` key use adaptive sampling algorithms:
By default, ``samplers`` defaults to ``Random: {}``. We have to customize the ``samplers`` key use adaptive sampling algorithms:

.. code-block:: yaml

Expand All @@ -42,7 +42,7 @@ configuration:

targets: ["obj1", "obj2", "foo", "bar", "foobar!"]
samplers:
RandomSampling: {}
Random: {}
ARR:
module: "TSTE"

Expand All @@ -56,7 +56,7 @@ would compare two different instances of
targets: ["obj1", "obj2", "foo", "bar", "foobar!"]

samplers:
RandomSampling: {}
Random: {}
arr_tste:
class: ARR
module: "TSTE"
Expand All @@ -66,7 +66,7 @@ would compare two different instances of
module__mu: 0.02

sampling:
probs: {"RandomSampling": 20, "arr_ckl": 40, "arr_tste": 40}
probs: {"Random": 20, "arr_ckl": 40, "arr_tste": 40}

In this configuration, custom names are provided for two instances of
:class:`~salmon.triplets.samplers.ARR`. Both instances are sampled 40% of the
Expand Down
2 changes: 1 addition & 1 deletion docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Passive Algorithms
:toctree: generated/
:template: only-init.rst

salmon.triplets.samplers.RandomSampling
salmon.triplets.samplers.Random
salmon.triplets.samplers.RoundRobin
salmon.triplets.samplers.Validation

Expand Down
11 changes: 5 additions & 6 deletions docs/source/benchmarks/active.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,13 +76,12 @@ sampling with these ``init.yaml`` configurations:
d: 2
samplers:
ARR: {random_state: 42} # active or adaptive sampling
RandomSampling: {} # random sampling

The "ARR" stands for "active round robin" and creates an instance of
:class:`~salmon.triplets.samplers.ARR`. For this class, head rotates through
available choices ("round robin") and for each head, the best comparisons are
chosen (by some measure with information gain).
Random: {} # random sampling

The "ARR" stands for "asynchronous round robin" and creates an instance of
:class:`~salmon.triplets.samplers.ARR`. For this class, the query head is
randomly chosen, and then for each head, the best comparison items are ranked
by some measure (information gain by default).

.. note::

Expand Down
15 changes: 11 additions & 4 deletions docs/source/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,9 +89,9 @@ Here's an example ``init.yaml`` YAML file for initialization:
max_queries: 100
samplers:
ARR: {}
RandomSampling: {}
Random: {}
sampling:
probs: {"ARR": 80, "RandomSampling": 20}
probs: {"ARR": 80, "Random": 20}

The top-level elements like ``max_queries`` and ``targets`` are called "keys"
in YAML jargon. Here's documentation for each key:
Expand All @@ -103,8 +103,15 @@ in YAML jargon. Here's documentation for each key:
* ``max_queries``: int. The number of queries a participant should answer. Set
``max_queries: -1`` for unlimited queries.
* ``samplers``. See :ref:`adaptive-config` for more detail.
* ``sampling``. A dictionary with the key ``probs`` and percentage
probabilities for each algorithm.
* ``sampling``. A dictionary with the following keys:

* ``probs``, a map between sampler names and the percentage that
each sampler is selected.

* ``samplers_per_user``: (optional int, default=0). Controls the
number of samplers each user sees. If ``samplers_per_user=0``, show
users a random sampler.

* ``targets``, optional list. Choices:

* YAML list. This ``targets: ["vonn", "miller", "ligety", "shiffrin"]`` is
Expand Down
6 changes: 5 additions & 1 deletion docs/source/offline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,14 +45,18 @@ This code will generate an embedding:

from salmon.triplets.offline import OfflineEmbedding

df = pd.read_csv("responses.csv")
df = pd.read_csv("responses.csv") # from dashboard
X = df[["head", "winner", "loser"]].to_numpy()

em = pd.read_csv("embedding.csv") # from dashboard

n = int(X.max() + 1) # number of targets
d = 2 # embed into 2 dimensions

X_train, X_test = train_test_split(X, random_state=42, test_size=0.2)
model = OfflineEmbedding(n=n, d=d)
model.initialize(X_train, embedding=em.to_numpy())

model.fit(X_train, X_test)

model.embedding_ # embedding
Expand Down
2 changes: 1 addition & 1 deletion examples/alien-eggs/random.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ debrief: Thanks! Please enter this participant ID in the assignment.
max_queries: 100 # took about 2.05 minutes for 100 queries
d: 2
samplers:
RandomSampling: {}
Random: {}
6 changes: 4 additions & 2 deletions examples/basic.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ d: 2
max_queries: 100
samplers:
ARR: {}
RandomSampling: {}
Random: {}
Validation:
queries: [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
sampling:
probs: {"ARR": 80, "RandomSampling": 20}
probs: {"ARR": 40, "Validation": 40, "Random": 20}
4 changes: 2 additions & 2 deletions examples/colors/colors.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ d: 2
samplers:
ARR:
random_state: 42
RandomSampling: {}
Random: {}
sampling:
probs: {"RandomSampling": 38, "ARR": 62}
probs: {"Random": 38, "ARR": 62}
css: >
body {
background-color: #414141;
Expand Down
2 changes: 1 addition & 1 deletion examples/faces/random.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ debrief: Thanks! Please enter this participant ID in the assignment.
max_queries: 100 # took about 2.05 minutes for 100 queries
d: 2
samplers:
RandomSampling: {}
Random: {}
4 changes: 2 additions & 2 deletions examples/zappos/active.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ debrief: Thanks! Please enter this participant ID in the assignment.
max_queries: 100
d: 2
samplers:
RandomSampling: {}
Random: {}
ARR: {random_state: 42}
sampling:
probs: {"ARR": 80, "RandomSampling": 20}
probs: {"ARR": 80, "Random": 20}
1 change: 1 addition & 0 deletions salmon.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ dependencies:
- pyyaml
- ipykernel # to run this environ in jupyter (required for viz)
- nb_conda_kernels
- altair
- pip:
- seaborn
- fastapi[all]
Expand Down
2 changes: 1 addition & 1 deletion salmon/backend/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
from .core import app
from .sampler import Runner
from .sampler import Sampler
4 changes: 2 additions & 2 deletions salmon/backend/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,9 +84,9 @@ async def init(ident: str, background_tasks: BackgroundTasks) -> bool:
- 3
- 4
samplers:
- RandomSampling
- Random
- random2
- class: RandomSampling
- class: Random
- foo: bar

"""
Expand Down
49 changes: 24 additions & 25 deletions salmon/backend/sampler.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
root = Path.rootPath()


class Runner:
class Sampler:
"""
Run a sampling algorithm. Provides hooks to connect with the database and
the Dask cluster.
Expand Down Expand Up @@ -117,12 +117,9 @@ def submit(fn: str, *args, allow_other_workers=True, **kwargs):
"process_answers", self_future, answers, workers=workers[1],
)

if hasattr(self, "get_queries"):
f_search = submit(
"get_queries", self_future, stop=done, workers=workers[2],
)
else:
f_search = client.submit(lambda x: ([], [], {}), 0)
f_search = submit(
"get_queries", self_future, stop=done, workers=workers[2],
)

time_model = 0.0
time_post = 0.0
Expand Down Expand Up @@ -218,7 +215,7 @@ def _search_done(_):

def save(self) -> bool:
"""
Save the runner's state and current embedding to the database.
Save the sampler's state and current embedding to the database.
"""
rj2 = self.redis_client(decode_responses=False)

Expand Down Expand Up @@ -309,25 +306,27 @@ def process_answers(self, answers: List[Answer]):
"""
raise NotImplementedError

# def get_queries(self) -> Tuple[List[Query], List[float]]:
"""
Get queries.
def get_queries(self) -> Tuple[List[Query], List[float]]:
"""
Get queries.

Returns
-------
queries : List[Query]
The list of queries
scores : List[float]
The scores for each query. Higher scores are sampled more
often.
Returns
-------
queries : List[Query]
The list of queries
scores : List[float]
The scores for each query. Higher scores are sampled more
often.
meta : Dict[str, Any]
Information about the search.

Notes
-----
The scores have to be unique. The underlying implementation does
not sample queries of the same score unbiased.
Notes
-----
The scores have to be unique. The underlying implementation does
not sample queries of the same score unbiased.

"""
raise NotImplementedError
"""
return [], [], {}

def get_model(self) -> Dict[str, Any]:
"""
Expand All @@ -343,7 +342,7 @@ def get_model(self) -> Dict[str, Any]:

def clear_queries(self, rj: RedisClient) -> bool:
"""
Clear all queries that this runner has posted from the database.
Clear all queries that this sampler has posted from the database.
"""
rj.delete(f"alg-{self.ident}-queries")
return True
Expand Down
Loading