Squashed commit of the following:

commit adcaff2 Author: Marek Wydmuch <marek@wydmuch.poznan.pl> Date: Mon Mar 20 15:30:30 2023 +0100 feat: add a training loss calculation to the predict method of PLT reduction (#4534) * add a training loss calculation to the predict method of PLT reduction * update PLT demo * update the tests for PLT reduction * disable the calculation of additional evaluation measures in PLT reduction when true labels are not available * apply black formating to plt_demo.py * remove unnecessary reset of weighted_holdout_examples variable in PLT reduction * revert the change of the path to the exe in plt_demo.py * apply black formating again to plt_demo.py --------- Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com> commit f7a197e Author: Griffin Bassman <griffinbassman@gmail.com> Date: Fri Mar 17 11:01:17 2023 -0400 refactor: separate cb_to_cs_adf_mtr and cb_to_cs_adf_dr (#4532) * refactor: separate cb_to_cs_adf_mtr and cb_to_cs_adf_dr * clang * unused * remove mtr commit e5597ae Author: swaptr <83858160+swaptr@users.noreply.github.com> Date: Fri Mar 17 02:21:50 2023 +0530 fix: fix multiline typo (#4533) commit 301800a Author: Eduardo Salinas <edus@microsoft.com> Date: Wed Mar 15 12:25:55 2023 -0400 test: [automl] improve runtest and test changes (#4531) commit 258731c Author: Griffin Bassman <griffinbassman@gmail.com> Date: Tue Mar 14 13:28:03 2023 -0400 chore: Update Version to 9.5.0 (#4529) commit 49131be Author: Eduardo Salinas <edus@microsoft.com> Date: Tue Mar 14 11:03:24 2023 -0400 fix: [automl] avoid ccb pulling in generate_interactions (#4524) * fix: [automl] avoid ccb pulling in generate_interactions * same features in one line minimal repro * add assert of reserve size * update test file * remove include and add comment * temp print * sorting interactions matters * update temp print * fix by accounting for slot ns * remove prints * change comment and remove commented code * add sort to test * update runtests * Squashed commit of the following: commit 322a2b1 Author: Eduardo Salinas <edus@microsoft.com> Date: Mon Mar 13 21:51:49 2023 +0000 possibly overwrite vw brought in by vw-executor commit 0a6baa0 Author: Eduardo Salinas <edus@microsoft.com> Date: Mon Mar 13 21:25:46 2023 +0000 add check for metrics commit 469cebe Author: Eduardo Salinas <edus@microsoft.com> Date: Mon Mar 13 21:22:38 2023 +0000 update test commit 7c0b212 Author: Eduardo Salinas <edus@microsoft.com> Date: Mon Mar 13 21:11:45 2023 +0000 format and add handler none commit 533e067 Author: Eduardo Salinas <edus@microsoft.com> Date: Mon Mar 13 20:56:07 2023 +0000 test: [automl] add ccb test that checks for ft names * update python test * Update automl_oracle.cc commit 37f4b19 Author: Griffin Bassman <griffinbassman@gmail.com> Date: Fri Mar 10 17:38:02 2023 -0500 refactor: remove resize in gd setup (#4526) * refactor: remove resize in gd setup * rm resize commit 009831b Author: Griffin Bassman <griffinbassman@gmail.com> Date: Fri Mar 10 16:57:53 2023 -0500 fix: multi-model state for cb_adf (#4513) * switch to vector * fix aml and ep_dec * clang * reorder * clang * reorder commit a31ef14 Author: Griffin Bassman <griffinbassman@gmail.com> Date: Fri Mar 10 14:52:50 2023 -0500 refactor: rename wpp, ppw, ws, params_per_problem, problem_multiplier, num_learners, increment -> feature_width (#4521) * refactor: rename wpp, ppw, ws, params_per_problem, problem_multiplier, num_learners, increment -> interleaves * clang * clang * settings * make bottom interleaves the same * remove bottom_interleaves * fix test * feature width * clang commit 8390f48 Author: Griffin Bassman <griffinbassman@gmail.com> Date: Fri Mar 10 12:25:12 2023 -0500 refactor: dedup dict const (#4525) * refactor: dedup dict const * clang commit 2238d70 Author: Jack Gerrits <jackgerrits@users.noreply.github.com> Date: Thu Mar 9 13:51:35 2023 -0500 refactor: add api to set data object associated with learner (#4523) * refactor: add api to set data object associated with learner * add shared ptr func commit b622540 Author: Griffin Bassman <griffinbassman@gmail.com> Date: Tue Mar 7 12:16:39 2023 -0500 fix: cbzo ppw fix (#4519) commit f83cb7f Author: Jack Gerrits <jackgerrits@users.noreply.github.com> Date: Tue Mar 7 11:21:51 2023 -0500 refactor: automatically set label parser after stack created (#4471) * refactor: automatically set label parser after stack created * a couple of fixes * Put in hack to keep search working * formatting commit 64e5920 Author: olgavrou <olgavrou@gmail.com> Date: Fri Mar 3 16:20:05 2023 -0500 feat: [LAS] with CCB (#4520) commit 69bf346 Author: Jack Gerrits <jackgerrits@users.noreply.github.com> Date: Fri Mar 3 15:17:29 2023 -0500 refactor: make flat_example an implementation detail of ksvm (#4505) * refactor!: make flat_example an implementation detail of ksvm * Update memory_tree.cc * Absorb flat_example into svm_example * revert "Absorb flat_example into svm_example" This reverts commit b063feb. commit f08f1ec Author: Jack Gerrits <jackgerrits@users.noreply.github.com> Date: Fri Mar 3 14:04:48 2023 -0500 test: fix pytype issue in test runner and utl (#4517) * test: fix pytype issue in test runner * fix version_number.py type checker issues commit a8b1d91 Author: Eduardo Salinas <edus@microsoft.com> Date: Fri Mar 3 12:59:26 2023 -0500 fix: [epsdecay] return champ prediction always (#4518) commit b2276c1 Author: olgavrou <olgavrou@gmail.com> Date: Thu Mar 2 20:18:23 2023 -0500 chore: [LAS] don't force mtr with LAS (#4516) commit c0ba180 Author: olgavrou <olgavrou@gmail.com> Date: Tue Feb 28 11:27:36 2023 -0500 feat: [LAS] add example ft hash and cache and re-use rows of matrix if actions do not change (#4509) commit e1a9363 Author: Eduardo Salinas <edus@microsoft.com> Date: Mon Feb 27 16:35:09 2023 -0500 feat: [gd] persist ppw extra state (#4023) * feat: [gd] persist ppm state * introduce resize_ppw_state * wip: move logic down to gd, respect incoming ft_offset * replace assert with status quo behaviour * implement writing/reading to modelfile * remove from predict * update test 351 and 411 * update sensitivity and update * remove debug prints * update all tests * apply fix of other pr * use .at() for bounds checking * add max_ft_offset and add asserts * comment extra assert that is failing * remove files * fix automl tests * more tests * tests * tests * clang * fix for predict_only_model automl * comment * fix ppm printing * temporarily remove tests 50 and 68 * address comments * expand width for search * fix tests * merge * revert cb_adf * merge * fix learner * clang * search fix * clang * fix unit tests * bump 9.7.1 for version CIs * revert to 9.7.0 * stop search from learning out of bounds * expand search num_learners * fix search cs test * comment * revert ext_libs * clang * comment out saveresume tests * pylint * comment * fix with search * fix search * clang * unused * unused * commnets * fix scope_exit * fix cs test * revert automl test update * remove resize * clang --------- Co-authored-by: Griffin Bassman <griffinbassman@gmail.com>
VowpalWabbit · Mar 20, 2023 · b7e1c4b · b7e1c4b
1 parent a191552
commit b7e1c4b
Show file tree

Hide file tree

Showing 216 changed files with 5,508 additions and 4,939 deletions.
diff --git a/.github/workflows/python_wheels.yml b/.github/workflows/python_wheels.yml
@@ -70,8 +70,8 @@ jobs:
         shell: bash
         run: |
           pip install -r requirements.txt
-          pip install pytest twine
-          pip install built_wheel/*.whl
+          pip install pytest vw-executor twine
+          pip install --force-reinstall built_wheel/*.whl
           twine check built_wheel/*.whl
           python -m pytest ./python/tests/
           python ./python/tests/run_doctest.py
@@ -133,7 +133,7 @@ jobs:
         shell: bash
         run: |
           pip install -r requirements.txt
-          pip install pytest
+          pip install pytest vw-executor
       - name: Run unit tests
         shell: bash
         run: |
@@ -212,8 +212,8 @@ jobs:
             source .env/bin/activate && \
             pip install --upgrade pip && \
             pip install -r requirements.txt && \
-            pip install pytest twine && \
-            pip install built_wheel/*.whl && \
+            pip install pytest vw-executor twine && \
+            pip install --force-reinstall built_wheel/*.whl && \
             twine check built_wheel/*.whl && \
             python --version && \
             python -m pytest ./python/tests/ && \
@@ -272,8 +272,8 @@ jobs:
         shell: bash
         run: |
           pip install -r requirements.txt
-          pip install pytest twine
-          pip install built_wheel/*.whl
+          pip install pytest vw-executor twine
+          pip install --force-reinstall built_wheel/*.whl
           twine check built_wheel/*.whl
           python -m pytest ./python/tests/
           python ./python/tests/run_doctest.py
@@ -361,8 +361,8 @@ jobs:
           export wheel_file="${wheel_files[0]}"
           echo Installing ${wheel_file}...
           pip install -r requirements.txt
-          pip install pytest twine
-          pip install ${wheel_file}
+          pip install pytest vw-executor twine
+          pip install --force-reinstall ${wheel_file}
           twine check ${wheel_file}
           python -m pytest .\\python\\tests\\
           python .\\python\\tests\\run_doctest.py
diff --git a/cs/unittest/TestSearch.cs b/cs/unittest/TestSearch.cs
@@ -146,7 +146,7 @@ void RunSearchPredict(Multiline multiline)
 
                     rawvw.EndOfPass();
                     driver.Reset();
-                } while (remainingPasses-- > 0);
+                } while (--remainingPasses > 0);
 
                 driver.Reset();
 

diff --git a/demo/plt/README.md b/demo/plt/README.md
@@ -1,5 +1,5 @@
 Probabilistic Label Tree demo
--------------------------------
+-----------------------------
 
 This demo presents PLT for applications of logarithmic time multilabel classification.
 It uses Mediamill dataset from the [LIBLINEAR datasets repository](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multilabel.html)
@@ -12,19 +12,22 @@ The datasets and paremeters can be easliy edited in the script. The script requi
 ## PLT options
 ```
 --plt                       Probabilistic Label Tree with <k> labels
---kary_tree                 use <k>-ary tree. By default = 2 (binary tree)
---threshold                 predict labels with conditional marginal probability greater than <thr> threshold"     
+--kary_tree                 use <k>-ary tree. By default = 2 (binary tree), 
+                            higher values usually give better results, but increase training time
+--threshold                 predict labels with conditional marginal probability greater than <thr> threshold     
 --top_k                     predict top-<k> labels instead of labels above threshold                 
 ```
 
+
 ## Tips for using PLT
 PLT accelerates training and prediction for a large number of classes, 
 if you have less than 10000 classes, you should probably use OAA.
 If you have a huge number of labels and features at the same time, 
 you will need as many bits (`-b`) as can afford computationally for the best performance.
 You may also consider using `--sgd` instead of default adaptive, normalized, and invariant updates to 
 gain more memory for feature weights what may lead to better performance.
-You may also consider using `--holdout_off` if you have many rare labels in your data. 
+If you have many rare labels in your data, you should train with `--holdout_off`, that disables usage of holdout (validation) dataset for early stopping.
+
 
 ## References
 

diff --git a/demo/plt/plt_demo.py b/demo/plt/plt_demo.py
@@ -6,7 +6,7 @@
 # This is demo example that demonstrates usage of PLT reduction on few popular multilabel datasets.
 
 # Select dataset
-dataset = "mediamill_exp1"  # should be in ["mediamill_exp1", "eurlex", "rcv1x", "wiki10", "amazonCat"]
+dataset = "eurlex"  # should be in ["mediamill_exp1", "eurlex", "rcv1x", "wiki10", "amazonCat"]
 
 # Select reduction
 reduction = "plt"  # should be in ["plt", "multilabel_oaa"]
@@ -18,10 +18,16 @@
 output_model = f"{dataset}_{reduction}_model"
 
 # Parameters
-kary_tree = 16
-l = 0.5
-passes = 3
-other_training_params = "--holdout_off"
+kary_tree = 16  # arity of the tree,
+# higher values usually give better results, but increase training time
+
+passes = 5  # number of passes over the dataset,
+# for some datasets you might want to change number of passes
+
+l = 0.5  # learning rate
+
+other_training_params = "--holdout_off"  # because these datasets have many rare labels,
+# disabling the holdout set improves the final performance
 
 # dict with params for different datasets (k and b)
 params_dict = {
@@ -34,14 +40,20 @@
 if dataset in params_dict:
     k, b = params_dict[dataset]
 else:
-    print(f"Dataset {dataset} is not supported for this demo.")
+    print(f"Dataset {dataset} is not supported by this demo.")
 
 # Download dataset (source: http://manikvarma.org/downloads/XC/XMLRepository.html)
+# Datasets were transformed to VW's format,
+# and features values were normalized (this helps with performance).
 if not os.path.exists(train_data):
+    print("Downloading train dataset:")
     os.system("wget http://www.cs.put.poznan.pl/mwydmuch/data/{}".format(train_data))
+
 if not os.path.exists(test_data):
+    print("Downloading test dataset:")
     os.system("wget http://www.cs.put.poznan.pl/mwydmuch/data/{}".format(test_data))
 
+
 print(f"\nTraining Vowpal Wabbit {reduction} on {dataset} dataset:\n")
 start = time.time()
 train_cmd = f"vw {train_data} -c --{reduction} {k} --loss_function logistic -l {l} --passes {passes} -b {b} -f {output_model} {other_training_params}"

diff --git a/java/src/test/java/vowpalWabbit/learner/VWScalarsLearnerTest.java b/java/src/test/java/vowpalWabbit/learner/VWScalarsLearnerTest.java
@@ -39,9 +39,9 @@ public void probs() throws IOException {
         float[][] expected = new float[][]{
                 new float[]{0.333333f, 0.333333f, 0.333333f},
                 new float[]{0.475999f, 0.262000f, 0.262000f},
-                new float[]{0.373369f, 0.345915f, 0.280716f},
-                new float[]{0.360023f, 0.415352f, 0.224625f},
-                new float[]{0.340208f, 0.355738f, 0.304054f}
+                new float[]{0.374730f, 0.344860f, 0.280410f},
+                new float[]{0.365531f, 0.411037f, 0.223432f},
+                new float[]{0.341800f, 0.354783f, 0.303417f}
         };
         assertEquals(expected.length, pred.length);
         for (int i=0; i<expected.length; ++i)

diff --git a/python/pylibvw.cc b/python/pylibvw.cc
@@ -841,7 +841,7 @@ void unsetup_example(vw_ptr vwP, example_ptr ae)
     }
   }
 
-  uint32_t multiplier = all.reduction_state.wpp << all.weights.stride_shift();
+  uint32_t multiplier = all.reduction_state.total_feature_width << all.weights.stride_shift();
   if (multiplier != 1)  // make room for per-feature information.
     for (auto ns : ae->indices)
       for (auto& idx : ae->feature_space[ns].indices) idx /= multiplier;
@@ -1657,7 +1657,8 @@ BOOST_PYTHON_MODULE(pylibvw)
 
   py::class_<Search::search, search_ptr>("search")
       .def("set_options", &Search::search::set_options, "Set global search options (auto conditioning, etc.)")
-      //.def("set_num_learners", &Search::search::set_num_learners, "Set the total number of learners you want to
+      //.def("set_total_feature_width", &Search::search::set_total_feature_width, "Set the total number of learners you
+      // want to
       // train")
       .def("get_history_length", &Search::search::get_history_length,
           "Get the value specified by --search_history_length")

diff --git a/python/tests/test_ccb.py b/python/tests/test_ccb.py
@@ -122,3 +122,138 @@ def test_ccb_non_slot_none_outcome():
     # CCB label is set to UNSET by default.
     assert label.type == vowpalwabbit.CCBLabelType.UNSET
     assert label.outcome is None
+
+
+def test_ccb_and_automl():
+    import random, json, os, shutil
+    import numpy as np
+    from vw_executor.vw import Vw
+
+    people_ccb = ["Tom", "Anna"]
+    topics_ccb = ["sports", "politics", "music"]
+
+    def my_ccb_simulation(n=10000, swap_after=5000, variance=0, bad_features=0, seed=0):
+        random.seed(seed)
+        np.random.seed(seed)
+
+        envs = [[[0.8, 0.4], [0.2, 0.4]]]
+        offset = 0
+
+        animals = [
+            "cat",
+            "dog",
+            "bird",
+            "fish",
+            "horse",
+            "cow",
+            "pig",
+            "sheep",
+            "goat",
+            "chicken",
+        ]
+        colors = [
+            "red",
+            "green",
+            "blue",
+            "yellow",
+            "orange",
+            "purple",
+            "black",
+            "white",
+            "brown",
+            "gray",
+        ]
+
+        for i in range(1, n):
+            person = random.randint(0, 1)
+            chosen = [int(i) for i in np.random.permutation(2)]
+            rewards = [envs[offset][person][chosen[0]], envs[offset][person][chosen[1]]]
+
+            for i in range(len(rewards)):
+                rewards[i] += np.random.normal(0.5, variance)
+
+            current = {
+                "c": {
+                    "shared": {"name": people_ccb[person]},
+                    "_multi": [{"a": {"topic": topics_ccb[i]}} for i in range(2)],
+                    "_slots": [{"_id": i} for i in range(2)],
+                },
+                "_outcomes": [
+                    {
+                        "_label_cost": -min(rewards[i], 1),
+                        "_a": chosen[i:],
+                        "_p": [1.0 / (2 - i)] * (2 - i),
+                    }
+                    for i in range(2)
+                ],
+            }
+
+            current["c"]["shared"][random.choice(animals)] = random.random()
+            current["c"]["shared"][random.choice(animals)] = random.random()
+
+            current["c"]["_multi"][random.choice(range(len(current["c"]["_multi"])))][
+                random.choice(colors)
+            ] = random.random()
+            current["c"]["_multi"][random.choice(range(len(current["c"]["_multi"])))][
+                random.choice(colors)
+            ] = random.random()
+
+            if random.random() < 0.50:
+                current["c"]["_multi"].append(
+                    {random.choice(colors): {random.choice(animals): 0.6666}}
+                )
+
+            current["c"]["_slots"][random.choice(range(len(current["c"]["_slots"])))][
+                random.choice(colors)
+            ] = random.random()
+            current["c"]["_slots"][random.choice(range(len(current["c"]["_slots"])))][
+                random.choice(colors)
+            ] = random.random()
+
+            yield current
+
+    def save_examples(examples, path):
+        with open(path, "w") as f:
+            for ex in examples:
+                f.write(f'{json.dumps(ex, separators=(",", ":"))}\n')
+
+    input_file = "ccb.json"
+    cache_dir = ".cache"
+    save_examples(
+        my_ccb_simulation(n=10, variance=0.1, bad_features=1, seed=0), input_file
+    )
+
+    assert os.path.exists(input_file)
+
+    vw = Vw(cache_dir, handler=None)
+    q = vw.train(
+        input_file, "-b 18 -q :: --ccb_explore_adf --dsjson", ["--invert_hash"]
+    )
+    automl = vw.train(
+        input_file,
+        "-b 20 --ccb_explore_adf --log_output stderr --dsjson --automl 4 --oracle_type one_diff --verbose_metrics",
+        ["--invert_hash", "--extra_metrics"],
+    )
+
+    automl_metrics = json.load(open(automl.outputs["--extra_metrics"][0]))
+
+    # we need champ switches to be zero for this to work
+    assert "total_champ_switches" in automl_metrics
+    assert automl_metrics["total_champ_switches"] == 0
+
+    q_weights = q[0].model9("--invert_hash").weights.sort_index()
+    automl_weights = automl[0].model9("--invert_hash").weights.sort_index()
+    automl_champ_weights = automl_weights[~automl_weights.index.str.contains("\[")]
+
+    fts_names_q = set([n for n in q_weights.index])
+    fts_names_automl = set([n for n in automl_weights.index if "[" not in n])
+
+    assert len(fts_names_q) == len(fts_names_automl)
+    assert fts_names_q == fts_names_automl
+    # since there is no champ switch automl_champ should be q:: and be equal to the non-automl q:: instance
+    assert q_weights.equals(automl_champ_weights)
+
+    os.remove(input_file)
+    shutil.rmtree(cache_dir)
+    assert not os.path.exists(input_file)
+    assert not os.path.exists(cache_dir)
diff --git a/python/vowpalwabbit/pyvw.py b/python/vowpalwabbit/pyvw.py
@@ -640,7 +640,7 @@ def learn(self, ec: Union["Example", List["Example"], str, List[str]]) -> None:
 
         elif isinstance(ec, list):
             if not self._is_multiline():
-                raise TypeError("Expecting a mutiline Learner.")
+                raise TypeError("Expecting a multiline learner.")
             if len(ec) == 0:
                 raise ValueError("An empty list is invalid")
             if isinstance(ec[0], str):

diff --git a/python/vowpalwabbit/sklearn.py b/python/vowpalwabbit/sklearn.py
@@ -738,12 +738,12 @@ class would be predicted.
             >>> model = VWMultiClassifier(oaa=3, loss_function='logistic')
             >>> _ = model.fit(X, y)
             >>> model.predict_proba(X)
-            array([[0.38928846, 0.30534211, 0.30536944],
-                   [0.40664235, 0.29666999, 0.29668769],
-                   [0.52324486, 0.23841164, 0.23834346],
-                   [0.5268591 , 0.23660533, 0.23653553],
-                   [0.65397811, 0.17312808, 0.17289382],
-                   [0.61190444, 0.19416356, 0.19393198]])
+            array([[0.38924146, 0.30537927, 0.30537927],
+                   [0.40661219, 0.29669389, 0.29669389],
+                   [0.52335149, 0.23832427, 0.23832427],
+                   [0.52696788, 0.23651604, 0.23651604],
+                   [0.65430814, 0.17284594, 0.17284594],
+                   [0.61224216, 0.19387889, 0.19387889]])
         """
         return VW.predict(self, X=X)
 

diff --git a/test/core.vwtest.json b/test/core.vwtest.json
@@ -5082,12 +5082,13 @@
   {
     "id": 394,
     "desc": "Test using automl with ccb",
-    "vw_command": "--ccb_explore_adf --example_queue_limit 7 -d train-sets/ccb_reuse_small.data --automl 3 --verbose_metrics --extra_metrics aml_ccb_metrics.json --oracle_type one_diff",
+    "vw_command": "--ccb_explore_adf --example_queue_limit 7 -d train-sets/ccb_automl.dsjson --automl 4 --dsjson --verbose_metrics --extra_metrics aml_ccb_metrics.json --oracle_type one_diff",
     "diff_files": {
+      "stderr": "train-sets/ref/automl_ccb.stderr",
       "aml_ccb_metrics.json": "test-sets/ref/aml_ccb_metrics.json"
     },
     "input_files": [
-      "train-sets/ccb_reuse_small.data"
+      "train-sets/ccb_automl.dsjson"
     ]
   },
   {
@@ -5884,5 +5885,16 @@
     "input_files": [
       "train-sets/automl_spin_off.txt"
     ]
+  },
+  {
+    "id": 455,
+    "desc": "large action spaces with cb_explore_adf epsilon greedy and ips",
+    "vw_command": "--cb_explore_adf -d train-sets/las_100_actions.txt --noconstant --large_action_space --cb_type ips",
+    "diff_files": {
+      "stderr": "train-sets/ref/las_egreedy_ips.stderr"    
+    },
+    "input_files": [
+      "train-sets/las_100_actions.txt"
+    ]
   }
 ]