From 80f92560a48f143e0b5c8c96e30671044fa2034d Mon Sep 17 00:00:00 2001 From: olgavrou Date: Tue, 30 May 2023 18:41:34 -0400 Subject: [PATCH 1/2] feat: WASM for both node and browser and 0.0.5 version bump --- wasm/README.md | 28 +- wasm/documentation.md | 786 +++++++++++++++++++++++++++++++++++++----- wasm/package.json | 26 +- wasm/src/vw.ts | 179 ++-------- wasm/src/vwbrowser.ts | 65 ++++ wasm/src/vwnode.ts | 216 ++++++++++++ 6 files changed, 1046 insertions(+), 254 deletions(-) create mode 100644 wasm/src/vwbrowser.ts create mode 100644 wasm/src/vwnode.ts diff --git a/wasm/README.md b/wasm/README.md index 12d84fbb03a..fe1b7aae46e 100644 --- a/wasm/README.md +++ b/wasm/README.md @@ -6,19 +6,26 @@ Javascript bindings for [VowpalWabbit](https://vowpalwabbit.org/) |----------|----------|----------| | 0.0.3 | 9.8.0 | wasm_v0.0.3 | | 0.0.4 | 9.8.0 | wasm_v0.0.4 | +| 0.0.5 | 9.8.0 | wasm_v0.0.5 | ## Documentation -[API documentation](https://github.com/VowpalWabbit/vowpal_wabbit/blob/wasm_v0.0.4/wasm/documentation.md) +[API documentation](https://github.com/VowpalWabbit/vowpal_wabbit/blob/wasm_v0.0.5/wasm/documentation.md) ## Examples and How-To +`@vowpalwabbit/vowpalwabbit` can be used both in nodejs and in ES6 environments. + ### How-To include the dependency and initialize a Contextual Bandit ADF model -Full API reference [here](https://github.com/VowpalWabbit/vowpal_wabbit/blob/wasm_v0.0.4/wasm/documentation.md#CbWorkspace) +Full API reference [here](https://github.com/VowpalWabbit/vowpal_wabbit/blob/wasm_v0.0.5/wasm/documentation.md#CbWorkspace) Require returns a promise because we need to wait for the WASM module to be initialized before including and using the VowpalWabbit JS code +A VW model needs to be deleted after we are done with its usage to return the aquired memory back to the WASM runtime. + +#### NodeJS environments + ```(js) const vwPromise = require('@vowpalwabbit/vowpalwabbit'); @@ -28,7 +35,18 @@ vwPromise.then((vw) => { }); ``` -A VW model needs to be deleted after we are done with its usage to return the aquired memory back to the WASM runtime. +#### ES6 environments + +```(js) +import { vwPromise } from '@vowpalwabbit/vowpalwabbit'; + +let vwModule = await vwPromise; + +let model = new vwModule.CbWorkspace({ args_str: "--cb_explore_adf" }); +model.delete() +``` + +The rest of the examples are shown with the nodejs `require` but the rest of the API usage is identical for both environments. ### How-To call learn and predict on a Contextual Bandit model @@ -182,7 +200,7 @@ There is also the option of stringifying an example for user-handled logging: let cbAsString = CBExampleToString(example); ``` -Synchronous logging options are also available [see API documentation](https://github.com/VowpalWabbit/vowpal_wabbit/blob/wasm_v0.0.4/wasm/documentation.md#VWExampleLogger) +Synchronous logging options are also available [see API documentation](https://github.com/VowpalWabbit/vowpal_wabbit/blob/wasm_v0.0.5/wasm/documentation.md#VWExampleLogger) ### How-To train a model with data from a file @@ -231,7 +249,7 @@ catch (e) ### How-To use a generic VW model (non Contextual Bandit specific functionality) -Full API reference [here](https://github.com/VowpalWabbit/vowpal_wabbit/blob/wasm_v0.0.4/wasm/documentation.md#Workspace) +Full API reference [here](https://github.com/VowpalWabbit/vowpal_wabbit/blob/wasm_v0.0.5/wasm/documentation.md#Workspace) #### Simple regression example diff --git a/wasm/documentation.md b/wasm/documentation.md index 6a6d22f83f1..fa854ceb187 100644 --- a/wasm/documentation.md +++ b/wasm/documentation.md @@ -1,143 +1,326 @@ ## Classes
+
CbWorkspacevw.CbWorkspace
+

ES6 wrapper around the Vowpal Wabbit C++ library.

+
+
Workspacevw.Workspace
+

ES6 wrapper around the Vowpal Wabbit C++ library.

+
VWExampleLogger
-

A class that helps facilitate the stringification of Vowpal Wabbit examples, and the logging of Vowpal Wabbit examples to a file.

+

A class that helps facilitate the stringification of Vowpal Wabbit examples, and the logging of Vowpal Wabbit examples to a file. +Currently available for use in nodejs environments only.

-
WorkspaceWorkspaceBase
-

A Wrapper around the Wowpal Wabbit C++ library.

+
CbWorkspacevw.CbWorkspace
+

Nodejs wrapper around the Vowpal Wabbit C++ library.

-
CbWorkspaceWorkspaceBase
-

A Wrapper around the Wowpal Wabbit C++ library for Contextual Bandit exploration algorithms.

+
Workspacevw.Workspace
+

Nodejs wrapper around the Vowpal Wabbit C++ library.

- + -## VWExampleLogger -A class that helps facilitate the stringification of Vowpal Wabbit examples, and the logging of Vowpal Wabbit examples to a file. +## CbWorkspace ⇐ vw.CbWorkspace +ES6 wrapper around the Vowpal Wabbit C++ library. **Kind**: global class +**Extends**: vw.CbWorkspace -* [VWExampleLogger](#VWExampleLogger) - * [.startLogStream(log_file)](#VWExampleLogger+startLogStream) - * [.logLineToStream(line)](#VWExampleLogger+logLineToStream) - * [.endLogStream()](#VWExampleLogger+endLogStream) - * [.logLineSync(log_file, line)](#VWExampleLogger+logLineSync) - * [.CBExampleToString(example)](#VWExampleLogger+CBExampleToString) ⇒ string - * [.logCBExampleToStream(example)](#VWExampleLogger+logCBExampleToStream) - * [.logCBExampleSync(log_file, example)](#VWExampleLogger+logCBExampleSync) +* [CbWorkspace](#CbWorkspace) ⇐ vw.CbWorkspace + * [new CbWorkspace([args_str], [model_array])](#new_CbWorkspace_new) + * [new CbWorkspace([args_str], [model_file], [model_array])](#new_CbWorkspace_new) + * [.predict(example)](#CbWorkspace+predict) ⇒ array + * [.learn(example)](#CbWorkspace+learn) + * [.addLine(line)](#CbWorkspace+addLine) + * [.learnFromString(example)](#CbWorkspace+learnFromString) + * [.samplePmf(pmf)](#CbWorkspace+samplePmf) ⇒ object + * [.samplePmfWithUUID(pmf, uuid)](#CbWorkspace+samplePmfWithUUID) ⇒ object + * [.predictAndSample(example)](#CbWorkspace+predictAndSample) ⇒ object + * [.predictAndSampleWithUUID(example)](#CbWorkspace+predictAndSampleWithUUID) ⇒ object + * [.predictionType()](#WorkspaceBase+predictionType) ⇒ + * [.sumLoss()](#WorkspaceBase+sumLoss) ⇒ number + * [.saveModelToFile(model_file)](#WorkspaceBase+saveModelToFile) + * [.getModelAsArray()](#WorkspaceBase+getModelAsArray) ⇒ Uint8Array + * [.loadModelFromFile(model_file)](#WorkspaceBase+loadModelFromFile) + * [.loadModelFromArray(model_array_ptr, model_array_len)](#WorkspaceBase+loadModelFromArray) + * [.delete()](#WorkspaceBase+delete) - + -### vwExampleLogger.startLogStream(log_file) -Starts a log stream to the specified file. Any new logs will be appended to the file. +### new CbWorkspace([args_str], [model_array]) +Creates a new Vowpal Wabbit workspace for Contextual Bandit exploration algorithms. +Can accept either or both string arguments and a model array. -**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) **Throws**: -- Error Throws an error if another logging stream has already been started +- Error Throws an error if: +- no argument is provided +- both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash | Param | Type | Description | | --- | --- | --- | -| log_file | string | the path to the file where the log will be appended to | +| [args_str] | string | The arguments that are used to initialize Vowpal Wabbit (optional) | +| [model_array] | tuple | The pre-loaded model's array pointer and length (optional). The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | - + -### vwExampleLogger.logLineToStream(line) -Takes a string and appends it to the log file. Line is logged in an asynchronous manner. +### new CbWorkspace([args_str], [model_file], [model_array]) +Creates a new Vowpal Wabbit workspace for Contextual Bandit exploration algorithms. +Can accept either or both string arguments and a model file. -**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) **Throws**: -- Error Throws an error if no logging stream has been started +- Error Throws an error if: +- no argument is provided +- both string arguments and a model file are provided, and the string arguments and arguments defined in the model clash +- both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash +- both a model file and a model array are provided | Param | Type | Description | | --- | --- | --- | -| line | string | the line to be appended to the log file | +| [args_str] | string | The arguments that are used to initialize Vowpal Wabbit (optional) | +| [model_file] | string | The path to the file where the model will be loaded from (optional) | +| [model_array] | tuple | The pre-loaded model's array pointer and length (optional). The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | - + -### vwExampleLogger.endLogStream() -Closes the logging stream. Logs a warning to the console if there is no logging stream active, but does not throw +### cbWorkspace.predict(example) ⇒ array +Takes a CB example and returns an array of (action, score) pairs, representing the probability mass function over the available actions +The returned pmf can be used with samplePmf to sample an action -**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) - +Example must have the following properties: +- text_context: a string representing the context -### vwExampleLogger.logLineSync(log_file, line) -Takes a string and appends it to the log file. Line is logged in a synchronous manner. -Every call to this function will open a new file handle, append the line and close the file handle. +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Returns**: array - probability mass function, an array of action,score pairs that was returned by predict +**Throws**: -**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) +- VWError Throws an error if the example text_context is missing from the example + + +| Param | Type | Description | +| --- | --- | --- | +| example | object | the example object that will be used for prediction | + + + +### cbWorkspace.learn(example) +Takes a CB example and uses it to update the model + +Example must have the following properties: +- text_context: a string representing the context +- labels: an array of label objects (usually one), each label object must have the following properties: + - action: the action index + - cost: the cost of the action + - probability: the probability of the action + +A label object should have more than one labels only if a reduction that accepts multiple labels was used (e.g. graph_feedback) + +**Kind**: instance method of [CbWorkspace](#CbWorkspace) **Throws**: -- Error Throws an error if another logging stream has already been started +- VWError Throws an error if the example does not have the required properties to learn | Param | Type | Description | | --- | --- | --- | -| log_file | string | the path to the file where the log will be appended to | -| line | string | the line to be appended to the log file | +| example | object | the example object that will be used for prediction | - + -### vwExampleLogger.CBExampleToString(example) ⇒ string -Takes a CB example and returns the string representation of it +### cbWorkspace.addLine(line) +Accepts a CB example (in text format) line by line. Once a full CB example is passed in it will call learnFromString. +This is intended to be used with files that have CB examples, that were logged using logCBExampleToStream and are being read line by line. -**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) -**Returns**: string - the string representation of the CB example +**Kind**: instance method of [CbWorkspace](#CbWorkspace) + +| Param | Type | Description | +| --- | --- | --- | +| line | string | a string representing a line from a CB example in text Vowpal Wabbit format | + + + +### cbWorkspace.learnFromString(example) +Takes a full multiline CB example in text format and uses it to update the model. This is intended to be used with examples that are logged to a file using logCBExampleToStream. + +**Kind**: instance method of [CbWorkspace](#CbWorkspace) **Throws**: -- Error Throws an error if the example is malformed +- Error Throws an error if the example is an object with a label and/or a text_context | Param | Type | Description | | --- | --- | --- | -| example | object | a CB example that will be stringified | +| example | string | a string representing the CB example in text Vowpal Wabbit format | - + -### vwExampleLogger.logCBExampleToStream(example) -Takes a CB example, stringifies it by calling CBExampleToString, and appends it to the log file. Line is logged in an asynchronous manner. +### cbWorkspace.samplePmf(pmf) ⇒ object +Takes an exploration prediction (array of action, score pairs) and returns a single action and score, +along with a unique id that was used to seed the sampling and that can be used to track and reproduce the sampling. -**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Returns**: object - an object with the following properties: +- action: the action index that was sampled +- score: the score of the action that was sampled +- uuid: the uuid that was passed to the predict function **Throws**: -- Error Throws an error if no logging stream has been started +- VWError Throws an error if the input is not an array of action,score pairs | Param | Type | Description | | --- | --- | --- | -| example | object | a CB example that will be stringified and appended to the log file | +| pmf | array | probability mass function, an array of action,score pairs that was returned by predict | - + -### vwExampleLogger.logCBExampleSync(log_file, example) -Takes a CB example, stringifies it by calling CBExampleToString, and appends it to the log file. Example is logged in a synchronous manner. -Every call to this function will open a new file handle, append the line and close the file handle. +### cbWorkspace.samplePmfWithUUID(pmf, uuid) ⇒ object +Takes an exploration prediction (array of action, score pairs) and a unique id that is used to seed the sampling, +and returns a single action index and the corresponding score. -**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Returns**: object - an object with the following properties: +- action: the action index that was sampled +- score: the score of the action that was sampled +- uuid: the uuid that was passed to the predict function **Throws**: -- Error Throws an error if another logging stream has already been started +- VWError Throws an error if the input is not an array of action,score pairs | Param | Type | Description | | --- | --- | --- | -| log_file | string | the path to the file where the log will be appended to | -| example | object | a CB example that will be stringified and appended to the log file | +| pmf | array | probability mass function, an array of action,score pairs that was returned by predict | +| uuid | string | a unique id that can be used to seed the prediction | + + + +### cbWorkspace.predictAndSample(example) ⇒ object +Takes an example with a text_context field and calls predict. The prediction (a probability mass function over the available actions) +will then be sampled from, and only the chosen action index and the corresponding score will be returned, +along with a unique id that was used to seed the sampling and that can be used to track and reproduce the sampling. + +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Returns**: object - an object with the following properties: +- action: the action index that was sampled +- score: the score of the action that was sampled +- uuid: the uuid that was passed to the predict function +**Throws**: + +- VWError if there is no text_context field in the example + + +| Param | Type | Description | +| --- | --- | --- | +| example | object | an example object containing the context to be used during prediction | + + + +### cbWorkspace.predictAndSampleWithUUID(example) ⇒ object +Takes an example with a text_context field and calls predict, and a unique id that is used to seed the sampling. +The prediction (a probability mass function over the available actions) will then be sampled from, and only the chosen action index +and the corresponding score will be returned, along with a unique id that was used to seed the sampling and that can be used to track and reproduce the sampling. + +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Returns**: object - an object with the following properties: +- action: the action index that was sampled +- score: the score of the action that was sampled +- uuid: the uuid that was passed to the predict function +**Throws**: + +- VWError if there is no text_context field in the example + + +| Param | Type | Description | +| --- | --- | --- | +| example | object | an example object containing the context to be used during prediction | + + + +### cbWorkspace.predictionType() ⇒ +Returns the enum value of the prediction type corresponding to the problem type of the model + +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Overrides**: [predictionType](#WorkspaceBase+predictionType) +**Returns**: enum value of prediction type + + +### cbWorkspace.sumLoss() ⇒ number +The current total sum of the progressive validation loss + +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Overrides**: [sumLoss](#WorkspaceBase+sumLoss) +**Returns**: number - the sum of all losses accumulated by the model + + +### cbWorkspace.saveModelToFile(model_file) +Takes a file location and stores the VW model in binary format in the file. +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Overrides**: [saveModelToFile](#WorkspaceBase+saveModelToFile) + +| Param | Type | Description | +| --- | --- | --- | +| model_file | string | the path to the file where the model will be saved | + + + +### cbWorkspace.getModelAsArray() ⇒ Uint8Array +Gets the VW model in binary format as a Uint8Array that can be saved to a file. +There is no need to delete or free the array returned by this function. +If the same array is however used to re-load the model into VW, then the array needs to be stored in wasm memory (see loadModelFromArray) + +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Overrides**: [getModelAsArray](#WorkspaceBase+getModelAsArray) +**Returns**: Uint8Array - the VW model in binary format + + +### cbWorkspace.loadModelFromFile(model_file) +Takes a file location and loads the VW model from the file. + +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Overrides**: [loadModelFromFile](#WorkspaceBase+loadModelFromFile) + +| Param | Type | Description | +| --- | --- | --- | +| model_file | string | the path to the file where the model will be loaded from | + + + +### cbWorkspace.loadModelFromArray(model_array_ptr, model_array_len) +Takes a model in an array binary format and loads it into the VW instance. +The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. + +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Overrides**: [loadModelFromArray](#WorkspaceBase+loadModelFromArray) + +| Param | Type | Description | +| --- | --- | --- | +| model_array_ptr | number | the pre-loaded model's array pointer The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | +| model_array_len | number | the pre-loaded model's array length | + + + +### cbWorkspace.delete() +Deletes the underlying VW instance. This function should be called when the instance is no longer needed. + +**Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Overrides**: [delete](#WorkspaceBase+delete) -## Workspace ⇐ WorkspaceBase -A Wrapper around the Wowpal Wabbit C++ library. +## Workspace ⇐ vw.Workspace +ES6 wrapper around the Vowpal Wabbit C++ library. **Kind**: global class -**Extends**: WorkspaceBase +**Extends**: vw.Workspace -* [Workspace](#Workspace) ⇐ WorkspaceBase +* [Workspace](#Workspace) ⇐ vw.Workspace + * [new Workspace(readSync, writeSync, [args_str], [model_file], [model_array])](#new_Workspace_new) + * [new Workspace([args_str], [model_array])](#new_Workspace_new) * [new Workspace([args_str], [model_file], [model_array])](#new_Workspace_new) * [.parse(line)](#Workspace+parse) ⇒ * [.predict(example)](#Workspace+predict) ⇒ @@ -153,6 +336,47 @@ A Wrapper around the Wowpal Wabbit C++ library. +### new Workspace(readSync, writeSync, [args_str], [model_file], [model_array]) +Creates a new Vowpal Wabbit workspace. +Can accept either or both string arguments and a model file. + +**Throws**: + +- Error Throws an error if: +- no argument is provided +- both string arguments and a model file are provided, and the string arguments and arguments defined in the model clash +- both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash +- both a model file and a model array are provided + + +| Param | Type | Description | +| --- | --- | --- | +| readSync | function | A function that reads a file synchronously and returns a buffer | +| writeSync | function | A function that writes a buffer to a file synchronously | +| [args_str] | string | The arguments that are used to initialize Vowpal Wabbit (optional) | +| [model_file] | string | The path to the file where the model will be loaded from (optional) | +| [model_array] | tuple | The pre-loaded model's array pointer and length (optional). The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | + + + +### new Workspace([args_str], [model_array]) +Creates a new Vowpal Wabbit workspace. +Can accept either or both string arguments and a model array. + +**Throws**: + +- Error Throws an error if: +- no argument is provided +- both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash + + +| Param | Type | Description | +| --- | --- | --- | +| [args_str] | string | The arguments that are used to initialize Vowpal Wabbit (optional) | +| [model_array] | tuple | The pre-loaded model's array pointer and length (optional). The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | + + + ### new Workspace([args_str], [model_file], [model_array]) Creates a new Vowpal Wabbit workspace. Can accept either or both string arguments and a model file. @@ -193,6 +417,10 @@ Calls vw predict on the example and returns the prediction. **Kind**: instance method of [Workspace](#Workspace) **Returns**: the prediction with a type corresponding to the reduction that was used +**Throws**: + +- VWError Throws an error if the example is not well defined + | Param | Type | Description | | --- | --- | --- | @@ -204,6 +432,10 @@ Calls vw predict on the example and returns the prediction. Calls vw learn on the example and updates the model **Kind**: instance method of [Workspace](#Workspace) +**Throws**: + +- VWError Throws an error if the example is not well defined + | Param | Type | Description | | --- | --- | --- | @@ -272,34 +504,154 @@ Takes a file location and loads the VW model from the file. -### workspace.loadModelFromArray(model_array_ptr, model_array_len) -Takes a model in an array binary format and loads it into the VW instance. -The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. +### workspace.loadModelFromArray(model_array_ptr, model_array_len) +Takes a model in an array binary format and loads it into the VW instance. +The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. + +**Kind**: instance method of [Workspace](#Workspace) +**Overrides**: [loadModelFromArray](#WorkspaceBase+loadModelFromArray) + +| Param | Type | Description | +| --- | --- | --- | +| model_array_ptr | number | the pre-loaded model's array pointer The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | +| model_array_len | number | the pre-loaded model's array length | + + + +### workspace.delete() +Deletes the underlying VW instance. This function should be called when the instance is no longer needed. + +**Kind**: instance method of [Workspace](#Workspace) +**Overrides**: [delete](#WorkspaceBase+delete) + + +## VWExampleLogger +A class that helps facilitate the stringification of Vowpal Wabbit examples, and the logging of Vowpal Wabbit examples to a file. +Currently available for use in nodejs environments only. + +**Kind**: global class + +* [VWExampleLogger](#VWExampleLogger) + * [.startLogStream(log_file)](#VWExampleLogger+startLogStream) + * [.logLineToStream(line)](#VWExampleLogger+logLineToStream) + * [.endLogStream()](#VWExampleLogger+endLogStream) + * [.logLineSync(log_file, line)](#VWExampleLogger+logLineSync) + * [.CBExampleToString(example)](#VWExampleLogger+CBExampleToString) ⇒ string + * [.logCBExampleToStream(example)](#VWExampleLogger+logCBExampleToStream) + * [.logCBExampleSync(log_file, example)](#VWExampleLogger+logCBExampleSync) + + + +### vwExampleLogger.startLogStream(log_file) +Starts a log stream to the specified file. Any new logs will be appended to the file. + +**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) +**Throws**: + +- Error Throws an error if another logging stream has already been started + + +| Param | Type | Description | +| --- | --- | --- | +| log_file | string | the path to the file where the log will be appended to | + + + +### vwExampleLogger.logLineToStream(line) +Takes a string and appends it to the log file. Line is logged in an asynchronous manner. + +**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) +**Throws**: + +- Error Throws an error if no logging stream has been started + + +| Param | Type | Description | +| --- | --- | --- | +| line | string | the line to be appended to the log file | + + + +### vwExampleLogger.endLogStream() +Closes the logging stream. Logs a warning to the console if there is no logging stream active, but does not throw + +**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) + + +### vwExampleLogger.logLineSync(log_file, line) +Takes a string and appends it to the log file. Line is logged in a synchronous manner. +Every call to this function will open a new file handle, append the line and close the file handle. + +**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) +**Throws**: + +- Error Throws an error if another logging stream has already been started + + +| Param | Type | Description | +| --- | --- | --- | +| log_file | string | the path to the file where the log will be appended to | +| line | string | the line to be appended to the log file | + + + +### vwExampleLogger.CBExampleToString(example) ⇒ string +Takes a CB example and returns the string representation of it + +**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) +**Returns**: string - the string representation of the CB example +**Throws**: + +- Error Throws an error if the example is malformed + + +| Param | Type | Description | +| --- | --- | --- | +| example | object | a CB example that will be stringified | + + + +### vwExampleLogger.logCBExampleToStream(example) +Takes a CB example, stringifies it by calling CBExampleToString, and appends it to the log file. Line is logged in an asynchronous manner. + +**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) +**Throws**: + +- Error Throws an error if no logging stream has been started + + +| Param | Type | Description | +| --- | --- | --- | +| example | object | a CB example that will be stringified and appended to the log file | + + + +### vwExampleLogger.logCBExampleSync(log_file, example) +Takes a CB example, stringifies it by calling CBExampleToString, and appends it to the log file. Example is logged in a synchronous manner. +Every call to this function will open a new file handle, append the line and close the file handle. + +**Kind**: instance method of [VWExampleLogger](#VWExampleLogger) +**Throws**: + +- Error Throws an error if another logging stream has already been started -**Kind**: instance method of [Workspace](#Workspace) -**Overrides**: [loadModelFromArray](#WorkspaceBase+loadModelFromArray) | Param | Type | Description | | --- | --- | --- | -| model_array_ptr | number | the pre-loaded model's array pointer The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | -| model_array_len | number | the pre-loaded model's array length | - - - -### workspace.delete() -Deletes the underlying VW instance. This function should be called when the instance is no longer needed. +| log_file | string | the path to the file where the log will be appended to | +| example | object | a CB example that will be stringified and appended to the log file | -**Kind**: instance method of [Workspace](#Workspace) -**Overrides**: [delete](#WorkspaceBase+delete) -## CbWorkspace ⇐ WorkspaceBase -A Wrapper around the Wowpal Wabbit C++ library for Contextual Bandit exploration algorithms. +## CbWorkspace ⇐ vw.CbWorkspace +Nodejs wrapper around the Vowpal Wabbit C++ library. **Kind**: global class -**Extends**: WorkspaceBase +**Extends**: vw.CbWorkspace -* [CbWorkspace](#CbWorkspace) ⇐ WorkspaceBase +* [CbWorkspace](#CbWorkspace) ⇐ vw.CbWorkspace + * [new CbWorkspace([args_str], [model_array])](#new_CbWorkspace_new) + * [new CbWorkspace([args_str], [model_file], [model_array])](#new_CbWorkspace_new) * [.predict(example)](#CbWorkspace+predict) ⇒ array * [.learn(example)](#CbWorkspace+learn) * [.addLine(line)](#CbWorkspace+addLine) @@ -316,6 +668,45 @@ A Wrapper around the Wowpal Wabbit C++ library for Contextual Bandit exploration * [.loadModelFromArray(model_array_ptr, model_array_len)](#WorkspaceBase+loadModelFromArray) * [.delete()](#WorkspaceBase+delete) + + +### new CbWorkspace([args_str], [model_array]) +Creates a new Vowpal Wabbit workspace for Contextual Bandit exploration algorithms. +Can accept either or both string arguments and a model array. + +**Throws**: + +- Error Throws an error if: +- no argument is provided +- both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash + + +| Param | Type | Description | +| --- | --- | --- | +| [args_str] | string | The arguments that are used to initialize Vowpal Wabbit (optional) | +| [model_array] | tuple | The pre-loaded model's array pointer and length (optional). The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | + + + +### new CbWorkspace([args_str], [model_file], [model_array]) +Creates a new Vowpal Wabbit workspace for Contextual Bandit exploration algorithms. +Can accept either or both string arguments and a model file. + +**Throws**: + +- Error Throws an error if: +- no argument is provided +- both string arguments and a model file are provided, and the string arguments and arguments defined in the model clash +- both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash +- both a model file and a model array are provided + + +| Param | Type | Description | +| --- | --- | --- | +| [args_str] | string | The arguments that are used to initialize Vowpal Wabbit (optional) | +| [model_file] | string | The path to the file where the model will be loaded from (optional) | +| [model_array] | tuple | The pre-loaded model's array pointer and length (optional). The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | + ### cbWorkspace.predict(example) ⇒ array @@ -327,6 +718,10 @@ Example must have the following properties: **Kind**: instance method of [CbWorkspace](#CbWorkspace) **Returns**: array - probability mass function, an array of action,score pairs that was returned by predict +**Throws**: + +- VWError Throws an error if the example text_context is missing from the example + | Param | Type | Description | | --- | --- | --- | @@ -347,6 +742,10 @@ Example must have the following properties: A label object should have more than one labels only if a reduction that accepts multiple labels was used (e.g. graph_feedback) **Kind**: instance method of [CbWorkspace](#CbWorkspace) +**Throws**: + +- VWError Throws an error if the example does not have the required properties to learn + | Param | Type | Description | | --- | --- | --- | @@ -392,7 +791,7 @@ along with a unique id that was used to seed the sampling and that can be used t - uuid: the uuid that was passed to the predict function **Throws**: -- Error Throws an error if the input is not an array of action,score pairs +- VWError Throws an error if the input is not an array of action,score pairs | Param | Type | Description | @@ -412,7 +811,7 @@ and returns a single action index and the corresponding score. - uuid: the uuid that was passed to the predict function **Throws**: -- Error Throws an error if the input is not an array of action,score pairs +- VWError Throws an error if the input is not an array of action,score pairs | Param | Type | Description | @@ -434,7 +833,7 @@ along with a unique id that was used to seed the sampling and that can be used t - uuid: the uuid that was passed to the predict function **Throws**: -- Error if there is no text_context field in the example +- VWError if there is no text_context field in the example | Param | Type | Description | @@ -455,7 +854,7 @@ and the corresponding score will be returned, along with a unique id that was us - uuid: the uuid that was passed to the predict function **Throws**: -- Error if there is no text_context field in the example +- VWError if there is no text_context field in the example | Param | Type | Description | @@ -533,3 +932,216 @@ Deletes the underlying VW instance. This function should be called when the inst **Kind**: instance method of [CbWorkspace](#CbWorkspace) **Overrides**: [delete](#WorkspaceBase+delete) + + +## Workspace ⇐ vw.Workspace +Nodejs wrapper around the Vowpal Wabbit C++ library. + +**Kind**: global class +**Extends**: vw.Workspace + +* [Workspace](#Workspace) ⇐ vw.Workspace + * [new Workspace(readSync, writeSync, [args_str], [model_file], [model_array])](#new_Workspace_new) + * [new Workspace([args_str], [model_array])](#new_Workspace_new) + * [new Workspace([args_str], [model_file], [model_array])](#new_Workspace_new) + * [.parse(line)](#Workspace+parse) ⇒ + * [.predict(example)](#Workspace+predict) ⇒ + * [.learn(example)](#Workspace+learn) + * [.finishExample(example)](#Workspace+finishExample) + * [.predictionType()](#WorkspaceBase+predictionType) ⇒ + * [.sumLoss()](#WorkspaceBase+sumLoss) ⇒ number + * [.saveModelToFile(model_file)](#WorkspaceBase+saveModelToFile) + * [.getModelAsArray()](#WorkspaceBase+getModelAsArray) ⇒ Uint8Array + * [.loadModelFromFile(model_file)](#WorkspaceBase+loadModelFromFile) + * [.loadModelFromArray(model_array_ptr, model_array_len)](#WorkspaceBase+loadModelFromArray) + * [.delete()](#WorkspaceBase+delete) + + + +### new Workspace(readSync, writeSync, [args_str], [model_file], [model_array]) +Creates a new Vowpal Wabbit workspace. +Can accept either or both string arguments and a model file. + +**Throws**: + +- Error Throws an error if: +- no argument is provided +- both string arguments and a model file are provided, and the string arguments and arguments defined in the model clash +- both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash +- both a model file and a model array are provided + + +| Param | Type | Description | +| --- | --- | --- | +| readSync | function | A function that reads a file synchronously and returns a buffer | +| writeSync | function | A function that writes a buffer to a file synchronously | +| [args_str] | string | The arguments that are used to initialize Vowpal Wabbit (optional) | +| [model_file] | string | The path to the file where the model will be loaded from (optional) | +| [model_array] | tuple | The pre-loaded model's array pointer and length (optional). The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | + + + +### new Workspace([args_str], [model_array]) +Creates a new Vowpal Wabbit workspace. +Can accept either or both string arguments and a model array. + +**Throws**: + +- Error Throws an error if: +- no argument is provided +- both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash + + +| Param | Type | Description | +| --- | --- | --- | +| [args_str] | string | The arguments that are used to initialize Vowpal Wabbit (optional) | +| [model_array] | tuple | The pre-loaded model's array pointer and length (optional). The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | + + + +### new Workspace([args_str], [model_file], [model_array]) +Creates a new Vowpal Wabbit workspace. +Can accept either or both string arguments and a model file. + +**Throws**: + +- Error Throws an error if: +- no argument is provided +- both string arguments and a model file are provided, and the string arguments and arguments defined in the model clash +- both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash +- both a model file and a model array are provided + + +| Param | Type | Description | +| --- | --- | --- | +| [args_str] | string | The arguments that are used to initialize Vowpal Wabbit (optional) | +| [model_file] | string | The path to the file where the model will be loaded from (optional) | +| [model_array] | tuple | The pre-loaded model's array pointer and length (optional). The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | + + + +### workspace.parse(line) ⇒ +Parse a line of text into a VW example. +The example can then be used for prediction or learning. +finishExample() must be called and then delete() on the example, when it is no longer needed. + +**Kind**: instance method of [Workspace](#Workspace) +**Returns**: a parsed vw example that can be used for prediction or learning + +| Param | Type | +| --- | --- | +| line | string | + + + +### workspace.predict(example) ⇒ +Calls vw predict on the example and returns the prediction. + +**Kind**: instance method of [Workspace](#Workspace) +**Returns**: the prediction with a type corresponding to the reduction that was used +**Throws**: + +- VWError Throws an error if the example is not well defined + + +| Param | Type | Description | +| --- | --- | --- | +| example | object | returned from parse() | + + + +### workspace.learn(example) +Calls vw learn on the example and updates the model + +**Kind**: instance method of [Workspace](#Workspace) +**Throws**: + +- VWError Throws an error if the example is not well defined + + +| Param | Type | Description | +| --- | --- | --- | +| example | object | returned from parse() | + + + +### workspace.finishExample(example) +Cleans the example and returns it to the pool of available examples. delete() must also be called on the example object + +**Kind**: instance method of [Workspace](#Workspace) + +| Param | Type | Description | +| --- | --- | --- | +| example | object | returned from parse() | + + + +### workspace.predictionType() ⇒ +Returns the enum value of the prediction type corresponding to the problem type of the model + +**Kind**: instance method of [Workspace](#Workspace) +**Overrides**: [predictionType](#WorkspaceBase+predictionType) +**Returns**: enum value of prediction type + + +### workspace.sumLoss() ⇒ number +The current total sum of the progressive validation loss + +**Kind**: instance method of [Workspace](#Workspace) +**Overrides**: [sumLoss](#WorkspaceBase+sumLoss) +**Returns**: number - the sum of all losses accumulated by the model + + +### workspace.saveModelToFile(model_file) +Takes a file location and stores the VW model in binary format in the file. + +**Kind**: instance method of [Workspace](#Workspace) +**Overrides**: [saveModelToFile](#WorkspaceBase+saveModelToFile) + +| Param | Type | Description | +| --- | --- | --- | +| model_file | string | the path to the file where the model will be saved | + + + +### workspace.getModelAsArray() ⇒ Uint8Array +Gets the VW model in binary format as a Uint8Array that can be saved to a file. +There is no need to delete or free the array returned by this function. +If the same array is however used to re-load the model into VW, then the array needs to be stored in wasm memory (see loadModelFromArray) + +**Kind**: instance method of [Workspace](#Workspace) +**Overrides**: [getModelAsArray](#WorkspaceBase+getModelAsArray) +**Returns**: Uint8Array - the VW model in binary format + + +### workspace.loadModelFromFile(model_file) +Takes a file location and loads the VW model from the file. + +**Kind**: instance method of [Workspace](#Workspace) +**Overrides**: [loadModelFromFile](#WorkspaceBase+loadModelFromFile) + +| Param | Type | Description | +| --- | --- | --- | +| model_file | string | the path to the file where the model will be loaded from | + + + +### workspace.loadModelFromArray(model_array_ptr, model_array_len) +Takes a model in an array binary format and loads it into the VW instance. +The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. + +**Kind**: instance method of [Workspace](#Workspace) +**Overrides**: [loadModelFromArray](#WorkspaceBase+loadModelFromArray) + +| Param | Type | Description | +| --- | --- | --- | +| model_array_ptr | number | the pre-loaded model's array pointer The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. | +| model_array_len | number | the pre-loaded model's array length | + + + +### workspace.delete() +Deletes the underlying VW instance. This function should be called when the instance is no longer needed. + +**Kind**: instance method of [Workspace](#Workspace) +**Overrides**: [delete](#WorkspaceBase+delete) diff --git a/wasm/package.json b/wasm/package.json index cc3620ae730..de3b497ba79 100644 --- a/wasm/package.json +++ b/wasm/package.json @@ -1,11 +1,24 @@ { "name": "@vowpalwabbit/vowpalwabbit", - "version": "0.0.4", + "version": "0.0.5", "description": "wasm bindings for vowpal wabbit", "exports": { - "require": "./dist/vw.js" + ".": { + "node": { + "require": "./dist/vwnode.js" + }, + "browser": { + "import": "./dist/vwbrowser.js" + }, + "default": "./dist/vwnode.js" + }, + "./package.json": "./package.json" + }, + "main": "dist/vwnode.js", + "module": "./dist/vwnode.js", + "browser": { + "./dist/vwbrowser.js": "./dist/vwbrowser.js" }, - "main": "dist/vw.js", "files": [ "dist/", "src/**/*.ts", @@ -21,10 +34,11 @@ "prepublish": "npm run build", "build": "tsc", "test": "node --experimental-wasm-threads ./node_modules/mocha/bin/mocha --delay", - "docs": "jsdoc2md ./dist/vw.js > documentation.md" + "docs": "jsdoc2md ./dist/vw*.js > documentation.md" }, "dependencies": { - "out": "^1.1.0" + "out": "^1.1.0", + "uuid": "^9.0.0" }, "repository": { "type": "git", @@ -42,4 +56,4 @@ ], "author": "olgavrou", "license": "BSD-3-Clause" -} +} \ No newline at end of file diff --git a/wasm/src/vw.ts b/wasm/src/vw.ts index 0bb8fa62e39..0ba399d516d 100644 --- a/wasm/src/vw.ts +++ b/wasm/src/vw.ts @@ -1,6 +1,4 @@ -import fs from 'fs'; -import crypto from 'crypto'; - +const { v4: uuidv4 } = require('uuid'); const VWWasmModule = require('./vw-wasm.js'); // internals @@ -21,154 +19,15 @@ class VWError extends Error { } } -/** - * A class that helps facilitate the stringification of Vowpal Wabbit examples, and the logging of Vowpal Wabbit examples to a file. - * @class - */ -class VWExampleLogger { - _outputLogStream: fs.WriteStream | null; - _log_file: string | null; - - constructor() { - this._outputLogStream = null; - this._log_file = null; - } - - /** - * - * Starts a log stream to the specified file. Any new logs will be appended to the file. - * - * @param {string} log_file the path to the file where the log will be appended to - * @throws {Error} Throws an error if another logging stream has already been started - */ - startLogStream(log_file: string) { - if (this._outputLogStream !== null) { - throw new Error("Can not start log stream, another log stream is currently active. Call endLogStream first if you want to change the log file. Current log file: " + this._log_file); - } - else { - this._log_file = log_file; - this._outputLogStream = fs.createWriteStream(log_file, { flags: 'a' }); - } - } - - /** - * Takes a string and appends it to the log file. Line is logged in an asynchronous manner. - * - * @param {string} line the line to be appended to the log file - * @throws {Error} Throws an error if no logging stream has been started - */ - logLineToStream(line: string) { - if (this._outputLogStream !== null) { - this._outputLogStream.write(line); - } - else { - throw new Error("Can not log line, log file is not specified. Call startLogStream first."); - } - } - - /** - * Closes the logging stream. Logs a warning to the console if there is no logging stream active, but does not throw - */ - endLogStream() { - if (this._outputLogStream !== null) { - this._outputLogStream.end(); - this._outputLogStream = null; - this._log_file = null; - } - else { - console.warn("Can not close log, log file is not specified"); - } - } - - /** - * - * Takes a string and appends it to the log file. Line is logged in a synchronous manner. - * Every call to this function will open a new file handle, append the line and close the file handle. - * - * @param {string} log_file the path to the file where the log will be appended to - * @param {string} line the line to be appended to the log file - * @throws {Error} Throws an error if another logging stream has already been started - */ - logLineSync(log_file: string, line: string) { - if (this._outputLogStream !== null && this._log_file === log_file) { - throw new Error("Can not call logLineSync on log file while the same file has an async log writer active. Call endLogStream first. Log file: " + log_file); - } - fs.appendFileSync(log_file, line); - } - - /** - * - * Takes a CB example and returns the string representation of it - * - * @param {object} example a CB example that will be stringified - * @returns {string} the string representation of the CB example - * @throws {Error} Throws an error if the example is malformed - */ - CBExampleToString(example: { text_context: string, labels: Array<{ action: number, cost: number, probability: number }> }): string { - let context = "" - if (example.hasOwnProperty('text_context')) { - context = example.text_context; - } - else { - throw new Error("Can not log example, there is no context available"); - } - - const lines = context.trim().split("\n").map((substr) => substr.trim()); - lines.push(""); - lines.push(""); - - if (example.hasOwnProperty("labels") && example["labels"].length > 0) { - let indexOffset = 0; - if (context.includes("shared")) { - indexOffset = 1; - } - - for (let i = 0; i < example["labels"].length; i++) { - let label = example["labels"][i]; - if (label.action + indexOffset >= lines.length) { - throw new Error("action index out of bounds: " + label.action); - } - - lines[label.action + indexOffset] = label.action + ":" + label.cost + ":" + label.probability + " " + lines[label.action + indexOffset] - } - } - return lines.join("\n"); - } - - /** - * - * Takes a CB example, stringifies it by calling CBExampleToString, and appends it to the log file. Line is logged in an asynchronous manner. - * - * @param {object} example a CB example that will be stringified and appended to the log file - * @throws {Error} Throws an error if no logging stream has been started - */ - logCBExampleToStream(example: { text_context: string, labels: Array<{ action: number, cost: number, probability: number }> }) { - let ex_str = this.CBExampleToString(example); - this.logLineToStream(ex_str); - } - - /** - * - * Takes a CB example, stringifies it by calling CBExampleToString, and appends it to the log file. Example is logged in a synchronous manner. - * Every call to this function will open a new file handle, append the line and close the file handle. - * - * @param {string} log_file the path to the file where the log will be appended to - * @param {object} example a CB example that will be stringified and appended to the log file - * @throws {Error} Throws an error if another logging stream has already been started - */ - logCBExampleSync(log_file: string, example: { text_context: string, labels: Array<{ action: number, cost: number, probability: number }> }) { - let ex_str = this.CBExampleToString(example); - this.logLineSync(log_file, ex_str); - } -}; - -module.exports = new Promise((resolve) => { +export default new Promise((resolve) => { VWWasmModule().then((moduleInstance: any) => { class WorkspaceBase { _args_str: string | undefined; _instance: any; + _readSync: Function; + _writeSync: Function; - constructor(type: string, { args_str, model_file, model_array }: + constructor(type: string, readSync: Function, writeSync: Function, { args_str, model_file, model_array }: { args_str?: string, model_file?: string, model_array?: [number | undefined, number | undefined] } = {}) { let vwModelConstructor = null; @@ -181,6 +40,9 @@ module.exports = new Promise((resolve) => { throw new Error("Unknown model type"); } + this._readSync = readSync; + this._writeSync = writeSync; + let model_array_ptr: number | undefined = undefined; let model_array_len: number | undefined = undefined; if (model_array !== undefined) { @@ -203,7 +65,7 @@ module.exports = new Promise((resolve) => { } if (model_file !== undefined) { - let modelBuffer = fs.readFileSync(model_file); + let modelBuffer = readSync(model_file); let ptr = moduleInstance._malloc(modelBuffer.byteLength); let heapBytes = new Uint8Array(moduleInstance.HEAPU8.buffer, ptr, modelBuffer.byteLength); heapBytes.set(new Uint8Array(modelBuffer)); @@ -254,7 +116,7 @@ module.exports = new Promise((resolve) => { uint8Array[i] = char_vector.get(i); } - fs.writeFileSync(model_file, Buffer.from(uint8Array)); + this._writeSync(model_file, Buffer.from(uint8Array)); char_vector.delete(); } @@ -286,7 +148,7 @@ module.exports = new Promise((resolve) => { * @param {string} model_file the path to the file where the model will be loaded from */ loadModelFromFile(model_file: string) { - let modelBuffer = fs.readFileSync(model_file); + let modelBuffer = this._readSync(model_file); let ptr = moduleInstance._malloc(modelBuffer.byteLength); let heapBytes = new Uint8Array(moduleInstance.HEAPU8.buffer, ptr, modelBuffer.byteLength); heapBytes.set(new Uint8Array(modelBuffer)); @@ -317,6 +179,7 @@ module.exports = new Promise((resolve) => { /** * A Wrapper around the Wowpal Wabbit C++ library. * @class + * @private * @extends WorkspaceBase */ class Workspace extends WorkspaceBase { @@ -325,6 +188,8 @@ module.exports = new Promise((resolve) => { * Can accept either or both string arguments and a model file. * * @constructor + * @param {Function} readSync - A function that reads a file synchronously and returns a buffer + * @param {Function} writeSync - A function that writes a buffer to a file synchronously * @param {string} [args_str] - The arguments that are used to initialize Vowpal Wabbit (optional) * @param {string} [model_file] - The path to the file where the model will be loaded from (optional) * @param {tuple} [model_array] - The pre-loaded model's array pointer and length (optional). @@ -335,9 +200,9 @@ module.exports = new Promise((resolve) => { * - both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash * - both a model file and a model array are provided */ - constructor({ args_str, model_file, model_array }: + constructor(readSync: Function, writeSync: Function, { args_str, model_file, model_array }: { args_str?: string, model_file?: string, model_array?: [number | undefined, number | undefined] } = {}) { - super(ProblemType.All, { args_str, model_file, model_array }); + super(ProblemType.All, readSync, writeSync, { args_str, model_file, model_array }); } /** @@ -394,6 +259,7 @@ module.exports = new Promise((resolve) => { /** * A Wrapper around the Wowpal Wabbit C++ library for Contextual Bandit exploration algorithms. * @class + * @private * @extends WorkspaceBase */ class CbWorkspace extends WorkspaceBase { @@ -402,6 +268,8 @@ module.exports = new Promise((resolve) => { * Can accept either or both string arguments and a model file. * * @constructor + * @param {Function} readSync - A function that reads a file synchronously and returns a buffer + * @param {Function} writeSync - A function that writes a buffer to a file synchronously * @param {string} [args_str] - The arguments that are used to initialize Vowpal Wabbit (optional) * @param {string} [model_file] - The path to the file where the model will be loaded from (optional) * @param {tuple} [model_array] - The pre-loaded model's array pointer and length (optional). @@ -414,9 +282,9 @@ module.exports = new Promise((resolve) => { */ _ex: string; - constructor({ args_str, model_file, model_array }: + constructor(readSync: Function, writeSync: Function, { args_str, model_file, model_array }: { args_str?: string, model_file?: string, model_array?: [number | undefined, number | undefined] } = {}) { - super(ProblemType.CB, { args_str, model_file, model_array }); + super(ProblemType.CB, readSync, writeSync, { args_str, model_file, model_array }); this._ex = ""; } @@ -512,7 +380,7 @@ module.exports = new Promise((resolve) => { * @throws {VWError} Throws an error if the input is not an array of action,score pairs */ samplePmf(pmf: Array): object { - let uuid = crypto.randomUUID(); + let uuid = uuidv4(); try { let ret = this._instance._samplePmf(pmf, uuid); ret["uuid"] = uuid; @@ -561,7 +429,7 @@ module.exports = new Promise((resolve) => { */ predictAndSample(example: object): object { try { - let uuid = crypto.randomUUID(); + let uuid = uuidv4(); let ret = this._instance._predictAndSample(example, uuid); ret["uuid"] = uuid; return ret; @@ -620,7 +488,6 @@ module.exports = new Promise((resolve) => { Workspace: Workspace, CbWorkspace: CbWorkspace, Prediction: Prediction, - VWExampleLogger: VWExampleLogger, getExceptionMessage: getExceptionMessage, wasmModule: moduleInstance } diff --git a/wasm/src/vwbrowser.ts b/wasm/src/vwbrowser.ts new file mode 100644 index 00000000000..05b0e29c31c --- /dev/null +++ b/wasm/src/vwbrowser.ts @@ -0,0 +1,65 @@ +import vwModulePromise from './vw.js'; + +function readSync(model_name: string) { + console.error("readSync was called in the browser, this behaviour is not supported, do not construct a model with a model file in the browser"); + return []; +} + +function writeSync(model_name: string) { + console.error("writeSync was called in the browser, this behaviour is not supported, do not construct a model with a model file in the browser"); +} + +export const vwPromise = new Promise((resolve) => { + vwModulePromise.then((vw: any) => { + /** + * ES6 wrapper around the Vowpal Wabbit C++ library. + * @class + * @extends vw.CbWorkspace + */ + class CbWorkspace extends vw.CbWorkspace { + /** + * Creates a new Vowpal Wabbit workspace for Contextual Bandit exploration algorithms. + * Can accept either or both string arguments and a model array. + * + * @constructor + * @param {string} [args_str] - The arguments that are used to initialize Vowpal Wabbit (optional) + * @param {tuple} [model_array] - The pre-loaded model's array pointer and length (optional). + * The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. + * @throws {Error} Throws an error if: + * - no argument is provided + * - both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash + */ + + constructor({ args_str, model_array }: + { args_str?: string, model_array?: [number | undefined, number | undefined] } = {}) { + super(readSync, writeSync, { args_str, model_array }); + } + }; + + /** + * ES6 wrapper around the Vowpal Wabbit C++ library. + * @class + * @extends vw.Workspace + */ + class Workspace extends vw.Workspace { + /** + * Creates a new Vowpal Wabbit workspace. + * Can accept either or both string arguments and a model array. + * + * @constructor + * @param {string} [args_str] - The arguments that are used to initialize Vowpal Wabbit (optional) + * @param {tuple} [model_array] - The pre-loaded model's array pointer and length (optional). + * The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. + * @throws {Error} Throws an error if: + * - no argument is provided + * - both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash + */ + constructor({ args_str, model_file, model_array }: + { args_str?: string, model_file?: string, model_array?: [number | undefined, number | undefined] } = {}) { + super(readSync, writeSync, { args_str, model_file, model_array }); + } + }; + + resolve({ CbWorkspace: CbWorkspace, Workspace: Workspace, Prediction: vw.Prediction, getExceptionMessage: vw.getExceptionMessage, wasmModule: vw.wasmModule }); + }) +}); \ No newline at end of file diff --git a/wasm/src/vwnode.ts b/wasm/src/vwnode.ts new file mode 100644 index 00000000000..8f36cb065b6 --- /dev/null +++ b/wasm/src/vwnode.ts @@ -0,0 +1,216 @@ +import fs from 'fs'; + +const vwModule = require('./vw.js').default; + +/** + * A class that helps facilitate the stringification of Vowpal Wabbit examples, and the logging of Vowpal Wabbit examples to a file. + * Currently available for use in nodejs environments only. + * @class + */ +class VWExampleLogger { + _outputLogStream: fs.WriteStream | null; + _log_file: string | null; + + constructor() { + this._outputLogStream = null; + this._log_file = null; + } + + /** + * + * Starts a log stream to the specified file. Any new logs will be appended to the file. + * + * @param {string} log_file the path to the file where the log will be appended to + * @throws {Error} Throws an error if another logging stream has already been started + */ + startLogStream(log_file: string) { + if (this._outputLogStream !== null) { + throw new Error("Can not start log stream, another log stream is currently active. Call endLogStream first if you want to change the log file. Current log file: " + this._log_file); + } + else { + this._log_file = log_file; + this._outputLogStream = fs.createWriteStream(log_file, { flags: 'a' }); + } + } + + /** + * Takes a string and appends it to the log file. Line is logged in an asynchronous manner. + * + * @param {string} line the line to be appended to the log file + * @throws {Error} Throws an error if no logging stream has been started + */ + logLineToStream(line: string) { + if (this._outputLogStream !== null) { + this._outputLogStream.write(line); + } + else { + throw new Error("Can not log line, log file is not specified. Call startLogStream first."); + } + } + + /** + * Closes the logging stream. Logs a warning to the console if there is no logging stream active, but does not throw + */ + endLogStream() { + if (this._outputLogStream !== null) { + this._outputLogStream.end(); + this._outputLogStream = null; + this._log_file = null; + } + else { + console.warn("Can not close log, log file is not specified"); + } + } + + /** + * + * Takes a string and appends it to the log file. Line is logged in a synchronous manner. + * Every call to this function will open a new file handle, append the line and close the file handle. + * + * @param {string} log_file the path to the file where the log will be appended to + * @param {string} line the line to be appended to the log file + * @throws {Error} Throws an error if another logging stream has already been started + */ + logLineSync(log_file: string, line: string) { + if (this._outputLogStream !== null && this._log_file === log_file) { + throw new Error("Can not call logLineSync on log file while the same file has an async log writer active. Call endLogStream first. Log file: " + log_file); + } + fs.appendFileSync(log_file, line); + } + + /** + * + * Takes a CB example and returns the string representation of it + * + * @param {object} example a CB example that will be stringified + * @returns {string} the string representation of the CB example + * @throws {Error} Throws an error if the example is malformed + */ + CBExampleToString(example: { text_context: string, labels: Array<{ action: number, cost: number, probability: number }> }): string { + let context = "" + if (example.hasOwnProperty('text_context')) { + context = example.text_context; + } + else { + throw new Error("Can not log example, there is no context available"); + } + + const lines = context.trim().split("\n").map((substr) => substr.trim()); + lines.push(""); + lines.push(""); + + if (example.hasOwnProperty("labels") && example["labels"].length > 0) { + let indexOffset = 0; + if (context.includes("shared")) { + indexOffset = 1; + } + + for (let i = 0; i < example["labels"].length; i++) { + let label = example["labels"][i]; + if (label.action + indexOffset >= lines.length) { + throw new Error("action index out of bounds: " + label.action); + } + + lines[label.action + indexOffset] = label.action + ":" + label.cost + ":" + label.probability + " " + lines[label.action + indexOffset] + } + } + return lines.join("\n"); + } + + /** + * + * Takes a CB example, stringifies it by calling CBExampleToString, and appends it to the log file. Line is logged in an asynchronous manner. + * + * @param {object} example a CB example that will be stringified and appended to the log file + * @throws {Error} Throws an error if no logging stream has been started + */ + logCBExampleToStream(example: { text_context: string, labels: Array<{ action: number, cost: number, probability: number }> }) { + let ex_str = this.CBExampleToString(example); + this.logLineToStream(ex_str); + } + + /** + * + * Takes a CB example, stringifies it by calling CBExampleToString, and appends it to the log file. Example is logged in a synchronous manner. + * Every call to this function will open a new file handle, append the line and close the file handle. + * + * @param {string} log_file the path to the file where the log will be appended to + * @param {object} example a CB example that will be stringified and appended to the log file + * @throws {Error} Throws an error if another logging stream has already been started + */ + logCBExampleSync(log_file: string, example: { text_context: string, labels: Array<{ action: number, cost: number, probability: number }> }) { + let ex_str = this.CBExampleToString(example); + this.logLineSync(log_file, ex_str); + } +}; + +module.exports = new Promise((resolve) => { + vwModule.then((vw: any) => { + + /** + * Nodejs wrapper around the Vowpal Wabbit C++ library. + * @class + * @extends vw.CbWorkspace + */ + class CbWorkspace extends vw.CbWorkspace { + /** + * Creates a new Vowpal Wabbit workspace for Contextual Bandit exploration algorithms. + * Can accept either or both string arguments and a model file. + * + * @constructor + * @param {string} [args_str] - The arguments that are used to initialize Vowpal Wabbit (optional) + * @param {string} [model_file] - The path to the file where the model will be loaded from (optional) + * @param {tuple} [model_array] - The pre-loaded model's array pointer and length (optional). + * The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. + * @throws {Error} Throws an error if: + * - no argument is provided + * - both string arguments and a model file are provided, and the string arguments and arguments defined in the model clash + * - both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash + * - both a model file and a model array are provided + */ + + constructor({ args_str, model_file, model_array }: + { args_str?: string, model_file?: string, model_array?: [number | undefined, number | undefined] } = {}) { + super(fs.readFileSync, fs.writeFileSync, { args_str, model_file, model_array }); + } + }; + + /** + * Nodejs wrapper around the Vowpal Wabbit C++ library. + * @class + * @extends vw.Workspace + */ + class Workspace extends vw.Workspace { + /** + * Creates a new Vowpal Wabbit workspace. + * Can accept either or both string arguments and a model file. + * + * @constructor + * @param {string} [args_str] - The arguments that are used to initialize Vowpal Wabbit (optional) + * @param {string} [model_file] - The path to the file where the model will be loaded from (optional) + * @param {tuple} [model_array] - The pre-loaded model's array pointer and length (optional). + * The memory must be allocated via the WebAssembly module's _malloc function and should later be freed via the _free function. + * @throws {Error} Throws an error if: + * - no argument is provided + * - both string arguments and a model file are provided, and the string arguments and arguments defined in the model clash + * - both string arguments and a model array are provided, and the string arguments and arguments defined in the model clash + * - both a model file and a model array are provided + */ + constructor({ args_str, model_file, model_array }: + { args_str?: string, model_file?: string, model_array?: [number | undefined, number | undefined] } = {}) { + super(fs.readFileSync, fs.writeFileSync, { args_str, model_file, model_array }); + } + }; + + resolve( + { + Workspace: Workspace, + CbWorkspace: CbWorkspace, + Prediction: vw.Prediction, + VWExampleLogger: VWExampleLogger, + getExceptionMessage: vw.getExceptionMessage, + wasmModule: vw.wasmModule, + } + ) + }) +}) \ No newline at end of file From b65753e264ff055c6f01ce0524cde7392f22be02 Mon Sep 17 00:00:00 2001 From: olgavrou Date: Tue, 30 May 2023 18:43:44 -0400 Subject: [PATCH 2/2] update docs --- wasm/README.md | 2 +- wasm/developer_readme.md | 11 ++++++----- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/wasm/README.md b/wasm/README.md index fe1b7aae46e..180949254fa 100644 --- a/wasm/README.md +++ b/wasm/README.md @@ -168,7 +168,7 @@ A model can be loaded from a file either during model construction (shown above) } ``` -### How-To log examples into a file or stringify examples for user-handled logging +### How-To log examples into a file or stringify examples for user-handled logging (currently available for nodejs environments only) A log stream can be started which will create and use a `fs` write stream: diff --git a/wasm/developer_readme.md b/wasm/developer_readme.md index bb48d2ea880..f81ab1b7fbb 100644 --- a/wasm/developer_readme.md +++ b/wasm/developer_readme.md @@ -53,8 +53,9 @@ npm run docs ### Release on npm 1. Update the version in package.json -2. Change all version references in README.md (relative links will be broken until merged to master and the tag is cut) -3. Update the table in README.md to point to latest VW version and tag -4. Commit changes to master -5. Tag the release as `wasm_v.major.minor.patch` -6. Publish to npm `npm publish --access public` (you need to sign into your npm account first and have access to the vowpalwabbit organisation) \ No newline at end of file +2. Run `npm run docs` and check in the new `documentation.md` if it has changed +3. Change all version references in README.md (relative links will be broken until merged to master and the tag is cut) +4. Update the table in README.md to point to latest VW version and tag +5. Commit changes to master +6. Tag the release as `wasm_v.major.minor.patch` +7. Publish to npm `npm publish --access public` (you need to sign into your npm account first and have access to the vowpalwabbit organisation) \ No newline at end of file