Syncing #1

cTxplorer · 2018-10-06T09:33:00Z

We are excited to review your PR.

So we can do the best job, please check:

There's a descriptive title that will make sense to other developers some time from now.
There's associated issues. All PR's should have issue(s) associated - unless a trivial self-evident change such as fixing a typo. You can use the format Fixes #nnnn in your description to cause GitHub to automatically close the issue(s) when your PR is merged.
Your change description explains what the change does, why you chose your approach, and anything else that reviewers should know.
You have included any necessary tests in the same PR.

Fixes #982 * different Config files for train and test * solves problem of long running time * train benchmarks contain only one iteration as it gives more idea on how the users will use. (with no warmup iteration) * predict config is the original version

* Add analayzer to nuget * Generalize target file to enable easier analayzer inclusions in future

* Transform wrappers and a reference implementation for tokenizers * Added pigsty extensions * Added pigsty test * Fixed most important PR comments * PR comments * Converted all text transforms into transformers/estimators. * Addressed reviewers's comments. * Addressed reviewers' comments. * Converted LdaTransform into Transformer/Estimator. * Fixed LdaNative issue and addressed reviewers' comments. * Fixed issue with test. * Diabled end-to-end LdaTransform test due to incosistency of outputs.

* Microsoft.ML.Data.StaticPipe to Microsoft.ML.StaticPipe. * Columns now in StaticPipe as opposed to StaticPipe.Runtime.

…ssion) to estimator (#1002)

Fixes #997

* Stop loading assemblies in ComponentCatalog. Write the AssemblyName into the model, and use it to register the assembly during model load. * Move ComponentCatalog from a static class to a member of IHostEnvironment. * Update tests for ComponentCatalog refactoring. * minor cleanup * Add AssemblyName to all model VersionInfo instances. Also fix a couple more tests. * Load and register all assemblies in the Maml directory. Ensure all loaded assemblies are registered in Experiment to maintain compability. Fix tests to not use ComponentCatalog but direct instantiation instead. * Sync up with latest code. * Fix newly added test * Clean up some test changes. * Fix up for latest code * Add path filtering back to LoadAssembliesInDir * Update TestAutoInference to use the correct Environment. * Respond to PR feedback. * Make all AutoInference tests use LocalEnvironment.

…ot frozen (#853) * building transform from ground up * dummy transform works after fixing the getters * SavedModel format works for Train, but fails for Save&Predict * remove dummy transform * remove dummy unit test * Works with non-frozen models * building transform from ground up * dummy transform works after fixing the getters * SavedModel format works for Train, but fails for Save&Predict * remove dummy transform * remove dummy unit test * fix compilation issues; verify existing tests work fine * works locally; need to refactor code * refactored code; keeping only 1 version of the convenience API * added class for directory structure * using latest nuget package (0.0.3) for Microsoft.ML.TensorFlow.TestModels * delete temporary files used when loading/saving models * delete local models; the updated nuget version (0.0.3) for Microsoft.ML.TensorFlow.TestModels contains these models * modified logic for load/restore of models * modified logic for load&restore of unfrozen models * model version update to support non-frozen models * based on the code review comments, we now infer if the provided model is frozen or not * simplify the logic in Save() related to loading of SavedModel. * trying Eric's suggestion * revert back to previous changes * attempt to use stream copy approach instead of in-memory * deleting some commented out code * Ensure we only copy the file segment & cleanup path logic * added finalizer that closes the session (if it isn't closed) and deletes the temporary directory * cleanup + misc review comments * trying to create temp dir with proper ACLs for high priviledge users * create temp dir with proper ACLs for high-privilege processes * fix build after merge with latest master * taking care of review comments related to model versioning of TFTransform * remove IDisposable from the TensorFlowTransform; renaming some methods * refactor code so we have only 1 constructor for the TensorFlowTransform (as suggested in review comment) * fix issues with nuget packaging; refactored the code + added comments * add checks in code to make sure that the input is not a variable length vector * fix typo in name of package * (1) added SavedModel test for MNIST model (2) added try/finally for deleting temp folder (3) deleted test using Legacy Learning API * remove and sort usings in file TrainSaveModelAndPredict.cs * using spaces in nupkgproj * error checking for passed in IHostEnvironment * fix TargetFramework version (netcore 2.0) of DnnAnalyzer to match that of Microsoft.ML.TensorFlow

…o estimators (#957) * derived from trainerestimatorbase * cleaned up * sorting namespaces * fixed review comments, still some more features to add * pr comments on tests * updated the tests * fixed review comments * fixing review comments * fixed bugs * refactored code * fixed review comments and cleaned up * cleaned up code and fixed documentation text * fixed review comments * fixed review comments * fixed review comments * fized review comment * fixed more review comments

adding PcaTrainer as estimator and tidying the PredictionTransformer class.

…tor (#1012)

…on (#1009) * FastTree classification and regression xtensions

…#991)

…ML.Scoring/Sonoma Library (#942)

Adds a benchmark test to measure performance of doing many single predictions with PredictionEngine. Closes #1013

)

* Remove the error tracing when assembly loading fails for Maml. Also adding our native assemblies to the list to skip, so they aren't attempted to be loaded. Fix #1034

Helps the user to relate to the macOS version faster.

…1074)

* Multiclass logistic Regression tests enabled * threshold providing in tests * defining tolerance as a constant in baseTestBaseline Class * upper case camel for constant and _ for large decimal numbers

* Add a workaround for the tests hanging while loading MKL. The workaround is to ensure the MKL library is loaded very early in the test process, so it doesn't cause the deadlock. Workaround #1073 Another deadlock also occurs when running TestAutoInference and TestPipelineSweeper in parallel. Marking these tests to not run in parallel anymore. Workaround #1095 Moving back to the Azure Hosted VS2017 pool to run the tests now that we've narrowed the deadlocks down.

…owUtils.GetModelNodes (#1093)

)

…nchmarks (#1114)

…1032) * Add instructions for building for .NET Core 3.0, and make them work. Fix #1011 * Add config specific properties for the Intrinsics configs. * Allow tests to be run against .NET Core 3.0

* Port of time series.

) * Static pipelines now handle types with PipelineColumn properties. * Update the internal infrastructure to accomodate these types, * Update the Roslyn analyzer to accomodate these types. * Update the tests so that they exercise this capability. * Opportunistically fix some problems with the Roslyn analyzer brought up in this work.

* turned string separators into char array separators * fixed review comments * allowed the old api to still work through the arguments object * added command line test * fixed test, and added visibility field to arguments * fixing review comments

…tructors (#1135) * Remove ComponentCatalog from EntryPointGraph's and GraphRunner's constructors * Remove catalog temp variable, use directly in call to ValidateNodes * Remove catalog temp variable from GraphRunner constructor

* add .NET Core 3.0 support for the benchmarks * code review fixes: keep it simple

* Adding the Samples.StaticPipe project. * Adding a sample for SDCA Regression

* Remove explicit ComponentCatalog ValidateNodes and EntryPointNode now use the ComponentCatalog property of IHostEnvironment.

) (#1141) * Updating the CopyColumnsEstimator and Transform to use common code (#706) This builds on the Estimator conversion for the CopyColumnsTransform. This change is mainly refactoring as common code has moved to base level classes. This change is the following: - CopyColumnTransform now derives from OneToOneTransformerBase - CopyColumnEstimator now derives from TrivialEstimator - CopyColumnTransform::Mapper now derives from MapperBase - Removed code that was no longer needed due to these changes * - Moved CopyColumnsTransform into Microsoft.ML.Transforms namespace, updated namespace usage and entrypoints due to this change. - Save now uses the SaveColumns from the base class - Other various changes based upon feedback.

* TrainUtils.Train does not have consistent API usage for the calibrator argument (#1023) Updates the API signature for TrainUtils.Train to take in an IComponentFactory<ICalibratorTrainer>. Fixes #1023

* conversion of multiclass naive bayes classifier to estimator * added pigstension and related test * added public methods to access label and feature histograms in the predictor * fixed review comments on new access functions * moved test to main file

See #1013 for the benchmark results

…#1145) * Fix MatchNumberWithTolerance to better compare floating-point values * Updating CheckEqualityFromPathsCore to allow a tolerance match on Windows

sfilipi and others added 30 commits September 21, 2018 22:01

undoing test changes (#974)

1f5f696

Extended contexts to regression and multiclass, added FFM pigstension

eb26489

Add analayzer to nuget (#999)

fe07907

* Add analayzer to nuget * Generalize target file to enable easier analayzer inclusions in future

Rename the static pipeline namespace. (#1007)

b790195

* Microsoft.ML.Data.StaticPipe to Microsoft.ML.StaticPipe. * Columns now in StaticPipe as opposed to StaticPipe.Runtime.

Conversion of ordinary least square linear regression (OlsLinearRegre…

a18d296

…ssion) to estimator (#1002)

ColumnNameAttribute is respected by TextLoader

655c2e2

Fixes #997

using TextLoader.Create instead of env.CreateLoader (#1025)

330aa41

PcaTrainer as estimator (#996)

b4a95aa

adding PcaTrainer as estimator and tidying the PredictionTransformer class.

Conversion of Parallel Stochastic Gradient Descent (SymSGD) to estima…

ab32439

…tor (#1012)

Adding the extension methods for FastTree classification and regressi…

b270b4d

…on (#1009) * FastTree classification and regression xtensions

Merge ModuleCatalog into ComponentCatalog. (#1022)

59a90e7

Converted Feature selection transforms in to transformers/estimators. (…

b831516

…#991)

LightGbm pigstensions (#1020)

a01c80c

TrainTestSplit function (#1005)

dd4320d

Update our Windows CI leg to use the non-Hosted Windows queue (#1030)

d7b062d

Add OnnxTransform for scoring Onnx 1.2 models - integrates Microsoft.…

f6d850f

…ML.Scoring/Sonoma Library (#942)

Adding benchmark test for PredictionEngine (#1014)

36c75d9

Adds a benchmark test to measure performance of doing many single predictions with PredictionEngine. Closes #1013

Remove DnnAnalyzer from the Microsoft.ML.TensorFlow nuget (#1029)

d42963c

Converted PcaTransform into Transformer using TransformerWrapper. (#1017

b87ae02

)

Bump master to 0.7 (#1037)

759ac33

Remove the error tracing when assembly loading fails for Maml. (#1058)

a80e3d6

* Remove the error tracing when assembly loading fails for Maml. Also adding our native assemblies to the list to skip, so they aren't attempted to be loaded. Fix #1034

Use full test name (#1035)

437c1ba

Provided the name for macOS 10.12 version. (#1070)

769b1eb

Helps the user to relate to the macOS version faster.

Finish the sentence in TextLoader static pipeline extension method (#…

7fde5a3

…1074)

Enabled Multiclass Logistic Regression Tests (#939)

b871c86

* Multiclass logistic Regression tests enabled * threshold providing in tests * defining tolerance as a constant in baseTestBaseline Class * upper case camel for constant and _ for large decimal numbers

GalOshri and others added 29 commits October 1, 2018 11:04

Add release notes for ML.NET 0.6 (#1102)

70b3c3b

Add xml documentation for TensorFlowUtils.GetModelSchema and TensorFl…

87ffaa1

…owUtils.GetModelNodes (#1093)

Updated the building instructions to specify supported VS version (#1024

76dd923

)

Adding ONNX scoring example link and prediction engine improvement be…

00a10ad

…nchmarks (#1114)

Fixed a grammatical error in windows-instructions (#1117)

eb1c141

Allow the creation of ONNX initializers (#965)

ff85a5c

Add instructions for building for .NET Core 3.0, and make them work. (#…

fcea146

…1032) * Add instructions for building for .NET Core 3.0, and make them work. Fix #1011 * Add config specific properties for the Intrinsics configs. * Allow tests to be run against .NET Core 3.0

Port Time Series (#977)

cde7038

* Port of time series.

Update Readme (#1123)

25b4a2c

Convert categorical hash to estimator (#1033)

c6c0d22

Update build yaml to use official container functionality (#1118)

3a1fa0f

General grammar and punctuation fixes (#1140)

2bafe94

Fixed docs for API overview: added AsDynamic call (#1139)

800d245

add .NET Core 3.0 support for the benchmarks (#1142)

eba2751

* add .NET Core 3.0 support for the benchmarks * code review fixes: keep it simple

Create links to detail sections (#1149)

20761a3

XML documentation references cs code for examples (#1105)

c87b869

* Adding the Samples.StaticPipe project. * Adding a sample for SDCA Regression

Remove explicit ComponentCatalog parameter (#1147)

22b0845

* Remove explicit ComponentCatalog ValidateNodes and EntryPointNode now use the ComponentCatalog property of IHostEnvironment.

TrainUtils.Train does not have consistent API usage (#1155)

b770281

* TrainUtils.Train does not have consistent API usage for the calibrator argument (#1023) Updates the API signature for TrainUtils.Train to take in an IComponentFactory<ICalibratorTrainer>. Fixes #1023

Convert RFF transform to estimators (#1122)

3e3bd50

Adding prediction benchmarks using legacy LearningPipeline API (#1126)

3170ab0

See #1013 for the benchmark results

Renamed variables with more ML.NET specific terminology (#799)

c45089f

Conversion of Hogwild SGD to estimator (#1134)

d517589

Fix MatchNumberWithTolerance to better compare floating-point values (…

02e85cc

…#1145) * Fix MatchNumberWithTolerance to better compare floating-point values * Updating CheckEqualityFromPathsCore to allow a tolerance match on Windows

cTxplorer merged commit 511503b into cTxplorer:master Oct 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syncing #1

Syncing #1

cTxplorer commented Oct 6, 2018 •

edited

Loading

Syncing #1

Syncing #1

Conversation

cTxplorer commented Oct 6, 2018 • edited Loading

cTxplorer commented Oct 6, 2018 •

edited

Loading