Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PcaTrainer as estimator #996

Merged
merged 12 commits into from
Sep 25, 2018
Merged

PcaTrainer as estimator #996

merged 12 commits into from
Sep 25, 2018

Conversation

sfilipi
Copy link
Member

@sfilipi sfilipi commented Sep 22, 2018

Addresses part of #754

@sfilipi sfilipi self-assigned this Sep 22, 2018
@sfilipi sfilipi added the API Issues pertaining the friendly API label Sep 22, 2018
@Zruty0 Zruty0 mentioned this pull request Sep 22, 2018
@@ -108,6 +137,18 @@ public override PcaPredictor Train(TrainContext context)
}
}

private static SchemaShape.Column MakeWeightColumn(string weightColumn)
Copy link
Member Author

@sfilipi sfilipi Sep 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use trainerUtils #Resolved

}

protected override BinaryPredictionTransformer<PcaPredictor> MakeTransformer(PcaPredictor model, ISchema trainSchema)
=> new BinaryPredictionTransformer<PcaPredictor>(Host, model, trainSchema, _featureColumn);
Copy link
Member Author

@sfilipi sfilipi Sep 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a, _featu [](start = 84, length = 9)

@zruty do we have a threeshold, to pass? #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use default


In reply to: 219663367 [](ancestors = 219663367)

@sfilipi
Copy link
Member Author

sfilipi commented Sep 23, 2018

        Model = model;

compact all constructors here. #Resolved


Refers to: src/Microsoft.ML.Data/Scorers/PredictionTransformer.cs:71 in c1d54f7. [](commit_id = c1d54f7, deletion_comment = False)

public IDataView Transform(IDataView input)
{
Host.CheckValue(input, nameof(input));
return Scorer.ApplyToData(Host, input);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Sep 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scorer [](start = 19, length = 6)

SDCA, FastTree, LightGbm, GAM don't have Scorer setup, so tests are failing. #Resolved

@@ -41,7 +42,7 @@ namespace Microsoft.ML.Runtime.PCA
/// <remarks>
/// This PCA can be made into Kernel PCA by using Random Fourier Features transform
/// </remarks>
public sealed class RandomizedPcaTrainer : TrainerBase<PcaPredictor>
public sealed class RandomizedPcaTrainer : TrainerEstimatorBase<BinaryPredictionTransformer<PcaPredictor>, PcaPredictor>
Copy link
Contributor

@TomFinley TomFinley Sep 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BinaryPredictionTransformer [](start = 68, length = 27)

So, why is this a binary prediction transformer? Prior to your change it was not a binary predictor, it was part of the anomaly detection task. #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks.


In reply to: 220048006 [](ancestors = 220048006)

{
return scoreType == NumberType.Float;
}
=> scoreType == NumberType.Float;
}
Copy link
Contributor

@TomFinley TomFinley Sep 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a nit, but while you're at it, maybe indent properly. #Closed

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sfilipi !

@sfilipi
Copy link
Member Author

sfilipi commented Sep 25, 2018

Closing and re-opening to trigger rebuild. The local build is fine..

@sfilipi sfilipi closed this Sep 25, 2018
@sfilipi sfilipi reopened this Sep 25, 2018
/// <param name="rank">The number of components in the PCA.</param>
/// <param name="oversampling">Oversampling parameter for randomized PCA training.</param>
/// <param name="center">If enabled, data is centered to be zero mean.</param>
/// <param name="seed">The seed for random number generation.</param>
Copy link
Contributor

@Zruty0 Zruty0 Sep 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seed [](start = 25, length = 4)

why do we use a separate seed here? #Pending

Copy link
Member Author

@sfilipi sfilipi Sep 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zruty0 it is part of the args, therefore i moved it here; rather than doing the delegate on advancedArgs.


In reply to: 220353522 [](ancestors = 220353522)

var pipeline = new RandomizedPcaTrainer(Env, featureColumn, rank:10);

TestEstimatorCore(pipeline, data);
}
Copy link
Contributor

@Zruty0 Zruty0 Sep 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

} [](start = 7, length = 2)

Done() #Resolved

Copy link
Contributor

@Zruty0 Zruty0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@sfilipi sfilipi merged commit b4a95aa into dotnet:master Sep 25, 2018
@sfilipi sfilipi deleted the PCAEstimator branch September 25, 2018 22:34
@ghost ghost locked as resolved and limited conversation to collaborators Mar 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants