-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support More Aggregate Methods #265
Conversation
src/csharp/Microsoft.Spark.E2ETest/IpcTests/Sql/DataFrameTests.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
src/csharp/Microsoft.Spark.E2ETest/IpcTests/Sql/DataFrameTests.cs
Outdated
Show resolved
Hide resolved
@@ -404,6 +404,22 @@ public void TestSignaturesV2_3_X() | |||
Assert.IsType<RelationalGroupedDataset>(_df.GroupBy(_df["age"])); | |||
Assert.IsType<RelationalGroupedDataset>(_df.GroupBy(_df["age"], _df["name"])); | |||
|
|||
Assert.IsType<DataFrame>(_df.GroupBy("name").Mean("age")); | |||
Assert.IsType<DataFrame>( | |||
_df.WithColumn("tempAge", _df["age"]).GroupBy("name").Mean("age", "tempAge")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for these GroupBy.fn tests, let's pull out _df.WithColumn("tempAge", _df["age"]).GroupBy("name") into a var within it's own scope, { ... } and reuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with _df.GroupBy("name") as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the changes. Please let me know if the var name makes sense. Thanks!
520ba68
Assert.IsType<DataFrame>(_df.GroupBy("name").Sum("age")); | ||
Assert.IsType<DataFrame>( | ||
_df.WithColumn("tempAge", _df["age"]).GroupBy("name").Sum("age", "tempAge")); | ||
var relationalGroupedDataset = _df.GroupBy("name"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's enclose this whole subsection with a { }. We can shorten the variable names. Just using df1
and df2
should suffice. Also use RelationalGroupedDataset
instead of var
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha! Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR exposes more aggregate methods for
RelationalGroupedDataset
likeMean()
,Max()
,Avg()
,Min()
. Follow up on issue #260