You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Summarize function in sharded databases doesn't "summarize" the individual shard results. It also attempts to re-summarize partial results on WS FULL, but I believe it is doing so incorrectly by using the same summary function as originally used on the raw data.
The text was updated successfully, but these errors were encountered:
I believe the WS FULL implemetation is currently correct, but only because the only summary functions supported are count, sum, max and min. If you needed to add avg or similar functions, you'd need to do more work. I will look at the sharding issue.
Well, count would be incorrect as well as it should sum up the individual counts when re-summarizing. But it is buggy anyway because the groupfn takes vectors of columns as argument. I've fixed in my fork and added new functions to re-summarize.
The Summarize function in sharded databases doesn't "summarize" the individual shard results. It also attempts to re-summarize partial results on WS FULL, but I believe it is doing so incorrectly by using the same summary function as originally used on the raw data.
The text was updated successfully, but these errors were encountered: