-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API to get Index Metrics #74
Conversation
Seems good! Whenever you can, @Jiaweihu08 , please resolve the conflicts and merge it! |
Shouldn't we also add |
Do you mean the Cube and Weight map? Because I don't know if including the files makes sense. But yeah, sure! Also, let's include a document in |
Yes, I meant Cube Statuses. I'll do the documentation. Thanks! |
Add an API to
QbeastTable
to retrieve OTree index metrics more easily!This is how would you use it:
val qbeastTable = QbeastTable.forPath(spark, tmpDir) val metrics = qbeastTable.getIndexMetrics()
The metrics included so far are:
General index metadata:
desiredCubeSize
Some more specific details such as
depthOverLogNumNodes = depth / log(cubeCounts)
,depthOnBalance = depth / log(rowCount/desiredCubeSize)
, both logs use base =dimensionCount
.We also take a closer look at the non leaf cube sizes.
NonLeafCubeSizeDetails
contains theirmin
,max
,quantiles
, and how far each of the cube sizes are from thedesiredCubeSize
.Map[CubeId, CubeStatus]
of the index is also returned, since some of the information stored inCubeStatus
are interesting to analyze, such as the distribution ofcube weights
for different indexes.You can access this information through
metrics.cubeStatuses
.Example output: