Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate dataset for qualx plugin invocations #1361

Merged
merged 3 commits into from
Sep 30, 2024

Conversation

leewyang
Copy link
Collaborator

This PR isolates the rows for the targeted dataset when invoking a qualx dataset-specific plugin.

Previously, all plugins were invoked with the entire input dataframe, which relied on the plugin implementation to correctly target only the rows of its intended dataset. While this design was simple and allowed for the full flexibility to modify the data in any way, it also can lead to unexpected behavior if a plugin accidentally targets the rows of other datasets.

I have confirmed that the full workflow of preprocessing, training, and evaluate produces the same models and evaluation results.

Changes

  1. send only the slice of the input dataframe for the specific dataset targeted by the plugin.
  2. raise an error if the plugin code modifies the dataframe indices (implying invalid/unsupported modification of the rows).
  3. add support for a 'default' split_function (by name vs. by presence in a list).
  4. updated qualx.md doc with more details for plugin implementations.

Test

Following CMDs have been tested:

Internal Usage:

python qualx_main.py preprocess
python qualx_main.py train
python qualx_main.py evaluate

Signed-off-by: Lee Yang <leewyang@gmail.com>
@leewyang leewyang added the user_tools Scope the wrapper module running CSP, QualX, and reports (python) label Sep 26, 2024
@leewyang leewyang self-assigned this Sep 26, 2024
Signed-off-by: Lee Yang <leewyang@gmail.com>
Signed-off-by: Lee Yang <leewyang@gmail.com>
Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTME. @eordentlich Could you please review it as well?

Copy link
Collaborator

@eordentlich eordentlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@parthosa parthosa merged commit de7c3a0 into NVIDIA:dev Sep 30, 2024
14 checks passed
@leewyang leewyang deleted the qualx_plugin_df branch September 30, 2024 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants