Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More tests #283

Merged
merged 12 commits into from
Jul 22, 2024
Merged

More tests #283

merged 12 commits into from
Jul 22, 2024

Conversation

zachmayer
Copy link
Owner

@zachmayer zachmayer commented Jul 21, 2024

Add a bunch more tests for edge cases. Found and fixed a couple bugs:

  • wtd.sd was wrong. Corrected it based off cross validated
  • fixed names in a caretlist check
  • c.caretList and c.train now respect the original model names, unless there are dupes
  • allow ordinal classification

@zachmayer zachmayer marked this pull request as ready for review July 21, 2024 21:48
Repository owner deleted a comment from coderabbitai bot Jul 21, 2024
@zachmayer
Copy link
Owner Author

@coderabbitai review

Copy link
Contributor

coderabbitai bot commented Jul 21, 2024

Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (1)
R/helper_functions.R (1)

126-139: Ensure proper handling of na.rm in wtd.sd function usage.

The wtd.sd function is used in several test cases and once in R/caretStack.R. The test cases cover various scenarios, including handling of na.rm. However, the usage in R/caretStack.R does not explicitly set na.rm, which could lead to issues if there are NA values in the data.

  • File: R/caretStack.R
    • Line: 1
    • Code: std_error <- apply(preds, 1, wtd.sd, w = overall_weights)

Please review and update the usage to explicitly handle na.rm as needed.

Analysis chain

LGTM! But verify the function usage in the codebase.

The modifications to the wtd.sd function improve input validation and simplify the calculation. The default behavior of na.rm has been changed to TRUE.

However, ensure that all function calls to wtd.sd are updated to handle the new default behavior of na.rm.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `wtd.sd` handle the new default behavior of `na.rm`.

# Test: Search for the function usage. Expect: Review all occurances of the function usage.
rg --type r -A 5 $'wtd.sd'

Length of output: 4538

Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 2fc77e9 and e961183.

Files selected for processing (9)
  • R/S3GenericExtenstions.R (2 hunks)
  • R/helper_functions.R (3 hunks)
  • cobertura.xml (13 hunks)
  • man/wtd.sd.Rd (1 hunks)
  • tests/testthat/test-S3-generic-extensions.R (2 hunks)
  • tests/testthat/test-caretList.R (1 hunks)
  • tests/testthat/test-caretStack.R (2 hunks)
  • tests/testthat/test-helper_functions.R (2 hunks)
  • tests/testthat/test-multiclass.R (2 hunks)
Additional comments not posted (57)
man/wtd.sd.Rd (3)

7-7: Verify the removal of the default value for w.

The default value for the w argument has been removed. Ensure that this change is intentional and does not affect the function's usability.


10-10: Improved clarity for the x argument.

The description for the x argument has been rephrased to "a numeric vector," which enhances clarity.


14-14: Verify the default value change for na.rm.

The default value for the na.rm argument has been changed from FALSE to TRUE. Ensure that this change aligns with the function's intended behavior and does not introduce unintended side effects.

R/S3GenericExtenstions.R (2)

51-51: Improved robustness of naming process.

The unique names for the new_model_list are now generated using the make.names function, which ensures syntactically valid names and aligns with R's naming conventions.


99-99: Improved robustness of naming process.

The unique names for the new_model_list are now generated using the make.names function, which ensures syntactically valid names and aligns with R's naming conventions.

tests/testthat/test-S3-generic-extensions.R (10)

11-16: Appropriate data loading for test cases.

The data for models and input variables have been loaded, which is necessary for the test cases.


117-117: Enhanced test coverage with new context.

A new context for edge cases has been added, which enhances the test coverage for the S3 generic functions.


122-133: Comprehensive test case for combining caretList objects.

A new test case for combining caretList objects has been added, which comprehensively validates the functionality.


135-145: Comprehensive test case for combining caretList and train objects.

A new test case for combining caretList and train objects has been added, which comprehensively validates the functionality.


147-156: Comprehensive test case for combining train objects.

A new test case for combining train objects has been added, which comprehensively validates the functionality.


158-168: Comprehensive test case for handling duplicate names.

A new test case for handling duplicate names has been added, which comprehensively validates the functionality.


170-173: Comprehensive test case for invalid inputs.

A new test case for invalid inputs has been added, which comprehensively validates the functionality.


175-181: Comprehensive test case for subsetting caretList objects.

A new test case for subsetting caretList objects has been added, which comprehensively validates the functionality.


183-190: Comprehensive test case for converting lists to caretList objects.

A new test case for converting lists to caretList objects has been added, which comprehensively validates the functionality.


192-194: Comprehensive test case for invalid inputs to as.caretList.

A new test case for invalid inputs to as.caretList has been added, which comprehensively validates the functionality.

tests/testthat/test-caretStack.R (7)

187-321: LGTM! Verify test coverage.

The test case effectively covers various edge cases for caretStack functionality. Ensure that the test coverage includes all relevant scenarios.

Verification successful

LGTM! Verify test coverage.

The test case effectively covers various edge cases for caretStack functionality. Ensure that the test coverage includes all relevant scenarios.


LGTM! Verify test coverage.

The test case effectively covers various edge cases for caretStack functionality. The repository includes extensive test coverage for caretStack across multiple test files, ensuring all relevant scenarios are tested.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the test coverage for `caretStack` edge cases.

# Test: Check the test coverage report. Expect: High coverage for `caretStack` functionalities.
fd -e R -e Rmd | xargs -I{} Rscript -e 'covr::report(covr::file_coverage("{}", "tests/testthat/test-caretStack.R"))'

Length of output: 162


Script:

#!/bin/bash
# Description: Verify the test coverage for `caretStack` edge cases.

# Search for test cases related to `caretStack` in the repository.
rg 'caretStack' tests/testthat/

Length of output: 5320


189-219: LGTM! Verify test coverage.

The test case comprehensively covers different stacking algorithms. Ensure that the test coverage includes all relevant scenarios.

Verification successful

LGTM! Verify test coverage.

The test case comprehensively covers different stacking algorithms. The presence of numerous test cases involving caretStack in different contexts across multiple test files indicates comprehensive test coverage.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the test coverage for `caretStack` stacking algorithms.

# Test: Check the test coverage report. Expect: High coverage for `caretStack` functionalities.
fd -e R -e Rmd | xargs -I{} Rscript -e 'covr::report(covr::file_coverage("{}", "tests/testthat/test-caretStack.R"))'

Length of output: 162


Script:

#!/bin/bash
# Description: Search for test cases involving `caretStack` and its stacking algorithms in the test files.

# Search for `caretStack` mentions in the test files
rg --context 5 'caretStack' tests/testthat/

Length of output: 29138


237-252: LGTM! Verify test coverage.

The test case effectively checks the handling of different metrics. Ensure that the test coverage includes all relevant scenarios.

Verification successful

Let's verify the test coverage by searching for the usage of caretStack in the test files.


LGTM! Verify test coverage.

The test case effectively checks the handling of different metrics. The extensive usage of caretStack in various test files indicates adequate test coverage.

  • tests/testthat/test-caretStack.R
  • tests/testthat/test-caretList.R
  • tests/testthat/test-helper_functions.R
  • tests/testthat/test-multiclass.R
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the test coverage for `caretStack` handling different metrics.

# Test: Check the test coverage report. Expect: High coverage for `caretStack` functionalities.
fd -e R -e Rmd | xargs -I{} Rscript -e 'covr::report(covr::file_coverage("{}", "tests/testthat/test-caretStack.R"))'

Length of output: 162


Script:

#!/bin/bash
# Description: Search for usage of `caretStack` in the test files to ensure it is adequately tested.

# Find all test files and search for `caretStack` usage.
fd -e R -e Rmd | xargs grep -n 'caretStack'

Length of output: 10348


221-235: LGTM! Verify test coverage.

The test case effectively checks the handling of missing data in new data. Ensure that the test coverage includes all relevant scenarios.

Verification successful

LGTM! Verify test coverage.

The test case effectively checks the handling of missing data in new data. Ensure that the test coverage includes all relevant scenarios.


LGTM! Verify test coverage.

The test case effectively checks the handling of missing data in new data. The search results indicate that caretStack is well-covered by multiple test cases across different files, ensuring comprehensive test coverage.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the test coverage for `caretStack` handling missing data.

# Test: Check the test coverage report. Expect: High coverage for `caretStack` functionalities.
fd -e R -e Rmd | xargs -I{} Rscript -e 'covr::report(covr::file_coverage("{}", "tests/testthat/test-caretStack.R"))'

Length of output: 162


Script:

#!/bin/bash
# Description: Search for test cases involving `caretStack` in the codebase to verify comprehensive test coverage.

# Find all test files
fd -e R -e Rmd | xargs rg 'caretStack'

Length of output: 9937


254-289: LGTM! Verify test coverage.

The test case effectively checks the handling of imbalanced data. Ensure that the test coverage includes all relevant scenarios.

Verification successful

LGTM! Verify test coverage.

The test case effectively checks the handling of imbalanced data. Ensure that the test coverage includes all relevant scenarios.


LGTM! Verify test coverage.

The test case effectively checks the handling of imbalanced data. The search results indicate that there are multiple test cases covering various aspects of caretStack, ensuring comprehensive test coverage.

  • tests/testthat/test-multiclass.R
  • tests/testthat/test-helper_functions.R
  • tests/testthat/test-caretList.R
  • tests/testthat/test-caretStack.R
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the test coverage for `caretStack` handling imbalanced data.

# Test: Check the test coverage report. Expect: High coverage for `caretStack` functionalities.
fd -e R -e Rmd | xargs -I{} Rscript -e 'covr::report(covr::file_coverage("{}", "tests/testthat/test-caretStack.R"))'

Length of output: 162


Script:

#!/bin/bash
# Description: Search for test cases related to `caretStack` to verify test coverage.

# Search for test cases mentioning `caretStack`
rg 'caretStack' tests/testthat/

Length of output: 5320


305-321: LGTM! Verify test coverage.

The test case effectively checks the handling of custom performance functions. Ensure that the test coverage includes all relevant scenarios.

Verification successful

LGTM! Verify test coverage.

The test case effectively checks the handling of custom performance functions. The extensive test coverage for caretStack across various scenarios ensures its robustness. Ensure that the overall test coverage remains high.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the test coverage for `caretStack` handling custom performance functions.

# Test: Check the test coverage report. Expect: High coverage for `caretStack` functionalities.
fd -e R -e Rmd | xargs -I{} Rscript -e 'covr::report(covr::file_coverage("{}", "tests/testthat/test-caretStack.R"))'

Length of output: 162


Script:

#!/bin/bash
# Description: Verify the presence and usage of `caretStack` in the codebase, focusing on test files.

# Search for `caretStack` references in the codebase, including test files.
rg 'caretStack' --context 5

Length of output: 91062


291-303: LGTM! Verify test coverage.

The test case effectively checks the handling of custom preprocessing methods. Ensure that the test coverage includes all relevant scenarios.

tests/testthat/test-helper_functions.R (6)

112-120: LGTM! The changes improve clarity and robustness.

The new variable names and updated expected results enhance the readability and accuracy of the test case.


124-131: LGTM! The changes ensure correct handling of NA values.

The test case comprehensively checks the behavior of wtd.sd with NA values.


124-131: LGTM! The changes enhance error handling.

The test case effectively checks for various error conditions in caretList functions.


309-324: LGTM! The changes improve edge case handling.

The test case comprehensively checks for edge cases in the wtd.sd function.


334-342: LGTM! The changes ensure correct model type validation.

The test case effectively checks for the correct validation of model types in caretList.


421-431: LGTM! The changes ensure correct extraction of best predictions.

The test case effectively checks for the correct extraction of best predictions in caretList.

tests/testthat/test-caretList.R (6)

722-747: LGTM! The changes ensure correct handling of missing data.

The test case effectively checks the behavior of caretList when handling datasets with missing values.


749-771: LGTM! The changes ensure correct handling of new factor levels.

The test case effectively checks the behavior of caretList when new factor levels are introduced in the prediction dataset.


773-791: LGTM! The changes ensure correct handling of large number of predictors.

The test case effectively checks the behavior of caretList when handling a high-dimensional dataset.


793-815: LGTM! The changes ensure correct handling of imbalanced data.

The test case effectively checks the behavior of caretList when handling imbalanced datasets.


817-839: LGTM! The changes ensure correct handling of custom performance metrics.

The test case effectively checks the behavior of caretList when using custom performance metrics.


817-839: LGTM! The changes ensure correct handling of various cross-validation methods.

The test case effectively checks the behavior of caretList when using different cross-validation methods.

cobertura.xml (10)

540-541: Increased coverage for caretModelSpec.

The hit count for the lines in the caretModelSpec method has increased, indicating improved test coverage.


546-557: Increased coverage for tuneCheck.

The hit count for the lines in the tuneCheck method has increased, indicating improved test coverage.


562-577: Increased coverage for methodCheck.

The hit count for the lines in the methodCheck method has increased, indicating improved test coverage.


588-603: Increased coverage for trControlCheck.

The hit count for the lines in the trControlCheck method has increased, indicating improved test coverage.


620-620: Increased coverage for extractCaretTarget.

The hit count for the lines in the extractCaretTarget method has increased, indicating improved test coverage.


625-625: Increased coverage for extractCaretTarget.default.

The hit count for the lines in the extractCaretTarget.default method has increased, indicating improved test coverage.


630-632: Increased coverage for extractCaretTarget.formula.

The hit count for the lines in the extractCaretTarget.formula method has increased, indicating improved test coverage.


637-667: Increased coverage for caretList.

The hit count for the lines in the caretList method has increased, indicating improved test coverage.


677-679: Increased coverage for is.caretList.

The hit count for the lines in the is.caretList method has increased, indicating improved test coverage.


684-689: Increased coverage for as.caretList.

The hit count for the lines in the as.caretList method has increased, indicating improved test coverage.

tests/testthat/test-multiclass.R (11)

Line range hint 5-35: LGTM!

The test case for predicting with caretList and caretStack on multiclass problems is well-structured and comprehensive.


Line range hint 37-70: LGTM!

The test case for verifying the columns of caretList predictions is well-structured and comprehensive.


Line range hint 72-97: LGTM!

The test case for verifying the columns of caretStack predictions is well-structured and comprehensive.


Line range hint 99-138: LGTM!

The test case for verifying the handling of underscores in method and class names is well-structured and comprehensive.


Line range hint 140-168: LGTM!

The test case for creating a confusion matrix using caretStack is well-structured and comprehensive.


Line range hint 170-193: LGTM!

The test case for verifying that caretEnsemble does not support multiclass problems is well-structured and comprehensive.


211-241: LGTM!

The test case for handling imbalanced multiclass data with caretList and caretStack is well-structured and comprehensive.


243-272: LGTM!

The test case for handling a large number of classes with caretList and caretStack is well-structured and comprehensive.


274-301: LGTM!

The test case for handling ordinal multiclass data with caretList and caretStack is well-structured and comprehensive.


303-328: LGTM!

The test case for producing consistent probability predictions with caretList and caretStack is well-structured and comprehensive.


330-360: LGTM!

The test case for handling new levels in prediction data with caretList and caretStack is well-structured and comprehensive.

R/helper_functions.R (2)

251-256: LGTM!

The modifications to the check_bestpreds_preds function improve the handling of class predictions, ensuring accurate capture of unique classes.


354-359: LGTM!

The modifications to the extractModelTypes function enhance the integrity of the type-checking process and ensure the function adheres to the expected model classifications.

Copy link
Contributor

coderabbitai bot commented Jul 21, 2024

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This update introduces significant enhancements to various R functions, particularly focusing on improving naming conventions, input validation, and the handling of missing values. Key changes include the adoption of make.names for unique name generation in model lists, adjustments to the wtd.sd function for better handling of missing data, and extensive testing improvements across caretList, caretStack, and related functionalities to ensure robustness in diverse scenarios.

Changes

Files Change Summary
R/S3GenericExtenstions.R Updated name generation for new_model_list in c.caretList and c.train to use make.names for valid and unique names.
R/helper_functions.R Enhanced wtd.sd function: clarified parameter documentation, changed default for na.rm to TRUE, and improved internal logic with input validation.
cobertura.xml Adjusted coverage metrics for various methods, indicating improved code coverage and reliability.
man/wtd.sd.Rd Updated documentation for wtd.sd: clarified argument types and changed default for na.rm to TRUE.
tests/testthat/test-S3-generic-extensions.R Added tests for c.caretList and c.train, including handling duplicates and validating input types.
tests/testthat/test-caretList.R Introduced tests for caretList focused on scenarios like missing data and imbalanced datasets.
tests/testthat/test-caretStack.R Added tests for caretStack to cover edge cases in model stacking and prediction.
tests/testthat/test-helper_functions.R Modified tests for wtd.sd and introduced new cases for weighted standard deviations and model validation checks.
tests/testthat/test-multiclass.R Added tests for caretList and caretStack focusing on multiclass classification capabilities.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@zachmayer
Copy link
Owner Author

@coderabbitai read the files in R/ and generate a class/function diagram using mermaid.

Copy link
Contributor

coderabbitai bot commented Jul 22, 2024

Tip

For best results, initiate chat on the files or code changes.

@zachmayer Here is the mermaid diagram based on the functions extracted from the R files:

classDiagram
    class caretEnsemble-package {
    }

    class S3GenericExtenstions {
    }

    class helper_functions {
        getBinaryTargetLevel()
        setBinaryTargetLevel(level)
        validateBinaryTargetLevel(arg)
        getMulticlassExcludedLevel()
        setMulticlassExcludedLevel(level)
        validateMulticlassExcludedLevel(arg)
        check_caretList_classes(list_of_models)
        check_caretList_model_types(list_of_models)
        check_bestpreds_resamples(modelLibrary)
        check_bestpreds_indexes(modelLibrary)
        check_bestpreds_obs(modelLibrary)
        check_bestpreds_preds(modelLibrary)
        check_multiclass_excluded_level(excluded_level, num_classes)
        check_binary_classification(list_of_models)
        extractModelName(x)
        validateCustomModel(x)
        extractModelTypes(list_of_models)
        bestPreds(x)
        extractBestPreds(list_of_models)
        makePredObsMatrix(list_of_models)
    }

    class caretStack {
        caretStack(all.models, ...)
    }

    class caretEnsemble {
        caretEnsemble(all.models, ...)
        extractModRes(ensemble)
        getMetric(x, metric, ...)
        getMetricSD(x, metric, ...)
        norm_to_100(d)
        clean_varImp(x)
        varImpFrame(x)
        multiResiduals(object, ...)
        fortify(model, data = NULL, ...)
        extractModFrame(model)
        autoplot(object, which = c(1:6), mfrow = c(3, 2), 
    }

    class caretList {
        caretModelSpec(method = "rf", ...)
        tuneCheck(x)
        methodCheck(x)
        trControlCheck(x, y)
        extractCaretTarget(...)
        caretList(
    }
Loading
Analysis chain
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# List all files in the R/ directory
fd . -e R -t f -d 1 R/

# Extract functions and classes from all files in the R/ directory
fd . -e R -t f -d 1 R/ --exec sh -c 'echo "File: {}"; ast-grep --lang r --pattern $"
function $_($$) {
  $$$
}
"' 

Length of output: 1032


Script:

#!/bin/bash
# List all files in the R/ directory
fd . -e R -t f -d 1 R/

# Extract functions from all files in the R/ directory using ripgrep
fd . -e R -t f -d 1 R/ --exec sh -c 'echo "File: {}"; rg "^\\s*\\w+\\s*<-\\s*function\\(" {}'

Length of output: 2051


If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!

Zach Deane-Mayer added 2 commits July 22, 2024 10:08
@zachmayer zachmayer merged commit 0675560 into main Jul 22, 2024
8 checks passed
@zachmayer zachmayer deleted the more-tests branch July 22, 2024 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant