Automating the validation process for machine learning models is crucial for efficient deployment and monitoring. Depending on the phase (pre-deployment or monitoring) and the model type (such as Classifiers, Regressors, Clusters), a tailored approach is necessary for selecting appropriate validation types and methods.
During the pre-deployment phase, the seamlessvalidation package intelligently selects validation types like Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE) for regression models, while for classification models, it opts for Receiver Operating Characteristic (ROC) curves, Confusion Matrices, Precision-Recall Scores, or F1 scores based on the model's nature. Moreover, it selects validation methods such as k-fold cross-validation, stratified cross-validation, leave-one-out cross-validation, bootstrapping, or holdout validation to ensure robustness in assessing model performance.
In the monitoring phase, the system continues to adapt its validation strategy based on the model type and current deployment context. It automatically selects appropriate validation types and methods to evaluate the model's performance over time. Additionally, it provides continuous monitoring results on data drift, enabling users to assess the model's effectiveness and population shift in real-world scenarios.
By automating the selection of validation types and methods based on the model type and deployment phase, the package ensures thorough and accurate validation, leading to more reliable model deployment and monitoring processes.
You can install your package using pip: pip install seamlessvalidation
Here's how you can use your package:
import seamlessvalidation
# Example usage code here
seamlessvalidation(your_model,X=X_class,y=y_class,n_splits=number_of_splits
,cv_strategy='stratified_k_fold',avg_strategy='weighted',phase='validation')
seamlessvalidation(your_model,X=X_class,y=y_class
,p_value=0.5
,data_compare=old_data_to_compare, data_new=new_data
,phase='post_deployment_monitoring')
Auto selection on validation metrics based on model type and development stage. Easily selection of validation strategy. Flexible intergration comparison method and data drift monitoring
'scikit-learn', 'pandas', 'numpy', 'matplotlib', 'pickle', 'scipy'
Contributions are welcome! Here's how you can contribute:
- Fork the repository
- Make your changes
- Submit a pull request
Copyright (c) 2023, The Hugging Face Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
ZhuZheng(Iverson) ZHOU