- replace all instances of deprecated function
purrr::invoke()
withrlang::exec()
- only show unclassified predictions in confusion matrix output if they exist
- add variable importance methods
- add param
vi
tosplendid()
andsplendid_model()
to calculate variable importance
- add elastic net penalty for multinomial logistic regression
- fix calculation of multi-class log loss metric
- increased minimum R version to 4.1.0
- fix warnings pertaining to deprecated functions
- fix reordering of SVM probability matrix
-
use reference cell parameterization to create dummy variables for factors. The reference level (default first level) does not have a dummy variable. A factor with k levels creates k-1 dummy variables.
-
add SVM to the list of algorithms that need all numeric variables, otherwise create dummy variables
-
add GitHub Action workflows for R CMD check, test coverage, and pkgdown
-
calculate AUC from
yardstick
-
package
DMwR
has been archived, so use SMOTE implementation from packageperformanceEstimation
-
print error message if suggested package not available
-
fix
sequential_eval()
for n = 1 bootstrap case
-
add Kappa to evaluation metrics (overall and class-specific)
-
add class-specific accuracy and G-mean metric
-
use a "one-vs-all" SMOTE subsampling technique: over-sample each class vs. the other classes and combine the oversampled datasets to create the "balanced" dataset
-
new multiclass metric G-mean
-
use
yardstick
package for most evaluation metrics -
pass
seeds
tocaret::trainControl()
for reproducible tuning (#48) -
add
roc_plot()
for plotting multi-class ROC curves -
add custom print method for objects returned from
prediction()
. The output was previously not informative and too long -
add parameter
seed_samp
tosplendid()
to allow setting random seed before subsampling -
splendid_convert()
is now defunct. Usesplendid_process()
for a more comprehensive data pre-processing step. The new function canconvert
categorical variables to dummy variables as before. Added the ability tostandardize
continuous variables and applysampling
techniques to deal with class imbalance. Subsampling can only occur on the training set. -
add parameter
stratify
to allow stratified bootstrap sampling on training set -
use standard convention for confusion matrices: predicted in rows, reference in columns
-
replace
MLmetrics::MultiLogLoss()
withModelMetrics::mlogLoss()
inlogloss()
since it handles the case when the truth has a category with 0 counts but is represented in the probability matrix -
add NPV and specificity to
evaluation()
-
increase minimum R version to 3.6.0
-
move packages used for classification to
Suggests
to reduce the number of dependencies and are used conditionally -
remove deprecated
context()
from tests -
update roxygen and docs
-
internal functions deprecated and imported from new packages as needed
-
update vignette parameter descriptions
-
put the macro and micro averaged ROC curves at the end of legend in
roc_plot()
-
suppress warnings in call to
multiROC::multi_roc()
after updates tostats:::regularize.values()
in R-3.6.0 passeswarn.collapsing = TRUE
if there is no value forties
instats::approx()
-
in sequential method, remove bootstrap iterations with an undefined F1-measure from average calculation
-
increase
perc.over
andperc.under
in SMOTE subsampling to ensure the second to smallest class has > 0 cases -
decrease
minsplit
inadaboost
so fewer observations are needed to split a node in the rpart classifier -
fix
num_class
inxgboost
: number of classes should be taken from factor levels (some might be dropped from training set) -
fix factor order in
class_threshold()
to take from column order of associated probability matrix
-
Default seed parameter value
NULL
does not invokeset.seed()
-
Extend random seed parameter to more algorithms
-
Extended categorical variable conversion to
classification()
-
Reinstate tidy evaluation semantics after package dependencies updated
-
Improved RFE interface
-
Added AdaBoost.M1 algorithm
-
Added a
NEWS.md
file to track changes to the package.