Releases: dariasor/TreeExtra
Releases · dariasor/TreeExtra
2.6
- All training and evaluation is now capable of using weights on individual data points. Weight column should be marked as "(weight)" in the attribute file.
- ag_addbag command deprecated
- New command vis_correlations, produces list of Spearman's correlation scores between all pairs of active features. Works only in the absence of missing values.
- bt_train outputs either a correlation table or information on missing values for active features
- ag_fs outputs core_features.txt: list of selected features sorted by the level of importance
- Input in vis_* -o parameter is now suffix instead of the whole file name, the whole file name is made of feature names combined with the provided suffix.
- Removed type check for non-active attributes
- Bagging recommendations changed to increase exponentially
2.5
- Old logs are not deleted any more during the run of ag_train or bt_train. Instead they are appended to logs.archive.txt
- bt_train now generates a new output file: prediction scores on the validation data set. Default file name is preds.txt
- For ag_train, a warning appears when there is no signal (even the simplest model has performance worse than baseline).
- Changed the output for the layered mode of training for Additive Groves. The best model parameters and the expanding recommendation now correspond to the best model for interaction detection, not to the model providing the best performance on the validation set.
- File performance.txt (Additive Groves output) now contains binary information about the learning curve convergence. 1 means the learning curve has converged for given (a, n) parameters, 0 means it has not.
2.4
- Feature evaluation algorithm in bagged trees is changed. Now a score in each node is normalized by the entropy of the split feature in that node. This way the scores of binary features become comparable with the scores of continuous features with multiple values.
- Effect and interaction plots now consider all data, including data points with missing values. For the features with substantial number of missing values, the effect of missing value is also plotted.