[tests] Model tests expansion (partially undone in #1464) #1455

ourownstory · 2023-10-17T23:29:52Z

to date, the test horizon of our model benchmarks has been rather long (20%), yielding in accuracy over a long generalization horizon being tested. This skews results towards more regularized models.

This PR introduces dual test horizons - a short one (5-10%) and a long one (20-30%).
Those will cover regular and long-horizon model fit accuracy, respectively.

Further, the 3 tutorial datasets (3 variations of the kaggle energy data) are introduced to be used in future tests.

Note: The Accuracy of the tests is currently lower due to early stopping having been disabled in this version. This is however necessary as early stopping produces more irregular results with hard to interpret training graphs as part of training is missing. This is particularly evident for AirPassengers.

codecov · 2023-10-17T23:37:42Z

Codecov Report

Merging #1455 (d4d8460) into main (a22a579) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1455   +/-   ##
=======================================
  Coverage   88.32%   88.32%           
=======================================
  Files          38       38           
  Lines        5097     5097           
=======================================
  Hits         4502     4502           
  Misses        595      595

github-actions · 2023-10-17T23:40:24Z

Model Benchmark

Benchmark	Metric	main	current	diff
AirPassengers_test30	MAE_val	-	25.3608	-
AirPassengers_test30	RMSE_val	-	30.624	-
AirPassengers_test30	Loss_val	-	0.0074	-
AirPassengers_test30	MAE	-	5.3982	-
AirPassengers_test30	RMSE	-	6.8828	-
AirPassengers_test30	Loss	-	0.0002	-
AirPassengers_test30	time	-	19.57	-
YosemiteTemps	MAE_val	1.34899	0.44287	-67.17%	🎉
YosemiteTemps	RMSE_val	2.00817	0.62874	-68.69%	🎉
YosemiteTemps	Loss_val	0.00078	7e-05	-91.32%	🎉
YosemiteTemps	MAE	1.32133	0.80138	-39.35%	🎉
YosemiteTemps	RMSE	2.13713	1.44081	-32.58%	🎉
YosemiteTemps	Loss	0.00064	0.00027	-57.67%	🎉
YosemiteTemps	time	59.72889	138.28	131.51%	❌
PeytonManning	MAE_val	0.58162	0.34778	-40.21%	🎉
PeytonManning	RMSE_val	0.72218	0.49928	-30.86%	🎉
PeytonManning	Loss_val	0.01239	0.00595	-51.96%	🎉
PeytonManning	MAE	0.41671	0.35123	-15.71%	🎉
PeytonManning	RMSE	0.55961	0.48066	-14.11%	🎉
PeytonManning	Loss	0.00612	0.0046	-24.8%	🎉
PeytonManning	time	12.73383	49.71	290.38%	❌
AirPassengers	MAE_val	13.0627	30.1272	130.64%	❌
AirPassengers	RMSE_val	15.94532	31.014	94.5%	❌
AirPassengers	Loss_val	0.00131	0.00371	182.63%	❌
AirPassengers	MAE	9.88153	6.27639	-36.48%	🎉
AirPassengers	RMSE	11.73543	7.88805	-32.78%	🎉
AirPassengers	Loss	0.00052	0.00019	-62.92%	🎉
AirPassengers	time	5.30237	20.39	284.54%	❌
YosemiteTemps_test30	MAE_val	-	1.8	-
YosemiteTemps_test30	RMSE_val	-	2.2687	-
YosemiteTemps_test30	Loss_val	-	0.001	-
YosemiteTemps_test30	MAE	-	0.8432	-
YosemiteTemps_test30	RMSE	-	1.4617	-
YosemiteTemps_test30	Loss	-	0.0003	-
YosemiteTemps_test30	time	-	119.61	-
PeytonManning_test30	MAE_val	-	0.9731	-
PeytonManning_test30	RMSE_val	-	1.128	-
PeytonManning_test30	Loss_val	-	0.0317	-
PeytonManning_test30	MAE	-	0.33	-
PeytonManning_test30	RMSE	-	0.4574	-
PeytonManning_test30	Loss	-	0.0044	-
PeytonManning_test30	time	-	43	-

Model training plots

Model Training

PeytonManning

YosemiteTemps

AirPassengers

ourownstory added 5 commits October 17, 2023 16:02

add tabulat

e00d6d4

add tabulate with comment

b288c6d

update lock

627170a

Merge branch 'main' into model-tests-expansion

ba601f3

update tests with dual test splits: short and long

d4d8460

ourownstory merged commit f392aec into main Oct 17, 2023

ourownstory deleted the model-tests-expansion branch October 17, 2023 23:49

ourownstory mentioned this pull request Oct 18, 2023

[Devops] update metrics ci with new tests #1457

Merged

ourownstory changed the title ~~Model tests expansion~~ [Devops] Model tests expansion Oct 18, 2023

ourownstory changed the title ~~[Devops] Model tests expansion~~ [tests] Model tests expansion Dec 8, 2023

ourownstory changed the title ~~[tests] Model tests expansion~~ [tests] Model tests expansion (partially undone in #1464) Dec 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tests] Model tests expansion (partially undone in #1464) #1455

[tests] Model tests expansion (partially undone in #1464) #1455

ourownstory commented Oct 17, 2023 •

edited

Loading

codecov bot commented Oct 17, 2023 •

edited

Loading

github-actions bot commented Oct 17, 2023

Model Training

PeytonManning

YosemiteTemps

AirPassengers

[tests] Model tests expansion (partially undone in #1464) #1455

[tests] Model tests expansion (partially undone in #1464) #1455

Conversation

ourownstory commented Oct 17, 2023 • edited Loading

codecov bot commented Oct 17, 2023 • edited Loading

Codecov Report

github-actions bot commented Oct 17, 2023

Model Benchmark

Model Training

PeytonManning

YosemiteTemps

AirPassengers

ourownstory commented Oct 17, 2023 •

edited

Loading

codecov bot commented Oct 17, 2023 •

edited

Loading