Skip to content
This repository has been archived by the owner on Apr 24, 2024. It is now read-only.

How-to guide for finding the optimal alpha in a Ridge regression #27

Open
PicoCentauri opened this issue Mar 2, 2023 · 0 comments
Open
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers help wanted Extra attention is needed

Comments

@PicoCentauri
Copy link
Collaborator

Since we have no RidgeCV we should give a small how-to of looping through a bunch of alphas and find the best alpha. We should also show a plot of the root-mean-square error versus ln(alpha).

Below a small code snippet that one can use for a start.

alpha_values = np.logspace(-10, 8, 10)
​
l_rmse_f_train = np.nan * np.ones(len(alpha_values))
l_rmse_f_test = np.nan * np.ones(len(alpha_values))
​
l_rmse_e_train = np.nan * np.ones(len(alpha_values))
l_rmse_e_test = np.nan * np.ones(len(alpha_values))
​
clf = Ridge(parameter_keys=["values", "force"])
​
for i_alpha, alpha_value in enumerate(alpha_values):
​
    error_prefix = f"{alpha_value}"
    try:
        with warnings.catch_warnings(record=True) as w:
            clf.fit(X_train, y_train, alpha=alpha, solver="solve")
            if len(w) > 0:
                print(f"warning: {error_prefix}: {w[0].message}")
    except LinAlgError as e:
        print(f"error: {error_prefix}: {e}")
        continue# Take force error (gradient wrt to positions) as scorer.
    l_rmse_f_train[i_alpha] = clf.score(X_train, y_train, parameter_key='positions')[0]
    l_rmse_f_test[i_alpha] = clf.score(X_test, y_test, parameter_key='positions')[0]
​
    l_rmse_e_train[i_alpha] = clf.score(X_train, y_train, parameter_key='values')[0]
    l_rmse_e_test[i_alpha] = clf.score(X_test, y_test, parameter_key='values')[0]
​
best_idx = np.nanargmin(l_rmse_f_test)
​
best_alpha = alpha_values[best_idx]
best_rmse_train = l_rmse_f_train[best_idx]
besy_rmse_test = l_rmse_f_test[best_idx]

For splitting X_test and X_train one could use equistore.split...

@PicoCentauri PicoCentauri added documentation Improvements or additions to documentation good first issue Good for newcomers help wanted Extra attention is needed labels Mar 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant