-
This is always the case when I use lmfit for batch fitting. It is clear that the shapes of curves are similar, but the fitting results differ greatly. Is there any way to improve this phenomenon. here is my code from lmfit import Minimizer, Model, Parameters, create_params, report_fit
import numpy as np
import pandas as pd
from scipy import interpolate
from sklearn.metrics import r2_score, mean_squared_error
def lm_fit(fit_df, model):
fit_df_scale = fit_df/(fit_df.max()- fit_df.min())
x_k = fit_df['Separation(nm)'].max() - fit_df['Separation(nm)'].min()
y_k = fit_df['Cor_Force(nN)'].max() - fit_df['Cor_Force(nN)'].min()
fit_model = Model(model, independent_vars=['x'], nan_policy='omit')
params = fit_model.make_params(Z=dict(value=0.08 / y_k, min=0, max=0.1/y_k),
A_H = dict(value=0.05/(x_k*y_k), min=1e-5/(x_k*y_k), max=1/(x_k*y_k)),
κ = dict(value=0.1*x_k, min=0.001*x_k, max=10000*x_k ),
)
params.add('R', value=30/x_k, vary=False)
if model is func4:
params.add('x0', value=0, min=-0.1, max=0.1)
if model is func5:
params.add('x0', value=0, min=-0.1, max=0.1)
params.add('y0', value=0, min=-0.1, max=0.1)
if model is func6:
params.add('x0', value=0, min=-0.1, max=0.1)
params.add('y0', value=0, min=-0.1, max=0.1)
params.add('Z_H', value=2*x_k/y_k, min=0, max=100*x_k/y_k)
params.add('D_H', value=5/x_k, min = 1e-10/x_k, max=100/x_k)
x = np.array(fit_df_scale.iloc[:, 0])
data = np.array(fit_df_scale.iloc[:, 1])
# tol = 1e-13
result = fit_model.fit(data, params=params, x=x, max_nfev=20000, method='nelder-mead')
final = result.best_fit
# params_dict = result.params.valuesdict()
r2 = r2_score(data, final)
return result, r2, final
def func6(x, Z, A_H, κ, Z_H, D_H, x0, y0, R=30):
epsilon = 1e-10
# 确保分母中无除零情况
valid_x = np.where((x - x0) != 0, x - x0, epsilon)
valid_x_R = np.where((x - x0 + 2 * R) != 0, x - x0 + 2 * R, epsilon)
# 计算公式
result = (
Z * R * κ * np.exp(-valid_x * κ)
- (2 * A_H * R ** 3) / (3 * valid_x ** 2 * valid_x_R ** 2)
- Z_H * R * np.exp(-valid_x / D_H)
+ y0
)
return result
result, r2, y_pred = lm_fit(fit_df, func6)
# fit_df is my data which contains two columns, each with approximately 100 rows |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 8 replies
-
@xmuworker It's hard for us to give specific advice about any particular "bad fit", especially without seeing a fit report. I would make a few suggestions of what to look at: a) the fit report is the main result of a fit. Importantly for your fits, it will tell you if a fit got stuck at a bound or if some parameter was not moved from its initial value or went to some crazy value, or if you hit the limit of the number of function evaluations. It will also give you the uncertainties in the parameters. For example, in your "bad fit", why did b) seeing bounds on parameter values set programmatically always worries me. I admit that I sometimes do this myself, but only when I feel like I understand the "physical/meaningful" values. The way you are setting bounds seems "mostly not too scary to me" (assuming that the dataframes are not causing c) Your If those don't guide you to better results, I suggest posting a more complete example of one of the "not very good" fits. |
Beta Was this translation helpful? Give feedback.
yeah, if either
-valid_x * k > 700
or-valid_x / D_H > 700
, you'll have problems withnp.inf
. You do have a check that helps gaurd so thatvalid_x**2 + valid_x_R**2
cannot be tiny. You might also check thatk
andD_H
are not so far off that the exponentials give younp.inf
.Again, I am generally very suspicious when I see bounds on variable Parameters being generated from the data. OTOH, clipping the arguments to exponentials, either at run time or by setting bounds appropriately seems like a good thing to do.