You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pandas-dedupe install the latest version of dedupe which is 3.0.3 as of now. However, when defining the field_properties in df_final = pandas_dedupe.dedupe_dataframe(df=df, field_properties=[...]), the following error is raised by dedupe:
File "/.../lib/python3.11/site-packages/dedupe/api.py", line 1141, in init
self.data_model = datamodel.DataModel(variable_definition)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.../lib/python3.11/site-packages/dedupe/datamodel.py", line 32, in init
raise ValueError(
ValueError: It looks like you are trying to use a variable definition composed of dictionaries. dedupe 3.0 uses variable objects directly. So instead of [{"field": "name", "type": "String"}] we now do [dedupe.variables.String("name")].
A quick and dirty fix I did to use dedupe>=3.0.3 (just to unblock myself) is to update the utility function pandas_dedupe.utility_functions.select_fields(fields, field_properties)(link) with:
ifisinstance(i, String):
fields.append(i)
Where i is of type dedupe.variables.String instead of:
iftype(i)==str:
fields.append({'field': i, 'type': 'String'})
Last commit in this project dates from 4 years. Any plans to upgrade the package to be compatible with dedupe>=3.0 and drop compatibility with older versions? Any help needed?
The text was updated successfully, but these errors were encountered:
pandas-dedupe
install the latest version ofdedupe
which is3.0.3
as of now. However, when defining thefield_properties
indf_final = pandas_dedupe.dedupe_dataframe(df=df, field_properties=[...])
, the following error is raised bydedupe
:A quick and dirty fix I did to use
dedupe>=3.0.3
(just to unblock myself) is to update the utility functionpandas_dedupe.utility_functions.select_fields(fields, field_properties)
(link) with:Where
i
is of typededupe.variables.String
instead of:Last commit in this project dates from 4 years. Any plans to upgrade the package to be compatible with
dedupe>=3.0
and drop compatibility with older versions? Any help needed?The text was updated successfully, but these errors were encountered: