You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The function dedupe_dataframe changes the dtypes of the input dataframe (e.g. float to string). One would rather expect this functions to not change the input arguments. One could prevent this by modifying a copy of the original dataframe.
The text was updated successfully, but these errors were encountered:
Thanks for the comment. Implementing your suggestion would provide more abstraction than is my intention. In fact, I actually dislike that the program currently abstracts the pre-processing step (which converts float to string). In the future, I may separate out the pre-processing step to make that conversion process explicit. What do you think?
IIWY, I'd generate a hash ID for each record prior to running pandas-dedupe, then join accordingly afterwards - it should only be a few lines of code. Let me know if you need more guidance on that.
The function
dedupe_dataframe
changes the dtypes of the input dataframe (e.g. float to string). One would rather expect this functions to not change the input arguments. One could prevent this by modifying a copy of the original dataframe.The text was updated successfully, but these errors were encountered: