Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for comparison issue in columns_equal when ignore_case=True (Generated by Ana - AI SDE) #329

Closed
wants to merge 1 commit into from

Conversation

ana-ai-sde
Copy link

Description

This PR addresses an issue in the columns_equal function where comparing datetime.date columns incorrectly returns false differences when ignore_case=True. The patch ensures that string operations are only performed on appropriate types, resolving this issue.

This patch was generated by Ana - AI SDE, an AI-powered software development assistant.

This is a fix for Issue 327

Copy link
Member

@fdosani fdosani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ana-ai-sde There seem to be some issues with the code. Also there are no new unit tests included in the PR.

@@ -749,6 +749,9 @@ def render(filename: str, *fields: Union[int, float, str]) -> str:
return file_open.read().format(*fields)


import pandas as pd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be placed at the top of the file as it doesn't conform to our code quality guidelines.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fdosani
Thanks for highlighting this. We will look into this as to why this happened.

BR,
Team Ana

@@ -816,9 +819,9 @@ def columns_equal(
col_2 = col_2.str.strip()

if ignore_case:
if col_1.dtype.kind == "O":
if col_1.dtype == 'object' and col_1.apply(lambda x: isinstance(x, str)).all():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems incorrect. The dtype would return an O not object AFAIK.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fdosani

The comment is partially correct. dtype.kind returns 'O' for object types, while dtype itself returns 'object'. Both col_1.dtype.kind == "O" and col_1.dtype == 'object' are valid ways to check for object dtypes. The proposed change adds an extra check to ensure all elements are strings, which may be useful depending on the specific requirements of the comparison.

image

BR,
Team Ana

@ana-ai-sde
Copy link
Author

@ana-ai-sde There seem to be some issues with the code. Also there are no new unit tests included in the PR.

Hi @fdosani
For the test cases, we will incorporate the ones that Ana utilized internally within the same pull request after ensuring compatibility with the repository guidelines.

BR,
Team Ana

@fdosani
Copy link
Member

fdosani commented Sep 3, 2024

Thanks for the PR, at this time the SDE contributions don't meet our contributor guidelines. Closing this for as of now.

@fdosani fdosani closed this Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants