Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: REGR: read_csv with memory_map=True raises UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 262143: unexpected end of data #43647

Merged
merged 6 commits into from
Sep 21, 2021

Conversation

michal-gh
Copy link
Contributor

pandas/io/common.py Show resolved Hide resolved
pandas/tests/io/test_common.py Outdated Show resolved Hide resolved
@twoertwein twoertwein added IO CSV read_csv, to_csv Regression Functionality that used to work in a prior pandas version labels Sep 18, 2021
@jreback jreback changed the title Gh43540 BUG: REGR: read_csv with memory_map=True raises UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 262143: unexpected end of data Sep 20, 2021
@jreback jreback added this to the 1.3.4 milestone Sep 20, 2021
@@ -19,6 +19,7 @@ Fixed regressions
- Fixed performance regression in :meth:`MultiIndex.equals` (:issue:`43549`)
- Fixed regression in :meth:`Series.cat.reorder_categories` failing to update the categories on the ``Series`` (:issue:`43232`)
- Fixed regression in :meth:`Series.cat.categories` setter failing to update the categories on the ``Series`` (:issue:`43334`)
- Fixed regression in :meth:`pandas.read_csv` raising UnicodeDecodeError exception when memory_map=True (:issue:`43540`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Fixed regression in :meth:`pandas.read_csv` raising UnicodeDecodeError exception when memory_map=True (:issue:`43540`)
- Fixed regression in :meth:`pandas.read_csv` raising `UnicodeDecodeError` exception when `memory_map=True` (:issue:`43540`)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in last commit

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc string comment

This reverts commit c2e91f8.
@michal-gh
Copy link
Contributor Author

I updated whatsnew docstring according to @jreback request

@jreback jreback merged commit 5a369d6 into pandas-dev:master Sep 21, 2021
@jreback
Copy link
Contributor

jreback commented Sep 21, 2021

thanks @michal-gh very nice. always happy to take further updates to master.

@jreback
Copy link
Contributor

jreback commented Sep 21, 2021

@meeseeksdev backport 1.3.x

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Sep 21, 2021
…e raises UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 262143: unexpected end of data
@lumberbot-app
Copy link

lumberbot-app bot commented Sep 21, 2021

Something went wrong ... Please have a look at my logs.

simonjayhawkins pushed a commit that referenced this pull request Sep 22, 2021
…ap=True raises UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 262143: unexpected end of data ) (#43691)

Co-authored-by: michal-gh <michaltus@gmail.com>
Co-authored-by: Thomas Li <47963215+lithomas1@users.noreply.github.com>
@michal-gh
Copy link
Contributor Author

Thank you @twoertwein , @jreback for help with this PR

@michal-gh michal-gh deleted the gh43540 branch September 22, 2021 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Regression Functionality that used to work in a prior pandas version
Projects
None yet
3 participants