DataFrame.to_string truncates long strings #9784

michaelaye · 2015-04-02T02:12:17Z

I am calling to_string() without any parameters and it beautifully fixed-formatted my dataframe apart from my very wide filename column, that is being truncated with "...". How can I avoid that?

                                            FILENAME  OBS_ID  XUV  
0  'mvn_iuv_l1a_IPH3-cycle00007-mode040-muvdark_2...      40  MUV  
1  'mvn_iuv_l1a_IPH2-cycle00047-mode050-muvdark_2...      50  MUV  
2  'mvn_iuv_l1a_apoapse-orbit00127-mode2001-muvda...    2001  MUV  
3  'mvn_iuv_l1a_APP1-orbit00087-mode1031-fuvdark_...    1031  FUV  
4  'mvn_iuv_l1a_IPH2-cycle00005-mode060-fuvdark_2...      60  FUV

I tried calling it like this, but to no avail (same output):

with open('test_summary_out.txt','w') as f:
    f.write(summarydf.head().to_string(formatters={'filename':lambda x: "{:100}".format(x)}))

Version: 0.16 with Python 3.4

The text was updated successfully, but these errors were encountered:

dsm054 · 2015-04-02T02:28:01Z

I think it picks that option up from display.max_colwidth. Does pd.set_option("display.max_colwidth", 10000) have an effect?

michaelaye · 2015-04-02T16:35:20Z

yes, that solved it, thx. I guess it should not pick that up for a to_string() operation, as that is not display? Or, well it is, but maybe one needs a different to_textfile() method that avoids this to be picked up.

jreback · 2015-04-02T20:10:17Z

@dsm054 I think it might be worthwhile to point this out in here, maybe in a note box?

JaysonSunshine · 2017-08-09T22:39:37Z

Is this issue resolved? I am getting this same issue on 0.20.3

gfyoung · 2017-08-09T22:40:59Z

There was no changes, just further documentation.

JaysonSunshine · 2017-08-09T22:44:35Z

Is the solution to modify the display settings? That seems pretty unsatisfactory.

gfyoung · 2017-08-09T22:45:32Z

@jreback : Thoughts?

michaelaye · 2017-08-09T22:49:22Z

i would argue that the "to_string" method should be independent from a display setting for real-time analysis. A string object is not necessarily being used for display purposes.

JaysonSunshine · 2017-08-09T22:50:46Z

In my present case, I am accessing a Redshift admin table to get a table DDL. The data frame has just that column/DDL, but I want to modify it in memory using a string operation -- split(';'). I think the to_string operation should definitely not carryover any display settings.

JaysonSunshine · 2017-08-09T22:51:44Z

Or, at least, to have a parameter we can toggle. That could work.

jorisvandenbossche · 2017-08-10T09:06:25Z

Yes, I don't think the documentation addition really solved this issue.

The max_colwidth option is used by the DataFrameFormatter.to_string without being able to change it. At least we could add a keyword to be able to override it without needing to change the display settings.
But if you look at another option like display.max_rows or max_columns, those are ignored by to_string. So it even makes sense to ignore max_colwidth as well I think (anyhow, to be able to ignore the option, it will have to be added as a keyword anyhow, so the output formatting code can pass the correct setting).

jorisvandenbossche · 2017-08-10T09:08:20Z

So for me, PR welcome for this!

jorisvandenbossche · 2017-08-10T09:58:22Z

#1852 is probably a duplicate of this

matanox · 2018-01-19T18:29:57Z

I can hardly see how the coupling of the display limit with any other processing helps in solving this benign scenario. And not really how padding the strings which I notice takes place as well, helps, outside the display scenario. If there's too much history behind it, would you recommend using the plain csv package of python, for reading strings without modifying them?

Here's a naive code sample, if it helps anyone ―

import csv
messages = []
with open("csv-file") as csvfile:
    reader = csv.DictReader(csvfile, delimiter=',', quotechar='"')
    for row in reader:
        messages.append(row['message'])
messages

messages_df = pd.DataFrame(messages, columns=['message'])

# then concat to your main DataFrame...

This seems to avoid the truncation (but not the padding, which hurts a little with right-to-left text, as it pads as if the text is left-to-right, which kind of skews the semantics of the text more in the case of right-to-left text)

Momut1 · 2019-03-25T15:34:21Z

two days debugging to find out this was the issue. i'm sad...

simonjayhawkins · 2019-04-01T10:24:17Z

see also #24841 for fix in to_html

addahlin · 2019-04-11T18:16:27Z

This was incredibly frustrating to debug. I was executing the code below and getting "..." in my output. I assumed it was just printing the "..." to the console, not in the dataframe! While I'm sure this isn't a
"good" approach to wrapping text in a tag, it's the most obvious way when starting out. I'm sure many people will do this and no one would expect this behavior.

df["ValueType"] = "<strong>" + df["ValueType"] == "Portfolio"] + "</strong>"

If this were my first experience with Pandas, I'd promptly throw it in the trash. Note: I love Pandas and thank you everyone for amazing work you do! I just wanted to share my experience.

rswgnu · 2019-07-29T21:33:37Z

yes, that solved it, thx. I guess it should not pick that up for a to_string() operation, as that is not display? Or, well it is, but maybe one needs a different to_textfile() method that avoids this to be picked up.

With Pandas 0.25.0, setting display.max_colwidth to a large number stops the truncation but when trying to left justify columns with df.to_string(justify='left'), that same display setting somehow pads columns on the left so they are not left aligned. Is there any present way to prevent truncation and get left justified string columns when output to a terminal? I know a pull request is in process but I would like to do this now. Thanks.

yamen321 · 2019-08-15T18:06:35Z

May I ask what the use case of having the to_string method dependent on the display.max_colwidth option? I can't seem to understand why one would ever ask for a DataFrame row as a string with truncated column values

TomAugspurger · 2019-08-15T20:34:52Z

@yamen321 I think it's agreed that to_string shouldn't truncate. Are you interested in working on it?

lshepard · 2019-08-21T03:11:18Z

Hi! I jumped in on the "good first issue" label and put up a PR to solve this. Feedback very welcome.

yamen321 · 2019-08-21T20:55:49Z

Thanks a lot for taking the initiative on this @lshepard!

santhoshnumberone · 2019-10-21T08:13:08Z

Hey

I have a dataframe column with url file name like

0 http://address/filename1.jpg
1 http://address/filename2.jpg
Name: fileUrl, dtype: object

I want to extract the filenames from the url

so

from pathlib import Path
filenamelist = df.apply(lambda x: Path(x.to_string()).name if x.name == 'fileUrl' else x)

I just want the file

0 filename1.jpg
1 filename2.jpg

If the filename is long string my output looks like

filena...
filena...

df.fileUrl.max_colwidth = 100 not solving the issue

though using dataframe would be much faster than

select the column
iterate through the column elements
extract the name

Any work around here, instead of this?

filenames_list = [str(Path(x).name) for x in list(df['fileUrl'])]

jreback added the Usage Question label Apr 2, 2015

jreback added Docs Good as first PR labels Apr 2, 2015

jreback added this to the Next Major Release milestone Apr 2, 2015

prabhjotsumman mentioned this issue Mar 21, 2016

Fix : 9784 Added the note for display.max_colwidth in dsintro.rst #12682

Closed

4 tasks

jreback modified the milestones: 0.18.1, Next Major Release Apr 4, 2016

jreback closed this as completed in 610d3d5 Apr 4, 2016

jorisvandenbossche removed Docs Usage Question labels Aug 10, 2017

jorisvandenbossche modified the milestones: Next Major Release, 0.18.1 Aug 10, 2017

jorisvandenbossche reopened this Aug 10, 2017

TomAugspurger added the good first issue label Oct 11, 2017

jreback added good first issue and removed good first issue Difficulty Novice labels Dec 15, 2017

simonjayhawkins mentioned this issue Apr 3, 2019

REF: handling of max_colwidth parameter #25977

Closed

4 tasks

lshepard mentioned this issue Aug 21, 2019

Make DataFrame.to_string output full content by default #28052

Merged

5 tasks

simonjayhawkins added the Output-Formatting __repr__ of pandas objects, to_string label Aug 25, 2019

TomAugspurger modified the milestones: Contributions Welcome, 1.0 Aug 30, 2019

WillAyd closed this as completed in #28052 Sep 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrame.to_string truncates long strings #9784

DataFrame.to_string truncates long strings #9784

michaelaye commented Apr 2, 2015

dsm054 commented Apr 2, 2015

michaelaye commented Apr 2, 2015

jreback commented Apr 2, 2015

JaysonSunshine commented Aug 9, 2017

gfyoung commented Aug 9, 2017

JaysonSunshine commented Aug 9, 2017

gfyoung commented Aug 9, 2017

michaelaye commented Aug 9, 2017

JaysonSunshine commented Aug 9, 2017

JaysonSunshine commented Aug 9, 2017

jorisvandenbossche commented Aug 10, 2017

jorisvandenbossche commented Aug 10, 2017

jorisvandenbossche commented Aug 10, 2017

matanox commented Jan 19, 2018 •

edited

Loading

Momut1 commented Mar 25, 2019

simonjayhawkins commented Apr 1, 2019

addahlin commented Apr 11, 2019

rswgnu commented Jul 29, 2019

yamen321 commented Aug 15, 2019

TomAugspurger commented Aug 15, 2019

lshepard commented Aug 21, 2019

yamen321 commented Aug 21, 2019

santhoshnumberone commented Oct 21, 2019 •

edited

Loading

DataFrame.to_string truncates long strings #9784

DataFrame.to_string truncates long strings #9784

Comments

michaelaye commented Apr 2, 2015

dsm054 commented Apr 2, 2015

michaelaye commented Apr 2, 2015

jreback commented Apr 2, 2015

JaysonSunshine commented Aug 9, 2017

gfyoung commented Aug 9, 2017

JaysonSunshine commented Aug 9, 2017

gfyoung commented Aug 9, 2017

michaelaye commented Aug 9, 2017

JaysonSunshine commented Aug 9, 2017

JaysonSunshine commented Aug 9, 2017

jorisvandenbossche commented Aug 10, 2017

jorisvandenbossche commented Aug 10, 2017

jorisvandenbossche commented Aug 10, 2017

matanox commented Jan 19, 2018 • edited Loading

Momut1 commented Mar 25, 2019

simonjayhawkins commented Apr 1, 2019

addahlin commented Apr 11, 2019

rswgnu commented Jul 29, 2019

yamen321 commented Aug 15, 2019

TomAugspurger commented Aug 15, 2019

lshepard commented Aug 21, 2019

yamen321 commented Aug 21, 2019

santhoshnumberone commented Oct 21, 2019 • edited Loading

matanox commented Jan 19, 2018 •

edited

Loading

santhoshnumberone commented Oct 21, 2019 •

edited

Loading