df.to_dict(orient=records) not forming exact columns as in the dataframe #25166

kebab-mai-haddi · 2019-02-05T19:42:12Z

While reading a CSV file, I try to get a list of column names via:

>>> df_attr = pd.read_csv("3BtTWMUQAMawXcCheUAOMlXU.csv", nrows=1)
>>> cols = list(df_attr)
>>> print (cols)

The o/p, i.e., the columns are:

['Response ID', 'Group Context', 'name', 'roh', 'sex', 'age', 'agemonth', 'ms', 'married_in_the_last_1', 'aadhar_yn', 'aadhar_number', 'aadhar_picture', 'post_acc', 'life_ins', 'health_ins', 'asso_memb_Youth.club', 'asso_memb_cultural.group', 'asso_memb_JFM', 'asso_memb_Association.of.the.users.of.drinking.water', 'asso_memb_SHG', 'asso_memb_Semi.governmental.organizations', 'asso_memb_Farmers.association', 'asso_memb_Youth.association', 'asso_memb_Women.s.association', 'asso_memb_nothing', 'asso_memb_Semi.governmental.organizations.1', 'asso_memb_Farmers.association.1', 'asso_memb_Women.s.association.1', 'asso_memb_Youth.association.1', 'asso_memb_SHG_EDTXJEKHESFdTtQwMcIT', 'asso_memb_Association.of.the.users.of.drinking.water.1', 'asso_memb_JFM_EDTXJEKHESFdTtQwMcIT', 'asso_memb_cultural.group.1', 'asso_memb_Youth.club.1', 'asso_memb_nothing.1', 'बँकिंग', 'bank_acc', 'ac_linked_aadhar', 'jdy_yn', 'शिक्षण', 'edu_high', 'edu_informal', 'edu_inst', 'comp_lit', 'कौशल्य.विकासासंबंधी.प्रश्न.फक्त.१८.३५.वयोगटातील.लोक्काना.विचारावे', 'any_skills_development_training', 'receive_skill_training_from', 'member_get_training_from', 'व्यवसाय', 'occ', 'other', 'require_assistance_in_receiving_employment', 'mgnrega_yn', 'mgnregaapp_yn', 'mgnregawork_yn', 'mgnrega_days', 'scheme_memb_Old Pension Scheme', 'scheme_memb_Janani Suraksha Yojana', 'scheme_memb_Disability Benefits', 'scheme_memb_Scholarship', 'scheme_memb_Widow Pension', 'vill_name', 'census_village_sd_2011', 'census_district_2011', 'vill_name_taluka_code', 'census_subdistrict_2011', 'subdistrict_code', 'vill_name_gp_code', 'vill_name_taluka_name', 'district_code', 'vill_name_gp_name', 'vill_name_village_code', 'location_latitude', 'location_longitude', 'location_accuracy', 'hoh', 'contact', 'informant', 'rh', 'rhoth', 'hh_occu', 'religion', 'socialgrp', 'sc_health_id', 'census_country', 'state_name', 'state_code', 'age_married', 'village_code_census2011_raw', 'phase']

and rows via:

>>>df = pd.read_csv("3BtTWMUQAMawXcCheUAOMlXU.csv", chunksize=2)
>>>for d in df:
...     d = d.to_dict(orient='records')
...     for r in d:
...             print(r)
...     import sys
...     sys.exit()

The o/p is:

{'_0': '74a7c6f8-94f3-4882-8ad7-9199313a1a51', '_1': 1, 'name': 'Suresh Kautik Patil', 'roh': 'Self', 'sex': 'Male', 'age': 62, 'agemonth': 2, 'ms': 'Married', 'married_in_the_last_1': 'No', 'aadhar_yn': 1, 'aadhar_number': 467172934356, 'aadhar_picture': 'Https://Collect-V2-Production.s3.Ap-South-1.Amazonaws.com/Nzrzdt5akq7uutju90bf%2Fhxbrh6bz57ihvawa3e4n%2Fze2gzezsov7polixljkg%2Fece8f806-3633-445D-8183-336460Cc6207', 'post_acc': 0, 'life_ins': 0, 'health_ins': 0, '_15': 0, '_16': 0, 'asso_memb_JFM': 0, '_18': 0, 'asso_memb_SHG': 0, '_20': 0, '_21': 0, '_22': 0, '_23': 0, 'asso_memb_nothing': 0, '_25': 0, '_26': 0, '_27': 0, '_28': 0, 'asso_memb_SHG_EDTXJEKHESFdTtQwMcIT': 0, '_30': 0, 'asso_memb_JFM_EDTXJEKHESFdTtQwMcIT': 0, '_32': 0, '_33': 0, '_34': 1, 'बँकिंग': nan, 'bank_acc': 1, 'ac_linked_aadhar': 1, 'jdy_yn': 1, 'शिक्षण': nan, 'edu_high': 'Secondary', 'edu_informal': 'Other', 'edu_inst': 'No Information', 'comp_lit': 'No', '_44': nan, 'any_skills_development_training': nan, 'receive_skill_training_from': nan, 'member_get_training_from': nan, 'व्यवसाय': nan, 'occ': 'Labourers', 'other': nan, 'require_assistance_in_receiving_employment': 'No', 'mgnrega_yn': 0, 'mgnregaapp_yn': nan, 'mgnregawork_yn': nan, 'mgnrega_days': nan, '_56': 'No', '_57': nan, '_58': nan, 'scheme_memb_Scholarship': nan, '_60': nan, 'vill_name': 'Cc2bf74e-5Baa-4A23-Ad9d-21Fef4517f41', 'census_village_sd_2011': 'Shindgavhan', 'census_district_2011': 'Nandurbar', 'vill_name_taluka_code': 3954, 'census_subdistrict_2011': 'Nandurbar', 'subdistrict_code': 3954, 'vill_name_gp_code': 182276, 'vill_name_taluka_name': 'Nandurbar', 'district_code': 497, 'vill_name_gp_name': 'Shidgavahan', 'vill_name_village_code': 525705, 'location_latitude': 21.4179219, 'location_longitude': 74.354738, 'location_accuracy': 3, 'hoh': 'Suresh Kautik Patil', 'contact': 'In(+91)-9374060682', 'informant': 'Suresh Kautik Patil', 'rh': 'Self', 'rhoth': nan, 'hh_occu': 'Labourers', 'religion': 'Hindu', 'socialgrp': 'OBC', 'sc_health_id': 1, 'census_country': 'India', 'state_name': 'Maharashtra', 'state_code': 27, 'age_married': '<21 (For Boys)', 'village_code_census2011_raw': 525705, 'phase': 'Phase 3'}

As you can see, the column name Response ID is missing while I read the row. It should be noted that df.iterrows() gave me all the correct columns.

The first few lines(inc the header) from my CSV file are here so that one can take this question as MVC.

The text was updated successfully, but these errors were encountered:

chris-b1 · 2019-02-05T20:31:17Z

Closed by #24965 - this is fixed in the 0.24.1 bugfix release

kebab-mai-haddi · 2019-02-05T20:47:15Z

Thanks @chris-b1

chris-b1 closed this as completed Feb 5, 2019

chris-b1 added the Duplicate Report Duplicate issue or pull request label Feb 5, 2019

chris-b1 added this to the No action milestone Feb 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

df.to_dict(orient=records) not forming exact columns as in the dataframe #25166

df.to_dict(orient=records) not forming exact columns as in the dataframe #25166

kebab-mai-haddi commented Feb 5, 2019 •

edited

Loading

chris-b1 commented Feb 5, 2019

kebab-mai-haddi commented Feb 5, 2019

df.to_dict(orient=records) not forming exact columns as in the dataframe #25166

df.to_dict(orient=records) not forming exact columns as in the dataframe #25166

Comments

kebab-mai-haddi commented Feb 5, 2019 • edited Loading

chris-b1 commented Feb 5, 2019

kebab-mai-haddi commented Feb 5, 2019

kebab-mai-haddi commented Feb 5, 2019 •

edited

Loading