Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

03.10 openrecipes database has no values or object #179

Open
ghost opened this issue Apr 7, 2019 · 6 comments
Open

03.10 openrecipes database has no values or object #179

ghost opened this issue Apr 7, 2019 · 6 comments

Comments

@ghost
Copy link

ghost commented Apr 7, 2019

I use the colab , here is the shared url
https://colab.research.google.com/drive/15uoHS3S48Q3GQTBrzEpneWiD6MHVnylr

@SRSteinkamp
Copy link

SRSteinkamp commented Jul 22, 2019

Hi,
I just ran into the same problem, and tried to circumvent it a little.
Apparently, there seems to be an issue on the open-recipes side (see fictive-kin/openrecipes#218).
I don't know whether there are newer versions of the DB dump (the one referenced here is from 2017), but the 2017 one seems to be the one which is used in the notebook (i.e. outputs seem to be the same).
So changing the download-link and the file names accordingly did the trick for me.

So I used:
!curl -O https://s3.amazonaws.com/openrecipes/20170107-061401-recipeitems.json.gz

And changed recipeitems-latest.json to 20170107-061401-recipeitems.json in the following steps.

Finally, to correctly read the data, you might need to change the files encoding during open, that is adding encoding="utf-8" to the open statement:

# read the entire file into a Python array`
with open('20170107-061401-recipeitems.json', 'r', encoding="utf-8") as f:
    # Extract each line
    data = (line.strip() for line in f)
    # Reformat so each line is the element of a list
    data_json = "[{0}]".format(','.join(data))
# read the result as a JSON

Maybe this helps!

@poroc300
Copy link

@SRSteinkamp thank you for your encoding tip. I was having a problem downloading the latest stable version of the open recipes .json file and your tip corrected the problem.

@Ashwin-Bhoolia
Copy link


FileNotFoundError Traceback (most recent call last)
in
----> 1 with open('20170107-061401-recipeitems.json') as f:
2 line = f.readline()
3 pd.read_json(line).shape

FileNotFoundError: [Errno 2] No such file or directory: '20170107-061401-recipeitems.json'

Help

@JamesCHub
Copy link

JamesCHub commented May 4, 2021

As of Pandas 1.1, you may need to further adjust the code fragment, as follows:

from io import StringIO
with open('20170107-061401-recipeitems.json', 'r', encoding="utf-8") as f:
    data = (line.strip() for line in f)
    data_json = "[{0}]".format(','.join(data))
recipesDF = pd.read_json(StringIO(data_json))

@kauefs
Copy link

kauefs commented Nov 22, 2023

After almost giving up, it seems the simplest solution works (thanks to the information of the 2017 file and related issue fictivekin/openrecipes#218):

recipes=pd.read_json('/content/20170107-061401-recipeitems.json.gz',lines=True)
recipes

@jakevdp
Copy link
Owner

jakevdp commented Nov 22, 2023

The instructions in the updated second edition notebook should work: https://github.com/jakevdp/PythonDataScienceHandbook/blob/d66231454ef753818dc9213c9b5942e067266966/notebooks/03.10-Working-With-Strings.ipynb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants