Read Egg Question #1489

andersonfrailey · 2017-07-21T21:23:13Z

Could someone explain how the read_egg_csv and read_egg_json functions in utils.py work? I've been reading over the documentation for pkg_resources but I still haven't been able to figure out how it all works. Thanks in advance.

The text was updated successfully, but these errors were encountered:

martinholmer · 2017-07-26T13:35:48Z

@andersonfrailey asked:

Could someone explain how the read_egg_csv and read_egg_json functions in utils.py work? I've been reading over the documentation for pkg_resources but I still haven't been able to figure out how it all works. Thanks in advance.

As you've probably figured out, Tax-Calculator reads in several CSV and JSON files both when executing from source code and when executing from a conda package. Those files are in the source code tree and are placed in the package (as an "egg", or at least this is my possibly incorrect understanding of how a Python/conda expert would put it) using information in the tax-calculator/MANIFEST.in file.

So, when reading one of those named files, tax-calculator code first checks for the file in the source code tree and, if it is there, reads it. If it can't find it in that directory location, then it assumes the code is running from the conda taxcalc package and calls either the read_egg_csv or read_egg_json function.

If you want to see how the two pkg_resources methods work together to read the file from inside the conda taxcalc package, look at this setuptools documentation using your browser to search for the documentation of the two methods used in the read_egg_* functions.

Why the read_egg_* code uses the Requirement.parse method to get a Requirement object to pass to the resource_stream method (rather than passing in the package name), I have no idea. That kind of issue is above my pay grade. There may be an important reason for doing it that way, or it may be the personal preference of the original developer of these functions. I have no idea.

The documentation says this:

In the following methods [including resource_stream], the package_or_requirement argument may be either a Python package/module name (e.g. foo.bar) or a Requirement instance.

Reading this might lead one to think the read_egg_* code could be simplified a bit.

@andersonfrailey, if you're interested in this topic, it would be a good learning experience for us all for you to try the alternative coding approach: implement the simpler code on a development branch, make a conda taxcalc package on your local computer (using the tax-calculator/conda.recipe/install_local_taxcalc_package.sh script), and test the simpler code by using the local taxcalc package outside of the tax-calculator source code tree. An example, of what you could do with the taxcalc package is this:

$ tc cps.csv 2020 --tables

If you get the same table output with the current and the simpler read_egg_* code, then it shows that the code can be simplified a bit. And, even more importantly, conducting this experiment will leave you in a more knowledgeable position to improve the docstring for each of these two read_egg_* functions. That would be appreciated by future developers, who will not need to ask questions like the one you asked in issue #1489.

andersonfrailey · 2017-07-27T14:53:40Z

@martinholmer, thanks for the detailed response. I will dig into the pkg_resources methods a bit and see if I can get the simpler code to run.

martinholmer · 2017-07-28T12:23:52Z

@andersonfrailey said:

I will dig into the pkg_resources methods a bit and see if I can get the simpler code to run.

Thanks. Your experimentation will help all of us understand better the read_egg_* logic.

talumbau · 2017-07-28T14:31:55Z

Hi @andersonfrailey, here's some good background info for your research:

ospc-org/ospc.org#501

In particular, I recall that this Stack Overflow post was a helpful pointer in determining how to solve the problem of reading data files from an installed package.

http://stackoverflow.com/questions/6028000/python-how-to-read-a-static-file-from-inside-a-package

Good luck!

martinholmer · 2017-08-18T00:59:14Z

@andersonfrailey, What's your view of the status of issue #1489?

Perhaps your question was not answered in an authoritative manner. I already admitted your question was beyond my Python knowledge base.

But given there has been no comments added to #1489 for three weeks, do you think there are reasons to keep this issue open?

andersonfrailey · 2017-08-18T14:24:11Z

@martinholmer I've gotten enough information from this to look further into my question on my own. Closing.

martinholmer added the question label Aug 18, 2017

andersonfrailey closed this as completed Aug 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read Egg Question #1489

Read Egg Question #1489

andersonfrailey commented Jul 21, 2017

martinholmer commented Jul 26, 2017

andersonfrailey commented Jul 27, 2017

martinholmer commented Jul 28, 2017

talumbau commented Jul 28, 2017

martinholmer commented Aug 18, 2017

andersonfrailey commented Aug 18, 2017

Read Egg Question #1489

Read Egg Question #1489

Comments

andersonfrailey commented Jul 21, 2017

martinholmer commented Jul 26, 2017

andersonfrailey commented Jul 27, 2017

martinholmer commented Jul 28, 2017

talumbau commented Jul 28, 2017

martinholmer commented Aug 18, 2017

andersonfrailey commented Aug 18, 2017