Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package not Self-Contained #27

Open
superctj opened this issue Jul 1, 2024 · 9 comments
Open

Package not Self-Contained #27

superctj opened this issue Jul 1, 2024 · 9 comments
Assignees
Labels
bug Something isn't working

Comments

@superctj
Copy link

superctj commented Jul 1, 2024

Thank you for open-sourcing this handy tool! I was trying to install the package from pip and source, but neither works out-of-the-box. From my end (Ubuntu with Python 3.10), running the command metacrafter scan-file --format short <file name> gives me the error:
Traceback (most recent call last): File "~/miniconda3/envs/metacrafter/bin/metacrafter", line 33, in <module> sys.exit(load_entry_point('metacrafter==0.0.4', 'console_scripts', 'metacrafter')()) File "~/miniconda3/envs/metacrafter/lib/python3.10/site-packages/metacrafter-0.0.4-py3.10.egg/metacrafter/__main__.py", line 10, in main from .core import cli File "~/miniconda3/envs/metacrafter/lib/python3.10/site-packages/metacrafter-0.0.4-py3.10.egg/metacrafter/core.py", line 18, in <module> from iterable.helpers.detect import open_iterable ModuleNotFoundError: No module named 'iterable'

Even I installed iterabledata 1.0.5 from pip, I ran into another error: AttributeError: module 'snappy' has no attribute 'decompress'. Could you please look into the issue? Thanks in advance.

@ivbeg
Copy link
Collaborator

ivbeg commented Jul 1, 2024

@superctj Hi! Looks like I described more dependencies wrong in the package. I will fix it ASAP, thanks!

I think you need to install python-snappy with pip install python-snappy
More info here https://stackoverflow.com/questions/48535799/module-snappy-has-no-attribute-decompress

@ivbeg
Copy link
Collaborator

ivbeg commented Jul 1, 2024

Fixed in main branch, will be updated in next package release

@superctj
Copy link
Author

superctj commented Jul 1, 2024

Thank you @ivbeg for the quick action! I appreciate it.

@superctj
Copy link
Author

superctj commented Jul 1, 2024

Hi @ivbeg again, FYI, when I installed the package from the main branch, I ran into ModuleNotFoundError: No module named 'Cython'. After I installed Cython, the installation completed but when running the file scan command, the AttributeError: module 'snappy' has no attribute 'decompress' popped up again. I did pip install python-snappy and it fixed the error. However, I got a parquet.ParquetFormatException: Unsupported encoding: RLE_DICTIONARY when scanning a parquet file. Do you have any idea?

@ivbeg
Copy link
Collaborator

ivbeg commented Jul 1, 2024

@superctj not yet, it's ok with almost all parquet files that I tested. Could you share this file please?

@superctj
Copy link
Author

superctj commented Jul 1, 2024

Thank you for your quick response! GitHub does not support attaching parquet files so I put the sample file in Google Drive. Let me know if you cannot access the file.

@ivbeg
Copy link
Collaborator

ivbeg commented Jul 1, 2024

@superctj Thanks. I use pure Python parquet lib https://pypi.org/project/parquet/ to read parquet files since it provides simple iteration functions but looks like it doesn't support this type of encoding. I will take a look a bit later if I could easily replace it with pyarrow parquet reader

@ivbeg
Copy link
Collaborator

ivbeg commented Jul 4, 2024

@superctj Finally fixed, replaced parquet lib with pyarrow. The changes are in the iterabledata library, you need to reinstall it from main branch source code repository https://github.com/apicrafter/pyiterable

@superctj
Copy link
Author

superctj commented Jul 5, 2024

Thank you @ivbeg for the quick action! I will probably give it a shot later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants