Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multipart.ParserError: Unexpected end of multipart stream (parser closed) #432

Open
rgaudin opened this issue Jan 20, 2025 · 0 comments
Open
Labels
bug Something isn't working
Milestone

Comments

@rgaudin
Copy link
Member

rgaudin commented Jan 20, 2025

In this zimit run, there seem to have been a failure parsing the WARC

Traceback (most recent call last):
  File "/usr/bin/zimit", line 8, in <module>
    sys.exit(zimit.zimit())
             ^^^^^^^^^^^^^
  File "/app/zimit/lib/python3.12/site-packages/zimit/zimit.py", line 688, in zimit
    sys.exit(run(sys.argv[1:]))
             ^^^^^^^^^^^^^^^^^
  File "/app/zimit/lib/python3.12/site-packages/zimit/zimit.py", line 609, in run
    return warc2zim(warc2zim_args)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/zimit/lib/python3.12/site-packages/warc2zim/main.py", line 168, in main
    return converter.run()
           ^^^^^^^^^^^^^^^
  File "/app/zimit/lib/python3.12/site-packages/warc2zim/converter.py", line 290, in run
    self.gather_information_from_warc()
  File "/app/zimit/lib/python3.12/site-packages/warc2zim/converter.py", line 469, in gather_information_from_warc
    for record in iter_warc_records(self.warc_files):
  File "/app/zimit/lib/python3.12/site-packages/warc2zim/converter.py", line 1014, in iter_warc_records
    for record in buffering_record_iter(ArchiveIterator(fh), post_append=True):
  File "/app/zimit/lib/python3.12/site-packages/cdxj_indexer/bufferiter.py", line 50, in buffering_record_iter
    join_req_resp(req, resp, post_append, url_key_func)
  File "/app/zimit/lib/python3.12/site-packages/cdxj_indexer/bufferiter.py", line 110, in join_req_resp
    query, append_str = append_method_query_from_req_resp(req, resp)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/zimit/lib/python3.12/site-packages/cdxj_indexer/postquery.py", line 26, in append_method_query_from_req_resp
    return append_method_query(method, content_type, len_, stream, url)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/zimit/lib/python3.12/site-packages/cdxj_indexer/postquery.py", line 35, in append_method_query
    query = query_extract(content_type, len_, stream, url)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/zimit/lib/python3.12/site-packages/cdxj_indexer/postquery.py", line 106, in query_extract
    for part in parser:
  File "/app/zimit/lib/python3.12/site-packages/multipart.py", line 722, in __iter__
    for part in self._part_iter:
  File "/app/zimit/lib/python3.12/site-packages/multipart.py", line 770, in _iterparse
    for event in parser.parse(chunk):
  File "/app/zimit/lib/python3.12/site-packages/multipart.py", line 370, in parse
    self.close()
  File "/app/zimit/lib/python3.12/site-packages/multipart.py", line 519, in close
    raise err
multipart.ParserError: Unexpected end of multipart stream (parser closed)

@rgaudin rgaudin added the bug Something isn't working label Jan 20, 2025
@benoit74 benoit74 added this to the 2.3.0 milestone Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants