Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indonesia_opentender yields invalid JSON #964

Closed
sentry-io bot opened this issue Sep 26, 2022 · 5 comments · Fixed by #1066
Closed

indonesia_opentender yields invalid JSON #964

sentry-io bot opened this issue Sep 26, 2022 · 5 comments · Fixed by #1066
Labels
blocked We can't do this yet existing spider

Comments

@sentry-io
Copy link

sentry-io bot commented Sep 26, 2022

For export-ocds-batch-year-2021-lpse-127-ocds-data-tender-24092022143442.json

From https://opentender.net/api/tender/export-ocds-batch?year=2021&lpse=127

Sentry Issue: DATA-REGISTRY-KINGFISHER-PROCESS-6Y

IncompleteJSONError: parse error: trailing garbage
          rindustrian", "id": "K27"}}]}tity"]}, {"name": "PT. HUDA TAT
                     (right here) ------^

  File "process/management/commands/file_worker.py", line 57, in callback
    upgraded_collection_file_id = process_file(collection_file)
  File "process/management/commands/file_worker.py", line 100, in process_file
    package, releases_or_records = _read_data_from_file(collection_file.filename, data_type)
  File "process/management/commands/file_worker.py", line 174, in _read_data_from_file
    for prefix, event, value in ijson.parse(ControlCodesFilter(f)):

Spider indonesia_opentender yields invalid JSON, skipping
@jpmckinney jpmckinney added existing spider blocked We can't do this yet labels Sep 26, 2022
@jpmckinney
Copy link
Member

Not sure if the error is perhaps intermittent, as when I try now, the JSON inside the ZIP file is valid.

@jpmckinney
Copy link
Member

jpmckinney commented Sep 26, 2022

Also saw for export-ocds-batch-year-2021-lpse-412-ocds-data-tender-24092022141324.json, which I figure means it's from https://opentender.net/api/tender/export-ocds-batch?year=2021&lpse=412

lexical error: invalid char in json text.
          "2021-08-18T23:59:00.000000Z"DR"}, "project": "Pembangunan J
                     (right here) ------^

However, when I download the ZIP now, it contains ocds-data-tender-27092022020842.json instead, and is valid. The file on the registry server is just from 3 days ago.

There are over 100 of these errors.

@yolile
Copy link
Member

yolile commented Oct 11, 2022

This issue was first reported in #794

@neelima-j
Copy link

We have a reply from Kes at ICW, in the feedback report which he explains they are changing the backend to pre-generate zip files.

@yolile
Copy link
Member

yolile commented Dec 15, 2022

With f8d47ca this problem didn't happen again. However, as the publisher is working on a better approach to generate the files, I will keep this issue open so we can remove the download limitations from the spider when the new solution is implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked We can't do this yet existing spider
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants