You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 27, 2024. It is now read-only.
When using the CSV uploader we encountered an enormous performance issue when submitting very large files (~80k). Uploading these files took roughly 30 - 45m before being fully processed. This offcourse results in issues with HTTP connections which timeouts.
Delving into this issue, it was quickly appareant that it solely related to the CSV uploader, IntelMQ can efficiently handle the data, and is encountered on the /submit function, /preview and /upload are not as much effected by the file size.
Running a profile for the /submit function for 100 records, the results of which can be seen below, showed the issues related to the intelmq/lib/utils.py:201(load_configuration) function. For the overall running time of 34.816s around 34.543s is spend in that function. This function is trigged by the intelmq/lib/message.py:95(__init__) function. Basically for every event in de upload, an new Event(Message) object is created which in turn loads and parses the harmonization.conf file. The loading of that particular file attributes for every single events results in the massive performance issue.
Mainly an effort has to be made on how we can e.g. use only a single Message object which can be reused or some other way that the harmonization config file is only loaded once.
When `Event` is init, it automatically loads the harmonization config
file. This means that for every event uploaded, the entire harmonization
config file is loaded and parsed. This dramatically extends the time
needed to parse a single event.
This PR ensures that the harmonization file is only loaded once and then
used for generating all `Event` objects.
Fixes: certat#79
When using the CSV uploader we encountered an enormous performance issue when submitting very large files (~80k). Uploading these files took roughly 30 - 45m before being fully processed. This offcourse results in issues with HTTP connections which timeouts.
Delving into this issue, it was quickly appareant that it solely related to the CSV uploader, IntelMQ can efficiently handle the data, and is encountered on the
/submit
function,/preview
and/upload
are not as much effected by the file size.Running a profile for the
/submit
function for 100 records, the results of which can be seen below, showed the issues related to theintelmq/lib/utils.py:201(load_configuration)
function. For the overall running time of 34.816s around 34.543s is spend in that function. This function is trigged by theintelmq/lib/message.py:95(__init__)
function. Basically for every event in de upload, an newEvent(Message)
object is created which in turn loads and parses theharmonization.conf
file. The loading of that particular file attributes for every single events results in the massive performance issue.Mainly an effort has to be made on how we can e.g. use only a single Message object which can be reused or some other way that the
harmonization
config file is only loaded once.The text was updated successfully, but these errors were encountered: