Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indexing - static initialisation and waste of ressources #5174

Open
henning-gerhardt opened this issue May 24, 2022 · 3 comments
Open

indexing - static initialisation and waste of ressources #5174

henning-gerhardt opened this issue May 24, 2022 · 3 comments
Assignees
Labels
bug search search, filter

Comments

@henning-gerhardt
Copy link
Collaborator

preface:
The following issue may only occur if you indexing a few ten thousand to million entries and / or adjust the elasticsearch configuration entries in kitodo_config.properties not correct or not fitting to your amount of data.

description:
One first initialisation of class IndexingService a static list of index worker and a dynamic list of counted database object per indexing type is created. This is happend even there is no indexing (full or partial / indexing not indexed) is required. This happening inside the class methods prepareIndexWorker() and countDatabaseObjects(). The amount of created worker per type is depending on the configuration value of elasticsearch.indexLimit. If you have a lot of entries inside the database a big collection list is created with inside waiting threads to be executed. As this worker list is only created on application start up there will no updates through new or deleted objects which is a source for errors which will happen if the application is running a long time and an indexing is started.

goal:
It would be better to create the worker instances on index all or indexing not indexed request.

@thomaslow
Copy link
Collaborator

This was mostly fixed in #5367. The number of threads no longer depends on the elasticsearch.indexLimit parameter. Instead, the number of threads can be configured via a separate parameter. And threads are only initialized when the indexing is triggered by the user.

As this worker list is only created on application start up there will no updates through new or deleted objects which is a source for errors which will happen if the application is running a long time and an indexing is started.

If any objects (processes, tasks, etc.) are added or edited while the indexing is still ongoing, they should be added or updated through the regular save-logic of the respective object. However, I never checked this scenario specifically.

@andre-hohmann
Copy link
Collaborator

@thomaslow : Thanks a lot for your hint. That helps us to clean up the issues.

@henning-gerhardt : Can this issue be closed?

@henning-gerhardt
Copy link
Collaborator Author

No, as the main issue still exist: lists are initialised while first access of the mentioned class and not modified in any way on runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug search search, filter
Development

No branches or pull requests

5 participants