Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minerva seems to have trouble processing save requests when there are many files #378

Closed
kltm opened this issue Apr 9, 2021 · 5 comments

Comments

@kltm
Copy link
Member

kltm commented Apr 9, 2021

When minerva is running operation: 'export-all' and working in a directory like the current noctua-models (with 6116 models currently defined), it returns an error. Repeated application gives slightly different errors (with a different model involved), until it no longer mentions models. For example:

https://gist.github.com/kltm/1929656545bf6668a07e96d9d8ac07f6

Tagging @balhoff

@kltm
Copy link
Member Author

kltm commented Apr 16, 2021

@vanaukenk, talking with @balhoff, this may be part of why some files seemed not to save in some cases. I've merged this with master, but wanted your input on when to get it out. As it stands now, I might just wait until the next power outage, but we could also schedule something sooner.

@kltm
Copy link
Member Author

kltm commented Apr 21, 2021

@balhoff After talking to @vanaukenk , we'll be trying to get this into production over the upcoming outage.
Would it be possible to get this patch on minerva dev as well just so I can give the tires a quick kick before then?

@balhoff
Copy link
Member

balhoff commented Apr 21, 2021

@kltm see #381 (cherry-pick).

@kltm
Copy link
Member Author

kltm commented Apr 22, 2021

@balhoff
Okay, I've done a little testing on noctua-dev. First the (very) good news: I believe that the change fixes the problem we were running into. minerva flushed the models to disk and there was no direct error reported.
The "bad" news, or rather a known and surmountable issue, is that the process to flush out thirty thousand odd files seems to take some time and the client gives up before minerva completes. This is a client-side issue though, and doesn't actually cause any problem that we did already have, so I think we're probably good once this is in production.

@kltm
Copy link
Member Author

kltm commented Apr 23, 2021

@balhoff

Interesting side note. When doing the shutdown today, I issued the usual "save" commands and it returned with:

    'message-type': 'success',
    message: 'Dumped all models to folder' },
    _okay: true,

This what I'd normally expect.

I guess, last time, our issue may have been due to either a very large number of accumulated open file handles over months (now tested on dev and on production)?

Due to patches being in place for likely issue and inability to any longer simulate the problem, I think we're good to close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants