-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Add functionality to upgrade ML model state to 7.x format before upgrade to 8.0 #64154
Labels
Comments
Pinging @elastic/ml-core (:ml) |
Currently they all will, due to elastic/ml-cpp#1545. So elastic/ml-cpp#1545 should be fixed before coding starts for this issue. |
benwtrent
added a commit
to benwtrent/elasticsearch
that referenced
this issue
Nov 5, 2020
This new API provides a way for users to upgrade their own anomaly job model snapshots. To upgrade a snapshot the following is done: - Open a native process given the job id and the desired snapshot id - load the snapshot to the process - write the snapshot again from the native task (now updated via the native process) closes elastic#64154
edsavage
added a commit
to edsavage/ml-cpp
that referenced
this issue
Nov 12, 2020
Ensure that the static counter tracking the peak memory usage for a job is included in the limited set to be persisted/restored from model state snapshots. Relates to elastic/elasticsearch#64154
edsavage
added a commit
to elastic/ml-cpp
that referenced
this issue
Nov 12, 2020
Ensure that the static counter tracking the peak memory usage for a job is included in the limited set to be persisted/restored from model state snapshots. Relates to elastic/elasticsearch#64154
edsavage
added a commit
to edsavage/ml-cpp
that referenced
this issue
Nov 12, 2020
Ensure that the static counter tracking the peak memory usage for a job is included in the limited set to be persisted/restored from model state snapshots. Relates to elastic/elasticsearch#64154
edsavage
added a commit
to edsavage/ml-cpp
that referenced
this issue
Nov 12, 2020
Ensure that the static counter tracking the peak memory usage for a job is included in the limited set to be persisted/restored from model state snapshots. Relates to elastic/elasticsearch#64154
benwtrent
added a commit
that referenced
this issue
Nov 12, 2020
) This new API provides a way for users to upgrade their own anomaly job model snapshots. To upgrade a snapshot the following is done: - Open a native process given the job id and the desired snapshot id - load the snapshot to the process - write the snapshot again from the native task (now updated via the native process) relates #64154
benwtrent
added a commit
to benwtrent/elasticsearch
that referenced
this issue
Nov 12, 2020
…stic#64665) This new API provides a way for users to upgrade their own anomaly job model snapshots. To upgrade a snapshot the following is done: - Open a native process given the job id and the desired snapshot id - load the snapshot to the process - write the snapshot again from the native task (now updated via the native process) relates elastic#64154
edsavage
added a commit
to elastic/ml-cpp
that referenced
this issue
Nov 12, 2020
Ensure that the static counter tracking the peak memory usage for a job is included in the limited set to be persisted/restored from model state snapshots. Relates to elastic/elasticsearch#64154 Backports #1572
edsavage
added a commit
to elastic/ml-cpp
that referenced
this issue
Nov 12, 2020
Ensure that the static counter tracking the peak memory usage for a job is included in the limited set to be persisted/restored from model state snapshots. Relates to elastic/elasticsearch#64154 Backports #1572
benwtrent
added a commit
that referenced
this issue
Nov 17, 2020
#64665) (#65010) * [ML] add new snapshot upgrader API for upgrading older snapshots (#64665) This new API provides a way for users to upgrade their own anomaly job model snapshots. To upgrade a snapshot the following is done: - Open a native process given the job id and the desired snapshot id - load the snapshot to the process - write the snapshot again from the native task (now updated via the native process) relates #64154
benwtrent
added a commit
to benwtrent/elasticsearch
that referenced
this issue
Dec 16, 2020
… need upgraded (elastic#66062) This adds checks that verify that machine learning anomaly job model snapshots support the required minimal version. If any are not the required version, directions are given to either delete the model snapshot, or utilize the _upgrade API. relates: elastic#64154
benwtrent
added a commit
that referenced
this issue
Dec 17, 2020
…s that need upgraded (#66062) (#66475) * [ML] [Deprecation] add deprecation check for job model snapshots that need upgraded (#66062) This adds checks that verify that machine learning anomaly job model snapshots support the required minimal version. If any are not the required version, directions are given to either delete the model snapshot, or utilize the _upgrade API. relates: #64154
This issue has been closed with two PRs:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently the
autodetect
process has to know how to load model state formats going right back to version 5.5. In 8.x we would like to drop support for loading model states that predate 7.0. This means that during 7.x we must offer a way for users to easily upgrade model snapshots in 5.x or 6.x formats into the latest 7.x format.Work was done for elastic/ml-cpp#1460 to add the necessary building block on the C++ side. What is required now is Java code to make use of this.
The Java work consists of:
min_version
earlier than 7.0 (or doesn't have amin_version
field at all) then:autodetect
process should be started for the job, and passed the supplied model snapshot (in preference to the one that would normally be restored).w
control message should be sent to theautodetect
process, supplying the same snapshot ID, snapshot timestamp and snapshot description that were on the original snapshot - this will cause it to overwrite the original snapshot documents with replacement documents in the latest format.autodetect
process should be gracefully stopped by closing its input stream - it will not persist state again as no data was sent to it.min_version
is older than 7.0 (ormin_version
not present).There are many tricky details to work through with the seemingly simple "start and stop
autodetect
" portion of the work. Will we reuse the same persistent task that we use to open the job for normal operation? If so, how will we prevent data being sent to the process? And would that mean the job would have to be closed during the upgrade (not ideal as it could be inconvenient)? But if we don't reuse the same persistent task then how will we account for memory requirement and enforce that it runs on an ML node? And how will we avoid named pipe name clashes? Additionally, when the model snapshot gets persisted for one of these special upgrade invocations ofautodetect
, we need to persist the upgraded model snapshot but not set it as the active one for the job, so the results handling code will need a tweak too.The text was updated successfully, but these errors were encountered: