Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Model uploader's jekins trigger parameter fix #403

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/model_uploader.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,12 +77,12 @@ jobs:
id: init_folders
run: |
model_id=${{ github.event.inputs.model_id }}
echo "model_folder=ml-models/${{github.event.inputs.model_source}}/${model_id}" >> $GITHUB_OUTPUT
if [[ -n "${{ github.event.inputs.upload_prefix }}" ]]; then
model_prefix="ml-models/${{ github.event.inputs.model_source }}/${{ github.event.inputs.upload_prefix }}"
else
model_prefix="ml-models/${{ github.event.inputs.model_source }}/${model_id%%/*}"
fi
echo "model_folder=$model_prefix/${model_id##*/}" >> $GITHUB_OUTPUT
echo "model_prefix_folder=$model_prefix" >> $GITHUB_OUTPUT
- name: Initiate workflow_info
id: init_workflow_info
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- Update model upload history - opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill (v.1.0.0)(TORCH_SCRIPT) by @dhrubo-os ([#400](https://github.com/opensearch-project/opensearch-py-ml/pull/400))

### Fixed
- Fix the wrong input parameter for model_uploader's base_download_path in jekins trigger.([#403](https://github.com/opensearch-project/opensearch-py-ml/pull/403))
- Enable make_model_config_json to add model description to model config file by @thanawan-atc in ([#203](https://github.com/opensearch-project/opensearch-py-ml/pull/203))
- Correct demo_ml_commons_integration.ipynb by @thanawan-atc in ([#208](https://github.com/opensearch-project/opensearch-py-ml/pull/208))
- Handle the case when the model max length is undefined in tokenizer by @thanawan-atc in ([#219](https://github.com/opensearch-project/opensearch-py-ml/pull/219))
Expand Down
29 changes: 22 additions & 7 deletions utils/model_uploader/update_models_upload_history_md.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,25 +85,40 @@ def create_model_json_obj(
return model_obj


def sort_models(models: List[Dict]) -> List[Dict]:
def sort_and_deduplicate_models(models: List[Dict]) -> List[Dict]:
"""
Sort models
Sort and deduplicate models

:param models: List of model dictionary objects to be sorted
:type models: list[dict]
:return: Sorted list of model dictionary objects
:rtype: list[dict]
"""
models = sorted(
models,

# Remove duplicates
unique_models = {}
for model in models:
key = (model["Model Version"], model["Model ID"], model["Model Format"])
if (
key not in unique_models
or model["Upload Time"] > unique_models[key]["Upload Time"]
):
unique_models[key] = model

# Convert the unique_models dictionary back to a list
unique_models_list = list(unique_models.values())

# Sort the deduplicated list
sorted_models = sorted(
unique_models_list,
key=lambda d: (
d["Upload Time"],
d["Model Version"],
d["Model ID"],
d["Model Format"],
),
)
return models
return sorted_models


def update_model_json_file(
Expand Down Expand Up @@ -172,7 +187,7 @@ def update_model_json_file(
models.append(model_obj)

models = [dict(t) for t in {tuple(m.items()) for m in models}]
models = sort_models(models)
models = sort_and_deduplicate_models(models)
with open(MODEL_JSON_FILEPATH, "w") as f:
json.dump(models, f, indent=4)

Expand All @@ -188,7 +203,7 @@ def update_md_file():
if os.path.exists(MODEL_JSON_FILEPATH):
with open(MODEL_JSON_FILEPATH, "r") as f:
models = json.load(f)
models = sort_models(models)
models = sort_and_deduplicate_models(models)
table_data = KEYS[:]
for m in models:
for k in KEYS:
Expand Down
Loading