Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README for new CLI #178

Merged
merged 3 commits into from
Sep 10, 2024
Merged

Update README for new CLI #178

merged 3 commits into from
Sep 10, 2024

Conversation

qihqi
Copy link
Collaborator

@qihqi qihqi commented Aug 30, 2024

Update README to refer to new CLI instead of old scripts.

This README is not complete yet. But it will be completed with more details on flags before next release.

Copy link
Collaborator

@wang2yn84 wang2yn84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general good, some nits and small issues. Thank you!

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated
python run_interactive.py --model_name=$model_name --batch_size=128 --max_cache_length=2048 --quantize_weights=$quantize_weights --quantize_type=$quantize_type --quantize_kv_cache=$quantize_weights --checkpoint_path=$output_ckpt_dir --tokenizer_path=$tokenizer_path --sharding_config=default_shardings/$model_name.yaml
To pass hf token, add `--hf_token` flag
```
jpt serve --model_id --model_id meta-llama/Meta-Llama-3-8B-Instruct --hf_token=...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto


* `--sharding_config=<path>` This makes use of alternative sharding config instead of
the ones in default_shardings directory.
Weights downloaded from HuggingFace will be stored by default in `checkpoints` folder.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we the options to store weight separately? Even we have problem storing the weights in gcp vm directly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For gs bucket it need to be brought locally or use mount using Fuse.

The working dir can be edited. Added paragraph to describe that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be great if you can add how to change the working dir. Cuz for us, we also need to direct to the external ssd. I will approve the PR to unblock you for now.

jetstream_pt/fetch_models.py Outdated Show resolved Hide resolved
@qihqi qihqi changed the title Update Jetstream, add optional sampler args. Update README for new CLI Sep 10, 2024
@qihqi qihqi merged commit ec4ac8f into main Sep 10, 2024
4 checks passed
@qihqi qihqi deleted the hanq_sampler branch September 10, 2024 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants