Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update inference model to make more sense #232

Closed
wants to merge 2 commits into from
Closed

Conversation

bokelley
Copy link
Contributor

Based on excellent feedback, clarifying the inference methodology

bokelley and others added 2 commits October 3, 2024 15:04
* wip on AI model

* split ai to new section

* wip

* wip

* first draft of training

* split out cluster and nvidia

* fine tuning updated

* inference service

* fix tests

* add model for token to energy

* update inference model to include LoRA; update overview to include data from BDavy

* break out datacenter more clearly in cluster defintiion

* update water use estimates for a100

* update cluster link & fix typos

* break out memory usage for more granular calculation

* Split out foundation components, fix various typos

* Update overview.mdx

LR updating phase description. Testing for update process going forward

* Update overview.mdx

Remove extra word

---------

Co-authored-by: lratliff3 <138066965+lratliff3@users.noreply.github.com>

Calculate the energy use for a request:
```
usage_energy_per_request = baseline_energy_per_request_second x predicted_request_duration +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this! Main task would be getting the predicted_inferences right (predicted_request_duration is likely directly correlated)

@bokelley bokelley closed this Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants