update inference model to make more sense #232

bokelley · 2024-10-14T06:31:01Z

Based on excellent feedback, clarifying the inference methodology

* wip on AI model * split ai to new section * wip * wip * first draft of training * split out cluster and nvidia * fine tuning updated * inference service * fix tests * add model for token to energy * update inference model to include LoRA; update overview to include data from BDavy * break out datacenter more clearly in cluster defintiion * update water use estimates for a100 * update cluster link & fix typos * break out memory usage for more granular calculation * Split out foundation components, fix various typos * Update overview.mdx LR updating phase description. Testing for update process going forward * Update overview.mdx Remove extra word --------- Co-authored-by: lratliff3 <138066965+lratliff3@users.noreply.github.com>

cfosco · 2024-10-15T05:36:29Z

docs/inference.mdx

+
+Calculate the energy use for a request:
+```
+usage_energy_per_request = baseline_energy_per_request_second x predicted_request_duration +


I like this! Main task would be getting the predicted_inferences right (predicted_request_duration is likely directly correlated)

bokelley and others added 2 commits October 3, 2024 15:04

various fixes and a new inference model

c2c4c32

cfosco reviewed Oct 15, 2024

View reviewed changes

bokelley closed this Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update inference model to make more sense #232

update inference model to make more sense #232

bokelley commented Oct 14, 2024

cfosco Oct 15, 2024

update inference model to make more sense #232

update inference model to make more sense #232

Conversation

bokelley commented Oct 14, 2024

cfosco Oct 15, 2024

Choose a reason for hiding this comment