-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extensions for LLM #442
Labels
question
Further information is requested
Comments
I didn't have experience to run a LLM on raspberry device. How does the perf look like, for example to run a phi3 model? |
@hannespreishuber Your app idea sounds interesting! Can you elaborate on the flow of the application a little more? Did you try to run phi-3 on raspberry pi already? Fyi, we are in the process of adding support for whisper. |
Closing the issue. Please re-open if this is still relevant. |
kunal-vaishnavi
added a commit
that referenced
this issue
Dec 11, 2024
### Description This PR adds support for outputting the last hidden state in addition to the logits in ONNX models. Users can run their models with ONNX Runtime GenAI and use the generator's `GetOutput` API to obtain the hidden states. C/C++: ```c std::unique_ptr<OgaTensor> embeddings = generator->GetOutput("hidden_states"); ``` C#: ```csharp using var embeddings = generator.GetOutput("hidden_states"); ``` Java: ```java Tensor embeddings = generator.getOutput("hidden_states"); ``` Python: ```python embeddings = generator.get_output("hidden_states") ``` ### Motivation and Context In SLMs and LLMs, the last hidden state represents a model's embeddings for a particular input before the language modeling head is applied. Generating embeddings for a model is a popular task. These embeddings can be used for many scenarios such as text classification, sequence labeling, information retrieval using [retrieval-augmented generation (RAG)](https://en.wikipedia.org/wiki/Retrieval-augmented_generation), and more. This PR helps the following issues: - microsoft/onnxruntime#20969 - #442 - #474 - #713
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
the idea. Write a litte app - put it on a raspberry to swith lights on or off
Would start with phi-3 (which works as .net app) extend with whisper..
Would need a feature like embeddings or functions
Had a look at semantic kernel but needs a http Rest endpoint, which makes no sense.
Any advice?
The text was updated successfully, but these errors were encountered: