Add function calling support using bolt-fc-1b #35

adilhafeez · 2024-08-08T23:31:15Z

This code change adds bolt-fc-1b to bolt gateway. There are several changes that went in to support bolt-fc-1b. Here is a summary of key changes.

add gradio based chatbot-ui to interact with bolt gateway
add model file to host bolt-fc in ollama
there is a readme that explains how to run e2e demo that uses weather_forecast as mode, please review it and make sure you can follow it
in the demo ollama is marked manual because ollama runs extremely slow when virtualized in docker
- primary reason is that docker doesn't support mapping mac gpu directly into docker container. So if you run ollama inside docker you are essentially using cpu
- but if you have nvidia then docker can map nvidia gpu directly onto docker container
in demo pls ignore all the files related to dashboard and prometheus
notice the new container called config-generator, its task is to take envoy.template.yaml, katanemo-config and generate final envoy.yaml that can be used by envoy to bootstrap
code needs some improvements that I will pick up in next PR
- add unit tests and add integration tests
- break big files like stream_context.rs into smaller components so its easy to read and manage
- remove dependency on qdrant completely and load embeddings into memory

function_resolver/app/handler.py

function_resolver/app/template.py

cotran2 · 2024-08-26T21:17:17Z

function_resolver/app/template.py

+                elements += self.handler._format_system(tools)
+
+            if message.role == "user":
+                elements += self.handler._format_user(content=message.content)


use dict or a better way to handle this

function_resolver/.vscode/launch.json

cotran2 · 2024-08-26T21:19:28Z

function_resolver/requirements.txt

add specific version

cotran2 · 2024-08-26T21:21:08Z

function_resolver/test/test.sh

we should organize using makefile in the future

cotran2 · 2024-08-26T21:36:16Z

function_resolver/app/template.py

+        Converts elements to token ids.
+        """
+        token_ids = []
+        for elem in elements:


this loop would be slow, any chance we can vectorize this?

nehcgs · 2024-08-26T21:53:02Z

@adilhafeez Thanks for making this PR!

Here are some changes I think you need to make:

Remove all redundant files about KFC-1B, including function_resolver/app/handler.py and function_resolver/app/template.py.
Also, you may want to remove everything about NER in demo.
Replace KFC-1B with Bolt-FC-1B. Many files need to be updated.
In function_resolver/app/main.py, replace Bolt-Function-Calling-1B:Q3_K_L to Bolt-Function-Calling-1B:Q4_K_M as a tentative option.
Please check function calling - system design doc to add engineering support for function calling.

junr03

Some comments -- I did not read the non rust code. Let me know if you'd like me to.

Looks like lints and tests are failing

junr03 · 2024-09-09T03:11:40Z

public-types/src/common_types.rs

-    pub score: f64,
+pub struct ToolParameters {
+    #[serde(rename = "type")]
+    pub parameters_type: String,


why not enum? also why not simply name it type?

it should be enum, will update

junr03 · 2024-09-09T03:12:02Z

public-types/src/common_types.rs

+    pub parameter_type: Option<String>,
+    pub description: String,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    pub required: Option<bool>,


Why option?

This is to say that required parameter is optional to enable following use case. In this spec, when required is missing then parameter is optional. For example city is required and days, units are optional parameters.

parameters: - name: city required: true description: The city for which the weather forecast is requested. - name: days description: The number of days for which the weather forecast is requested. - name: units description: The units in which the weather forecast is requested.

Interesting API choice, instead of using required: false

junr03 · 2024-09-09T03:13:16Z

public-types/src/common_types.rs

    pub model: String,
+    pub created_at: String,


Why not chrono::DateTime

updated to use chrono::DateTime

junr03 · 2024-09-09T03:16:13Z

public-types/src/common_types.rs

+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+#[serde(untagged)]
+pub enum IntOrString {


IMO it is more descriptive to name the int and the string by names that tell us why someone would use a string or an int in this field.

junr03 · 2024-09-09T03:17:20Z

public-types/src/common_types.rs

 }

 pub mod open_ai {
    use serde::{Deserialize, Serialize};

+    use super::ToolsDefinition;
+
    #[derive(Debug, Clone, Serialize, Deserialize)]
    pub struct ChatCompletions {
        #[serde(default)]
        pub model: String,


What is the difference between the model here and in the message?

junr03 · 2024-09-09T03:24:30Z