Skip to content

One .NET library to consume OpenAI, Anthropic, Cohere, Google, Azure, Groq, and self-hosed APIs.

License

Notifications You must be signed in to change notification settings

lofcz/LlmTornado

Repository files navigation

LlmTornado LlmTornado.Contrib

🌪️ LLM Tornado - one .NET library to consume OpenAI, Anthropic, Cohere, Google, Azure, Groq, and self-hosted APIs.

At least one new large language model is released each month. Wouldn't it be awesome if using the new, shiny model was as easy as switching one argument? LLM Tornado acts as an aggregator allowing you to do just that. Think SearX but for LLMs!

OpenAI, Anthropic, Cohere, Google, Azure, and Groq (LLama 3, Mixtral, Gemma 2..) are currently supported along with KoboldCpp and Ollama.

The following video captures one conversation, running across OpenAI, Cohere, and Anthropic, with parallel tools calling & streaming:

2024-05-21.00-28-35.mp4

⭐ Try it, the code of the demo is here! Try asking for previously fetched information and verify the context is correctly constructed even when switching Providers mid-conversation.

⚡Getting Started

Install LLM Tornado via NuGet:

dotnet add package LlmTornado

Optional: extra features and quality of life extension methods are distributed in Contrib addon:

dotnet add package LlmTornado LlmTornado.Contrib

🪄 Quick Inference

Inferencing across multiple providers is as easy as changing the ChatModel argument. Tornado instance can be constructed with multiple API keys, the correct key is then used based on the model automatically:

TornadoApi api = new TornadoApi(new List<ProviderAuthentication>
{
    new ProviderAuthentication(LLmProviders.OpenAi, "OPEN_AI_KEY"),
    new ProviderAuthentication(LLmProviders.Anthropic, "ANTHROPIC_KEY"),
    new ProviderAuthentication(LLmProviders.Cohere, "COHERE_KEY"),
    new ProviderAuthentication(LLmProviders.Google, "GOOGLE_KEY"),
    new ProviderAuthentication(LLmProviders.Groq, "GROQ_KEY")
});

List<ChatModel> models = [
    ChatModel.OpenAi.Gpt4.Turbo, ChatModel.Anthropic.Claude3.Sonnet, 
    ChatModel.Cohere.Command.RPlus, ChatModel.Google.Gemini.Gemini15Flash, 
    ChatModel.Groq.Meta.Llama370B 
];

foreach (ChatModel model in models)
{
    string? response = await api.Chat.CreateConversation(model)
        .AppendSystemMessage("You are a fortune teller.")
        .AppendUserInput("What will my future bring?")
        .GetResponse();

    Console.WriteLine(response);
}

🔮 Custom Providers

Instead of consuming commercial APIs, one can roll their own inference servers easily with a myriad of tools available. Here is a simple demo for streaming response with Ollama, but the same approach can be used for any custom provider:

public static async Task OllamaStreaming()
{
    TornadoApi api = new TornadoApi(new Uri("http://localhost:11434")); // default Ollama port
    
    await api.Chat.CreateConversation(new ChatModel("falcon3:1b")) // <-- replace with your model
        .AppendUserInput("Why is the sky blue?")
        .StreamResponse(Console.Write);
}
clip.mp4

🔎 Advanced Inference

Streaming

Tornado offers several levels of abstraction, trading more details for more complexity. The simple use cases where only plaintext is needed can be represented in a terse format.

await api.Chat.CreateConversation(ChatModel.Anthropic.Claude3.Sonnet)
    .AppendSystemMessage("You are a fortune teller.")
    .AppendUserInput("What will my future bring?")
    .StreamResponse(Console.Write);

Tools with deferred resolve

When plaintext is insufficient, switch to GetResponseRich() or StreamResponseRich() APIs. Tools requested by the model can be resolved later and never returned to the model. This is useful in scenarios where we use the tools without intending to continue the conversation.

Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.OpenAi.Gpt4.Turbo,
    Tools = new List<Tool>
    {
        new Tool
        {
            Function = new ToolFunction("get_weather", "gets the current weather")
        }
    },
    ToolChoice = new OutboundToolChoice(OutboundToolChoiceModes.Required)
});

chat.AppendUserInput("Who are you?"); // user asks something unrelated, but we force the model to use the tool
ChatRichResponse response = await chat.GetResponseRich(); // the response contains one block of type Function

GetResponseRichSafe() API is also available, which is guaranteed not to throw on the network level. The response is wrapped in a network-level wrapper, containing additional information. For production use cases, either use try {} catch {} on all the HTTP request producing Tornado APIs, or use the safe APIs.

Tools with immediate resolve

Tools requested by the model can also be resolved and the results returned immediately. This has the benefit of automatically continuing the conversation.

Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.OpenAi.Gpt4.O,
    Tools =
    [
        new Tool(new ToolFunction("get_weather", "gets the current weather", new
        {
            type = "object",
            properties = new
            {
                location = new
                {
                    type = "string",
                    description = "The location for which the weather information is required."
                }
            },
            required = new List<string> { "location" }
        }))
    ]
})
.AppendSystemMessage("You are a helpful assistant")
.AppendUserInput("What is the weather like today in Prague?");

ChatStreamEventHandler handler = new ChatStreamEventHandler
{
  MessageTokenHandler = (x) =>
  {
      Console.Write(x);
      return Task.CompletedTask;
  },
  FunctionCallHandler = (calls) =>
  {
      calls.ForEach(x => x.Result = new FunctionResult(x, "A mild rain is expected around noon.", null));
      return Task.CompletedTask;
  },
  AfterFunctionCallsResolvedHandler = async (results, handler) => { await chat.StreamResponseRich(handler); }
};

await chat.StreamResponseRich(handler);

REPL

This interactive demo can be expanded into an end-user-facing interface in the style of ChatGPT. Shows how to use strongly typed tools together with streaming and resolve parallel tool calls. ChatStreamEventHandler is a convenient class allowing subscription to only the events your use case needs.

public static async Task OpenAiFunctionsStreamingInteractive()
{
    // 1. set up a sample tool using a strongly typed model
    ChatPluginCompiler compiler = new ChatPluginCompiler();
    compiler.SetFunctions([
        new ChatPluginFunction("get_weather", "gets the current weather in a given city", [
            new ChatFunctionParam("city_name", "name of the city", ChatPluginFunctionAtomicParamTypes.String)
        ])
    ]);
    
    // 2. in this scenario, the conversation starts with the user asking for the current weather in two of the supported cities.
    // we can try asking for the weather in the third supported city (Paris) later.
    Conversation chat = api.Chat.CreateConversation(new ChatRequest
    {
        Model = ChatModel.OpenAi.Gpt4.Turbo,
        Tools = compiler.GetFunctions()
    }).AppendUserInput("Please call functions get_weather for Prague and Bratislava (two function calls).");

    // 3. repl
    while (true)
    {
        // 3.1 stream the response from llm
        await StreamResponse();

        // 3.2 read input
        while (true)
        {
            Console.WriteLine();
            Console.Write("> ");
            string? input = Console.ReadLine();

            if (input?.ToLowerInvariant() is "q" or "quit")
            {
                return;
            }
            
            if (!string.IsNullOrWhiteSpace(input))
            {
                chat.AppendUserInput(input);
                break;
            }
        }
    }

    async Task StreamResponse()
    {
        await chat.StreamResponseRich(new ChatStreamEventHandler
        {
            MessageTokenHandler = async (token) =>
            {
                Console.Write(token);
            },
            FunctionCallHandler = async (fnCalls) =>
            {
                foreach (FunctionCall x in fnCalls)
                {
                    if (!x.TryGetArgument("city_name", out string? cityName))
                    {
                        x.Result = new FunctionResult(x, new
                        {
                            result = "error",
                            message = "expected city_name argument"
                        }, null, true);
                        continue;
                    }

                    x.Result = new FunctionResult(x, new
                    {
                        result = "ok",
                        weather = cityName.ToLowerInvariant() is "prague" ? "A mild rain" : cityName.ToLowerInvariant() is "paris" ? "Foggy, cloudy" : "A sunny day"
                    }, null, true);
                }
            },
            AfterFunctionCallsResolvedHandler = async (fnResults, handler) =>
            {
                await chat.StreamResponseRich(handler);
            }
        });
    }
}

Other endpoints such as Images, Embedding, Speech, Assistants, Threads and Vision are also supported!
Check the links for simple to-understand examples!

Why Tornado?

  • 25,000+ installs on NuGet under previous names Lofcz.Forks.OpenAI, OpenAiNg.
  • Used in commercial projects incurring charges of thousands of dollars monthly.
  • The license will never change. Looking at you HashiCorp and Tiny.
  • Supports streaming, functions/tools, modalities (images, audio), and strongly typed LLM plugins/connectors.
  • Great performance, nullability annotations.
  • Extensive tests suite.
  • Maintained actively for over a year.

Documentation

Every public class, method, and property has extensive XML documentation, using LLM Tornado should be intuitive if you've used any other LLM library previously. Feel free to open an issue here if you have any questions.

PRs are welcome!

License

💜 This library is licensed under the MIT license.