🌪️ LLM Tornado - one .NET library to consume OpenAI, Anthropic, Cohere, Google, Azure, Groq, and self-hosted APIs.
At least one new large language model is released each month. Wouldn't it be awesome if using the new, shiny model was as easy as switching one argument? LLM Tornado acts as an aggregator allowing you to do just that. Think SearX but for LLMs!
OpenAI, Anthropic, Cohere, Google, Azure, and Groq (LLama 3, Mixtral, Gemma 2..) are currently supported along with KoboldCpp and Ollama.
The following video captures one conversation, running across OpenAI, Cohere, and Anthropic, with parallel tools calling & streaming:
2024-05-21.00-28-35.mp4
⭐ Try it, the code of the demo is here! Try asking for previously fetched information and verify the context is correctly constructed even when switching Providers mid-conversation.
Install LLM Tornado via NuGet:
dotnet add package LlmTornado
Optional: extra features and quality of life extension methods are distributed in Contrib
addon:
dotnet add package LlmTornado LlmTornado.Contrib
Inferencing across multiple providers is as easy as changing the ChatModel
argument. Tornado instance can be constructed with multiple API keys, the correct key is then used based on the model automatically:
TornadoApi api = new TornadoApi(new List<ProviderAuthentication>
{
new ProviderAuthentication(LLmProviders.OpenAi, "OPEN_AI_KEY"),
new ProviderAuthentication(LLmProviders.Anthropic, "ANTHROPIC_KEY"),
new ProviderAuthentication(LLmProviders.Cohere, "COHERE_KEY"),
new ProviderAuthentication(LLmProviders.Google, "GOOGLE_KEY"),
new ProviderAuthentication(LLmProviders.Groq, "GROQ_KEY")
});
List<ChatModel> models = [
ChatModel.OpenAi.Gpt4.Turbo, ChatModel.Anthropic.Claude3.Sonnet,
ChatModel.Cohere.Command.RPlus, ChatModel.Google.Gemini.Gemini15Flash,
ChatModel.Groq.Meta.Llama370B
];
foreach (ChatModel model in models)
{
string? response = await api.Chat.CreateConversation(model)
.AppendSystemMessage("You are a fortune teller.")
.AppendUserInput("What will my future bring?")
.GetResponse();
Console.WriteLine(response);
}
Instead of consuming commercial APIs, one can roll their own inference servers easily with a myriad of tools available. Here is a simple demo for streaming response with Ollama, but the same approach can be used for any custom provider:
public static async Task OllamaStreaming()
{
TornadoApi api = new TornadoApi(new Uri("http://localhost:11434")); // default Ollama port
await api.Chat.CreateConversation(new ChatModel("falcon3:1b")) // <-- replace with your model
.AppendUserInput("Why is the sky blue?")
.StreamResponse(Console.Write);
}
clip.mp4
Tornado offers several levels of abstraction, trading more details for more complexity. The simple use cases where only plaintext is needed can be represented in a terse format.
await api.Chat.CreateConversation(ChatModel.Anthropic.Claude3.Sonnet)
.AppendSystemMessage("You are a fortune teller.")
.AppendUserInput("What will my future bring?")
.StreamResponse(Console.Write);
When plaintext is insufficient, switch to GetResponseRich()
or StreamResponseRich()
APIs. Tools requested by the model can be resolved later and never returned to the model. This is useful in scenarios where we use the tools without intending to continue the conversation.
Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
Model = ChatModel.OpenAi.Gpt4.Turbo,
Tools = new List<Tool>
{
new Tool
{
Function = new ToolFunction("get_weather", "gets the current weather")
}
},
ToolChoice = new OutboundToolChoice(OutboundToolChoiceModes.Required)
});
chat.AppendUserInput("Who are you?"); // user asks something unrelated, but we force the model to use the tool
ChatRichResponse response = await chat.GetResponseRich(); // the response contains one block of type Function
GetResponseRichSafe()
API is also available, which is guaranteed not to throw on the network level. The response is wrapped in a network-level wrapper, containing additional information. For production use cases, either use try {} catch {}
on all the HTTP request producing Tornado APIs, or use the safe APIs.
Tools requested by the model can also be resolved and the results returned immediately. This has the benefit of automatically continuing the conversation.
Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
Model = ChatModel.OpenAi.Gpt4.O,
Tools =
[
new Tool(new ToolFunction("get_weather", "gets the current weather", new
{
type = "object",
properties = new
{
location = new
{
type = "string",
description = "The location for which the weather information is required."
}
},
required = new List<string> { "location" }
}))
]
})
.AppendSystemMessage("You are a helpful assistant")
.AppendUserInput("What is the weather like today in Prague?");
ChatStreamEventHandler handler = new ChatStreamEventHandler
{
MessageTokenHandler = (x) =>
{
Console.Write(x);
return Task.CompletedTask;
},
FunctionCallHandler = (calls) =>
{
calls.ForEach(x => x.Result = new FunctionResult(x, "A mild rain is expected around noon.", null));
return Task.CompletedTask;
},
AfterFunctionCallsResolvedHandler = async (results, handler) => { await chat.StreamResponseRich(handler); }
};
await chat.StreamResponseRich(handler);
This interactive demo can be expanded into an end-user-facing interface in the style of ChatGPT. Shows how to use strongly typed tools together with streaming and resolve parallel tool calls.
ChatStreamEventHandler
is a convenient class allowing subscription to only the events your use case needs.
public static async Task OpenAiFunctionsStreamingInteractive()
{
// 1. set up a sample tool using a strongly typed model
ChatPluginCompiler compiler = new ChatPluginCompiler();
compiler.SetFunctions([
new ChatPluginFunction("get_weather", "gets the current weather in a given city", [
new ChatFunctionParam("city_name", "name of the city", ChatPluginFunctionAtomicParamTypes.String)
])
]);
// 2. in this scenario, the conversation starts with the user asking for the current weather in two of the supported cities.
// we can try asking for the weather in the third supported city (Paris) later.
Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
Model = ChatModel.OpenAi.Gpt4.Turbo,
Tools = compiler.GetFunctions()
}).AppendUserInput("Please call functions get_weather for Prague and Bratislava (two function calls).");
// 3. repl
while (true)
{
// 3.1 stream the response from llm
await StreamResponse();
// 3.2 read input
while (true)
{
Console.WriteLine();
Console.Write("> ");
string? input = Console.ReadLine();
if (input?.ToLowerInvariant() is "q" or "quit")
{
return;
}
if (!string.IsNullOrWhiteSpace(input))
{
chat.AppendUserInput(input);
break;
}
}
}
async Task StreamResponse()
{
await chat.StreamResponseRich(new ChatStreamEventHandler
{
MessageTokenHandler = async (token) =>
{
Console.Write(token);
},
FunctionCallHandler = async (fnCalls) =>
{
foreach (FunctionCall x in fnCalls)
{
if (!x.TryGetArgument("city_name", out string? cityName))
{
x.Result = new FunctionResult(x, new
{
result = "error",
message = "expected city_name argument"
}, null, true);
continue;
}
x.Result = new FunctionResult(x, new
{
result = "ok",
weather = cityName.ToLowerInvariant() is "prague" ? "A mild rain" : cityName.ToLowerInvariant() is "paris" ? "Foggy, cloudy" : "A sunny day"
}, null, true);
}
},
AfterFunctionCallsResolvedHandler = async (fnResults, handler) =>
{
await chat.StreamResponseRich(handler);
}
});
}
}
Other endpoints such as Images, Embedding, Speech, Assistants, Threads and Vision are also supported!
Check the links for simple to-understand examples!
- 25,000+ installs on NuGet under previous names Lofcz.Forks.OpenAI, OpenAiNg.
- Used in commercial projects incurring charges of thousands of dollars monthly.
- The license will never change. Looking at you HashiCorp and Tiny.
- Supports streaming, functions/tools, modalities (images, audio), and strongly typed LLM plugins/connectors.
- Great performance, nullability annotations.
- Extensive tests suite.
- Maintained actively for over a year.
Every public class, method, and property has extensive XML documentation, using LLM Tornado should be intuitive if you've used any other LLM library previously. Feel free to open an issue here if you have any questions.
PRs are welcome!
💜 This library is licensed under the MIT license.