-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] refactor: init some experimental refactoring. #362
Conversation
These two will probably be handled by my ongoing work to redesign the executors. My plan is a new executor, built on a single batch, which can have multiple ongoing "conversations" that can all be updated in one single batch.
Personally I would be against this, I think we should keep LLamaSharp as close to just being a wrapper around llama.cpp as possible. However, we should definitely make sure the library is flexible enough that something like SK can be integrated with it! |
/// <summary> | ||
/// Set a custom generation control to use. <b>If this is set antiprompt will be ignored!</b> | ||
/// </summary> | ||
IGenerationControl GenerationControl { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IGenerationControl GenerationControl { get; set; } | |
IGenerationControl? GenerationControl { get; set; } |
/// <summary> | ||
/// Set a custom tokenizer to use. | ||
/// </summary> | ||
ITokenizer Tokenizer { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ITokenizer Tokenizer { get; set; } | |
ITokenizer? Tokenizer { get; set; } |
namespace LLama.Control | ||
{ | ||
/// <summary> | ||
/// The default generation control in LLamaSharp, using antiprompts. This class should not be inherited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// The default generation control in LLamaSharp, using antiprompts. This class should not be inherited. | |
/// The default generation control in LLamaSharp, using antiprompts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's sealed, so it's not possible to extend this class.
{ | ||
/// <summary> | ||
/// The default generation control in LLamaSharp, using antiprompts. This class should not be inherited. | ||
/// <b>Note that this class has state. The previous outputs feeded to it will affect its control.</b> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <b>Note that this class has state. The previous outputs feeded to it will affect its control.</b> | |
/// <b>Note that this class has state. The previous outputs fed to it will affect its output.</b> |
/// </summary> | ||
public bool ShouldStopGeneration(LLamaContext context, IInferenceParams inferenceParams, IEnumerable<int> lastOutputIds) | ||
{ | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be returning false?
namespace LLama.Transform | ||
{ | ||
/// <summary> | ||
/// The default tokenizer of LLamaSharp. This class should not be inherited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// The default tokenizer of LLamaSharp. This class should not be inherited. | |
/// The default tokenizer of LLamaSharp. |
{ | ||
/// <summary> | ||
/// The default tokenizer of LLamaSharp. This class should not be inherited. | ||
/// <b>Note that this class has state. The previous outputs feeded to it will affect its control.</b> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <b>Note that this class has state. The previous outputs feeded to it will affect its control.</b> | |
/// <b>Note that this class has state. The previous outputs fed to it will affect its output.</b> |
/// <summary> | ||
/// <inheritdoc/> | ||
/// </summary> | ||
public IEnumerable<int> Tokenize(LLamaContext context, string text, bool addBos = true, bool special = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to accept a LLamaWeights
in the constructor, instead of a LLamaContext
in the Tokenize/Detokenize methids. That simplifies usage and allows you to use the tokenizer without creating an entire context (which is quite memory hungry).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change it, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there already a way to tokenize without a context? I didn't find a such method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLamaWeights w = your_weights;
w.NativeHandle.Tokenize("a string");
Should do it. There should really be higher level wrappers for tokenization in LLamaWeights
, so that you don't have to access the NativeHandle, but I haven't built them yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I mistook the one in NativeApi with context as parameter as the lowest level api. I'll add a wrapper for it with LLamaWeights. :)
{ | ||
/// <summary> | ||
/// Decodes a stream of tokens into a stream of characters | ||
/// </summary> | ||
public sealed class StreamingTokenDecoder | ||
{ | ||
private readonly SafeLlamaModelHandle _weights; | ||
private readonly SafeLlamaModelHandle? _weights; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any of the changes in this class are necessary if we go with the suggested change to the DefaultTokenizer
Over all I really like the changes 1, 2 and 3. Can we split them into 3 separate PRs so they can proceed separately? |
Sure, I'll separate them into 3 PRs. Besides there're some proposals I forgot to mention above:
Also there're some features that makes me headache. I don't mean to complete them in v1.0.0 but just include them in the discussion.
|
This PR refactors the implementations of LLamaSharp, which is targeted for v1.0.0.
I pushed some changes but has not completed it yet. This PR now is more like a proposal, no promising for compiling and test.
Currently I've made the following changes
IGenerationControl
to allow customized control for the stop of generation, instead of just anti-prompts.ITokenizer
to allow customized tokenizer.TextCompletion
, which is corresponding toChatSession
, to provide some further wrapping for text completion tasks. Related issue: Create HTTP API server and provide API like OAI #269 Garbled output from model in Unity #178Some other proposals
InstructExecutor
obsolete because it's only one of the mode of text completion.StatelessExecutor
againFor discussion
I'm wondering if we should introduce
Microsoft.SemanticKernel.Abstractions
to LLamaSharp, instead of just using it in SK integration. The reasons are listed as following:However, though, I also have some worrying about it:
@martindevans @SignalRT @saddam213 May I ask for some ideas from you? This is not a formal PR so please feel free to talk about anything.