Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context Shifting #4588

Closed
tmsingson opened this issue Nov 14, 2023 · 6 comments
Closed

Context Shifting #4588

tmsingson opened this issue Nov 14, 2023 · 6 comments
Labels
enhancement New feature or request stale

Comments

@tmsingson
Copy link

Description

About 10 days ago, KoboldCpp added a feature called Context Shifting which is supposed to greatly reduce reprocessing. Here is their official description of the feature:

NEW FEATURE: Context Shifting (A.K.A. EvenSmarterContext) - This feature utilizes KV cache shifting to automatically remove old tokens from context and add new ones without requiring any reprocessing. So long as you use no memory/fixed memory and don't use world info, you should be able to avoid almost all reprocessing between consecutive generations even at max context. This does not consume any additional context space, making it superior to SmartContext.

Any chance this gets added to Ooba as well?

Additional Context

Reddit thread: https://www.reddit.com/r/LocalLLaMA/comments/17ni4hm/koboldcpp_v148_context_shifting_massively_reduced/
llama.cpp pull: ggerganov/llama.cpp#3228
kobold.cpp 1.48.1 release: https://github.com/LostRuins/koboldcpp/releases/tag/v1.48.1

@tmsingson tmsingson added the enhancement New feature or request label Nov 14, 2023
@github-actions github-actions bot added the stale label Dec 26, 2023
Copy link

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

@kisenera
Copy link

kisenera commented Jan 2, 2024

Is there any way to get this for Exl2?

@aarongerber
Copy link

This was closed as stale. Did it ever get implemented @oobabooga? Literally this is driving me to use KoboldCpp. As soon as you hit context limits in Oobabooga it becomes obnoxious in comparison. :/

@RichardFevrier
Copy link

Is there any way to get this for Exl2?

Wish too know too

@aarongerber
Copy link

Thanks @oobabooga you rock

@kelheor
Copy link

kelheor commented Nov 17, 2024

I remember when I started to use this project (when it just created) with pygmalion 6b a long time ago, it worked exactly like that - AI were shifted context and forgot old info without any slow down. Then at some moment, it stopped worked like that (maybe when exllama appeared). I wonder why that happened...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

5 participants