Huggingface GGUF Editor #9268
Replies: 3 comments 1 reply
-
I have just added support for easily editing individual tokenizer.ggml.scores/token_type entries with quick and easy lookup of every token, making this the most advanced GGUF Editor currently available. :) This comes in real handy when models are released with incorrect metadata just like the recent Yi-Coder models that had incorrect token_type/score for the <|im_start|> token (as well as incorrect EOS token), there are still GGUFs with wrong metadata available... |
Beta Was this translation helpful? Give feedback.
-
Unfortunately there was an oversight that could cause unaligned tensors, this should now be fixed. If you experienced nonsensical results from an edited GGUF, please grab a new copy, sorry for the inconvenience! :( |
Beta Was this translation helpful? Give feedback.
-
Thanks for the FOSS! It looks nice / useful. GGUFs are usually monolithic up to some size then they're usually but not necessarily split into various usually somewhat equally-large size chunks of up to several/many gigabytes each. I haven't verified how the file encoding / format is but I assume maybe the metadata is right at the beginning of the first file or at least is a small section somewhere in the group of files. So if you want to make any kind of metadata change you're by definition modifying a tiny tiny fraction of the typical GGUF file / file-set total. So it seems easy to inspect / modify that tiny piece online in the HF space UI, but the result is (as I interpret what little I know / see so far) having to download at least a large "chunk" of the (if split) GGUF or the entire single file to obtain a copy of the tiny metadata changes. If that's true then it might be nice to have SOME way(s) to modify the metadata but any combination of : A: Generate / download instructions or a script to use the standard llama.cpp local-copy GGUF metadata editor CLI to edit / replace the metadata according to the selected change(s). That way one has only a few lines of code / instruction to download and can apply the desired change without downloading gigabytes of unnecessary GGUF data, too. B: Generate / download some kind of "binary patch" file / script that can (somehow) just take the binary differences between the original GGUF and the modified one and encode those so one can apply them with any standard (there are plenty of standard / common file patching tools e.g. for unix but also cross platform) patching tool or if there's any reasonable use case for it a custom patch script (which I'd avoid in favor of an already sufficient purpose-built standard patch tool). Then one downloads a tiny set of instructions / patch file and has a straightforward way to apply the differences to a GGUF you already have downloaded via some easy / quick means. C: Maybe it is already possible but if so it is unknown to me and maybe others so perhaps discussing / commenting on the right techniques would help many users. So let's say there is a possible work flow to instead of downloading the modified GGUF it is actually modified inside or saved to one's OWN (or writable) HF repo / space / whatever; maybe that's where you put a copy of the original GGUF to be modified and you basically want to just make a "small delta" to it to create a new GIT version or whatever that is 99.999% the same except for the metadata block. Ok so then to efficiently download THAT from your modified HF repo / space / whatever where you have a modified copy it would be ideal to use an efficient file transfer / synchronization protocol like rsync or such which will only copy the changed portion (approximately) and realize quickly it does not have to download the rest of the unchanged N GBys. That would involve some kind of efficient binary delta (or git LFS commit condent delta...?) aware transfer protocol or software. IDK how "git lfs pull" or whatever actually might be used optionally or defaults to work in terms of efficiently pulling small updates to large binary files but IF there is a git lfs based way it probably bears promotion / discussion since I may not be the only nescient one. And if HF CLI / UI supports some other "scp", "rsync" or other means to actually efficiently copy small updates of binary large files that'd be great to know about also. D: It's possibly relevant to the editing workflow but also possibly relevant much more widely to the way ggufs are made / published. So if the metadata "section/chunk" is tiny and the rest of the data is huge then it's nice to be able to edit / send / update it efficiently independently. GGUF already gained the ability to support "splitting" with variable amounts of "data" in each "piece/chunk" file. So if that's so then why not GENERATE a gguf that is SPLIT so that for example somefile-00001-of-00099.gguf contains essentially ONLY the metadata (and maybe other highly relevant small header etc. content) and is tiny. Then the OTHER GGUF split files would get the various pieces of the model data which would be large and highly less likely to need to be edited / updated in small granularity. E: Of course I guess it's possible to take a HF space's "software" and if it's open source etc. fork / reuse it etc. So maybe your GGUF editor GUI is already able to be derived-from but anyway the thought crosses my mind that if it is using some simple-ish gradio UI on top of some python or whatever script then maybe there is some nice option to reuse 70% of that or whatever and make a very analogous metadata editor UI people can download / use locally to augment / replace the llama.cpp metadata editing / inspection CLI as a UI option. But I assume that'd turn into a github project or maybe llama.cpp pull request or something to extend the metadata / gguf editing / composition UIs in whatever ways. But maybe something to think about if the user friendly UI UX could be enhanced for local use cases also. I'm not personally proposing a fork or project just sharing a concept FWIW. |
Beta Was this translation helpful? Give feedback.
-
The Huggingface GGUF Editor
🎉 Check out my latest project 🌍✨
A powerful editor designed specifically for editing GGUF metadata and downloading the result directly from any Huggingface repository you have access to (you must sign in for access to gated or private ones).
With its user-friendly design, you can effortlessly edit any GGUF metadata through the GGUF Editor hosted on Huggingface Spaces! 🌍✨🎉
Here are some basic usage examples:
Missing Pre-Tokenizer: If you notice that there's missing/incorrect metadata like the crucial Pre-Tokenizer name, Fill-in-Middle tokens, or any other metadata based feature, edit/create GGUF metadata in a flash! 📝✨🔗
Chat Template Editing: Suppose you want to fix an issue with the chat template(s) by updating them directly in the GGUF before downloading it, go ahead! Edit the existing template(s) or even add new ones for RAG, Tools, etc... 🗣️👥
Downloading Edited Files: Once you've made your edits using the GGUF Editor, download your edited file directly from the Huggingface repository, no need to learn how to install and use complicated scripts! 🚀✨
Beta Was this translation helpful? Give feedback.
All reactions