-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gguf_dump.py: fix markddown kv array print #8588
gguf_dump.py: fix markddown kv array print #8588
Conversation
834de1a
to
d99a34b
Compare
Co-authored-by: compilade <git@compilade.net>
@compilade thanks. This is how it will look like now
|
gguf-py/scripts/gguf_dump.py
Outdated
@@ -249,21 +249,29 @@ def dump_markdown_metadata(reader: GGUFReader, args: argparse.Namespace) -> None | |||
if len(field.types) == 1: | |||
curr_type = field.types[0] | |||
if curr_type == GGUFValueType.STRING: | |||
value = repr(str(bytes(field.parts[-1]), encoding='utf-8')[:60]) | |||
value = "\"`{strval}`\"".format(strval=str(bytes(field.parts[-1]), encoding='utf-8')[:60]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The quotes render a bit weird, and what if the string contains `
? I suggest to remove the quotes or to include them inside the inline code blocks, and... Hmm not sure how to escape `
except by adding more surrounding `
than the longest inner occurrence, and separate the delimiters by spaces if the string happens to start or finish with `
.
I don't know if there's a limit, let's see: ````````````````````
(20 inner, 21 outer `
) seems to work, so there might be no limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the inline code blocks because of <unk>
rendering weirdly... I'm inclined to just remove "
gguf-py/scripts/gguf_dump.py
Outdated
else: | ||
array_elements.append(value_string) | ||
value_array_inner = ["\"`{strval}`\"".format(strval=strval) for strval in array_elements] | ||
value = f'[ {", ".join(value_array_inner).strip()}{", ..." if total_elements > len(array_elements) else ""} ]' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good, but conditionally appending "..."
to value_array_inner
might be better than inserting the string ", ..."
after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look reasonable, but you might want to fix escaping and/or change the truncation of inner strings in lists of strings.
How about this? |
FYI, I'm pretty happy with this now. If you are happy with the adjustments, you can press merge whenever. |
>>> escape_markdown_inline_code("hello world") '`hello world`' >>> escape_markdown_inline_code("hello ` world") '``hello ` world``'
On a side note, added the dump to https://huggingface.co/mofosyne/TinyLLama-v0-5M-F16-llamafile/blob/main/TinyLLama-4.6M-v0.0-F16.dump.md so you can see how it appears in huggingface as well. |
* gguf_dump.py: fix markddown kv array print * Update gguf-py/scripts/gguf_dump.py Co-authored-by: compilade <git@compilade.net> * gguf_dump.py: refactor kv array string handling * gguf_dump.py: escape backticks inside of strings * gguf_dump.py: inline code markdown escape handler added >>> escape_markdown_inline_code("hello world") '`hello world`' >>> escape_markdown_inline_code("hello ` world") '``hello ` world``' * gguf_dump.py: handle edge case about backticks on start or end of a string --------- Co-authored-by: compilade <git@compilade.net>
The initial gguf dump didn't match the output of main.cpp so must have read it wrong. Adjusted the python script until it matched
', '', '<0x00>', '<0x01>', ... ]From main.cpp: