Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gguf_dump.py: fix markddown kv array print #8588

Merged
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 13 additions & 5 deletions gguf-py/scripts/gguf_dump.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,21 +249,29 @@ def dump_markdown_metadata(reader: GGUFReader, args: argparse.Namespace) -> None
if len(field.types) == 1:
curr_type = field.types[0]
if curr_type == GGUFValueType.STRING:
value = repr(str(bytes(field.parts[-1]), encoding='utf-8')[:60])
value = "\"`{strval}`\"".format(strval=str(bytes(field.parts[-1]), encoding='utf-8')[:60])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quotes render a bit weird, and what if the string contains `? I suggest to remove the quotes or to include them inside the inline code blocks, and... Hmm not sure how to escape ` except by adding more surrounding ` than the longest inner occurrence, and separate the delimiters by spaces if the string happens to start or finish with `.

I don't know if there's a limit, let's see: ```````````````````` (20 inner, 21 outer `) seems to work, so there might be no limit.

Copy link
Collaborator Author

@mofosyne mofosyne Jul 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the inline code blocks because of <unk> rendering weirdly... I'm inclined to just remove "

elif curr_type in reader.gguf_scalar_to_np:
value = str(field.parts[-1][0])
else:
if field.types[0] == GGUFValueType.ARRAY:
curr_type = field.types[1]
array_elements = []
if curr_type == GGUFValueType.STRING:
render_element = min(5, total_elements)
for element_pos in range(render_element):
value += repr(str(bytes(field.parts[-1 - element_pos]), encoding='utf-8')[:5]) + (", " if total_elements > 1 else "")
truncate_length = 30
value_string = str(bytes(field.parts[-1 - (total_elements - element_pos - 1) * 2]), encoding='utf-8')
if len(value_string) > truncate_length:
array_elements.append(value_string[:truncate_length // 2] + "`...`" + value_string[-truncate_length // 2:])
else:
array_elements.append(value_string)
value_array_inner = ["\"`{strval}`\"".format(strval=strval) for strval in array_elements]
value = f'[ {", ".join(value_array_inner).strip()}{", ..." if total_elements > len(array_elements) else ""} ]'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, but conditionally appending "..." to value_array_inner might be better than inserting the string ", ..." after.

elif curr_type in reader.gguf_scalar_to_np:
render_element = min(7, total_elements)
for element_pos in range(render_element):
value += str(field.parts[-1 - element_pos][0]) + (", " if total_elements > 1 else "")
value = f'[ {value}{" ..." if total_elements > 1 else ""} ]'
array_elements.append(str(field.parts[-1 - (total_elements - element_pos - 1)][0]))
value = f'[ {", ".join(array_elements).strip()}{", ..." if total_elements > len(array_elements) else ""} ]'
kv_dump_table.append({"n":n, "pretty_type":pretty_type, "total_elements":total_elements, "field_name":field.name, "value":value})

kv_dump_table_header_map = [
Expand Down Expand Up @@ -382,7 +390,7 @@ def dump_markdown_metadata(reader: GGUFReader, args: argparse.Namespace) -> None
markdown_content += f"- Percentage of total elements: {group_percentage:.2f}%\n"
markdown_content += "\n\n"

print(markdown_content) # noqa: NP100
print(markdown_content) # noqa: NP100


def main() -> None:
Expand Down
Loading