-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CLI utf8 issue #1080
Fix CLI utf8 issue #1080
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cursor is not in the correct position after i type some utf-8 characters.
You can try copy this utf8 string to the shell, and the cursor doesn't appear in the correct position
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got seg fault for this utf-8 string:
z473chen@ubuntu:/data0/kuzu/build/release/tools/shell$ ./kuzu_shell -i dsad
Opened the database at path: dsad
Enter ":help" for usage hints.
kuzu> 大苏打Segmentation fault (core dumped)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still a bug when the user types a very long utf-8 string in the shell:
z473chen@ubuntu:/data0/kuzu/build/release/tools/shell$ ./kuzu_shell -i 12451
Opened the database at path: 12451
Enter ":help" for usage hints.
kuzu> 顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶
kuzu> 顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶
kuzu> 顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶
The text appears in multiple lines rather than a single line
This PR fixes issue #1042, and refactors linenoise.cpp for better utf8 support in the future.
In detail:
chars
tracker tolinenoisestate
.len
now keeps track of the number of single-byte characters in the buffer, whilechars
keeps track of the actual number of utf8 characters.getBufCharPos
, which converts from the cursor's position to the data position in the buffer.linenoiseEditInsert
still takes in one byte at a time, but it doesn't write it until a full utf-8 character is typed, to prevent it from typing invalid utf-8 characters.