You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Good catch. For others who find this thread later: sub deals in bytes so to include all of the bytes of the potentially multi-byte i'th utf8 character, we have to get the byte position of the (i+1)th and subtract 1.
jxlin123
added a commit
to jxlin123/pil-4th
that referenced
this issue
Dec 9, 2024
As discussed on <oitofelix#2>, the
original solutions for exercises 4.4 and 4.6 may accidentally "cut off"
bytes, given the nature of Unicode codepoints possibly being encoded
using multiple bytes. I've now gone ahead and applied the fix as
described in that link.
pil-4th/ex-4.6.lua
Line 11 in 417e460
This (and also in ex 4.4) should have the
-1
in the call toutf8.offset
on the outside and not on the inside. Try this output and you will see:-- Japanese example.
jp = '私は日本語が分かります。'
print(jp)
print(insert(jp, 3, "少し"))
-- Chinese example.
zh = "我學了四年了,可是很長的時間沒有練習。"
print(zh)
print(insert(zh, 4, "中文"))
-- Japanese example.
jp = '私は少し日本語が分かります。'
print(jp)
print(remove(jp, 3, 2))
-- Chinese example.
zh = "我學了中文四年了,可是很長的時間沒有練習。"
print(zh)
print(remove(zh, 4, 2))
The text was updated successfully, but these errors were encountered: