Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 功能改善 - 更精准的tokenizer切割判断 #162

Closed
Leizhenpeng opened this issue Mar 31, 2023 · 2 comments
Closed

🚀 功能改善 - 更精准的tokenizer切割判断 #162

Leizhenpeng opened this issue Mar 31, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@Leizhenpeng
Copy link
Member

功能改善建议 🚀

欢迎在此分享您对功能的改善建议,我们期待听到您的想法和建议。

现在的处理是借助字符串长度来判断是否超过最大token,
根据openai官方的tokenizer,https://platform.openai.com/tokenizer
显然这种做法并不合理,需要优化这种判断

您的建议是什么? 🤔

利用goja来运行openai官方的tokenizer
可以参考:
https://github.com/pandodao/tokenizer-go

感谢您的分享和支持!🙏

@Leizhenpeng Leizhenpeng added the enhancement New feature or request label Mar 31, 2023
@LufeiCheng
Copy link
Contributor

请问这个功能目前在进行中了吗,如果目前没有commiter来完成这项工作,我可以尝试引入tokenizer-go包,来完成它

@Leizhenpeng
Copy link
Member Author

Leizhenpeng commented Apr 10, 2023

欢迎PR!!!!我最近太忙了。。。愁。。。

请问这个功能目前在进行中了吗,如果目前没有commiter来完成这项工作,我可以尝试引入tokenizer-go包,来完成它

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants