Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问这里的中文模型支持的最大输入序列长度是512tokens吗?超过512tokens就会被截断嘛?可不可以在微调的时候扩大模型的位置编码数量? #130

Open
chengzi-big opened this issue Jul 21, 2024 · 0 comments

Comments

@chengzi-big
Copy link

1.请问这里的中文模型支持的最大输入序列长度是512tokens吗?
2.超过512tokens就会被截断嘛?
3.可不可以在微调的时候扩大模型的位置编码数量?
4.因为我的数据集中的输入过长,如果超过512tokens之后的输入被截断的话,可能会造成信息丢失,请问您有什么更好的办法吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant