rwkv-7 代码和模型不一致 #285

qxde01 · 2025-01-26T04:43:13Z

rwkv_v7_demo.py : args.vocab_size = 50304
01.b 实际：65536

raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for RWKV: Missing key(s) in state_dict: "blocks.0.att.v0", "blocks.0.att.v1", "blocks.0.att.v2". size mismatch for emb.weight: copying a param with shape torch.Size([65536, 768]) from checkpoint, the shape in current model is torch.Size([50304, 768]). size mismatch for head.weight: copying a param with shape torch.Size([65536, 768]) from checkpoint, the shape in current model is torch.Size([50304, 768]).

The text was updated successfully, but these errors were encountered:

BlinkDL · 2025-01-26T05:49:37Z

pile系列模型，和world系列模型，使用的tokenizer不同，所以vocab_size不同

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rwkv-7 代码和模型不一致 #285

rwkv-7 代码和模型不一致 #285

qxde01 commented Jan 26, 2025

BlinkDL commented Jan 26, 2025

rwkv-7 代码和模型不一致 #285

rwkv-7 代码和模型不一致 #285

Comments

qxde01 commented Jan 26, 2025

BlinkDL commented Jan 26, 2025