We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug File "/home/xcsong/workspace/wenet/wenet/transformer/embedding.py", line 100, in position_encoding # pytorch/pytorch#69434 if isinstance(offset, int): assert offset + size <= self.max_len ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE pos_emb = self.pe[:, offset:offset + size] elif isinstance(offset, torch.Tensor) and offset.dim() == 0: # scalar RuntimeError: AssertionError:
To Reproduce wenet --device cuda --language chinese ./20minutes.wav
Expected behavior 期望可以有结果
60s的音频是可以的, 但是20分钟就assert了, cli.transcribe有 stream模式吗
The text was updated successfully, but these errors were encountered:
Describe the bug File "/home/xcsong/workspace/wenet/wenet/transformer/embedding.py", line 100, in position_encoding # pytorch/pytorch#69434 if isinstance(offset, int): assert offset + size <= self.max_len ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE pos_emb = self.pe[:, offset:offset + size] elif isinstance(offset, torch.Tensor) and offset.dim() == 0: # scalar RuntimeError: AssertionError: To Reproduce wenet --device cuda --language chinese ./20minutes.wav Expected behavior 期望可以有结果 60s的音频是可以的, 但是20分钟就assert了, cli.transcribe有 stream模式吗
本来就是流式识别吧,只要内存够大,处理多长的音频应该都没问题
Sorry, something went wrong.
需要借助vad之类的工具
No branches or pull requests
Describe the bug
File "/home/xcsong/workspace/wenet/wenet/transformer/embedding.py", line 100, in position_encoding
# pytorch/pytorch#69434
if isinstance(offset, int):
assert offset + size <= self.max_len
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
pos_emb = self.pe[:, offset:offset + size]
elif isinstance(offset, torch.Tensor) and offset.dim() == 0: # scalar
RuntimeError: AssertionError:
To Reproduce
wenet --device cuda --language chinese ./20minutes.wav
Expected behavior
期望可以有结果
60s的音频是可以的, 但是20分钟就assert了,
cli.transcribe有 stream模式吗
The text was updated successfully, but these errors were encountered: