Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

推理时无限processing #398

Open
gaoxuxu110 opened this issue Jul 19, 2024 · 4 comments
Open

推理时无限processing #398

gaoxuxu110 opened this issue Jul 19, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@gaoxuxu110
Copy link

Snipaste_2024-07-19_12-33-04
使用的是: 参考音频的方式生成语音, 但是一直卡在processing..
控制台窗口加载到一下信息后就没有后续了
To create a public link, set share=True in launch().
You are using the latest version of funasr-1.1.2
2024-07-19 12:28:59,203 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-19 12:28:59,203 - modelscope - INFO - Use user-specified model revision: master
另外: 使用了最新的代码(即: fix modelscope version problem (#396))

@gaoxuxu110 gaoxuxu110 added the bug Something isn't working label Jul 19, 2024
@gaoxuxu110
Copy link
Author

Snipaste_2024-07-19_12-33-04 使用的是: 参考音频的方式生成语音, 但是一直卡在processing.. 控制台窗口加载到一下信息后就没有后续了 To create a public link, set share=True in launch(). You are using the latest version of funasr-1.1.2 2024-07-19 12:28:59,203 - modelscope - WARNING - Using the master branch is fragile, please use it with caution! 2024-07-19 12:28:59,203 - modelscope - INFO - Use user-specified model revision: master 另外: 使用了最新的代码(即: fix modelscope version problem (#396))

还是加载出来了, 但是花了6,7分钟... 这明显不合理, 因为只有十几个字, 而且我用的编译推理, 本身是3090

@gaoxuxu110 gaoxuxu110 changed the title 推理时无线processing 推理时无限processing Jul 19, 2024
@PoTaTo-Mika
Copy link
Contributor

因为auto_rerank功能需要下载funasr的权重,时间长的话请自行检查网速是否太慢。

@chaoqunxie
Copy link

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2024-07-18 15:45:44,275][fish_speech.models.text2semantic.lit_module][INFO] - [rank: 0] Set weight decay: 0 for 459 parameters
[2024-07-18 15:45:44,275][fish_speech.models.text2semantic.lit_module][INFO] - [rank: 0] Set weight decay: 0.0 for 65 parameters

| Name | Type | Params | Mode

0 | model | DualARTransformer | 394 M | train
1 | model.embeddings | Embedding | 2.4 M | train
2 | model.layers | ModuleList | 311 M | train
3 | model.norm | RMSNorm | 1.0 K | train
4 | model.output | Linear | 280 K | train
5 | model.fast_embeddings | Embedding | 1.1 M | train
6 | model.fast_layers | ModuleList | 77.9 M | train
7 | model.fast_norm | RMSNorm | 1.0 K | train
8 | model.fast_output | Linear | 1.1 M | train

4.3 M Trainable params
390 M Non-trainable params
394 M Total params
1,577.969 Total estimated model params size (MB)
Sanity Checking: | | 0/? [00:00<?, ?it/s][2024-07-18 15:45:44,446][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:44,446][fish_speech.datasets.text][INFO] - [rank: 0] Reading 2 / 1 files
[2024-07-18 15:45:44,447][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:44,447][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:44,447][fish_speech.datasets.text][INFO] - [rank: 0] Read total 2 groups of data
[2024-07-18 15:45:44,447][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:44,448][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:44,448][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:44,448][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:44,448][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:44,450][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:44,451][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:44,452][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:44,452][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:44,452][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:44,452][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:47,450][fish_speech.datasets.text][INFO] - [rank: 0] Reading 2 / 1 files
[2024-07-18 15:45:47,451][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:47,451][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:47,452][fish_speech.datasets.text][INFO] - [rank: 0] Read total 2 groups of data
[2024-07-18 15:45:47,458][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:47,458][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:47,458][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:47,459][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:47,461][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:47,461][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:47,461][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:47,461][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:47,464][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:47,464][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[2024-07-18 15:45:47,508][fish_speech.datasets.text][INFO] - [rank: 0] Reading 1 / 1 files
[2024-07-18 15:45:47,508][fish_speech.datasets.text][INFO] - [rank: 0] Read total 1 groups of data
[rank0]:[E ProcessGroupNCCL.cpp:1414] [PG 0 Rank 0] Process group watchdog thread terminated with exception: CUDA error: unspecified launch failure
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:43 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f4941b7a897 in /usr/local/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7f4941b2ab25 in /usr/local/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7f4941f06718 in /usr/local/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10d::ProcessGroupNCCL::WorkNCCL::finishedGPUExecutionInternal() const + 0x56 (0x7f48f58598e6 in /usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #4: c10d::ProcessGroupNCCL::WorkNCCL::isCompleted() + 0x58 (0x7f48f585d9e8 in /usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #5: c10d::ProcessGroupNCCL::watchdogHandler() + 0x77c (0x7f48f586305c in /usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #6: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x10c (0x7f48f5863dcc in /usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #7: + 0xd44a3 (0x7f49412ba4a3 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #8: + 0x89134 (0x7f4943751134 in /usr/lib/x86_64-linux-gnu/libc.so.6)
frame #9: __clone + 0x40 (0x7f49437d0a40 in /usr/lib/x86_64-linux-gnu/libc.so.6)

terminate called after throwing an instance of 'c10::DistBackendError'
what(): [PG 0 Rank 0] Process group watchdog thread terminated with exception: CUDA error: unspecified launch failure
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:43 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f4941b7a897 in /usr/local/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7f4941b2ab25 in /usr/local/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7f4941f06718 in /usr/local/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10d::ProcessGroupNCCL::WorkNCCL::finishedGPUExecutionInternal() const + 0x56 (0x7f48f58598e6 in /usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #4: c10d::ProcessGroupNCCL::WorkNCCL::isCompleted() + 0x58 (0x7f48f585d9e8 in /usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #5: c10d::ProcessGroupNCCL::watchdogHandler() + 0x77c (0x7f48f586305c in /usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #6: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x10c (0x7f48f5863dcc in /usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #7: + 0xd44a3 (0x7f49412ba4a3 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #8: + 0x89134 (0x7f4943751134 in /usr/lib/x86_64-linux-gnu/libc.so.6)
frame #9: __clone + 0x40 (0x7f49437d0a40 in /usr/lib/x86_64-linux-gnu/libc.so.6)

Exception raised from ncclCommWatchdog at ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1418 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f4941b7a897 in /usr/local/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: + 0xe32119 (0x7f48f54e7119 in /usr/local/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #2: + 0xd44a3 (0x7f49412ba4a3 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #3: + 0x89134 (0x7f4943751134 in /usr/lib/x86_64-linux-gnu/libc.so.6)
frame #4: __clone + 0x40 (0x7f49437d0a40 in /usr/lib/x86_64-linux-gnu/libc.so.6) 这是啥原因呢

@AnyaCoder
Copy link
Collaborator

Snipaste_2024-07-19_12-33-04 使用的是: 参考音频的方式生成语音, 但是一直卡在processing.. 控制台窗口加载到一下信息后就没有后续了 To create a public link, set share=True in launch(). You are using the latest version of funasr-1.1.2 2024-07-19 12:28:59,203 - modelscope - WARNING - Using the master branch is fragile, please use it with caution! 2024-07-19 12:28:59,203 - modelscope - INFO - Use user-specified model revision: master 另外: 使用了最新的代码(即: fix modelscope version problem (#396))

还是加载出来了, 但是花了6,7分钟... 这明显不合理, 因为只有十几个字, 而且我用的编译推理, 本身是3090

rerank用到了音频转写,有一些问题,现在pr中修复。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants