Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Audio streaming only available if source is 'microphone' on Ubuntu 22.04 #403

Open
j2l opened this issue Jul 19, 2024 · 3 comments
Open
Labels
bug Something isn't working

Comments

@j2l
Copy link

j2l commented Jul 19, 2024

Describe the bug
Can't run web UI

To Reproduce
python tools/webui...

Expected behavior
Run

Screenshots / log

python tools/webui.py \
    --llama-checkpoint-path checkpoints/fish-speech-1.2-sft \
    --decoder-checkpoint-path checkpoints/fish-speech-1.2-sft/firefly-gan-vq-fsq-4x1024-42hz-generator.pth

/usr/lib/python3/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.4
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
2024-07-19 18:59:57.695 | INFO     | __main__:<module>:523 - Loading Llama model...
2024-07-19 19:00:03.030 | INFO     | tools.llama.generate:load_model:347 - Restored model from checkpoint
2024-07-19 19:00:03.030 | INFO     | tools.llama.generate:load_model:351 - Using DualARTransformer
2024-07-19 19:00:03.031 | INFO     | __main__:<module>:530 - Llama model loaded, loading VQ-GAN model...
2024-07-19 19:00:04.087 | INFO     | tools.vqgan.inference:load_model:44 - Loaded model: <All keys matched successfully>
2024-07-19 19:00:04.087 | INFO     | __main__:<module>:538 - Decoder model loaded, warming up...
2024-07-19 19:00:04.088 | INFO     | tools.api:encode_reference:117 - No reference audio provided
2024-07-19 19:00:04.120 | INFO     | tools.llama.generate:generate_long:432 - Encoded text: Hello, world!
2024-07-19 19:00:04.120 | INFO     | tools.llama.generate:generate_long:450 - Generating sentence 1/1 of sample 1/1
  0%|                                                                         | 0/4080 [00:00<?, ?it/s]/home/pm/.local/lib/python3.10/site-packages/torch/backends/cuda/__init__.py:342: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see, torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature.
  warnings.warn(
  1%|▌                                                               | 39/4080 [00:01<02:46, 24.21it/s]
2024-07-19 19:00:06.246 | INFO     | tools.llama.generate:generate_long:505 - Generated 41 tokens in 2.13 seconds, 19.28 tokens/sec
2024-07-19 19:00:06.247 | INFO     | tools.llama.generate:generate_long:508 - Bandwidth achieved: 9.45 GB/s
2024-07-19 19:00:06.247 | INFO     | tools.llama.generate:generate_long:513 - GPU Memory used: 1.42 GB
2024-07-19 19:00:06.266 | INFO     | tools.api:decode_vq_tokens:128 - VQ features: torch.Size([4, 40])
/home/pm/.local/lib/python3.10/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
  return F.conv1d(input, weight, bias, self.stride,
2024-07-19 19:00:07.330 | INFO     | __main__:<module>:555 - Warming up done, launching the web UI...
/mnt/phil/fish-speech-main/tools/webui.py:343: UserWarning: You have unused kwarg parameters in Checkbox, please remove them: {'scale': 0, 'min_width': 150}
  if_refine_text = gr.Checkbox(
Traceback (most recent call last):
  File "/mnt/phil/fish-speech-main/tools/webui.py", line 557, in <module>
    app = build_app()
  File "/mnt/phil/fish-speech-main/tools/webui.py", line 439, in build_app
    stream_audio = gr.Audio(
  File "/home/pm/.local/lib/python3.10/site-packages/gradio/components.py", line 2387, in __init__
    raise ValueError(
ValueError: Audio streaming only available if source is 'microphone'.

Additional context
RTX 3060 (12GB)
Driver Version: 550.67 CUDA Version: 12.4

@j2l j2l added the bug Something isn't working label Jul 19, 2024
@AnyaCoder
Copy link
Collaborator

Maybe you need to install a brand new python=3.10 environment, then pip install -e .

@Brian-AI-strategist
Copy link

Maybe you need to install a brand new python=3.10 environment, then pip install -e .

@j2l
Copy link
Author

j2l commented Jul 20, 2024

Maybe I need to install a brand new python=3.10 environment, then pip install -e . 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants