Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.8.5版本加载不运行 #256

Open
11123kbbs opened this issue Dec 19, 2024 · 12 comments
Open

0.8.5版本加载不运行 #256

11123kbbs opened this issue Dec 19, 2024 · 12 comments

Comments

@11123kbbs
Copy link

11123kbbs commented Dec 19, 2024

模型加载了。但是卡在这不运行了。请问怎么处理。低版本的是可以正常运行的。

@11123kbbs
Copy link
Author

QQ20241219-082935

@CheshireCC
Copy link
Owner

看看软件日志,应该是显存不够了

@11123kbbs
Copy link
Author

4090为什么会显存不够。我使用3.1.1版本是正常的

@CheshireCC
Copy link
Owner

4090为什么会显存不够。我使用3.1.1版本是正常的

看看日志

@11123kbbs
Copy link
Author

2024-12-27_01:22:58 - faster_whisper - INFO - Processing audio with duration 30:00.000
2024-12-27_01:23:01 - faster_whisper - INFO - VAD filter removed 00:54.192 of audio
2024-12-27_01:23:01 - faster_whisper - DEBUG - VAD filter kept the following audio segments: [00:00.000 -> 00:01.520], [00:05.616 -> 05:51.120], [05:52.752 -> 06:55.920], [06:58.608 -> 17:49.616], [17:50.960 -> 17:56.560], [17:58.224 -> 20:00.784], [20:05.904 -> 20:09.584], [20:12.720 -> 20:15.952], [20:20.112 -> 20:22.032], [20:24.624 -> 21:45.616], [21:47.056 -> 23:05.712], [23:11.216 -> 23:13.296], [23:16.304 -> 23:18.224], [23:21.008 -> 25:16.560], [25:17.872 -> 27:48.656], [27:50.576 -> 29:22.544], [29:26.736 -> 29:44.336], [29:46.992 -> 29:55.056]
2024-12-27_01:23:01 - faster_whisper - DEBUG - Processing segment at 00:00.000

这个就是日志所有显示内容

@CheshireCC
Copy link
Owner

另一个日志

@11123kbbs
Copy link
Author

The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.
torchvision is not available - cannot save figures
The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.

faster_whisper_GUI: 0.8.5
==========2024-12-28_16:46:15==========
==========Start==========

language: zh

==========2024-12-28_16:46:27==========
==========LoadModel==========

-model_size_or_path: F:/whisper-large-v3-float32
-device: cuda
-device_index: 0
-compute_type: float32
-cpu_threads: 4
-num_workers: 1
-download_root: C:/Users/ab695/.cache/huggingface/hub
-local_files_only: False
-use_v3_model: False

Load over
F:/whisper-large-v3-float32
max_length: 448
num_samples_per_token: 320
time_precision: 0.02
tokens_per_second: 50
input_stride: 2

==========2024-12-28_16:46:45==========
==========Process==========

redirect std output
vad_filter : True
-onset : 0.2
-min_speech_duration_ms : 0
-max_speech_duration_s : inf
-min_silence_duration_ms : 2000
-speech_pad_ms : 400
Transcribes options:
-audio : ['C:/Users/ab695/Downloads/新建文件夹/RAW VIDEO/TUTO_001_Car_Preparation.mp4']
-language : en
-task : False
-log_progress : False
-beam_size : 1
-best_of : 5
-patience : 1.0
-length_penalty : 1.0
-temperature : [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
-compression_ratio_threshold : 1.4
-log_prob_threshold : -10.0
-no_speech_threshold : 0.9
-condition_on_previous_text : False
-initial_prompt : None
-prefix : None
-suppress_blank : True
-suppress_tokens : [-1]
-without_timestamps : False
-max_initial_timestamp : 1.0
-word_timestamps : False
-prepend_punctuations : "'“¿([{-
-append_punctuations : "'.。,,!!??::”)]}、
-multilingual : False
-repetition_penalty : 1.0
-no_repeat_ngram_size : 0
-prompt_reset_on_temperature : 0.5
-max_new_tokens : None
-chunk_length : 30
-clip_mode : 0
-clip_timestamps : 0
-hallucination_silence_threshold : 0.5
-hotwords :
-language_detection_threshold : 0.5
-language_detection_segments : 1
create transcribe process with 1 workers
start transcribe process
Traceback (most recent call last):
File "C:\Users\ab695\Desktop\新建文1\faster_whisper_GUI\transcribe.py", line 371, in run
File "C:\Users\ab695\Desktop\新建文
1\concurrent\futures_base.py", line 621, in result_iterator
File "C:\Users\ab695\Desktop\新建文1\concurrent\futures_base.py", line 319, in _result_or_cancel
File "C:\Users\ab695\Desktop\新建文
1\concurrent\futures_base.py", line 458, in result
File "C:\Users\ab695\Desktop\新建文1\concurrent\futures_base.py", line 403, in __get_result
File "C:\Users\ab695\Desktop\新建文
1\concurrent\futures\thread.py", line 58, in run
File "C:\Users\ab695\Desktop\新建文1\faster_whisper_GUI\transcribe.py", line 281, in transcribe_file
File "C:\Users\ab695\Desktop\新建文
1\faster_whisper\transcribe.py", line 1800, in restore_speech_timestamps
File "C:\Users\ab695\Desktop\新建文1\faster_whisper\transcribe.py", line 1138, in generate_segments
File "C:\Users\ab695\Desktop\新建文
1\faster_whisper\transcribe.py", line 1348, in encode
ValueError: Invalid input features shape: expected an input with shape (1, 128, 3000), but got an input with shape (1, 80, 3000) instead

@CheshireCC
Copy link
Owner

V3 模型选项问题

@11123kbbs
Copy link
Author

你好,找个模型选项问题怎么修改呀。可以告诉下吗?谢谢

@weituo2002
Copy link

同问,也是卡住了不知道该怎么修改参数

@11123kbbs
Copy link
Author

作者,这个怎么修改或者选择呀。谢谢了

@CheshireCC
Copy link
Owner

加载模型页面可以设置

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants