Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

有人 train 成功了吗? #19

Open
1 of 2 tasks
LYCnight opened this issue Aug 28, 2024 · 15 comments
Open
1 of 2 tasks

有人 train 成功了吗? #19

LYCnight opened this issue Aug 28, 2024 · 15 comments
Assignees
Labels
bug Something isn't working

Comments

@LYCnight
Copy link

System Info / 系統信息

Transformer 4.43, 4.44, 4.33 都试了,modeling_chatglm.py 也替换了,运行最后的 .sh 文件是报了和其他人类似的错。
建议官方再把训练操作过程写的详细些。

Who can help? / 谁可以帮助到您?

Information / 问题信息

  • The official example scripts / 官方的示例脚本
  • My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

Loading extension module cpu_adam...
Time to load cpu_adam op: 2.735379934310913 seconds
Traceback (most recent call last):
File "/root/AI4E/ljc/LongWriter/train/main.py", line 130, in
train()
File "/root/AI4E/ljc/LongWriter/train/main.py", line 126, in train
trainer.train(resume_from_checkpoint=False)
File "/root/anaconda3/envs/glm-4-copy/lib/python3.10/site-packages/transformers/trainer.py", line 1938, in train
return inner_training_loop(
File "/root/anaconda3/envs/glm-4-copy/lib/python3.10/site-packages/transformers/trainer.py", line 2095, in _inner_training_loop
model, self.optimizer = self.accelerator.prepare(self.model, self.optimizer)
File "/root/anaconda3/envs/glm-4-copy/lib/python3.10/site-packages/accelerate/accelerator.py", line 1303, in prepare
result = self._prepare_deepspeed(*args)
File "/root/anaconda3/envs/glm-4-copy/lib/python3.10/site-packages/accelerate/accelerator.py", line 1779, in _prepare_deepspeed
engine, optimizer, _, lr_scheduler = deepspeed.initialize(**kwargs)
File "/root/anaconda3/envs/glm-4-copy/lib/python3.10/site-packages/deepspeed/init.py", line 179, in initialize
config_class = DeepSpeedConfig(config, mpu, mesh_device=mesh_device)
File "/root/anaconda3/envs/glm-4-copy/lib/python3.10/site-packages/deepspeed/runtime/config.py", line 797, in init
self._initialize_params(copy.copy(self._param_dict))
File "/root/anaconda3/envs/glm-4-copy/lib/python3.10/site-packages/deepspeed/runtime/config.py", line 817, in _initialize_params
self.zero_config = get_zero_config(param_dict)
File "/root/anaconda3/envs/glm-4-copy/lib/python3.10/site-packages/deepspeed/runtime/zero/config.py", line 71, in get_zero_config
return DeepSpeedZeroConfig(**zero_config_dict)
File "/root/anaconda3/envs/glm-4-copy/lib/python3.10/site-packages/deepspeed/runtime/config_utils.py", line 57, in init
super().init(**data)
File "/root/anaconda3/envs/glm-4-copy/lib/python3.10/site-packages/pydantic/main.py", line 193, in init
self.pydantic_validator.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for DeepSpeedZeroConfig
stage3_prefetch_bucket_size
Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=15099494.4, input_type=float]
For further information visit https://errors.pydantic.dev/2.8/v/int_from_float
[2024-08-28 12:38:44,068] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 282936
[2024-08-28 12:38:44,901] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 282937
[2024-08-28 12:38:46,425] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 282938
[2024-08-28 12:38:46,443] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 282939
[2024-08-28 12:38:46,452] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 282940
[2024-08-28 12:38:46,460] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 282941
[2024-08-28 12:38:46,460] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 282942
[2024-08-28 12:38:46,469] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 282943
[2024-08-28 12:38:46,478] [ERROR] [launch.py:325:sigkill_handler] ['/root/anaconda3/envs/glm-4-copy/bin/python', '-u', 'main.py', '--local_rank=7', '--model_name_or_path', '/root/AI4E/share/glm-4-9b', '--train_file', './data/glm4/longwriter', '--output_dir', './output/glm4/longwriter', '--num_train_epochs', '4', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1', '--gradient_accumulation_steps', '1', '--save_strategy', 'steps', '--save_steps', '400', '--save_total_limit', '10', '--preprocessing_num_workers', '64', '--learning_rate', '1e-5', '--weight_decay', '0.1', '--warmup_ratio', '0.03', '--lr_scheduler_type', 'cosine', '--logging_dir', './logs/', '--deepspeed', 'ds_config/stage3.json', '--bf16', '--gradient_checkpointing', '1', '--adam_beta1', '0.9', '--adam_beta2', '0.95', '--report_to', 'wandb', '--run_name', 'glm4_longwriter', '--logging_steps', '1', '--batch_method', 'pack', '--pack_loss'] exits with return code = 1

Expected behavior / 期待表现

@bys0318
Copy link
Member

bys0318 commented Aug 28, 2024

在deepspeed config里将stage3_prefetch_bucket_size设为15099494试试呢?

@LYCnight
Copy link
Author

LYCnight commented Aug 29, 2024

在deepspeed config里将stage3_prefetch_bucket_size设为15099494试试呢?

可以,但是会报新错误:
RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288

@LYCnight LYCnight reopened this Aug 29, 2024
@LYCnight
Copy link
Author

LYCnight commented Aug 29, 2024

官方人员检查一下 tokenizer 吧

我已经把官方的方法都试过了,现在我的情况是:

  • transformers==4.33.0
  • pytorch==2.2.0
  • /patch/modeling_chatglm.py 已替换 /root/AI4E/share/glm-4-9b/modeling_chatglm.py
    但是运行的时候会报一个 KeyError: '<|endoftext|>',所以我认为是 tokenizer 的问题。

官方人员检查一下 tokenizer 吧
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
path = "/root/AI4E/share/glm-4-9b"
tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)

`---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[6], line 2
1 path = "/root/AI4E/share/glm-4-9b"
----> 2 tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)

File ~/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py:723, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
721 if os.path.isdir(pretrained_model_name_or_path):
722 tokenizer_class.register_for_auto_class()
--> 723 return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
724 elif config_tokenizer_class is not None:
725 tokenizer_class = None

File ~/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:1854, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, *init_inputs, **kwargs)
1851 else:
1852 logger.info(f"loading file {file_path} from cache at {resolved_vocab_files[file_id]}")
-> 1854 return cls._from_pretrained(
1855 resolved_vocab_files,
1856 pretrained_model_name_or_path,
1857 init_configuration,
1858 *init_inputs,
1859 token=token,
1860 cache_dir=cache_dir,
1861 local_files_only=local_files_only,
1862 _commit_hash=commit_hash,
1863 _is_local=is_local,
1864 **kwargs,
1865 )

File ~/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:2090, in PreTrainedTokenizerBase._from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, token, cache_dir, local_files_only, _commit_hash, _is_local, *init_inputs, **kwargs)
2087 tokenizer.add_tokens(tokens, special_tokens=is_last_special)
2089 # Check all our special tokens are registered as "no split" token (we don't cut them) and are in the vocab
-> 2090 added_tokens = tokenizer.sanitize_special_tokens()
2091 if added_tokens:
2092 logger.warning_advice(
2093 "Special tokens have been added in the vocabulary, make sure the associated word embeddings are"
2094 " fine-tuned or trained."
2095 )

File ~/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:861, in SpecialTokensMixin.sanitize_special_tokens(self)
851 def sanitize_special_tokens(self) -> int:
852 """
853 Make sure that all the special tokens attributes of the tokenizer (tokenizer.mask_token,
854 tokenizer.cls_token, etc.) are in the vocabulary.
(...)
859 int: The number of tokens added in the vocabulary during the operation.
860 """
--> 861 return self.add_tokens(self.all_special_tokens_extended, special_tokens=True)

File ~/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:1004, in SpecialTokensMixin.add_tokens(self, new_tokens, special_tokens)
1001 if not isinstance(new_tokens, (list, tuple)):
1002 new_tokens = [new_tokens]
-> 1004 return self._add_tokens(new_tokens, special_tokens=special_tokens)

File ~/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py:421, in PreTrainedTokenizer._add_tokens(self, new_tokens, special_tokens)
417 if not special_tokens and hasattr(self, "do_lower_case") and self.do_lower_case:
418 token = token.lower()
419 if (
420 token != self.unk_token
--> 421 and self.convert_tokens_to_ids(token) == self.convert_tokens_to_ids(self.unk_token)
422 and token not in tokens_to_add
423 ):
424 tokens_to_add.append(token)
425 if self.verbose:

File ~/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py:582, in PreTrainedTokenizer.convert_tokens_to_ids(self, tokens)
579 return None
581 if isinstance(tokens, str):
--> 582 return self._convert_token_to_id_with_added_voc(tokens)
584 ids = []
585 for token in tokens:

File ~/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py:595, in PreTrainedTokenizer._convert_token_to_id_with_added_voc(self, token)
593 if token in self.added_tokens_encoder:
594 return self.added_tokens_encoder[token]
--> 595 return self._convert_token_to_id(token)

File ~/.cache/huggingface/modules/transformers_modules/glm-4-9b/tokenization_chatglm.py:96, in ChatGLM4Tokenizer._convert_token_to_id(self, token)
94 def _convert_token_to_id(self, token):
95 """ Converts a token (str) in an id using the vocab. """
---> 96 return self.mergeable_ranks[token]

KeyError: '<|endoftext|>'`

@LYCnight
Copy link
Author

附上运行 ' ./scripts/glm4_longwriter.sh' 时的报错信息:

KeyError: '<|endoftext|>'
Using unk_token, but it is not set yet.
Traceback (most recent call last):
File "/root/AI4E/ljc/LongWriter/train/main.py", line 139, in
train()
File "/root/AI4E/ljc/LongWriter/train/main.py", line 121, in train
tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 723, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained
return cls._from_pretrained(
^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2090, in _from_pretrained
added_tokens = tokenizer.sanitize_special_tokens()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 861, in sanitize_special_tokens
return self.add_tokens(self.all_special_tokens_extended, special_tokens=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1004, in add_tokens
return self._add_tokens(new_tokens, special_tokens=special_tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 421, in _add_tokens
and self.convert_tokens_to_ids(token) == self.convert_tokens_to_ids(self.unk_token)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 582, in convert_tokens_to_ids
return self._convert_token_to_id_with_added_voc(tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 595, in _convert_token_to_id_with_added_voc
return self._convert_token_to_id(token)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b/tokenization_chatglm.py", line 96, in _convert_token_to_id
return self.mergeable_ranks[token]
~~~~~~~~~~~~~~~~~~~~^^^^^^^
KeyError: '<|endoftext|>'
[2024-08-29 07:53:56,997] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528556
[2024-08-29 07:53:56,997] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528557
[2024-08-29 07:53:57,347] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528558
[2024-08-29 07:53:58,671] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528559
[2024-08-29 07:53:58,689] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528560
[2024-08-29 07:53:58,698] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528561
[2024-08-29 07:53:58,706] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528562
[2024-08-29 07:53:58,720] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528563
[2024-08-29 07:53:58,732] [ERROR] [launch.py:325:sigkill_handler] ['/root/anaconda3/envs/glm-4-copy/bin/python', '-u', 'main.py', '--local_rank=7', '--model_name_or_path', '/root/AI4E/share/glm-4-9b', '--train_file', './data/glm4/longwriter', '--output_dir', './output/glm4/longwriter', '--num_train_epochs', '4', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1', '--gradient_accumulation_steps', '1', '--save_strategy', 'steps', '--save_steps', '400', '--save_total_limit', '10', '--preprocessing_num_workers', '64', '--learning_rate', '1e-5', '--weight_decay', '0.1', '--warmup_ratio', '0.03', '--lr_scheduler_type', 'cosine', '--logging_dir', './logs/', '--deepspeed', 'ds_config/stage3.json', '--bf16', '--gradient_checkpointing', '1', '--adam_beta1', '0.9', '--adam_beta2', '0.95', '--report_to', 'wandb', '--run_name', 'glm4_longwriter', '--logging_steps', '1', '--batch_method', 'pack', '--pack_loss'] exits with return code = 1

@badarrrr
Copy link

在deepspeed config里将stage3_prefetch_bucket_size设为15099494试试呢?

可以,但是会报新错误: RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288

我遇到了跟你一模一样的错误:
Traceback of TorchScript (most recent call last):
File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b/modeling_chatglm.py", line 145, in apply_rotary_pos_emb
rope_cache = rope_cache[:sq]
xshaped = x.reshape(sq, -1, np, rot_dim // 2, 2)
rope_cache = rope_cache.view(sq, -1, 1, xshaped.size(3), 2)

x_out2 = torch.stack(
[
RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288

@bys0318
Copy link
Member

bys0318 commented Aug 29, 2024

附上运行 ' ./scripts/glm4_longwriter.sh' 时的报错信息:

KeyError: '<|endoftext|>' Using unk_token, but it is not set yet. Traceback (most recent call last): File "/root/AI4E/ljc/LongWriter/train/main.py", line 139, in train() File "/root/AI4E/ljc/LongWriter/train/main.py", line 121, in train tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 723, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained return cls._from_pretrained( ^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2090, in _from_pretrained added_tokens = tokenizer.sanitize_special_tokens() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 861, in sanitize_special_tokens return self.add_tokens(self.all_special_tokens_extended, special_tokens=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1004, in add_tokens return self._add_tokens(new_tokens, special_tokens=special_tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 421, in _add_tokens and self.convert_tokens_to_ids(token) == self.convert_tokens_to_ids(self.unk_token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 582, in convert_tokens_to_ids return self._convert_token_to_id_with_added_voc(tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 595, in _convert_token_to_id_with_added_voc return self._convert_token_to_id(token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b/tokenization_chatglm.py", line 96, in _convert_token_to_id return self.mergeable_ranks[token] ~~~~~~~~~~~~~~~~~~~~^^^^^^^ KeyError: '<|endoftext|>' [2024-08-29 07:53:56,997] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528556 [2024-08-29 07:53:56,997] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528557 [2024-08-29 07:53:57,347] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528558 [2024-08-29 07:53:58,671] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528559 [2024-08-29 07:53:58,689] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528560 [2024-08-29 07:53:58,698] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528561 [2024-08-29 07:53:58,706] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528562 [2024-08-29 07:53:58,720] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528563 [2024-08-29 07:53:58,732] [ERROR] [launch.py:325:sigkill_handler] ['/root/anaconda3/envs/glm-4-copy/bin/python', '-u', 'main.py', '--local_rank=7', '--model_name_or_path', '/root/AI4E/share/glm-4-9b', '--train_file', './data/glm4/longwriter', '--output_dir', './output/glm4/longwriter', '--num_train_epochs', '4', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1', '--gradient_accumulation_steps', '1', '--save_strategy', 'steps', '--save_steps', '400', '--save_total_limit', '10', '--preprocessing_num_workers', '64', '--learning_rate', '1e-5', '--weight_decay', '0.1', '--warmup_ratio', '0.03', '--lr_scheduler_type', 'cosine', '--logging_dir', './logs/', '--deepspeed', 'ds_config/stage3.json', '--bf16', '--gradient_checkpointing', '1', '--adam_beta1', '0.9', '--adam_beta2', '0.95', '--report_to', 'wandb', '--run_name', 'glm4_longwriter', '--logging_steps', '1', '--batch_method', 'pack', '--pack_loss'] exits with return code = 1

你好,请用LongWriter-glm4-9b的tokenizer代码,目前的训练代码没有支持最新版GLM-4-9b的tokenizer。

@badarrrr
Copy link

附上运行 ' ./scripts/glm4_longwriter.sh' 时的报错信息:

KeyError: '<|endoftext|>' Using unk_token, but it is not set yet. Traceback (most recent call last): File "/root/AI4E/ljc/LongWriter/train/main.py", line 139, in train() File "/root/AI4E/ljc/LongWriter/train/main.py", line 121, in train tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 723, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained return cls._from_pretrained( ^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2090, in _from_pretrained added_tokens = tokenizer.sanitize_special_tokens() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 861, in sanitize_special_tokens return self.add_tokens(self.all_special_tokens_extended, special_tokens=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1004, in add_tokens return self._add_tokens(new_tokens, special_tokens=special_tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 421, in _add_tokens and self.convert_tokens_to_ids(token) == self.convert_tokens_to_ids(self.unk_token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 582, in convert_tokens_to_ids return self._convert_token_to_id_with_added_voc(tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 595, in _convert_token_to_id_with_added_voc return self._convert_token_to_id(token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b/tokenization_chatglm.py", line 96, in _convert_token_to_id return self.mergeable_ranks[token] ~~~~~~~~~~~~~~~~~~~~^^^^^^^ KeyError: '<|endoftext|>' [2024-08-29 07:53:56,997] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528556 [2024-08-29 07:53:56,997] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528557 [2024-08-29 07:53:57,347] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528558 [2024-08-29 07:53:58,671] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528559 [2024-08-29 07:53:58,689] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528560 [2024-08-29 07:53:58,698] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528561 [2024-08-29 07:53:58,706] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528562 [2024-08-29 07:53:58,720] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528563 [2024-08-29 07:53:58,732] [ERROR] [launch.py:325:sigkill_handler] ['/root/anaconda3/envs/glm-4-copy/bin/python', '-u', 'main.py', '--local_rank=7', '--model_name_or_path', '/root/AI4E/share/glm-4-9b', '--train_file', './data/glm4/longwriter', '--output_dir', './output/glm4/longwriter', '--num_train_epochs', '4', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1', '--gradient_accumulation_steps', '1', '--save_strategy', 'steps', '--save_steps', '400', '--save_total_limit', '10', '--preprocessing_num_workers', '64', '--learning_rate', '1e-5', '--weight_decay', '0.1', '--warmup_ratio', '0.03', '--lr_scheduler_type', 'cosine', '--logging_dir', './logs/', '--deepspeed', 'ds_config/stage3.json', '--bf16', '--gradient_checkpointing', '1', '--adam_beta1', '0.9', '--adam_beta2', '0.95', '--report_to', 'wandb', '--run_name', 'glm4_longwriter', '--logging_steps', '1', '--batch_method', 'pack', '--pack_loss'] exits with return code = 1

你好,请用LongWriter-glm4-9b的tokenizer代码,目前的训练代码没有支持最新版GLM-4-9b的tokenizer。

RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288
这个和你说的是同一个问题吗

@bys0318 bys0318 self-assigned this Aug 29, 2024
@bys0318 bys0318 added the bug Something isn't working label Aug 29, 2024
@LYCnight
Copy link
Author

在deepspeed config里将stage3_prefetch_bucket_size设为15099494试试呢?

可以,但是会报新错误: RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288

我遇到了跟你一模一样的错误: Traceback of TorchScript (most recent call last): File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b/modeling_chatglm.py", line 145, in apply_rotary_pos_emb rope_cache = rope_cache[:sq] xshaped = x.reshape(sq, -1, np, rot_dim // 2, 2) rope_cache = rope_cache.view(sq, -1, 1, xshaped.size(3), 2)

x_out2 = torch.stack(
[
RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288

现在我也报这个错了

@LYCnight
Copy link
Author

现在会报两种类型的错误

  • 系统环境:
    • python==3.11.9
    • transformers==4.33.0
    • pytorch==2.2.0
    • /glm-4-9b 目录下的 modeling_chatglm.pytokenization_chatglm.py 都已经替换

错误一:RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288

在 /ds_config/stage3.json 中设置 "stage3_prefetch_bucket_size": 15099494,

这样的话会一直运行到出现wandb界面,但在开始训练的时候就会报错:

 ^^^^^^^^  File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/trainer.py", line 2679, in training_step
^RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b/modeling_chatglm.py", line 146, in apply_rotary_pos_emb
    rope_cache = rope_cache[:sq]
    xshaped = x.reshape(sq, -1, np, rot_dim // 2, 2)
    rope_cache = rope_cache.view(sq, -1, 1, xshaped.size(3), 2)
                 ~~~~~~~~~~~~~~~ <--- HERE
    x_out2 = torch.stack(
        [
RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288

错误二:Input should be a valid integer, got a number with a fractional part

在 /ds_config/stage3.json 中设置 "stage3_prefetch_bucket_size": "auto",

这样设置并运行的话会在wandb出现之前就报错:

  File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/deepspeed/runtime/config.py", line 817, in _initialize_params
    self.zero_config = get_zero_config(param_dict)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/deepspeed/runtime/zero/config.py", line 71, in get_zero_config
    return DeepSpeedZeroConfig(**zero_config_dict)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/deepspeed/runtime/config_utils.py", line 57, in __init__
    super().__init__(**data)
  File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/pydantic/main.py", line 193, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for DeepSpeedZeroConfig
stage3_prefetch_bucket_size
  Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=15099494.4, input_type=float]
    For further information visit https://errors.pydantic.dev/2.8/v/int_from_floa

@LYCnight
Copy link
Author

LYCnight commented Sep 2, 2024

附上运行 ' ./scripts/glm4_longwriter.sh' 时的报错信息:

KeyError: '<|endoftext|>' Using unk_token, but it is not set yet. Traceback (most recent call last): File "/root/AI4E/ljc/LongWriter/train/main.py", line 139, in train() File "/root/AI4E/ljc/LongWriter/train/main.py", line 121, in train tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 723, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained return cls._from_pretrained( ^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2090, in _from_pretrained added_tokens = tokenizer.sanitize_special_tokens() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 861, in sanitize_special_tokens return self.add_tokens(self.all_special_tokens_extended, special_tokens=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1004, in add_tokens return self._add_tokens(new_tokens, special_tokens=special_tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 421, in _add_tokens and self.convert_tokens_to_ids(token) == self.convert_tokens_to_ids(self.unk_token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 582, in convert_tokens_to_ids return self._convert_token_to_id_with_added_voc(tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 595, in _convert_token_to_id_with_added_voc return self._convert_token_to_id(token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b/tokenization_chatglm.py", line 96, in _convert_token_to_id return self.mergeable_ranks[token] ~~~~~~~~~~~~~~~~~~~~^^^^^^^ KeyError: '<|endoftext|>' [2024-08-29 07:53:56,997] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528556 [2024-08-29 07:53:56,997] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528557 [2024-08-29 07:53:57,347] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528558 [2024-08-29 07:53:58,671] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528559 [2024-08-29 07:53:58,689] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528560 [2024-08-29 07:53:58,698] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528561 [2024-08-29 07:53:58,706] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528562 [2024-08-29 07:53:58,720] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528563 [2024-08-29 07:53:58,732] [ERROR] [launch.py:325:sigkill_handler] ['/root/anaconda3/envs/glm-4-copy/bin/python', '-u', 'main.py', '--local_rank=7', '--model_name_or_path', '/root/AI4E/share/glm-4-9b', '--train_file', './data/glm4/longwriter', '--output_dir', './output/glm4/longwriter', '--num_train_epochs', '4', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1', '--gradient_accumulation_steps', '1', '--save_strategy', 'steps', '--save_steps', '400', '--save_total_limit', '10', '--preprocessing_num_workers', '64', '--learning_rate', '1e-5', '--weight_decay', '0.1', '--warmup_ratio', '0.03', '--lr_scheduler_type', 'cosine', '--logging_dir', './logs/', '--deepspeed', 'ds_config/stage3.json', '--bf16', '--gradient_checkpointing', '1', '--adam_beta1', '0.9', '--adam_beta2', '0.95', '--report_to', 'wandb', '--run_name', 'glm4_longwriter', '--logging_steps', '1', '--batch_method', 'pack', '--pack_loss'] exits with return code = 1

你好,请用LongWriter-glm4-9b的tokenizer代码,目前的训练代码没有支持最新版GLM-4-9b的tokenizer。

你好,请问有solution了吗,还是想跑一下训练

@bys0318
Copy link
Member

bys0318 commented Sep 3, 2024

附上运行 ' ./scripts/glm4_longwriter.sh' 时的报错信息:

KeyError: '<|endoftext|>' Using unk_token, but it is not set yet. Traceback (most recent call last): File "/root/AI4E/ljc/LongWriter/train/main.py", line 139, in train() File "/root/AI4E/ljc/LongWriter/train/main.py", line 121, in train tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 723, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained return cls._from_pretrained( ^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2090, in _from_pretrained added_tokens = tokenizer.sanitize_special_tokens() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 861, in sanitize_special_tokens return self.add_tokens(self.all_special_tokens_extended, special_tokens=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1004, in add_tokens return self._add_tokens(new_tokens, special_tokens=special_tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 421, in _add_tokens and self.convert_tokens_to_ids(token) == self.convert_tokens_to_ids(self.unk_token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 582, in convert_tokens_to_ids return self._convert_token_to_id_with_added_voc(tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 595, in _convert_token_to_id_with_added_voc return self._convert_token_to_id(token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b/tokenization_chatglm.py", line 96, in _convert_token_to_id return self.mergeable_ranks[token] ~~~~~~~~~~~~~~~~~~~~^^^^^^^ KeyError: '<|endoftext|>' [2024-08-29 07:53:56,997] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528556 [2024-08-29 07:53:56,997] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528557 [2024-08-29 07:53:57,347] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528558 [2024-08-29 07:53:58,671] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528559 [2024-08-29 07:53:58,689] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528560 [2024-08-29 07:53:58,698] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528561 [2024-08-29 07:53:58,706] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528562 [2024-08-29 07:53:58,720] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 528563 [2024-08-29 07:53:58,732] [ERROR] [launch.py:325:sigkill_handler] ['/root/anaconda3/envs/glm-4-copy/bin/python', '-u', 'main.py', '--local_rank=7', '--model_name_or_path', '/root/AI4E/share/glm-4-9b', '--train_file', './data/glm4/longwriter', '--output_dir', './output/glm4/longwriter', '--num_train_epochs', '4', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1', '--gradient_accumulation_steps', '1', '--save_strategy', 'steps', '--save_steps', '400', '--save_total_limit', '10', '--preprocessing_num_workers', '64', '--learning_rate', '1e-5', '--weight_decay', '0.1', '--warmup_ratio', '0.03', '--lr_scheduler_type', 'cosine', '--logging_dir', './logs/', '--deepspeed', 'ds_config/stage3.json', '--bf16', '--gradient_checkpointing', '1', '--adam_beta1', '0.9', '--adam_beta2', '0.95', '--report_to', 'wandb', '--run_name', 'glm4_longwriter', '--logging_steps', '1', '--batch_method', 'pack', '--pack_loss'] exits with return code = 1

你好,请用LongWriter-glm4-9b的tokenizer代码,目前的训练代码没有支持最新版GLM-4-9b的tokenizer。

你好,请问有solution了吗,还是想跑一下训练

你好,从报错信息看代码运行时用的还是glm-4-9b原本的tokenization_chatglm.py,并不是LongWriter-glm4-9btokenization_chatglm.py。请确认main.py里model和tokenizer载入时是否加了trust_remote_code=True

@bys0318
Copy link
Member

bys0318 commented Sep 3, 2024

现在会报两种类型的错误

  • 系统环境:

    • python==3.11.9
    • transformers==4.33.0
    • pytorch==2.2.0
    • /glm-4-9b 目录下的 modeling_chatglm.pytokenization_chatglm.py 都已经替换

错误一:RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288

在 /ds_config/stage3.json 中设置 "stage3_prefetch_bucket_size": 15099494,

这样的话会一直运行到出现wandb界面,但在开始训练的时候就会报错:

 ^^^^^^^^  File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/transformers/trainer.py", line 2679, in training_step
^RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b/modeling_chatglm.py", line 146, in apply_rotary_pos_emb
    rope_cache = rope_cache[:sq]
    xshaped = x.reshape(sq, -1, np, rot_dim // 2, 2)
    rope_cache = rope_cache.view(sq, -1, 1, xshaped.size(3), 2)
                 ~~~~~~~~~~~~~~~ <--- HERE
    x_out2 = torch.stack(
        [
RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288

错误二:Input should be a valid integer, got a number with a fractional part

在 /ds_config/stage3.json 中设置 "stage3_prefetch_bucket_size": "auto",

这样设置并运行的话会在wandb出现之前就报错:

  File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/deepspeed/runtime/config.py", line 817, in _initialize_params
    self.zero_config = get_zero_config(param_dict)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/deepspeed/runtime/zero/config.py", line 71, in get_zero_config
    return DeepSpeedZeroConfig(**zero_config_dict)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/deepspeed/runtime/config_utils.py", line 57, in __init__
    super().__init__(**data)
  File "/root/anaconda3/envs/glm-4-copy/lib/python3.11/site-packages/pydantic/main.py", line 193, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for DeepSpeedZeroConfig
stage3_prefetch_bucket_size
  Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=15099494.4, input_type=float]
    For further information visit https://errors.pydantic.dev/2.8/v/int_from_floa

对于错误二,请把"stage3_prefetch_bucket_size": "auto"改为15099494。

@bys0318
Copy link
Member

bys0318 commented Sep 3, 2024

hiyouga/LLaMA-Factory#5252 这个issue来看,"stage3_prefetch_bucket_size": "auto"报错可以通过降低deepspeed版本解决,试试pip install deepspeed==0.14.4

@bys0318
Copy link
Member

bys0318 commented Sep 3, 2024

@LYCnight @badarrrr 请看我们在README中的FAQ是否能解决你们遇到的问题。不好意思让你们久等了。

@LYCnight
Copy link
Author

LYCnight commented Sep 4, 2024

@LYCnight @badarrrr 请看我们在README中的FAQ是否能解决你们遇到的问题。不好意思让你们久等了。

非常感谢!我已经train成功了,分享一些经验:#25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants