You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
transformers version: 4.43.3
Platform: Linux-5.15.0-105-generic-x86_64-with-glibc2.17
Python version: 3.8.19
Huggingface_hub version: 0.24.5
Safetensors version: 0.4.4
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): not installed (NA)
Tensorflow version (GPU?): 2.7.0 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: Yes, using TensorFlow MirroredStrategy for distributed training.
Training gen
Style embeddings shape: (4, 768)
Input embeddings shape: (4, 72, 768)
Extended embeddings shape: (4, 72, 768)
Logits shape: (4, 72, 21128)
Logits dtype: <dtype: 'float16'>
labels shape: (4, 72)
Mask shape: (4, 72)
Mask dtype: <dtype: 'float16'>
Reconstruction loss: Tensor("Cast_3:0", shape=(), dtype=float32)
Input shapes: input_ids: (4, 72) input_ids dtype: <dtype: 'int32'> attention_mask: (4, 72) labels: (4, 72) styles: (4,) max_len_value: tf.Tensor(125, shape=(), dtype=int32)
New shape: Tensor("Shape_1:0", shape=(2,), dtype=int32)
Seq len: 72
Max length: tf.Tensor(125, shape=(), dtype=int32)
Max new tokens: tf.Tensor(43, shape=(), dtype=int32)
Max new tokens: 43
Padding shape: (4, 43)
Extended input_ids shape: (4, 115)
Extended attention_mask shape: (4, 115)
/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py:377: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use and modify the model generation configuration (see https://huggingface.co/docs/transformers/generation_strategies#default-text-generation-configuration )
return py_builtins.overload_of(f)(*args)
Error during generation: max_new_tokens must be greater than 0, but is 43.
input_ids shape: (4, 72)
attention_mask shape: (4, 72)
max_len_value: 125
Traceback (most recent call last):
File "train.py", line 530, in
train_model.train(train_tf_dataset_X, train_tf_dataset_Y, valid_tf_dataset_X, valid_tf_dataset_Y, trainconfig.epochs)
File "train.py", line 306, in train
rec_loss, lm_loss, adv_loss, kl_loss, current_lr, accuracy, total_gen_loss = self.distributed_train_generator_step(
File "train.py", line 138, in distributed_train_generator_step
loss, rec_loss, lm_loss, adv_loss, kl_loss, current_lr, accuracy = self.strategy.run(
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py", line 1316, in run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py", line 2892, in call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/mirrored_strategy.py", line 677, in _call_for_each_replica
return mirrored_run.call_for_each_replica(
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/mirrored_run.py", line 104, in call_for_each_replica
return _call_for_each_replica(strategy, fn, args, kwargs)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/mirrored_run.py", line 246, in _call_for_each_replica
coord.join(threads)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/six.py", line 719, in reraise
raise value
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/training/coordinator.py", line 297, in stop_on_exception
yield
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/mirrored_run.py", line 346, in run
self.main_result = self.main_fn(*self.main_args, **self.main_kwargs)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py", line 601, in wrapper
return func(*args, **kwargs)
File "train.py", line 133, in generator_step
loss, rec_loss, lm_loss, adv_loss, kl_loss, current_lr, accuracy, gradients = self.model.train_generator_step(*args, **kwargs)
File "/root/autodl-tmp/model/model.py", line 302, in train_generator_step
step_total_loss, step_rec_loss, step_lm_loss, step_adv_loss, step_kl_loss, step_gradients, step_accuracy = step_fn(
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 1129, in autograph_handler
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
File "/root/autodl-tmp/model/model.py", line 220, in step_fn *
generated_ids = self.gen.generate(
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/transformers/generation/tf_utils.py", line 738, in generate *
model_kwargs = generation_config.update(**kwargs) # All unused kwargs must be model kwargs
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/transformers/generation/configuration_utils.py", line 1207, in update *
self.validate()
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/transformers/generation/configuration_utils.py", line 544, in validate *
raise ValueError(f"max_new_tokens must be greater than 0, but is {self.max_new_tokens}.")
ValueError: max_new_tokens must be greater than 0, but is 43.
"""
Definetely, l occured this mistake within @tf.function, and there is no logical mistake when l dubug my code under eager-excution model, similarly, when i use max_length and min_length paramters, it would be occured to "ValueError: max_length must be greater than min_length, 1 is larger than 128.", like this. But, when l set the paramter"max_new_tokens" as a constant value like 50, it would be fine, l donno what leads this, and debug this for at least 20 times.
"""
Expected behavior
Of course, the value of my variable is dynamic, but I have already defined it outside the graph and used it as a parameter. My expected behavior should be 43 as max_new_token, but it reported an error.
The text was updated successfully, but these errors were encountered:
Hey! Thanks for posting! Which model are you using? Also one weird thing is: train_generator_step should not use generate, in general using generate is for inference!
yeah, l know.l use uer/gpt2-chinese-cluecorpussmall model as generator. And l need to use generate to train my GAN, you could find complete code section in my repository, basically, l need the generate id to train my model.
Hey! Thanks for posting! Which model are you using? Also one weird thing is: train_generator_step should not use generate, in general using generate is for inference!
System Info
transformers version: 4.43.3
Platform: Linux-5.15.0-105-generic-x86_64-with-glibc2.17
Python version: 3.8.19
Huggingface_hub version: 0.24.5
Safetensors version: 0.4.4
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): not installed (NA)
Tensorflow version (GPU?): 2.7.0 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: Yes, using TensorFlow MirroredStrategy for distributed training.
Who can help?
@ArthurZucker the original issues is #33329, thanks a lot
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
''
def train_generator_step(self, input_ids, attention_mask, labels, styles, max_len, step, accumulation_steps=4,
lambda_rec=1.0, lambda_lm=1.0, lambda_adv=1.0, lambda_kl=1.0, gamma=1.0):
'''
"""
Relevant Message
"""
Training gen
Style embeddings shape: (4, 768)
Input embeddings shape: (4, 72, 768)
Extended embeddings shape: (4, 72, 768)
Logits shape: (4, 72, 21128)
Logits dtype: <dtype: 'float16'>
labels shape: (4, 72)
Mask shape: (4, 72)
Mask dtype: <dtype: 'float16'>
Reconstruction loss: Tensor("Cast_3:0", shape=(), dtype=float32)
Input shapes: input_ids: (4, 72) input_ids dtype: <dtype: 'int32'> attention_mask: (4, 72) labels: (4, 72) styles: (4,) max_len_value: tf.Tensor(125, shape=(), dtype=int32)
New shape: Tensor("Shape_1:0", shape=(2,), dtype=int32)
Seq len: 72
Max length: tf.Tensor(125, shape=(), dtype=int32)
Max new tokens: tf.Tensor(43, shape=(), dtype=int32)
Max new tokens: 43
Padding shape: (4, 43)
Extended input_ids shape: (4, 115)
Extended attention_mask shape: (4, 115)
/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py:377: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use and modify the model generation configuration (see https://huggingface.co/docs/transformers/generation_strategies#default-text-generation-configuration )
return py_builtins.overload_of(f)(*args)
Error during generation: max_new_tokens must be greater than 0, but is 43.
input_ids shape: (4, 72)
attention_mask shape: (4, 72)
max_len_value: 125
Traceback (most recent call last):
File "train.py", line 530, in
train_model.train(train_tf_dataset_X, train_tf_dataset_Y, valid_tf_dataset_X, valid_tf_dataset_Y, trainconfig.epochs)
File "train.py", line 306, in train
rec_loss, lm_loss, adv_loss, kl_loss, current_lr, accuracy, total_gen_loss = self.distributed_train_generator_step(
File "train.py", line 138, in distributed_train_generator_step
loss, rec_loss, lm_loss, adv_loss, kl_loss, current_lr, accuracy = self.strategy.run(
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py", line 1316, in run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py", line 2892, in call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/mirrored_strategy.py", line 677, in _call_for_each_replica
return mirrored_run.call_for_each_replica(
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/mirrored_run.py", line 104, in call_for_each_replica
return _call_for_each_replica(strategy, fn, args, kwargs)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/mirrored_run.py", line 246, in _call_for_each_replica
coord.join(threads)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/six.py", line 719, in reraise
raise value
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/training/coordinator.py", line 297, in stop_on_exception
yield
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/distribute/mirrored_run.py", line 346, in run
self.main_result = self.main_fn(*self.main_args, **self.main_kwargs)
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py", line 601, in wrapper
return func(*args, **kwargs)
File "train.py", line 133, in generator_step
loss, rec_loss, lm_loss, adv_loss, kl_loss, current_lr, accuracy, gradients = self.model.train_generator_step(*args, **kwargs)
File "/root/autodl-tmp/model/model.py", line 302, in train_generator_step
step_total_loss, step_rec_loss, step_lm_loss, step_adv_loss, step_kl_loss, step_gradients, step_accuracy = step_fn(
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 1129, in autograph_handler
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
File "/root/autodl-tmp/model/model.py", line 220, in step_fn *
generated_ids = self.gen.generate(
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/transformers/generation/tf_utils.py", line 738, in generate *
model_kwargs = generation_config.update(**kwargs) # All unused kwargs must be model kwargs
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/transformers/generation/configuration_utils.py", line 1207, in update *
self.validate()
File "/root/miniconda3/envs/gpt2-env/lib/python3.8/site-packages/transformers/generation/configuration_utils.py", line 544, in validate *
raise ValueError(f"
max_new_tokens
must be greater than 0, but is {self.max_new_tokens}.")ValueError:
max_new_tokens
must be greater than 0, but is 43."""
Definetely, l occured this mistake within @tf.function, and there is no logical mistake when l dubug my code under eager-excution model, similarly, when i use max_length and min_length paramters, it would be occured to "ValueError: max_length must be greater than min_length, 1 is larger than 128.", like this. But, when l set the paramter"max_new_tokens" as a constant value like 50, it would be fine, l donno what leads this, and debug this for at least 20 times.
"""
Expected behavior
Of course, the value of my variable is dynamic, but I have already defined it outside the graph and used it as a parameter. My expected behavior should be 43 as max_new_token, but it reported an error.
The text was updated successfully, but these errors were encountered: