OpenVLA does not work on Jetson AGX orin #634

nirty-fai · 2024-09-17T23:10:04Z

I am trying to follow the tutorial here - https://www.jetson-ai-lab.com/openvla.html. I get the error message attached.

I am able to run the nanoLLM demo here - https://www.jetson-ai-lab.com/tutorial_nano-llm.html. I have acces to LLama models (2, 3) in the huggingface.

jetson-containers run $(autotag nano_llm)   python3 -m nano_llm.vision.vla --api mlc     --model openvla/openvla-7b     --quantization q4f16_ft     --dataset dusty-nv/bridge_orig_ep100     --dataset-type rlds     --max-episodes 10     --save-stats /data/benchmarks/openvla_bridge_int4.json
Namespace(packages=['nano_llm'], prefer=['local', 'registry', 'build'], disable=[''], user='dustynv', output='/tmp/autotag', quiet=False, verbose=False)
-- L4T_VERSION=36.3.0  JETPACK_VERSION=6.0  CUDA_VERSION=12.2
-- Finding compatible container image for ['nano_llm']
[sudo] password for <user>: 
dustynv/nano_llm:r36.3.0
localuser:root being added to access control list
+ sudo docker run --runtime nvidia -it --rm --network host --shm-size=8g --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/<user>/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:0 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-3 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-6 --device /dev/i2c-7 --device /dev/i2c-8 --device /dev/i2c-9 dustynv/nano_llm:r36.3.0 python3 -m nano_llm.vision.vla --api mlc --model openvla/openvla-7b --quantization q4f16_ft --dataset dusty-nv/bridge_orig_ep100 --dataset-type rlds --max-episodes 10 --save-stats /data/benchmarks/openvla_bridge_int4.json
/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:127: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
Fetching 6 files:   0%|                                                                                                                                                                                                                                                                                                                                       | 0/6 [00:00<?, ?it/s]/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1150: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Fetching 6 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 6772.29it/s]
23:07:21 | INFO | Load dataset info from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0
23:07:21 | INFO | Creating a tf.data.Dataset reading 1 files located in folders: /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0.
23:07:21 | INFO | Constructing tf.data.Dataset bridge_orig_ep100 for split train[:11], from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0
23:07:21 | SUCCESS | TFDSDataset | loaded bridge_orig_ep100 from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04 (records=11)
2024-09-17 23:07:21.335497: W tensorflow/core/kernels/data/cache_dataset_ops.cc:913] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
2024-09-17 23:07:21.472187: W tensorflow/core/kernels/data/cache_dataset_ops.cc:913] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
23:07:21 | SUCCESS | RLDSDataset | loaded bridge_orig_ep100 - episode format:
{ 'action': [7],
  'cameras': ['image'],
  'image_size': (224, 224, 3),
  'observation': { 'image': ((224, 224, 3), dtype('uint8')),
                   'state': ((7,), dtype('float32'))},
  'step': [ 'action',
            'is_first',
            'is_last',
            'language_instruction',
            'observation']}
Fetching 15 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 8298.98it/s]
Fetching 18 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 20333.28it/s]
23:07:21 | INFO | loading /data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0 with MLC
Traceback (most recent call last):
  File "/opt/NanoLLM/nano_llm/nano_llm.py", line 337, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(self.model_path, use_fast=True, trust_remote_code=True)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 854, in from_pretrained
    config = AutoConfig.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 976, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 632, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 689, in _get_config_dict
    resolved_config_file = cached_file(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 373, in cached_file
    raise EnvironmentError(
OSError: /data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0/llm does not appear to have a file named config.json. Checkout 'https://huggingface.co//data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0/llm/tree/None' for available files.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/NanoLLM/nano_llm/vision/vla.py", line 446, in <module>
    vla_process_dataset(**{**vars(args), 'dataset': dataset})
  File "/opt/NanoLLM/nano_llm/vision/vla.py", line 296, in vla_process_dataset
    model = NanoLLM.from_pretrained(model, **kwargs)
  File "/opt/NanoLLM/nano_llm/nano_llm.py", line 91, in from_pretrained
    model = MLCModel(model_path, **kwargs)
  File "/opt/NanoLLM/nano_llm/models/mlc.py", line 50, in __init__
    super(MLCModel, self).__init__(model_path, **kwargs)
  File "/opt/NanoLLM/nano_llm/nano_llm.py", line 339, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(self.model_path, use_fast=False, trust_remote_code=True)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 854, in from_pretrained
    config = AutoConfig.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 976, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 632, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 689, in _get_config_dict
    resolved_config_file = cached_file(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 373, in cached_file
    raise EnvironmentError(
OSError: /data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0/llm does not appear to have a file named config.json. Checkout 'https://huggingface.co//data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0/llm/tree/None' for available files.

The text was updated successfully, but these errors were encountered:

dusty-nv · 2024-09-18T20:48:53Z

Hi @nirty-fai ! Can you try deleting your /data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0/llm folder? And can you check that the model got downloaded. You can try running it first with --api=hf (unquantized)

Gandinator123 · 2024-10-03T12:20:53Z

Hi @nirty-fai ! Can you try deleting your /data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0/llm folder? And can you check that the model got downloaded. You can try running it first with --api=hf (unquantized)

Deleting the folder worked for me! Thanks both for pointing this out.

BigJohnn · 2024-11-27T12:42:17Z

Jetson Orin NX 16GB, similar problem but have some different ...

python3 -m nano_llm.vision.vla --api mlc     --model openvla/openvla-7b     --quantization q8f16_ft     --dataset dusty-nv/bridge_orig_ep100     --dataset-type rlds     --max-episodes 10     --save-stats /data/benchmarks/openvla_bridge_fp8.json
/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:127: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
19:32:15 | INFO | Load dataset info from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0
19:32:15 | INFO | Creating a tf.data.Dataset reading 1 files located in folders: /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0.
19:32:15 | INFO | Constructing tf.data.Dataset bridge_orig_ep100 for split train[:11], from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0
19:32:15 | SUCCESS | TFDSDataset | loaded bridge_orig_ep100 from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04 (records=11)
2024-11-27 19:32:15.624473: W tensorflow/core/kernels/data/cache_dataset_ops.cc:913] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
2024-11-27 19:32:15.780591: W tensorflow/core/kernels/data/cache_dataset_ops.cc:913] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
19:32:15 | SUCCESS | RLDSDataset | loaded bridge_orig_ep100 - episode format:
{ 'action': [7],
  'cameras': ['image'],
  'image_size': (224, 224, 3),
  'observation': { 'image': ((224, 224, 3), dtype('uint8')),
                   'state': ((7,), dtype('float32'))},
  'step': [ 'action',
            'is_first',
            'is_last',
            'language_instruction',
            'observation']}


19:36:37 | INFO | loading /data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0 with MLC
['/data/models/mlc/dist/openvla-7b/ctx4096/openvla-7b-q8f16_ft/mlc-chat-config.json', '/data/models/mlc/dist/openvla-7b/ctx4096/openvla-7b-q8f16_ft/params/mlc-chat-config.json']
19:36:38 | INFO | running MLC quantization:

python3 -m mlc_llm.build --model /data/models/mlc/dist/models/openvla-7b --quantization q8f16_ft --target cuda --use-cuda-graph --use-flash-attn-mqa --sep-embed --max-seq-len 4096 --artifact-path /data/models/mlc/dist/openvla-7b/ctx4096 --use-safetensors 


Using path "/data/models/mlc/dist/models/openvla-7b" for model "openvla-7b"
Target configured: cuda -keys=cuda,gpu -arch=sm_87 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
bin_idx_path==/data/models/mlc/dist/models/openvla-7b/model.safetensors.index.json
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/build.py", line 47, in <module>
    main()
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/build.py", line 43, in main
    core.build_model_from_args(parsed_args)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/core.py", line 876, in build_model_from_args
    param_manager.init_torch_pname_to_bin_name(args.use_safetensors)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/relax_model/param_manager.py", line 292, in init_torch_pname_to_bin_name
    mapping = load_torch_pname2binname_map(
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/relax_model/param_manager.py", line 984, in load_torch_pname2binname_map
    raise ValueError("Multiple weight shard files without json map is not supported")
ValueError: Multiple weight shard files without json map is not supported
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/NanoLLM/nano_llm/vision/vla.py", line 446, in <module>
    vla_process_dataset(**{**vars(args), 'dataset': dataset})
  File "/opt/NanoLLM/nano_llm/vision/vla.py", line 296, in vla_process_dataset
    model = NanoLLM.from_pretrained(model, **kwargs)
  File "/opt/NanoLLM/nano_llm/nano_llm.py", line 91, in from_pretrained
    model = MLCModel(model_path, **kwargs)
  File "/opt/NanoLLM/nano_llm/models/mlc.py", line 60, in __init__
    quant = MLCModel.quantize(self.model_path, self.config, method=quantization, max_context_len=max_context_len, **kwargs)
  File "/opt/NanoLLM/nano_llm/models/mlc.py", line 276, in quantize
    subprocess.run(cmd, executable='/bin/bash', shell=True, check=True)  
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'python3 -m mlc_llm.build --model /data/models/mlc/dist/models/openvla-7b --quantization q8f16_ft --target cuda --use-cuda-graph --use-flash-attn-mqa --sep-embed --max-seq-len 4096 --artifact-path /data/models/mlc/dist/openvla-7b/ctx4096 --use-safetensors ' returned non-zero exit status 1

As it says, there is no related .json file within /data/models/mlc/dist/models/openvla-7b, so I have copied some from somewhere else, like this

cp -L /data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0/*.json /data/models/mlc/dist/models/openvla-7b

But then new errors occured,

python3 -m nano_llm.vision.vla --api mlc     --model openvla/openvla-7b     --quantization q8f16_ft     --dataset dusty-nv/bridge_orig_ep100     --dataset-type rlds     --max-episodes 10     --save-stats /data/benchmarks/openvla_bridge_fp8.json
/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:127: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
20:04:49 | INFO | Load dataset info from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0
20:04:49 | INFO | Creating a tf.data.Dataset reading 1 files located in folders: /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0.
20:04:49 | INFO | Constructing tf.data.Dataset bridge_orig_ep100 for split train[:11], from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0
20:04:49 | SUCCESS | TFDSDataset | loaded bridge_orig_ep100 from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04 (records=11)
2024-11-27 20:04:49.533554: W tensorflow/core/kernels/data/cache_dataset_ops.cc:913] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
2024-11-27 20:04:49.737170: W tensorflow/core/kernels/data/cache_dataset_ops.cc:913] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
20:04:49 | SUCCESS | RLDSDataset | loaded bridge_orig_ep100 - episode format:
{ 'action': [7],
  'cameras': ['image'],
  'image_size': (224, 224, 3),
  'observation': { 'image': ((224, 224, 3), dtype('uint8')),
                   'state': ((7,), dtype('float32'))},
  'step': [ 'action',
            'is_first',
            'is_last',
            'language_instruction',
            'observation']}

20:09:10 | INFO | loading /data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0 with MLC
['/data/models/mlc/dist/openvla-7b/ctx4096/openvla-7b-q8f16_ft/mlc-chat-config.json', '/data/models/mlc/dist/openvla-7b/ctx4096/openvla-7b-q8f16_ft/params/mlc-chat-config.json']
20:09:12 | INFO | running MLC quantization:

python3 -m mlc_llm.build --model /data/models/mlc/dist/models/openvla-7b --quantization q8f16_ft --target cuda --use-cuda-graph --use-flash-attn-mqa --sep-embed --max-seq-len 4096 --artifact-path /data/models/mlc/dist/openvla-7b/ctx4096 --use-safetensors 


Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/build.py", line 47, in <module>
    main()
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/build.py", line 41, in main
    parsed_args = core._parse_args(parsed_args)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/core.py", line 444, in _parse_args
    parsed = _setup_model_path(parsed)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/core.py", line 494, in _setup_model_path
    validate_config(args.model_path)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/core.py", line 538, in validate_config
    config["model_type"] in utils.supported_model_types
AssertionError: Model type openvla not supported.
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/NanoLLM/nano_llm/vision/vla.py", line 446, in <module>
    vla_process_dataset(**{**vars(args), 'dataset': dataset})
  File "/opt/NanoLLM/nano_llm/vision/vla.py", line 296, in vla_process_dataset
    model = NanoLLM.from_pretrained(model, **kwargs)
  File "/opt/NanoLLM/nano_llm/nano_llm.py", line 91, in from_pretrained
    model = MLCModel(model_path, **kwargs)
  File "/opt/NanoLLM/nano_llm/models/mlc.py", line 60, in __init__
    quant = MLCModel.quantize(self.model_path, self.config, method=quantization, max_context_len=max_context_len, **kwargs)
  File "/opt/NanoLLM/nano_llm/models/mlc.py", line 276, in quantize
    subprocess.run(cmd, executable='/bin/bash', shell=True, check=True)  
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'python3 -m mlc_llm.build --model /data/models/mlc/dist/models/openvla-7b --quantization q8f16_ft --target cuda --use-cuda-graph --use-flash-attn-mqa --sep-embed --max-seq-len 4096 --artifact-path /data/models/mlc/dist/openvla-7b/ctx4096 --use-safetensors ' returned non-zero exit status 1.

...
Maybe I shouldn't copy files from /data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0 ?

When I tried not to use MLC, another error occurred =. =

python3 -m nano_llm.vision.vla --api hf \
    --model openvla/openvla-7b \
    --dataset dusty-nv/bridge_orig_ep100 \
    --dataset-type rlds \
    --max-episodes 10 \
    --save-stats /data/benchmarks/openvla_bridge_fp16.json
/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:127: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
21:06:19 | INFO | Load dataset info from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0
21:06:19 | INFO | Creating a tf.data.Dataset reading 1 files located in folders: /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0.
21:06:19 | INFO | Constructing tf.data.Dataset bridge_orig_ep100 for split train[:11], from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04/1.0.0
21:06:19 | SUCCESS | TFDSDataset | loaded bridge_orig_ep100 from /data/datasets/huggingface/datasets--dusty-nv--bridge_orig_ep100/snapshots/f2b661dd5d67de43b7368a4018e549a7d8893d04 (records=11)
2024-11-27 21:06:19.986484: W tensorflow/core/kernels/data/cache_dataset_ops.cc:913] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
2024-11-27 21:06:20.236710: W tensorflow/core/kernels/data/cache_dataset_ops.cc:913] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
21:06:20 | SUCCESS | RLDSDataset | loaded bridge_orig_ep100 - episode format:
{ 'action': [7],
  'cameras': ['image'],
  'image_size': (224, 224, 3),
  'observation': { 'image': ((224, 224, 3), dtype('uint8')),
                   'state': ((7,), dtype('float32'))},
  'step': [ 'action',
            'is_first',
            'is_last',
            'language_instruction',
            'observation']}
21:06:20 | INFO | loading /data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0 with HF
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/NanoLLM/nano_llm/vision/vla.py", line 446, in <module>
    vla_process_dataset(**{**vars(args), 'dataset': dataset})
  File "/opt/NanoLLM/nano_llm/vision/vla.py", line 296, in vla_process_dataset
    model = NanoLLM.from_pretrained(model, **kwargs)
  File "/opt/NanoLLM/nano_llm/nano_llm.py", line 94, in from_pretrained
    model = HFModel(model_path, **kwargs)
  File "/opt/NanoLLM/nano_llm/models/hf.py", line 47, in __init__
    self.model = AutoModelForCausalLM.from_pretrained(self.model_path,
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 567, in from_pretrained
    raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.llm.configuration_prismatic.OpenVLAConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, ElectraConfig, ErnieConfig, FalconConfig, FuyuConfig, GemmaConfig, Gemma2Config, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, JambaConfig, JetMoeConfig, LlamaConfig, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NemotronConfig, OlmoConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig

johnnynunez closed this as completed Feb 21, 2025

amirtaherin mentioned this issue Feb 24, 2025

OpenVLA fails to start on NVIDIA AGX Orin #874

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenVLA does not work on Jetson AGX orin #634

OpenVLA does not work on Jetson AGX orin #634

nirty-fai commented Sep 17, 2024

dusty-nv commented Sep 18, 2024

Gandinator123 commented Oct 3, 2024

BigJohnn commented Nov 27, 2024 •

edited

Loading

OpenVLA does not work on Jetson AGX orin #634

OpenVLA does not work on Jetson AGX orin #634

Comments

nirty-fai commented Sep 17, 2024

dusty-nv commented Sep 18, 2024

Gandinator123 commented Oct 3, 2024

BigJohnn commented Nov 27, 2024 • edited Loading

BigJohnn commented Nov 27, 2024 •

edited

Loading