-
-
Notifications
You must be signed in to change notification settings - Fork 972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix env var extraction #2043
fix env var extraction #2043
Conversation
@@ -2,7 +2,7 @@ | |||
|
|||
# Export specific ENV variables to /etc/rp_environment | |||
echo "Exporting environment variables..." | |||
printenv | grep -E '^RUNPOD_|^PATH=|^_=' | sed 's/^\(.*\)=\(.*\)$/export \1="\2"/' >> /etc/rp_environment | |||
printenv | grep -E '^HF_|^BNB_|^CUDA_|^NCCL_|^NV|^RUNPOD_|^PATH=|^_=' | sed 's/^\([^=]*\)=\(.*\)$/export \1="\2"/' | grep -v 'printenv' >> /etc/rp_environment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is NV
trying to catch here? Could it be a bit more specific(longer)? I'm concerned it may catch something we don't want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NV_LIBCUBLAS_VERSION=12.4.5.8-1
NVIDIA_VISIBLE_DEVICES=all
NV_NVML_DEV_VERSION=12.4.127-1
NV_CUDNN_PACKAGE_NAME=libcudnn9-cuda-12
NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.21.5-1+cuda12.4
NV_LIBNCCL_DEV_PACKAGE_VERSION=2.21.5-1
NVIDIA_REQUIRE_CUDA=cuda>=12.4 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471 brand=tesla,driver>=525,driver<526 brand=unknown,driver>=525,driver<526 brand=nvidia,driver>=525,driver<526 brand=nvidiartx,driver>=525,driver<526 brand=geforce,driver>=525,driver<526 brand=geforcertx,driver>=525,driver<526 brand=quadro,driver>=525,driver<526 brand=quadrortx,driver>=525,driver<526 brand=titan,driver>=525,driver<526 brand=titanrtx,driver>=525,driver<526 brand=tesla,driver>=535,driver<536 brand=unknown,driver>=535,driver<536 brand=nvidia,driver>=535,driver<536 brand=nvidiartx,driver>=535,driver<536 brand=geforce,driver>=535,driver<536 brand=geforcertx,driver>=535,driver<536 brand=quadro,driver>=535,driver<536 brand=quadrortx,driver>=535,driver<536 brand=titan,driver>=535,driver<536 brand=titanrtx,driver>=535,driver<536
NV_LIBCUBLAS_DEV_PACKAGE=libcublas-dev-12-4=12.4.5.8-1
NV_NVTX_VERSION=12.4.127-1
NV_CUDA_CUDART_DEV_VERSION=12.4.127-1
NV_LIBCUSPARSE_VERSION=12.3.1.170-1
NV_LIBNPP_VERSION=12.2.5.30-1
NV_CUDNN_PACKAGE=libcudnn9-cuda-12=9.1.0.70-1
NVIDIA_DRIVER_CAPABILITIES=compute,utility
NV_NVPROF_DEV_PACKAGE=cuda-nvprof-12-4=12.4.127-1
NV_LIBNPP_PACKAGE=libnpp-12-4=12.2.5.30-1
NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev
NV_LIBCUBLAS_DEV_VERSION=12.4.5.8-1
NVIDIA_PRODUCT_NAME=CUDA
NV_LIBCUBLAS_DEV_PACKAGE_NAME=libcublas-dev-12-4
NV_CUDA_CUDART_VERSION=12.4.127-1
NV_LIBCUBLAS_PACKAGE=libcublas-12-4=12.4.5.8-1
NV_CUDA_NSIGHT_COMPUTE_DEV_PACKAGE=cuda-nsight-compute-12-4=12.4.1-1
NV_LIBNPP_DEV_PACKAGE=libnpp-dev-12-4=12.2.5.30-1
NV_LIBCUBLAS_PACKAGE_NAME=libcublas-12-4
NV_LIBNPP_DEV_VERSION=12.2.5.30-1
NV_LIBCUSPARSE_DEV_VERSION=12.3.1.170-1
NV_CUDNN_VERSION=9.1.0.70-1
NV_CUDA_LIB_VERSION=12.4.1-1
NVARCH=x86_64
NV_CUDNN_PACKAGE_DEV=libcudnn9-dev-cuda-12=9.1.0.70-1
NV_CUDA_COMPAT_PACKAGE=cuda-compat-12-4
NV_LIBNCCL_PACKAGE=libnccl2=2.21.5-1+cuda12.4
NV_CUDA_NSIGHT_COMPUTE_VERSION=12.4.1-1
NV_NVPROF_VERSION=12.4.127-1
NV_LIBNCCL_PACKAGE_NAME=libnccl2
NV_LIBNCCL_PACKAGE_VERSION=2.21.5-1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could alternatively be more explicit with ^NV_ | ^NVIDIA_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we set any of these? I don't recall seeing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't seem to be set when logging in via direct ssh but are needed
Description
in runpod, using the direct SSH connection often leaves the environment in an incorrect state. This fixes the env vars we grab and correctly sets them in the environment on login