Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"downgrad nccl to 1.3" #8244

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ COPY ./paddle/scripts/docker/root/ /root/

RUN apt-get update && \
apt-get install -y \
git python-pip python-dev openssh-server bison libnccl-dev \
git python-pip python-dev openssh-server bison \
wget unzip unrar tar xz-utils bzip2 gzip coreutils ntp \
curl sed grep graphviz libjpeg-dev zlib1g-dev \
python-matplotlib gcc-4.8 g++-4.8 \
Expand Down Expand Up @@ -77,6 +77,14 @@ RUN git clone https://github.com/woboq/woboq_codebrowser /woboq && \
-DCMAKE_BUILD_TYPE=Release . \
make)

# https://github.com/PaddlePaddle/Paddle/issues/8195
# NCCL2.1.4 seems works well on cuda9, but not compatible with cuda8
# TODO(dzhwinter): this disable the NCCL DSO temporarily, should be removed
# Download the NCCL1.3 and build it locally
RUN git clone https://github.com/NVIDIA/nccl /nccl && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure, https://github.com/PaddlePaddle/Paddle/blob/develop/tools/manylinux1/Dockerfile.x64#L53 has the same result. Dockerfile under root folder is use for CI and development, manylinux1 Dockerfile is used for releasing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR only for the CI testing.

(cd /nccl \
make -j `nproc` && export PREFIX=/usr; make install)

# Configure OpenSSH server. c.f. https://docs.docker.com/engine/examples/running_ssh_service
RUN mkdir /var/run/sshd
RUN echo 'root:root' | chpasswd
Expand Down