At Red Hen, we have started to use OpenPose for gesture recognition purposes. In order to ensure portability across the various HPC facilities offered by the universities of the researchers involved I chose to create a Singularity container. Having this container even helps with the various GPU generations available within FAU’s TinyGPU cluster. Here are the benefits of using the image the creation of which I will describe below:
- contains the latest OpenPose codebase
- is based on the latest NVCaffe image (for newer NVidia cards)
- is compatible with all generations of NVidia cards available to us
- includes a CPU version and will automatically use that if no GPU is available
In my tests, the major advantage of the NVCaffe image was the smaller memory footprint (less than 3 GB vs. roughly 5 GB with the custom Caffe that comes with OpenPose by default), which means that the GPUs with 4 GB of RAM become usable for OpenPose. In addition, it was about 10% faster than the Caffe that comes with OpenPose on GTX 1080 cards.
Note that the instructions below assume interactive installation due to some glitches that occured in my setup. Feel free to create a Singularity recipe out of this and post the link in the comments! Also, this should be applicable to use in Docker with minimal changes.
- Pull the NVCaffe image. We use a sandbox (i.e. directory) format because of the glitch described in 4. below.
singularity build --sandbox openpose_multi_container_oct_2019/ docker:nvcr.io/nvidia/caffe:19.09-py2
- Open writable shell as root:
sudo singularity shell -w openpose_multi_container_oct_2019
- Let us upgrade OS in the container:
export LC_ALL=C apt-get -y --no-install-recommends update && \ apt-get -y --no-install-recommends upgrade
- I was greeted with the following error: (Skip if everything runs fine)
dpkg: error processing archive /tmp/apt-dpkg-install-XOAywY/02-tzdata_2019c-0ubuntu0.18.04_all.deb (--unpack): unable to make backup link of './usr/share/zoneinfo/UCT' before installing new version: Invalid cross-device link
I did not find an actual solution to this problem (again, let me know in the comments if you have one), so we are going to do a simple workaround. We exit Singularity, rename the file in the container directory, go back into the container and try again.
exit mv openpose_multi_container_oct_2019/usr/share/zoneinfo/UCT openpose_multi_container_oct_2019/usr/share/zoneinfo/UCT_original sudo singularity shell -w openpose_multi_container_oct_2019 export LC_ALL=C apt-get -y --no-install-recommends upgrade
You will probably be prompted for timezone information.
- Let us install a range of dependencies and tools:
apt-get install -y --no-install-recommends \ build-essential \ cmake \ git \ wget \ nano \ dialog \ software-properties-common \ libatlas-base-dev \ libleveldb-dev \ libsnappy-dev \ libhdf5-serial-dev \ libboost-all-dev \ libgflags-dev \ libgoogle-glog-dev \ liblmdb-dev \ pciutils \ python3-setuptools \ python3-dev \ python3-pip \ opencl-headers \ ocl-icd-opencl-dev \ libviennacl-dev \ libavcodec-dev \ libavformat-dev \ libswscale-dev \ libv4l-dev \ libxvidcore-dev \ libx264-dev \ libgtk-3-dev \ gfortran \ pkg-config \ libcanberra-gtk-module && \ python3 -m pip install \ numpy \ opencv-python pip3 install protobuf==3.6.0 add-apt-repository -y ppa:jonathonf/ffmpeg-4 apt-get -y --no-install-recommends update apt-get -y install ffmpeg
- The NVCaffe image is strange in that it does not contain some pieces of software against which NVCaffe appears to be built. Protobuf and OpenCV are not installed, but the build fails without them. However, NVCaffe seems to expect them to be in a version that is NOT part of the OS’s package management. Ubuntu 18.04 currently contains Protobuf 3.0 and OpenCV 3.2, but the build fails if those are installed via apt, complaining about version mismatches. We thus need to install OpenCV 3.4 and Protobuf 3.6.0 specifically, although for the latter the pip3 line in the previous step should be enough.
- OpenCV with CUDA support and Fast Math (not that the latter seems to change much in our case…):
cd /opt wget -O opencv3.4.8.zip https://github.com/opencv/opencv/archive/3.4.8.zip wget -O opencv-contrib3.4.8.zip https://github.com/opencv/opencv_contrib/archive/3.4.8.zip unzip opencv3.4.8.zip unzip opencv-contrib3.4.8.zip cd opencv-3.4.8/ mkdir build && cd build cmake -D CMAKE_BUILD_TYPE=RELEASE \ -D CMAKE_INSTALL_PREFIX=/usr/local \ -D WITH_CUDA=ON \ -D ENABLE_FAST_MATH=1 \ -D CUDA_FAST_MATH=1 \ -D WITH_CUBLAS=1 \ -D WITH_FFMPEG=ON \ -D INSTALL_PYTHON_EXAMPLES=ON \ -D OPENCV_EXTRA_MODULES_PATH=/opt/opencv_contrib-3.4.8/modules \ -D OPENCV_ENABLE_NONFREE=ON \ -D BUILD_EXAMPLES=ON .. make -j`nproc` make install cd /opt rm opencv3.4.8.zip rm opencv-contrib3.4.8.zip
- Clone OpenPose and make a copy for the CPU version
cd /opt git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose.git cp -R openpose openpose_cpu
- In our environment, I have root access only to a machine that does not have a GPU. But the compilation for the NVCaffe version appears to require the presence of a GPU. Thus I need to move the image for compilation to a machine in the HPC cluster that does not have direct Internet access. Skip this step if you have root access to a GPU machine (but make sure you called singularity with
--nvand loaded CUDA before).
chmod -R a+rwX /opt/openpose chmod -R a+rwX /opt/openpose_cpu chmod a+rwX /opt exit sudo tar cvzf openpose_multi_container_oct_2019.tar.gz openpose_multi_container_oct_2019/ # copy to remote machine, SSH there and execute tar xvzf openpose_multi_container_oct_2019.tar.gz # You may need to load the correct NVidia drivers here. In our case: module load cuda/10.1 singularity shell --nv -w openpose_multi_container_oct_2019 export LC_ALL=C #optional: set http proxy export HTTP_PROXY=http://proxy.rrze.uni-erlangen.de:80 export HTTPS_PROXY=https://proxy.rrze.uni-erlangen.de:443 export http_proxy=http://proxy.rrze.uni-erlangen.de:80 export https_proxy=https://proxy.rrze.uni-erlangen.de:443
- Modify /opt/openpose/CMakeLists.txt; the line removes a build error that occured for me.
# Add this line (I put it in line 226): find_package(Boost COMPONENTS system filesystem REQUIRED)
- Build OpenPose for GPU, enabling all GPU architectures (but see step 17 for old cards):
mkdir -p /opt/openpose/build && \ cd /opt/openpose/build && \ cmake -DDL_FRAMEWORK=NV_CAFFE -DCaffe_INCLUDE_DIRS=/usr/local/lib/include/caffe \ -DCaffe_LIBS=/usr/local/lib/libcaffe-nv.so -DBUILD_CAFFE=OFF -DCUDA_ARCH=All .. && \ make -j`nproc`
- There is a problem with the CPU version of Caffe used by OpenPose. We need to make some minor changes before it will compile in this container. If needed, you can download the COCO model that is faster with the CPU version.
# Edit /opt/openpose_cpu/3rdparty/caffe/src/caffe/layers/mkldnn_inner_product_layer.cpp # Add 3 spaces to the beginning of lines 354 and 357 # Add 4 spaces to the beginning of lines 355 and 358 mkdir -p /opt/openpose_cpu/build && \ cd /opt/openpose_cpu/build && \ cmake -DGPU_MODE=CPU_ONLY .. && \ make -j`nproc` # The following step is optional, in case you want to use the COCO model (faster on CPUs but less accurate) cd /opt/openpose/models wget -c http://posefs1.perception.cs.cmu.edu/OpenPose/models/pose/coco/pose_iter_440000.caffemodel -P pose/coco/
- Make this the content of the file /.singularity.d/runscript:
#!/bin/bash if nvidia-smi; then cd /opt/openpose echo "#### USING GPU ####" else cd /opt/openpose_cpu echo "#### USING CPU ####" fi ./build/examples/openpose/openpose.bin "$@"
- Make sure the runscript is executable and exit:
chmod a+rx /.singularity.d/runscript exit
- Convert the sandbox directory to a compressed read-only Singularity Image File (SIF) for production use:
singularity build openpose_multi_container_oct_2019.sif openpose_multi_container_oct_2019/
You can now delete the sandbox directory if you like.
- To run OpenPose, you can now simply run the singularity container with the OpenPose options. In our example below, we run on a GPU (hence
--nv) and just wand JSON output for body pose, face and hand.
singularity run --nv ~/openpose/openpose_multi_container_oct_2019.sif --video ~/tmp/whatever.mp4 --write_json ~/wherever --display 0 --render_pose 0 --face --hand
- If you have legacy cards, NVCaffe is not for you. Officially, only the Pascal, Volta and Turing architectures are supported. Details about card generations can be found here. However, in my tests, the GTX 980 from the Maxwell generation also worked with the NVCaffe build of OpenPose described above. The Kepler cards (K20m, K40m) did not. For those, we will install OpenPose with the Custom Caffe that comes with it by default. Unfortunately, we need a different version of cmake than the one that ships with Ubuntu 18.04 due to some incompatiblity.
cd /opt wget https://github.com/Kitware/CMake/releases/download/v3.15.4/cmake-3.15.4-Linux-x86_64.sh /bin/sh cmake-3.15.4-Linux-x86_64.sh # [say yes to everything] git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose.git openpose_legacy_gpu cd /opt/openpose_legacy_gpu rm -rf /opt/openpose_legacy_gpu/build mkdir -p /opt/openpose_legacy_gpu/build && \ cd /opt/openpose_legacy_gpu/build && \ /opt/cmake-3.15.4-Linux-x86_64/bin/cmake -DCUDA_ARCH=All .. && \ make -j`nproc`
Note: It looks like custom Caffe does not produce code for all architectures needed. OpenPose claims to be preparing code for sm_30 to sm_75 when
-DCUDA_Arch=Allis set, the “Caffe Configuration Summary” that shows during build does not include sm_75. We do not care at the moment, because anything from sm50 onwards should run with NVCaffe.
We also have to adapt the runscript to select the appropriate version of OpenPose:
#!/bin/bash if nvidia-smi; then if (($(deviceQuery | grep "CUDA Capability" | grep -oP "(?<= )[0-9]" | head -n 1) >= 5 )); then cd /opt/openpose echo "#### USING GPU with NVCaffe ####" else cd /opt/openpose_legacy_gpu echo "#### USING Legacy GPU with Custom Caffe ####" fi else cd /opt/openpose_cpu echo "#### USING CPU ####" fi ./build/examples/openpose/openpose.bin "$@"
Please leave any feedback in the comments.