How to install Nvidia Gstreamer plugins (nvenc, nvdec) on Ubuntu?
7 min. read |
Gstreamer’s plugins avdec_h264 (h264 video decoding) and x264enc (h264 video encoding) optimized for CPU. Meanwhile with Nvidia Gstreamer plugins (nvenc, nvdec) we can benefit from GPU capabilities and performance. With GPU based plugins applications can process videos up to 3 times faster.
Requirements
- Ubuntu
- Gstreamer
- Nvidia GPU
- Nvidia Video Codec SDK
Scripts
Learn how to?
- build Nvidia gstreamer plugins nvenc (nvh264enc) and nvdec
- use OpenGL gstreamer plugins gldownload, glimagesink
Guide
Environment setup
- CUDA Version 10.0.130
- Ubuntu 18.04.4 LTS
- Video_Codec_SDK_9.0.20
- Gstreamer 1.14.5
- NVIDIA Driver 435.21
Inspect
At the beginning, check if plugins are already installed:
gst-inspect-1.0 nvdec
On test machine plugin isn’t installed yet, so command’s output looks like the following:
No such element or plugin 'nvdec'
Requirements
First, check your GPU supports NVENC, NVDEC SDKs using Video Encode and Decode GPU Support Matrix
Initial requirements for nvenc, nvdec plugins specified in official gstreamer nvidia plugins README.
Install CUDA
Check CUDA version:
cat /usr/local/cuda/version.txt
Install CUDA Toolkit using official NVIDIA Cuda Installation Guide
Install NVIDIA Video Codec SDK
Download
At first, go to NVIDIA Video Codec SDK’s page and download compatible with your nvidia driver version. For example, check the requirements for the latest SDK version:
Use nvidia-smi to check your nvidia driver version.
nvidia-smi
Note: In particular case my local Driver Version is 435.21 and Nvidia Video Codec SDK 9.1 requires 435.21 or newer. But in my case I managed everything to work with previous version of Nvidia Video Codec SDK 9.0 with next requirements:
Install
In order to install Video Codec SDK simply extract files from downloaded archive and move includes & libs to your cuda path (ex.: usr/local/cuda/). For example:
unzip Video_Codec_SDK.zip
cd Video_Codec_SDK
cp include/* /usr/local/cuda/include
cp Lib/linux/stubs/x86_64/* /usr/local/cuda/lib64/stubs
Build nvenc, nvdec gstreamer plugins
Nvidia gstreamer plugins located in gst-plugins-bad package.
Note: in gst-plugins-bad repository plugins need more quality, testing or documentation
Now, let’s clone repository and build plugins.
git clone git://anongit.freedesktop.org/git/gstreamer/gst-plugins-bad
cd gst-plugins-bad
git checkout $(gst-launch-1.0 --version | \
grep version | tr -s ' ' '\n' | tail -1)
./autogen.sh --disable-gtk-doc --noconfigure
NVENCODE_CFLAGS="-I/usr/local/cuda/include" \
./configure --with-cuda-prefix="/usr/local/cuda"
Note: additional NVENCODE_CFLAGS instructs to look at Cuda’s includes folder when building plugins. And flag (–with-cuda-prefix) specifies exact Cuda location.
Check output to confirm that nvenc, nvdec plugins is going to be build.
Then, go to exact nvenc, nvdec folders and build libraries.
cd sys/nvenc
make
make install
cd sys/nvdec
make
make install
Note: By default gstreamer installs libraries (*.so) into /usr/local/lib/gstreamer-1.0/, if no other location specified by –prefix option when configuring.
Best practice to load gstreamer plugins from other locations just specify GST_PLUGIN_PATH environment variable. For example:
GST_PLUGIN_PATH=$GST_PLUGIN_PATH:/usr/local/lib/gstreamer-1.0/
Now, check that both nvenc and nvdec plugins available with gst-inspect-1.0
GST_DEBUG=nvdec*:6,nvenc*:6 gst-inspect-1.0 nvdec
Or
GST_PLUGIN_PATH=$GST_PLUGIN_PATH:/usr/local/lib/gstreamer-1.0/ \
GST_DEBUG=nvdec*:6,nvenc*:6 gst-inspect-1.0 nvdec
GST_DEBUG=nvdec*:6,nvenc*:6 gst-inspect-1.0 nvh264enc
Note: Enable error messages for nvidia gstreamer plugins to check exact reason in case of failure with the following export:
GST_DEBUG=nvdec*:6,nvenc*:6
Now, let’s launch some pipelines to check correctness of Nvidia’s Gstreamer plugins video decoding/encoding
Nvidia-accelerated pipelines
Plugins Performance
Download test video with youtube-dl. Have a look at How to watch Youtube videos with Gstreamer to explore youtube-dl.
youtube-dl --format "best[ext=mp4][protocol=https]" \
https://www.youtube.com/watch?v=9eiaiVthVrk -o jumanji.mp4
Let’s check video information with gst-discoverer-1.0:
gst-discoverer-1.0 jumanji.mp4
Note: Pay attention to container type (QuickTime), container format (MP4), video codec (H264). This information helps to build right pipeline to process video.
In addition, let use ffprobe to get video’s frames count and resolution
ffprobe -v error -select_streams v:0 \
-show_entries stream=nb_frames,width,height \
-of default=noprint_wrappers=1 jumanji.mp4
width=1280
height=720
nb_frames=3948
Hardware specifications
- GPU GeForce GTX 1050
- CPU Intel(R) Core(TM) i7-7700HQ @ 2.80GHz
At first let’s try to decode video with avdec_h264 plugin. It is pure implementation from Libav (open-source audio and video processing library). Common pipeline looks like the following:
gst-launch-1.0 filesrc location=jumanji.mp4 ! qtdemux \
! h264parse ! avdec_h264 ! fakesink
# timing: 2,63s
Note: avdec_h264 plugin is used as video codec format is H264.
Note: qtdemux used to demux video with QuickTime container format of video file
Also let check CPU-GPU load while video file processing. CPU is loaded for ~65% and GPU is in idle state.
Now, try to decode video with nvdec
gst-launch-1.0 filesrc location=jumanji.mp4 ! qtdemux \
! h264parse ! nvdec ! fakesink
# timing: 4.173
Note that now GPU loaded for ~20% and there is a process gst-launch-1.0 located on GPU. CPU load also around 20%, that is less when decoding pipeline with avdec_h264 (65%).
Note: Decoding with NVIDIA plugin appeared to be slower (~2.6 s vs 4.1 s), but it possibly due to the CPU -> GPU buffers transfer and due to the initial memory allocation on GPU. For bigger (ex.: 4K) and longer video files decoding on GPU is going to be faster.
Now, check encoding performance using x264enc and nvh264enc plugins.
gst-launch-1.0 videotestsrc num-buffers=10000 ! x264enc ! fakesink
# timing: 7,704s
gst-launch-1.0 videotestsrc num-buffers=10000 ! nvh264enc ! fakesink
# timing: 1,912s
Decoding/Encoding plugin | CPU | GPU | Elapsed Time | FPS | Scaler |
avdec_h264 | ~65% | 0% | 2.63s | 1501.14 | x |
nvdec | ~20% | ~24% | 4.173s | 946.08 | 1.58x |
x264enc | ~55% | 0% | 7.704s | 1298.02 | 4.03x |
nvh264enc | 1 core 100% | ~15% | 1.912s | 5230.12 | x |
Example pipelines
Write to file
gst-launch-1.0 videotestsrc num-buffers=10000 ! nvh264enc ! h264parse \
! mp4mux ! filesink location=video.mp4
Note: mp4mux is used to mux stream in ISO MPEG-4 container format.
Display video
Common way to display video with autovideosink works around ~245 FPS on test machine:
gst-launch-1.0 filesrc location=jumanji.mp4 ! qtdemux ! h264parse ! \
avdec_h264 ! autovideosink sync=false
Since, nvdec outputs buffers in raw format located on GPU (video/x-raw(memory:GLMemory)), so you can easily display it with glimagesink. This is great and fast OpenGL plugin to render video frames. With the following approach video displaying works with ~700 FPS.
gst-launch-1.0 filesrc location=jumanji.mp4 ! qtdemux ! h264parse ! \
nvdec ! glimagesink sync=false
And the last approach is to use nvdec for decoding, but autovideosink for video display. In such a case we need to transfer raw buffer from GPU to CPU memory with gldownload plugin:
video/x-raw(memory:GLMemory) -> video/x-raw
Pipeline works with around 180 FPS (Due to additional memory conversion and allocation)
gst-launch-1.0 filesrc location=jumanji.mp4 ! qtdemux ! h264parse ! \
nvdec ! gldownload ! videoconvert n-threads=0 ! autovideosink sync=false
Note: for FPS measurements use fpsdisplaysink. For example:
gst-launch-1.0 filesrc location=jumanji.mp4 ! qtdemux ! h264parse ! \
nvdec ! fpsdisplaysink video-sink=glimagesink sync=false
Check out more pipelines at gstreamer-commands page
Common Issues
Version mismatch
Problem. Video Codec SDK Version not supported by Nvidia Driver.
ERROR:nvenc gstnvenc.c:331:plugin_init: Failed to get NVEncodeAPI function table!
Solution. Download proper SDK version to match your driver (go to: Install NVIDIA Video Codec SDK section) . Or change your driver version (if available in Software & Updates).
No errors in output
Problem. Plugins now working and no error messages printed in console.
Solution. Clear gstreamer cache to reload plugins. This should at least print error message
rm -rf ~/.cache/gstreamer-1.0/
Conclusion
Using previous guide we learned how to:
- build Nvidia gstreamer plugins nvenc (nvh264enc) and nvdec
- tried and checked performance of different decoding (avdec_h264, nvdec) and encoding (x264enc, nvh264enc) plugins. With GPU-accelerated plugins we received up to 3x performance boost.
- use OpenGL gstreamer plugins gldownload, glimagesink in order to display video faster using GPU.
Hope everything worked as expected 😉 In case of any troubles, suggestions, particular cases that haven’t been covered leave a comment.
I tried to follow your guide with the following setup:
Ubuntu 18.04
Gstreamer 1.14.5
NVIDIA QUADRO P2000
NVIDIA-SMI 440.100 Driver Version: 440.100
CUDA Version 10.2.89
NVIDIA Video_Codec_SDK_9.0.20
I can successfully execute this command:
gst-launch-1.0 filesrc location=jumanji.mp4 ! qtdemux ! h264parse ! nvdec ! glimagesink sync=false
However, the following one fails:
gst-launch-1.0 videotestsrc num-buffers=10000 ! nvh264enc ! h264parse ! mp4mux ! filesink location=video.mp4
and I get the following error:
WARN videoencoder gstvideoencoder.c:1627:gst_video_encoder_change_state: error: Failed to open encoder
ERROR nvenc gstnvbaseenc.c:437:gst_nv_base_enc_open: Failed to create NVENC encoder session, ret=15
any clues on what the problem could be?
more detailed log:
nvenc gstnvenc.c:267:gst_nvenc_create_cuda_context: Initialising CUDA.. 0:00:00.523634157 7971 0x56375974c600 INFO nvenc gstnvenc.c:276:gst_nvenc_create_cuda_context: Initialised CUDA 0:00:00.523654036 7971 0x56375974c600 INFO nvenc gstnvenc.c:284:gst_nvenc_create_cuda_context: 1 CUDA device(s) detected 0:00:00.523702909 7971 0x56375974c600 INFO nvenc gstnvenc.c:290:gst_nvenc_create_cuda_context: GPU #0 supports NVENC: yes (Quadro P2000) (Compute SM 6.1) 0:00:00.646223264 7971 0x56375974c600 INFO nvenc gstnvenc.c:312:gst_nvenc_create_cuda_context: Created CUDA context 0x5637599d78f0 0:00:00.646239492 7971 0x56375974c600 ERROR nvenc gstnvbaseenc.c:437:gst_nv_base_enc_open: Failed to create NVENC encoder session, ret=15 0:00:00.646262028 7971 0x56375974c600 INFO nvenc gstnvenc.c:320:gst_nvenc_destroy_cuda_context: Destroying CUDA context 0x5637599d78f0 0:00:00.755491991 7971 0x56375974c600 WARN videoencoder gstvideoencoder.c:1627:gst_video_encoder_change_state: error: Failed to open encoder
This tutorial really helped me install the NVIDIA plugins.
However, it appears that, at least on my machine, nvdec had higher fps on autovideosink instead of the glimagesink. Which I find very weird. How could the glimagesink be slower if the frames are already loaded to the GPU?
In my case, the CPU decoder was faster than the GPU one (when comparing with fakesink).
Any thoughts what may cause that? I find it unlikely that the CPU (Intel Core i7 9th gen) has so much more power than my GPU (GeForce RTX 2080).