December 23, 2019

Deep Learning Video Analytics Frameworks based on Gstreamer

By taras.lishchenko Deep Learning, Gstreamer, Top List 0 Comments

3 min. read |

Video analytics applications (ex.: smart cities, retail, industries, etc.) consist of two main parts: Video Streaming and Computer Vision/ Deep Learning Frameworks. In here we’ll go through available frameworks that allow developers to focus on providing analytics part and hide nuances of video streaming.

Overview

General architecture of Video Analytics applications looks like the following.

Source: http://on-demand.gputechconf.com/gtc-cn/2018/pdf/CH8307.pdf

What stands behind each part of previous architecture?

Process	Variations
Collect	Web/IP Camera HTTP/RTP/RTSP Streaming Video Files (single, multiple) Image Files (single, multiple)
Decode/Encode	Video compression formats: MJPEG/H264/H265/… Video containers: MPEG-4/AVI/MOV/… Colorspaces: RGB/RGBA/BGR/YUV /…
Pre-Process	Crop/Scale/Draw/Enhance/Filters
Output	Video File (single, multiple) with Digital Video Record (DVR) Image File (single, multiple) Show Window (single, multiple, composite) Stream: TCP/UDP/HTTP/RTP/RTSP

All listed spectrum of steps require sufficient expertise in image/video processing. But could be hided by next Frameworks.

General information and Top Video Analytics Frameworks.

Framework	Year	Maintainer	Language	Video Streaming Framework	Arch	OS
OpenCV	2010	Community	C/C++ Python Java	FFmpeg Gstreamer	arm; arm64; x64; x86	Linux MacOS Windows iOS Android
GstInference	2019	RidgeRun	C/C++	Gstreamer
NNStreamer	2018	Samsung	C/C++	Gstreamer	arm arm64 x64 x86 *more	Tizen Ubuntu Android Yocto MacOS *more
DeepStream	2018	Nvidia	C/C++ Python	Gstreamer (GPU-accelerated: Nvidia)	arm arm64 *more	Ubuntu *more
Gst-Video-Analytics	2019	OpenCV	C/C++	Gstreamer (GPU-CPU accelerated: VAAPI, OpenGL)	x64 x86	Linux* *more

What is under the hood of each Video Analytics Frameworks? Video Streaming and CV/ML Frameworks Support.

Framework	Video Streaming Framework	CV/ML Frameworks Support
OpenCV	FFmpeg Gstreamer	OpenCV
GstInference	Gstreamer	– Neural Compute SDK (NCSDK) – Tensorflow – Caffe – TensorRT – OpenCV * more
NNStreamer*	Gstreamer	Tensorflow Tensorflow-Lite pytorch caffe2
DeepStream	Gstreamer (GPU-accelerated: Nvidia)	TensorRT; Caffe
Gst-Video-Analytics	Gstreamer (GPU-CPU accelerated: VAAPI, OpenGL)	OpenVINOOpenCV

* Documentation

In general all Frameworks are built on top of open source media streaming libraries FFmpeg and Gstreamer. As a CV/ML Frameworks there are a variety of possible solutions: Tensorflow, Tensorflow-Lite, TensorRT, Pytorch, Caffe, OpenVINO, OpenCV.

Thoughts

Most Product Development Process from Client perspective could be reduced to follows (Video Analytics Case):

Reduce Development Costs/Time
- Solutions:
  - existing solutions reuse
  - balance between Software Engineering and Data Science common skill set
Reduce Product Cost
- Solutions:
  - process multiple video processing feeds; fast and accurate models usage
    - efficient hardware usage
      - shared memory resources
      - up to 99% hardware capabilities usage
    - reduced data storage and transmission
Reduce Product Scaling Costs/Time
- Solution
  - generability: single solution multiple use cases

All listed frameworks helps to build software faster, in more efficient way.

Personal Experience

I started to prototype Video Analytics applications from OpenCV. When we exceeded the limits of it due to new project requirements (resolution/fps setup, video record, custom operations, performance improvement) we switched to Gstreamer (advice by another expert).
I know C/C++ well, but started to dive deep into Gstreamer with Python (easier dependencies setup and development itself). With Python I was mostly focused on how framework works. So when prototyping we failed/succeeded faster.
Exploring Gstreamer is challenging but rewarding process. Luck of resources, community is a huge problem. The main pain was to setup everything and make Python friends with Gstreamer.
Diving into Gstreamer helped me to learn it’s architecture, code development approaches, basics of video processing. It was and still is an entertaining process 😉 . I think that Gstreamer has one of the best architecture (interfaces, abstraction) which gives developers great flexibility and extensibility (sometimes code might be dirty, not intuitive, but nothing is perfect …) 😉
Now, I’m glad when I see how other companies use Gstreamer for Video Analytics applications.
FFmpeg I use often as a command line tool (commands are shorter, sometimes more clear).
OpenCV works for me when there is a need to deliver prototype in a short time and there are no restrictions on Hardware performance. OpenCV supports Gstreamer as well but requires additional library build with enabled additional properties (with pip-package it is so much easier).
I constantly look for new repositories, frameworks which simplifying development of Video Analytics Applications.
- Btw: In Examples by Google for Coral TPU Dev Board there are also both OpenCV and Gstreamer examples as well.

Conclusion

What are the most common features for all of them?

Gstreamer is used for Video Streaming
Tendency to provide Python bindings (so more developers can dive deep faster)
Multiple Deep Learning/Computer Vision Frameworks Support (generability)

Which one to choose is definitely up to you and your project requirements?

OS?
Target Architecture?
Programming Language
Deadlines?

Due to my experience:

start with the simplest
explore, understand, exceed the limits
make more conscious decision (based on experience you have now)
circle steps 2-3

In any case, additional knowledge in Video Streaming (with Gstreamer or FFmpeg) could help you to improve you Project Design, Performance, Accuracy.

Hope you enjoy reading. In case I missed any framework – let me know in comments 😉

Tags:computer vision, deep learning, deepstream, gstinference, gstreamer, nnstreamer, opencv, python, video streaming

LifeStyleTransfer