Deep Learning Video Analytics Frameworks based on Gstreamer

3 min. read |

Video analytics applications (ex.: smart cities, retail, industries, etc.) consist of two main parts: Video Streaming and Computer Vision/ Deep Learning Frameworks. In here we’ll go through available frameworks that allow developers to focus on providing analytics part and hide nuances of video streaming.

Overview

General architecture of Video Analytics applications looks like the following.

deep learning video analytics

Source: http://on-demand.gputechconf.com/gtc-cn/2018/pdf/CH8307.pdf

What stands behind each part of previous architecture?

ProcessVariations
CollectWeb/IP Camera
HTTP/RTP/RTSP Streaming
Video Files (single, multiple)
Image Files (single, multiple)
Decode/EncodeVideo compression formats: MJPEG/H264/H265/…
Video containers: MPEG-4/AVI/MOV/…
Colorspaces: RGB/RGBA/BGR/YUV /…
Pre-ProcessCrop/Scale/Draw/Enhance/Filters
OutputVideo File (single, multiple) with Digital Video Record (DVR)
Image File (single, multiple)
Show Window (single, multiple, composite)
Stream: TCP/UDP/HTTP/RTP/RTSP

All listed spectrum of steps require sufficient expertise in image/video processing. But could be hided by next Frameworks.

General information and Top Video Analytics Frameworks.

FrameworkYear MaintainerLanguageVideo Streaming FrameworkArchOS
OpenCV2010CommunityC/C++
Python
Java
FFmpeg
Gstreamer
arm; arm64; x64; x86Linux
MacOS
Windows
iOS
Android
GstInference2019 RidgeRunC/C++Gstreamer
NNStreamer2018SamsungC/C++Gstreamerarm arm64 x64
x86
*more
Tizen
Ubuntu
Android
Yocto
MacOS
*more
DeepStream2018NvidiaC/C++
Python
Gstreamer
(GPU-accelerated: Nvidia)
arm
arm64
*more
Ubuntu
*more
Gst-Video-Analytics2019OpenCVC/C++Gstreamer
(GPU-CPU accelerated: VAAPI, OpenGL)
x64
x86
Linux*
*more

What is under the hood of each Video Analytics Frameworks? Video Streaming and CV/ML Frameworks Support.

FrameworkVideo Streaming FrameworkCV/ML Frameworks Support
OpenCVFFmpeg
Gstreamer
OpenCV
GstInferenceGstreamer– Neural Compute SDK (NCSDK)
– Tensorflow
– Caffe
– TensorRT
– OpenCV
* more
NNStreamer*GstreamerTensorflow
Tensorflow-Lite
pytorch
caffe2
DeepStreamGstreamer
(GPU-accelerated: Nvidia)
TensorRT;
Caffe
Gst-Video-AnalyticsGstreamer
(GPU-CPU accelerated: VAAPI, OpenGL)

OpenVINO​OpenCV

* Documentation

In general all Frameworks are built on top of open source media streaming libraries FFmpeg and Gstreamer. As a CV/ML Frameworks there are a variety of possible solutions: Tensorflow, Tensorflow-Lite, TensorRT, Pytorch, Caffe, OpenVINO, OpenCV.

Thoughts

Most Product Development Process from Client perspective could be reduced to follows (Video Analytics Case):

  • Reduce Development Costs/Time
    • Solutions:
      • existing solutions reuse
      • balance between Software Engineering and Data Science common skill set
  • Reduce Product Cost
    • Solutions:
      • process multiple video processing feeds; fast and accurate models usage
        • efficient hardware usage
          • shared memory resources
          • up to 99% hardware capabilities usage
        • reduced data storage and transmission
  • Reduce Product Scaling Costs/Time
    • Solution
      • generability: single solution multiple use cases

All listed frameworks helps to build software faster, in more efficient way.

Personal Experience

  • I started to prototype Video Analytics applications from OpenCV. When we exceeded the limits of it due to new project requirements (resolution/fps setup, video record, custom operations, performance improvement) we switched to Gstreamer (advice by another expert).
  • I know C/C++ well, but started to dive deep into Gstreamer with Python (easier dependencies setup and development itself). With Python I was mostly focused on how framework works. So when prototyping we failed/succeeded faster.
  • Exploring Gstreamer is challenging but rewarding process. Luck of resources, community is a huge problem. The main pain was to setup everything and make Python friends with Gstreamer.
  • Diving into Gstreamer helped me to learn it’s architecture, code development approaches, basics of video processing. It was and still is an entertaining process πŸ˜‰ . I think that Gstreamer has one of the best architecture (interfaces, abstraction) which gives developers great flexibility and extensibility (sometimes code might be dirty, not intuitive, but nothing is perfect …) πŸ˜‰
  • Now, I’m glad when I see how other companies use Gstreamer for Video Analytics applications.
  • FFmpeg I use often as a command line tool (commands are shorter, sometimes more clear).
  • OpenCV works for me when there is a need to deliver prototype in a short time and there are no restrictions on Hardware performance. OpenCV supports Gstreamer as well but requires additional library build with enabled additional properties (with pip-package it is so much easier).
  • I constantly look for new repositories, frameworks which simplifying development of Video Analytics Applications.

Conclusion

What are the most common features for all of them?

  • Gstreamer is used for Video Streaming
  • Tendency to provide Python bindings (so more developers can dive deep faster)
  • Multiple Deep Learning/Computer Vision Frameworks Support (generability)

Which one to choose is definitely up to you and your project requirements?

  • OS?
  • Target Architecture?
  • Programming Language
  • Deadlines?

Due to my experience:

  • start with the simplest
  • explore, understand, exceed the limits
  • make more conscious decision (based on experience you have now)
  • circle steps 2-3

In any case, additional knowledge in Video Streaming (with Gstreamer or FFmpeg) could help you to improve you Project Design, Performance, Accuracy.

Hope you enjoy reading. In case I missed any framework – let me know in comments πŸ˜‰

Add a Comment

Your email address will not be published. Required fields are marked *