How to implement Object Detection in Video with Gstreamer in Python using Tensorflow?

10 min. read |

In this tutorial we are going to implement Object Detection plugin for Gstreamer using pre-trained models from Tensorflow Models Zoo and inject it into Video Streaming Pipeline.

Requirements

Code

Learn how to?

  • Create gstreamer plugin that detects objects with tensorflow in each video frame using models from Tensorflow Models Zoo

Use gstreamer plugins

Preface

In previous posts we’ve already learnt How to create simple Gstreamer Plugin in Python. Now let’s make a step forward.

Guide

Preparation

At, first clone repository with prepared models, video and code, so we can work with code samples from the beginning.

Use virtual environment to make dependencies for project clear and at own place. So, activate new one:

Then, install requirements from the project.

For model inference install Tensorflow. But check if your PC supports Cuda-enabled GPU first (otherwise install CPU version):

Additional. To make projects reproducible at any time I prefer to use Data Version Control for models, data or other huge files. As a storage service I use Google Cloud Storage (free, easy to use and setup).

Now check data/ folder there should be prepared model (.pb) and video (.mp4), so you can easily run tests on your own.

Define baseline gstreamer pipeline

Launch next pipeline in terminal to check that gstreamer works properly

Basically, the following pipeline :

  • captures frames from video file usig filesrc,
  • converts frames to RGB colorspace using videoconvert and capsfilter with pre-defined image colorspace format by a string “video/x-raw,format=RGB“.
  • displays frames in window with autovideosink
How to implement Object Detection in Video with Gstreamer in Python using Tensorflow?

Run

Display mode

For now, let’s run a simple predefined command to check that everything working:

How to implement Object Detection in Video with Gstreamer in Python using Tensorflow?

Text mode

Export required paths to enable plugin and make it visible to gstreamer:

Note

  • Path to libgstpython.cpython-36m-x86_64-linux-gnu.so (built from gst-python)
  • Path to Gstreamer Plugins implementation (python scripts)

Run command with enabled debug messages print:

Note: check gstreamer debugging tools to enable logging

How to implement Object Detection in Video with Gstreamer in Python using Tensorflow?

Note: application should print similar output (list of dicts with object’s class_name, confidence, bounding_box)

Great, now let’s go through code.

Explanation

From previous post (ex.: How to write a Gstreamer Plugin with Python) we discovered that from gstreamer plugin we can easily get image data. Now let’s try to run model on retrieved image data and display inference results in console or video.

At the beginning let implement Object Detection Plugin (gst_tf_detection).

Define plugin class

First, define Plugin class that extends GstBase.BaseTransform (base class for elements that process data). Plugin’s name “gst_tf_detection” (with this name plugin can be called inside gstreamer pipeline).

Fixate stream format

First, define input and output buffer’s format for plugin. Since our models consumes RGB image data, let’s specify it:

Define properties

Additionally, let define next parameters to be able to pass:

  • model instance (with proper interface).
    • to be able to pass python object use GObject.TYPE_PYOBJECT as parameter type
  • model config, so we can easily modify model’s parameters without changing single line of code.
    • to be able to pass string use str as parameter type.
    • configuration file with parameters is common practice to setup plugins when the number of parameters exceeds 3 and more.

Note: Passing model as a parameter also allows to save memory consumption. For example, if model is initialized each time plugin created then model’s weights are duplicated in the memory. In a such way the number of running pipelines simultaneously is limited by memory amount. But if model is created once and passes to each plugin as a reference, then the only limitation is hardware performance capacity.

Now, to specify model’s config use next command

Implement get-set handlers for defined properties

GET

SET

When config for model is updated we need to shutdown previous one, initialize and start new one.

Implement transform()

First, define a function to do buffer processing in-place do_transform_ip(), that accepts Gst.Buffer and returns state (Gst.FlowReturn).

Then, if there is no model plugin should work in passthrough mode.

Otherwise, we convert Gst.Buffer to np.ndarray, feed image to model (inference), print results to console and write objects to Gst.Buffer as metadata (recap: How to add metadata to gstreamer buffer), so detected objects can be transmitted further in pipeline.

Tensorflow Model Implementation

We won’t deep dive much into Tensorflow model implementation. Just have a look at code. Class TfObjectDetectionModel hides:

  • tf.Graph import
  • device configuration
  • model parameters (ex.: threshold, labels, input size)

Additional

Model Configuration file

  • file with common editable parameters for model inference. For example:

Labels format

  • file with lines of pairs <class_id: class_name>. For example:

Object Detection Overlay Plugin

In order to draw detected objects on video there is an implementation of gst_detection_overlay plugin (recap: “How to draw kitten with Gstreamer“).

Main differences compared to gst_tf_detection plugin.

Input-output buffer’s format now RGBx (4-channels format), so we can work with buffer using cairo library.

Request detected objects info from buffer using gstreamer-python package

To enable drawing on buffer (in-place) use gstreamer-python package as well

Tuning

  • change video input
    • run whole pipeline on your video file, from camera or stream
  • change model’s config
    • reduce false positives with higher confidence threshold
  • improve quality with increasing input size
  • leave target labels only

Conclusion

With Gstreamer Python Bindings you can inject any Tensorflow model in any video streaming pipeline. Custom plugins with Tensorflow models already used by popular video analytics frameworks.

Hope everything works as expected 😉 In case of troubles with running code leave comments or open an issue on Github.

11 Comments

Add a Comment

Your email address will not be published. Required fields are marked *