top of page

Timeplus + Roboflow: Real-Time Video Analytics and Monitoring

  • Writer: Gang Tao
    Gang Tao
  • Apr 23
  • 5 min read

Updated: May 1

A machine learning vision model is a type of AI system trained to understand and interpret visual data—like images or videos—much like how humans use their eyes and brain to see and recognize things.  In simple terms, It’s a computer program that "learns" from visual examples and can then:

  • Identify what’s in a picture (e.g., cats vs. dogs)

  • Detect objects (like people, cars, or faces)

  • Understand scenes (e.g., "a person walking on a rainy street")

  • Generate new images or describe what's happening in a video


Vision models can be used to analyze video footage in real time and trigger actions in response to security or safety threats, with applications in surveillance, industrial safety, and healthcare.



Computer vision is transforming how we analyze and respond to video data, but building end-to-end solutions that connect powerful ML models with real-time analytics remains challenging. In this post, I'll share a project that combines Roboflow's computer vision capabilities with Timeplus's real-time analytics to create a complete video monitoring solution.



The Problem: From Video Insights to Actions


Organizations increasingly rely on video data for critical insights, from manufacturing quality control to retail analytics and security monitoring. While ML models can now detect objects with impressive accuracy, there's still a gap between detection and actionable insights. How do we:

  1. Deploy powerful ML models on video feeds without complex infrastructure?

  2. Process detection results in real-time to identify patterns and anomalies?

  3. Create alerts and dashboards based on what our models detect?



The Solution: A Real-Time Video Analytics Pipeline


I've built an open-source application that bridges this gap, creating a seamless pipeline from video input to analytics dashboards. The system:

  1. Uses Roboflow to run state-of-the-art computer vision models and inference flows on video streams

  2. Processes detection results and streams them to Timeplus for real-time analytics

  3. Provides a FastAPI based web interface showing both the processed video and structured data

  4. Enables complex analysis and alerting based on detection patterns




How It Works


The application has four main components:


1. Video Processing with Roboflow


A video processing workflow has been created as following:


There are two models:

  • A object detection model based on yolo11n

  • A violent behavior detection model


There are two visualization nodes that will add a bounding box and label to the input video. So the workflow will give me three outputs

  1. The detection objects 

  2. The classification result of the violent behavior

  3. The processed video image with detected objects that highlighted with bounding box and label.


Roboflow's inference SDK makes it easy to run sophisticated computer vision models on video feeds. Here is the python code to run the workflow locally.


python

# Initialize Roboflow's inference pipeline
self.pipeline = InferencePipeline.init_with_workflow(
    api_key=api_key,
    workspace_name=username,
    workflow_id=workflow_id,
    video_reference=video_path,
    max_fps=30,
    on_prediction=self.sink
)


2. Real-Time Video Streaming with FastAPI


One of the key components is the FastAPI server that streams the processed video to web browsers in real-time:


python

@app.get("/video_feed")
async def video_feed():
    """Stream video frames"""
    return StreamingResponse(
        video_processor.generate_frames(),
        media_type="multipart/x-mixed-replace; boundary=frame"
    )

This endpoint uses HTTP streaming to continuously send processed video frames to the browser. The frames are:

  • Captured from the Roboflow inference pipeline

  • Enhanced with detection visualizations

  • Encoded as JPEG images

  • Streamed using the multipart/x-mixed-replace content type


The web interface displays this stream alongside the structured detection data, creating a unified monitoring experience:


html

<div class="container">
    <div class="video-container">
        <img src="/video_feed" alt="Video Stream">
    </div>
    <div class="results-container">
        <h2>Inference Results</h2>
        <div id="inference-results">Loading...</div>
    </div>
</div>

JavaScript in the web page periodically fetches the latest detection data through another API endpoint, ensuring the information stays in sync with the video stream.



3. Ingest Detection Result to Timeplus


The inference server will send Roboflow's detection results to Timeplus, by converting detection objects into structured JSON can call Timeplus Rest API to ingest these results.


Here is a sample output from the inference server:

{
	"model_detection_prediction": [
		{
			"bbox": {
				"x1": 367,
				"y1": 157,
				"x2": 605,
				"y2": 417,
				"width": 238,
				"height": 260
			},
			"confidence": 0.8265051245689392,
			"class_id": 0,
			"class_name": "person",
			"detection_id": "c3eff046-c18d-451b-8b9f-9eadbfe2cbe5",
			"parent_id": "image.[0]",
			"image_dimensions": "[480 854]",
			"inference_id": "4f7d5e4c-bef0-4b49-a87f-f55c3d5720e5",
			"prediction_type": "object-detection",
			"root_parent_id": "image.[0]",
			"root_parent_coordinates": "[0 0]",
			"root_parent_dimensions": "[480 854]",
			"parent_coordinates": "[0 0]",
			"parent_dimensions": "[480 854]"
		}
	],
	"model_violence_predictions": {
		"inference_id": "36295c35-5459-4474-a5e8-1052d2670547",
		"time": 0.2452179160027299,
		"image": {
			"width": 854,
			"height": 480
		},
		"predictions": [
			{
				"class": "non_violence",
				"class_id": 0,
				"confidence": 0.9987
			}
		],
		"top": "non_violence",
		"confidence": 0.9987,
		"prediction_type": "classification",
		"parent_id": "image.[0]",
		"root_parent_id": "image.[0]"
	}
}

Part of my code is used to process supervision.detection.core.Detection object will be an output from the detection model, I hope there is a method like to_json() to get the json string of that output. For the violent classification model output, which is a json serializable.


The above data are outputs from the inference workflow, we ingest these data in real-time into a stream of Timeplus called `video_stream_log`, users can run query to analysis the output in real-time such as `select * from video_stream_log`



4. Real-Time Analytics with Timeplus


Timeplus processes the detection data stream, enabling powerful analytics capabilities:

  • Real-time dashboards showing detection counts, types, and patterns

  • SQL queries to filter and aggregate detection events

  • Anomaly detection for unusual patterns

  • Alerts when specific objects appear or when counts exceed thresholds



There are two queries in this case:


Total Detect Objects in Last 5 Seconds

WITH obj AS
 (
   SELECT
     _tp_time AS time, array_join(json_extract_array(raw, 'model_detection_prediction')) AS detected_objects, detected_objects:class_name AS name
   FROM
     video_stream_log
   WHERE
     _tp_time > (now() - 1h)
 )
SELECT
 count(*) AS count, name, window_start
FROM
 hop(obj, time, 1s, 5s)
GROUP BY
 window_start, name
ORDER BY
 count DESC

The above query shows "Within the past hour, what are the most frequently detected objects in each 5-second window, with windows sliding forward every second?"



Violent Score in the Last 5 Seconds

WITH vio AS
 (
   SELECT
     _tp_time, raw:model_violence_predictions:top AS flag, if(flag = 'violence', cast(raw:model_violence_predictions:confidence, 'float64'), 0) AS vscore
   FROM
     video_stream_log
   where _tp_time > now()-1h
 )
SELECT
 window_start, count(*) AS count, sum(vscore) AS svscore, svscore / count AS violence_rate
FROM
 hop(vio, 1s, 5s)
GROUP BY
 window_start

The above query shows , Within the past hour, what is the level of violence detected in each 5-second window of video, with windows sliding forward every second?


Leveraging the iframe, we also embedded the video output from the inference flow in the dashboard, so users can easily monitor what is happening right now.




Conclusion


By combining Roboflow's powerful computer vision capabilities with Timeplus's real-time analytics, we've created a complete solution for video analytics that's both powerful and accessible. This integration demonstrates how modern tools can bridge the gap between ML insights and actionable analytics.


I'd love to hear how you might use this solution in your own projects. Feel free to try it out! The code can be found here in our GitHub repo.

bottom of page