CHALLENGES:
Storage and Compute Costs
Storing voluminous and complex JSON documents in Kafka and subsequently in OLAP data warehouses can lead to substantial storage and computational costs.
Complex Data Structures
The GitHub schema, while open and flexible, often results in verbose JSON documents with complex, nested structures, potentially spanning hundreds of lines per message. Parsing and filtering these documents efficiently and accurately presents a significant challenge.
Continuous and Incremental Analytics Demand
In contrast to batch ETL and traditional BI, Timeplus supports continuous and incremental processing. Critical use cases include real-time event tracking, live aggregation of event data, and immediate anomaly or fraud detection.
SolutioN
Timeplus seamlessly connects with Kafka API and read incoming and existing messages. You can run SQL query directly via the Kafka External Stream without saving any data in Timeplus.
To avoid the high cost of storing billions of events in Kafka and also to improve the query performance, it's recommended to create Materialized Views in Timeplus to save Kafka message locally and apply optimized columar storage or index. Materialized Views also enable incremental updates to turn BI reports from minutes to sub-seconds.
Step 1
Call GitHub API to get real-time events and push to Kafka
Step 2
Create External Stream in Timeplus to read data from Kafka
Step 3
Create Materialized Views to process and analyze the dataCreate Materialized Views to process and analyze the data
Step 4
Create real-time alerts or visualization in Timeplus or Grafana
SQL EXAMPLE:
SEE IT IN ACTION:
data lineage:

Imagine you have millions of GitHub events, such as commits, pull requests, issues, and releases, streaming in real-time. You want to monitor these events to gain insights into rising new projects, popular projects, and developer contributions. Querying them directly from Apache Kafka can be challenging due to the complexity of the data and the need for efficient processing.
Timeplus provides a solution to this problem by allowing you to create materialized views that can process and analyze these events in real-time, enabling you to gain valuable insights quickly.

