top of page
Streaming Processing Showdown
ksqlDB logo.png
vs.
proton - white logo.png
Stream processors share a common theme: timely insights. But it's not enough to just be aware of and reactive to datastreams – they should also be easy to set up, use and maintain.

ksqlDB and Proton are two engines for processing Kafka data in real-time, differing in design, efficiency, and open source license. 

So, which one wins?
Two Open-Source Streaming Engines

ksqlDB (previously KSQL) is a database for Apache Kafka, a popular distributed streaming platform, and is licensed under the Confluent Community License. 

Proton is a fast and lightweight alternative to Apache Flink, powered by ClickHouse. It's also the core engine of Timeplus, a cloud native streaming analytics platform.

ksqlDB and Proton share many similar features, including support for:
stream-processing.png

Stateful streaming processing

persistent.png

Data persistence

table.png

Stream/table concept

Query.png

Dual query mode: unbounded and bounded*

write.png

Read data and write results back to Kafka

* Long running unbounded push-based query, and bounded pull-based query

How does ksqlDB compare with Proton?
ksqlDB offers a SQL interface, integration with Kafka, stateful processing, scalability, and great security features. However, it has its limitations, including deep coupling with Kafka, heavy resource consumption, and not specifically designed for analytics.

Along with shared features, Proton offers additional benefits compared to ksqlDB.
Let's see 5 reasons why developers are choosing Proton as an alternative to ksqlDB.
ready-icon
Is it ready for prime-time?

ksqlDB

Not true open source

ksqlDB is licensed under the Confluent Community License (CLL), and there are many limitations with that license. For instance, it cannot be used for commercial purposes, and all source edits must contribute back to it.

PROTON

Apache License 2.0

Proton’s open source license is Apache License 2.0, which is more open compared to CCL. Developers can use, update, or redistribute it for free without any limitations.

flexible-icon
Is it flexible?

ksqlDB

Deep coupling with Kafka

ksqlDB is tightly coupled with Kafka, at the deployment level. Each ksqlDB server is binded with a Kafka cluster, ksqlDB uses Kafka as storage to keep lots of internal state. There is no way to process streams from different clusters unless you route the data from different clusters into the same Kafka.

 

Additionally, while running ksqlDB, it will impact the Kafka cluster by creating more internal topics with extra read and write.

PROTON

High flexibility consuming Kafka data

Proton supports Kafka external stream with read and write, unlike other streaming processing systems where Kafka is only offered as a source or sink. Proton takes Kafka as a stream, though no direct data persisted. The user can still create a materialized view in case data is required to be persisted, but more flexibility is provided to the user.

 

When working with Kafka, there is no direct coupling between Proton and Kafka, so users can query any data from any Kafka cluster.

efficient-icon
Is it efficient?

ksqlDB

Heavy Resource Consumption

Every SQL query run on ksqlDB is a Kafka Streams application, which creates its own worker threads, adding overhead to every query.


ksqlDB uses Kafka topics to store state changelogs and using RocksDB to materialize these changelogs into tables, which means more resource consumption for the state.

PROTON

Lightweight and efficient

Proton is lightweight, written in C++ and built on top of ClickHouse, notable for its outstanding performance. Leveraging SIMD, specially designed internal data format and other optimization techniques, Proton can process over 1 million records per second on a commodity computer.

analytics-icon
Is it for analytic workloads?

ksqlDB

Not designed for analytics

With Kafka, ksqlDB can support streaming processing, but the RocksDB key-value storage used as the table storage is not great for quickly scanning huge amounts of data while skipping irrelevant data.

PROTON

Purpose-designed for analytics

Proton supports both unbounded streaming queries (append-only log) and bounded historical queries (column store based on Clickhouse). When joining streaming data with historical data, Proton can quickly scan huge amounts of historical data.

udfs-icon
Does it support User Defined Functions?

ksqlDB

Java UDF

ksqlDB uses Java-based UDFs. Compared to JavaScript UDFs, it's not as easy to use, and there's added complexity of handling JVM version, Kafka version, or dependency versions.

PROTON

JavaScript UDF

Proton supports complex computing with JavaScript UDFs. With an embedded JavaScript engine, Proton extends query capabilities to a wider range of users. JavaScript UDFs are easy to build and deploy.

More from Our Blog

Our CTO, Gang Tao, goes into more details on ksqlDB vs. Proton in his blog post. 

Summary
ksqlDB
Proton
License
Confluent Community License (CCL)
Apache 2.0
Language
Java (on Kafka Stream)
C++
Resource consumption
High
Low
Stateful streaming processing
Yes
Yes
Bounded, pull based query
Yes
Yes
Unbounded, continuously push base query
Yes
Yes
Materialized view and table concept
Yes (on top of RocksDB)
Yes (on top of ClickHouse)
Kafka Connection
Yes (source and sink)
Yes (external stream)
Support for other messaging systems
Kafka only (dependency)
Pulsar, Kinesis, and more (supports but not dependent)
Cluster and HA
Yes
No (supported by Timeplus Cloud or Timeplus Platform)
User Defined Function
Java
JavaScript
Performance
Strong
Stronger
Performance
Good
Excellent
Security
Yes, role-based access control
Yes, based on ClickHouse
Get Started with Proton
Take control of historical and live data with unified batch and stream processing and streaming OLAP.
bottom of page