10× Faster Log Processing at Scale: Beating Logstash Bottlenecks with Timeplus

Gang Tao
Nov 11
8 min read

In recent months, we’ve been working with several enterprise customers on their Elastic and observability stacks — and a familiar pattern keeps emerging. As their systems scale and log volumes explode, Logstash pipelines begin to choke. Queues fill up, parsing lags behind, and real-time detection and visibility slips away.

The situation gets much worse when teams enable aggregations and more advanced transformations, or any in-pipe alerts. Every time they turn on metrics or anomaly detection in Logstash, performance collapses. The only workaround? Deploy more instances, and watch infrastructure costs skyrocket.

Observability teams are increasingly demanding a more efficient and scalable solution to handle this growing flood of log data. That’s exactly the kind of challenge Timeplus was built to tackle.

Back in my Splunk days designing large-scale log pipelines, I saw these same bottlenecks first-hand. I was confident Timeplus could perform better, but I wanted proof. Together with our engineering team, we ran a benchmark comparing Timeplus and Logstash under identical conditions: same hardware, same grok patterns, same log formats.

The results confirmed: Timeplus dramatically outperformed Logstash in both parsing and aggregation, while reusing the same grok rules as seamless migration.

Before we dive into the test results, a quick overview of Logstash and Timeplus, and the problems we are solving:

Logstash is an open source server-side data processing pipeline that ingests data from a multitude of sources, transforms it, and then sends it to different targets, typically Elastic.

Timeplus, powered by our vectorized stream processing engine in a single C++ binary, is purpose-built for high-throughput, large-scale real-time data processing including log, metric and tracing data. Unlike traditional log pipelines that adapted batch processing to streaming workloads, Timeplus treats streaming as native mode. It actually can handle the same core log processing challenges, centralized aggregation, parsing unstructured data with grok patterns, and enriching logs with additional context, but with a fundamentally different architecture optimized for speed and simplicity. The result: dramatically higher throughput, faster and more in-pipeline intelligence, and a fraction of the infrastructure cost.

The problems we are looking to solve:

Centralized Log Aggregation Modern distributed systems generate logs across thousands of servers in wildly different formats, making manual collection and correlation impossible at scale.
Parsing Unstructured Data into Structured Fields Raw log files contain unstructured text in diverse formats (Apache logs, firewall logs, stack traces) that cannot be easily searched or analyzed.
Enrichment and Normalization Log data often lacks essential context—IP addresses don't reveal locations, user IDs don't show roles, and timestamps arrive in inconsistent formats.

Let’s take a look at how Timeplus performs when parsing Cisco ASA logs compared with Logstash.

Test Scenario: Cisco ASA Firewall Log Processing

This test evaluates how Timeplus and Logstash handle real-world Cisco ASA firewall logs—a challenging use case because these devices generate hundreds of different event types, each with unique field structures and formats. Why Cisco ASA logs are challenging: Each event type represents different operational activities (TCP connections, authentication attempts, ICMP traffic, VPN sessions, access denials) with completely different field layouts. Parsing these efficiently requires flexible pattern matching that can extract the right fields from each event type.

Here is a sample log:

<189>Apr 27 10:22:58 asa-fw01 %ASA-6-302013: Built outbound TCP connection 12345678 for outside:203.0.113.45/443 (203.0.113.45/443) to inside:192.168.1.101/52543 (192.168.1.101/52543)
<189>Apr 27 10:22:59 asa-fw01 %ASA-6-302014: Teardown TCP connection 12345678 for outside:203.0.113.45/443 to inside:192.168.1.101/52543 duration 0:00:01 bytes 527 TCP FINs
<189>Apr 27 10:23:02 asa-fw01 %ASA-4-106023: Deny tcp src inside:192.168.1.55/4444 dst outside:198.51.100.77/23 by access-group "INSIDE_OUT" [0x0, 0x0]
<189>Apr 27 10:23:05 asa-fw01 %ASA-6-305012: Teardown dynamic TCP translation from inside:192.168.1.88/51542 to outside:198.51.100.25/443 duration 0:00:05
<189>Apr 27 10:23:08 asa-fw01 %ASA-6-302015: Built inbound UDP connection 23456789 for inside:192.168.2.5/53 (192.168.2.5/53) to outside:8.8.8.8/54321 (8.8.8.8/54321)
<189>Apr 27 10:23:10 asa-fw01 %ASA-4-400013: IDS:2004 ICMP echo request from 192.168.2.15 to 8.8.8.8 on interface inside
<189>Apr 27 10:23:12 asa-fw01 %ASA-3-313001: Denied ICMP type=8, code=0 from 10.0.0.55 on interface dmz due to rate limit
<189>Apr 27 10:23:15 asa-fw01 %ASA-6-305011: Built dynamic TCP translation from inside:192.168.3.10/58942 to outside:203.0.113.22/443
<189>Apr 27 10:23:17 asa-fw01 %ASA-6-713172: Group = vpn_user1, IP = 203.0.113.77, Assigned private IP = 10.10.10.55
<189>Apr 27 10:23:19 asa-fw01 %ASA-6-113004: AAA user authentication Successful: server = 10.0.0.10, user = vpn_user1
<189>Apr 27 10:23:21 asa-fw01 %ASA-4-113015: AAA user authentication Rejected: reason = Invalid password: user = admin, server = 10.0.0.10
<189>Apr 27 10:23:23 asa-fw01 %ASA-6-302020: Built inbound ICMP connection for faddr 8.8.4.4/0 gaddr 203.0.113.10/0 laddr 192.168.1.50/0
<189>Apr 27 10:23:25 asa-fw01 %ASA-6-302021: Teardown ICMP connection for faddr 8.8.4.4/0 gaddr 203.0.113.10/0 laddr 192.168.1.50/0 duration 0:00:02 bytes 74
......

Two scenarios will be covered in our test:

Scenario 1: Parsing Events

Both systems use identical grok patterns to parse three representative event types (302013, 302014, and 106023), extracting common fields (timestamp, device, severity, event_id) and event-specific fields (source/destination IPs and ports, actions, connection IDs).

Test setup: Kafka → Parsing/Transformation → Kafka (eliminates downstream storage as a variable)

Scenario 2: Aggregation

For each event type, both systems compute total event counts to measure aggregation performance—a critical capability for security analytics that need to detect patterns like "count failed login attempts per IP address."

Test Setup

Software Versions

Logstash: 9.2.0
Timeplus: 3.0.1

Hardware

Both systems run on identical GCP n2-standard-8 virtual machines:
- CPU: 8 cores
- Memory: 16GB RAM

Test Architectures

Similar architectures for both Logstash and Timeplus.

Logstash:

Timeplus:

Identical input/output:

Both systems consume from the same Kafka source topic and write to Kafka output topics, eliminating downstream storage (Elasticsearch, databases) as a performance variable
Identical parsing logic: Both use the same predefined grok patterns for fair comparison
Continuous load: A log generator continuously produces Cisco ASA log entries to simulate real-world high-volume scenarios
CPU saturation testing: Load is increased until both systems reach 800% CPU utilization (all 8 cores fully utilized), establishing maximum throughput

This setup ensures an apples-to-apples comparison of pure parsing and transformation performance.

Grok Pattern Configuration Both systems use identical grok patterns (syntax differs slightly between platforms, but logic is the same):

Logstash

Timeplus

# ---- 1. Parse the ASA syslog header (fast dissect)

grok {

match => {

"message" => "<%{POSINT:priority}>%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:device} %%{WORD:facility}-%{INT:severity}-%{INT:event_id}: %{GREEDYDATA:asa_message}"

}

# ---- 2. Branch based on event_id

if [event_id] in ["302013", "302014"] {

grok {

match => {

"asa_message" => "%{DATA:action} %{DATA:direction} %{DATA:protocol} connection %{INT:connection_id} for %{DATA:src_interface}:%{IP:src_ip}\\/%{INT:src_port} to %{DATA:dst_interface}:%{IP:dst_ip}\\/%{INT:dst_port}"

}

remove_field => ["asa_message"]

}

mutate {

convert => { "src_port" => "integer" "dst_port" => "integer" "connection_id" => "integer" }

}

else if [event_id] == "106023" {

grok {

match => {

"asa_message" => "%{DATA:action} %{DATA:protocol} src %{DATA:src_interface}:%{IP:src_ip}\\/%{INT:src_port} dst %{DATA:dst_interface}:%{IP:dst_ip}\\/%{INT:dst_port} by %{DATA:rest}"

}

remove_field => ["asa_message"]

}

mutate {

convert => { "src_port" => "integer" "dst_port" => "integer" }

}

select

grok(message,'<%{POSINT:priority}>%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:device} %%{WORD:facility}-%{INT:severity}-%{INT:event_id}: %{GREEDYDATA:asa_message}') as m,

multi_if(

m['event_id'] in ['302013', '302014', '305011', '305012'],

grok(m['asa_message'],

'%{DATA:action} %{DATA:direction} %{DATA:protocol} connection %{INT:connection_id} for %{DATA:src_interface}:%{IP:src_ip}/%{INT:src_port} to %{DATA:dst_interface}:%{IP:dst_ip}/%{INT:dst_port}'),

m['event_id'] in ['106023'],

grok(m['asa_message'],

'%{DATA:action} %{DATA:protocol} src %{DATA:src_interface}:%{IP:src_ip}/%{INT:src_port} dst %{DATA:dst_interface}:%{IP:dst_ip}/%{INT:dst_port} by %{DATA:rest}'),

map_cast(['k'], ['v'])

) as m1

from log_w_ext

Test Result

Scenario	Logstash	Timeplus	Advantage
Field Extractions	60,000 eps	250,000 eps	4x faster
Aggregation	25,000 eps	250,000 eps	10x faster
CPU utilization	800% (full cores)	800% (full cores)	Both maxed out

Note: In the aggregation scenario, the max thread for logstash has to be set to 1 to ensure the aggregation result is correct, therefore the performance drops a lot.

Performance Comparison

Cisco ASA Firewall Log Processing Benchmark (8 cores, 16GB RAM)

What makes Timeplus faster in this case?

Based on our test, Timeplus delivers 4x faster log parsing and 10x faster aggregation compared to Logstash when processing Cisco ASA firewall logs with identical grok patterns.

The performance gap reveals fundamental architectural limitations:

Aggregation bottleneck (10x difference): Logstash requires single-threaded processing to ensure correct aggregation results, as noted in the benchmark: "the max thread for logstash has to be set to 1 to ensure the aggregation result is correct." This architectural constraint limits throughput to 25,000 eps regardless of available CPU cores.
Parsing bottleneck (4x difference): Even with multi-threaded workers, Logstash's JRuby-based grok processing and single-threaded input stages limit parsing to 60,000 eps, while Timeplus leverages vectorized processing and parallel execution to achieve 250,000 eps.

See the fundamental differences between these two products:

Aspect	Logstash	Timeplus
Processing model	Batch-oriented pipeline with micro-batching	Streaming-first continuous processing
Implementation	JRuby/Java with interpreted execution	Modern C++ with native compilation
Deployment	JVM + multiple config files	Single binary under 500MB
Dependencies	Requires JVM, often paired with Kafka/Elasticsearch	Zero external dependencies, self-contained
Memory overhead	4-8GB heap + off-heap buffers	Efficient on 0.5GB RAM to multi-core servers

The 4x parsing performance difference comes down to how each system was built and what programming language they use.

Logstash was written in JRuby (Ruby running on Java), which is an interpreted scripting language. When Logstash parses a log with grok patterns, it runs those patterns through an interpreter that reads and executes the code line by line, rather than running pre-compiled machine instructions. Think of it like reading a recipe step-by-step versus having muscle memory for cooking—one is inherently slower. Additionally, every log event creates Ruby objects in memory that need to be cleaned up by Java's garbage collector, which periodically pauses processing to free up memory. Most critically, Logstash's input stage (where logs first enter the system) runs on a single thread, creating a bottleneck where logs queue up waiting to be processed sequentially even though 7 other CPU cores are available.

Timeplus was built in C++, a compiled language where code is transformed into optimized machine instructions before it ever runs. Its grok patterns execute as native machine code that runs directly on the CPU without interpretation overhead. More importantly, Timeplus uses SIMD (Single Instruction, Multiple Data) technology—special CPU instructions that can process 4-16 log entries at the exact same time in one operation, like scanning multiple lines simultaneously instead of one at a time. It also spreads parsing work across all available CPU cores automatically, with no garbage collection pauses to interrupt processing.

The 10x aggregation performance difference was from a simple fact: Logstash can only use one CPU core at a time for aggregations, while Timeplus can use all available cores simultaneously.

Here's why this happens: When Logstash needs to count events or calculate statistics (like "how many failed logins per IP address?"), its architecture cannot split this work across multiple threads without getting wrong answers. So even though the test machine has 8 CPU cores, Logstash forces aggregations to run on just one core while the other 7 sit completely idle. This isn't a bug or configuration issue—it's a fundamental design limitation that cannot be fixed.

Timeplus was built differently from the ground up. Its architecture can correctly divide aggregation work across all 8 cores, with each core processing a portion of the data and producing accurate combined results. Additionally, Timeplus stores data in columns, so when counting events by severity level, it only reads the "severity" column instead of loading entire log records into memory. It also uses special CPU instructions that can process 4-16 values at once instead of one at a time.

Summary

In today’s blog, we made a performance comparison between Logstash and Timeplus for parsing Cisco ASA firewall logs, using identical grok patterns and identical hardware (8-core, 16GB RAM machines). The results reveal significant performance differences caused by fundamental architectural choices.

In log ingest and transformation use cases, both products can be used in different use cases:

Logstash is good at:

Scenarios requiring obscure plugins (legacy systems, niche enterprise software)
Organizations with deep ELK stack investments and limited pressure to change
Simple forwarding scenarios where transformation is minimal
Teams with strong Logstash expertise but limited SQL skills
Moderate volumes (<10,000 events per second)

Timeplus excels in cases with:

High-volume log streams (>50,000 events per second)
Real-time requirements demanding sub-second latency
Complex analytics involving aggregations and correlations
Resource-constrained environments (edge computing, small instances)
Operational simplicity preferences (fewer moving parts)
Cost sensitivity around infrastructure spend
Security operations requiring immediate threat detection

If your use case involves high-volume logs requiring real-time aggregation and analysis, Timeplus is a good choice. Try it yourself! See install options

WHY TIMEPLUS?

PRODUCT

DEPLOYMENT

WHY TIMEPLUS?

PRODUCT

WHY TIMEPLUS?

PRODUCT

Test Scenario: Cisco ASA Firewall Log Processing

Scenario 1: Parsing Events

Scenario 2: Aggregation

Test Setup

# ---- 1. Parse the ASA syslog header (fast dissect)

grok {

match => {

"message" => "<%{POSINT:priority}>%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:device} %%{WORD:facility}-%{INT:severity}-%{INT:event_id}: %{GREEDYDATA:asa_message}"

}

}

# ---- 2. Branch based on event_id

if [event_id] in ["302013", "302014"] {

grok {

match => {

"asa_message" => "%{DATA:action} %{DATA:direction} %{DATA:protocol} connection %{INT:connection_id} for %{DATA:src_interface}:%{IP:src_ip}\\/%{INT:src_port} to %{DATA:dst_interface}:%{IP:dst_ip}\\/%{INT:dst_port}"

}

remove_field => ["asa_message"]

}

mutate {

convert => { "src_port" => "integer" "dst_port" => "integer" "connection_id" => "integer" }

}

}

else if [event_id] == "106023" {

grok {

match => {

"asa_message" => "%{DATA:action} %{DATA:protocol} src %{DATA:src_interface}:%{IP:src_ip}\\/%{INT:src_port} dst %{DATA:dst_interface}:%{IP:dst_ip}\\/%{INT:dst_port} by %{DATA:rest}"

}

remove_field => ["asa_message"]

}

mutate {

convert => { "src_port" => "integer" "dst_port" => "integer" }

}

}

select

grok(message,'<%{POSINT:priority}>%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:device} %%{WORD:facility}-%{INT:severity}-%{INT:event_id}: %{GREEDYDATA:asa_message}') as m,

multi_if(

m['event_id'] in ['302013', '302014', '305011', '305012'],

grok(m['asa_message'],

'%{DATA:action} %{DATA:direction} %{DATA:protocol} connection %{INT:connection_id} for %{DATA:src_interface}:%{IP:src_ip}/%{INT:src_port} to %{DATA:dst_interface}:%{IP:dst_ip}/%{INT:dst_port}'),

m['event_id'] in ['106023'],

grok(m['asa_message'],

'%{DATA:action} %{DATA:protocol} src %{DATA:src_interface}:%{IP:src_ip}/%{INT:src_port} dst %{DATA:dst_interface}:%{IP:dst_ip}/%{INT:dst_port} by %{DATA:rest}'),

map_cast(['k'], ['v'])

) as m1

from log_w_ext

Test Result

Performance Comparison

What makes Timeplus faster in this case?

Summary