From Logs to Context: Why Your SOC Detection Needs a Real-Time Context and Control Layer

Ting Wang
2 hours ago
8 min read

By Ting Wang, CEO and Co-Founder of Timeplus

🎙️ Join our webinar, "From Logs to Context: Build Real-Time Security Context & Control to Detect Threats Before SIEM", hosted by Ting Wang and Gang Tao (CTO).

Tuesday, March 31, 11am PT

Sign up to save your spot!

The Midnight Alert Fatigue

It’s 00:03. You’re on call. A single source starts hammering your VPN login surface. Your Cisco ASA does what it’s supposed to do: it logs every rejection. Invalid password, user not found, account locked. Line after line.

Each event is “true.”

But dozens of these appear within seconds. Then hundreds. Every log line is still technically accurate. Each one on its own tells you almost nothing. Taken together, they paint an obvious picture — a brute-force or password-spray attack is underway.

The problem is, the questions a SOC analyst actually needs answered don't live in any single log line. How many failures in the last 60 seconds or 5 minutes? Is this IP behaving differently from its normal pattern? Is it targeting privileged users or critical assets? Did a successful login follow the spray?

No single event or stream can answer that. Analyzing data from 30 mins or hours ago means you already missed the attack. Real-time context can.

Where Today's SOC Architecture Falls Short

Most Security Operations Centers process data through a familiar pipeline. Raw logs flow into a SIEM. The SIEM indexes the data, and an analyst then runs filtering, transformations, correlations, and eventually an alert fires.

The problem is baked into the sequence. Every step — indexing, querying, correlating — happens after the data has been stored. Time windows get reconstructed retroactively. Aggregations are computed on historical records. Enrichment is layered on after the fact.

By the time a pattern surfaces, minutes have passed. Thousands of events have been already ingested and stored, which is expensive. Worse, the attacker may have already succeeded and moved laterally inside your network.

The detection logic itself is usually fine. The real bottleneck is when and where that logic runs.

A Useful Way to Think About the Gap

Three concepts help clarify this:

Logs are raw events. They tell you what happened at a point in time (e.g. a single firewall denial, a single authentication failure).

Signals are patterns that emerge from aggregating multiple events (e.g. 50 failed logins from the same IP in 10 seconds, a sudden traffic spike from a previously quiet host).

Context ties signals together across time, identity, and history. It connects a spike in failed logins to the fact that this IP has never accessed this subnet before, and that the targeted account belongs to a domain administrator.

In a typical SOC setup, signals and context are both produced after ingestion. They're reconstructed from stored data. That fundamental delay means even the signals themselves can be missed, or arrive too late to act on.

Why "Store Everything, Search Later" Is Breaking Down

This approach worked when attacks unfolded slowly. Correlation windows could stretch across hours. Data volumes were manageable, and minutes or longer latency between an event and its detection didn't change outcomes.

That world is disappearing. AI-driven attacks compress entire kill chains — reconnaissance, exploitation, lateral movement, exfiltration — into seconds or minutes. The combination of faster attacks and exploding data volumes has created a time-and-scale problem that the old model wasn't built to handle.

The Missing Capability: Real-Time Dynamic Correlation (Live vs. Baseline)

Detecting "high traffic" from an IP address, by itself, is not very useful. A web server legitimately handling a traffic surge will also show high traffic. What matters is whether the behavior is abnormal for that specific entity at that specific moment.

Answering that question requires four things working together: live aggregation over short windows (seconds), a historical baseline computed over longer windows (minutes or hours), real-time correlation between the live data and the baseline, and dynamic thresholds that adapt per entity rather than relying on fixed rules.

Here's what this looks like concretely, using Cisco ASA logs processed through Timeplus streaming SQL.

Step 1 — Capture Live Dynamic Traffic in Short Windows

A materialized view computes total bytes per source IP over a sliding 5-second window, advancing every second:

This HOP function creates an overlapping window. Every second, it emits the sum of bytes for the last five seconds per IP. The result is a continuously updated picture of short-term traffic behavior.

Step 2 — Build a Dynamic Baseline

A separate materialized view computes average bytes based on 5s hop window aggregation result for each IP:

This creates a Mutable Stream keyed by src_ip, meaning it holds exactly one row per IP address and updates that row in place whenever a new value arrives. The materialized view continuously listens to the 5-second hop aggregation stream and recalculates the running average total bytes for 5 second for each IP.

Step 3 — Correlate Two Moving Targets (Live vs. Baseline) in Real Time

A third materialized view joins the live stream against the baseline stream on source IP and computes a spike ratio:

This materialized view is where real-time detection actually happens. It continuously joins the live 5-second traffic stream against the mutable baseline lookup table on src_ip. Every time a new 5-second window arrives for an IP, the join looks up that IP's historical average from the mutable stream and computes a spike ratio — the current traffic divided by the baseline.

The result flows into cxt_ddos_stream, giving downstream consumers (like the alert) a ready-to-use stream where every event already carries the live bytes, the baseline bytes, and how many times above normal the traffic is. This is essentially the streaming equivalent of a classic anomaly score: no fixed thresholds baked in, just a ratio that says "this IP is currently doing 8x its normal traffic" and lets the alert layer decide what to act on.

Step 4 — Trigger Detection and Response

An alert fires when the spike ratio exceeds the threshold:

The throttle prevents alert storms. The batch event size groups related events into a single notification. Detection happens within seconds of the anomaly appearing, not minutes after ingestion.

What does this approach deliver? Detection in seconds rather than minutes. Thresholds that adapt dynamically per IP. Anomaly detection that runs continuously, not on a schedule.

Can Existing Tools Do This?

This is worth examining honestly. The detection logic above isn't conceptually new. What's different is where and how it executes.

Splunk

Splunk can express similar logic, but the implementation is constrained by its underlying “search-after-storage” architecture. This makes correlation operationally heavy and expensive and heavy, critically, it is not “real-time.”

A representative query might look like this:

index=asa
| bucket _time span=5m
| stats sum(bytes)/300 as baseline by src_ip
| join src_ip
    [search index=asa earliest=-30s
     | stats sum(bytes) as live_bytes by src_ip]
| eval spike_ratio = live_bytes / baseline
| where spike_ratio > 5

The logic is sound. The constraints are operational:

Indexing Latency: This query only runs after data has been fully indexed. You are already behind the clock.

Expensive/Limited Joins: The join operation is resource-intensive at scale. Running this across high-volume firewall logs frequently can cripple search performance.

Scheduled, Not Continuous: It executes on a schedule—typically with a minimum interval of five minutes. The end-to-end latency from event to alert is measured in minutes, not seconds.

For SOC teams dealing with budget pressure, the cost of storing and re-querying all this raw data adds up quickly. You are paying to store the noise just to find the signal minutes too late.

Cribl

Cribl approaches the problem differently by processing data in-flight. It excels at apply filtering and routing logic as events pass through, effective for reducing SIEM ingestion costs:

if (/ASA/.test(_raw)) {
    src_ip = extract(/src=([0-9.]+)/, _raw);
}
baseline = lookup("baseline_by_ip", src_ip);
if (bytes > baseline * 5) {
    route("alert");
}

While this is fast for simple comparisons, it hits a wall quickly in a SOC environment:

Static Baselines: The baseline must be precomputed elsewhere and loaded as a static lookup. It cannot adapt to changing network conditions in real-time.

Stateless Processing: Cribl does not maintain a continuous state across events. It cannot natively correlate data across different time windows—such as comparing a 5-second burst to a 5-minute average.

Routing vs. Correlation: Cribl handles data movement exceptionally well, but it does not analyze data in the way this use case demands. It can tell you where the data should go, but it cannot tell you what the data means when the answer requires temporal context.

Timeplus

The Timeplus approach, shown in the streaming SQL examples above, computes both the live stream and the baseline continuously. The join happens while data is still in motion. State is maintained across windows. Detection fires the moment the threshold is crossed.

The output can be routed to multiple downstream systems simultaneously—an enriched alert to the SIEM for long-term storage, a Slack channel for immediate notification, and a SOAR platform for automated response.

The key difference is structural:

Splunk answers questions after data is stored.
Cribl decides where data goes.
Timeplus determines what the data means while it is still moving.

The Architecture Shift

In the traditional pipeline, every step after ingestion is retroactive. The SIEM owns the data, and all analysis depends on querying stored records.

Alternatively, the new architecture inserts a real-time processing layer between the raw logs and the downstream stores:

In this model, the control layer handles correlation, context building, detection, and routing before data reaches any storage system. The SIEM still exists — it receives enriched, filtered, pre-correlated data. But it's no longer the only place where analysis happens, and it's no longer the bottleneck.

What Actually Changes

Moving detection upstream — from post-storage to in-motion — shifts several things at once.

Raw events become real-time signals. Instead of storing everything and searching for patterns later, the system produces aggregated signals as data flows through. A brute-force detection rule doesn't wait for a scheduled query. It fires the moment the pattern forms.

Fixed thresholds become dynamic baselines. Rather than alerting on "more than 100 failed logins per minute" (a number that's either too sensitive or too loose depending on the environment), the system compares each entity against its own recent history. An IP that normally generates two login attempts per minute and suddenly produces fifty triggers an alert. An IP that routinely generates a thousand does not.

Batch correlation becomes continuous correlation. Instead of running a correlation query every five minutes, the system maintains a join state across streams. When a new event arrives that completes a pattern, the match is emitted immediately.

Downstream costs drop. When signals, context, and detection happen before data reaches storage, the volume of data that needs to be stored and indexed decreases. The SIEM ingests enriched alerts and relevant context — not every raw syslog line from every firewall.

These aren't hypothetical architectures. One of the world’s leading entertainment platforms uses Timeplus as their real-time security context and control layer to solve cyber threat detection in real-time. They moved away from traditional data platforms and SIEM-based detections in favor of Timeplus, and are now able to:

Handle millions of events per second (IPFIX flow data)
Track and baseline hundreds of thousands of unique IPs simultaneously
Identify and block malicious IPs globally in seconds, in real-time
Run hundreds concurrent rules, including real-time correlation, dynamic baseline comparisons, even during massive traffic spikes
Operate across edge/hybrid environments

Try it Yourself

Want to see how Timeplus processes massive events and security correlations in real-time?

Download a 30-day free trial and get started today:

curl https://install.timeplus.com | sh

The security industry has spent years building better ways to store and search telemetry. That work is valuable, and SIEMs aren't going away. But the assumption that detection must happen after storage is increasingly outdated.

The real goal is producing trusted context early enough to act on — while events are still flowing and the attack is still in progress. A real-time control layer makes that possible. Not as a replacement for the SIEM, but as the missing layer in front of it.

🎙️ Join our webinar, "From Logs to Context: Build Real-Time Security Context & Control to Detect Threats Before SIEM", hosted Ting Wang and Gang Tao (CTO).

Tuesday, March 31, 11am PT

Sign up to save your spot!

WHY TIMEPLUS?

PRODUCT

WHY TIMEPLUS?

PRODUCT

From Logs to Context: Why Your SOC Detection Needs a Real-Time Context and Control Layer

The Midnight Alert Fatigue

Where Today's SOC Architecture Falls Short

A Useful Way to Think About the Gap

Why "Store Everything, Search Later" Is Breaking Down

The Missing Capability: Real-Time Dynamic Correlation (Live vs. Baseline)

Step 1 — Capture Live Dynamic Traffic in Short Windows

Step 2 — Build a Dynamic Baseline

Step 3 — Correlate Two Moving Targets (Live vs. Baseline) in Real Time

Step 4 — Trigger Detection and Response

Can Existing Tools Do This?

Splunk

Cribl

Timeplus

The Architecture Shift

What Actually Changes

Try it Yourself

Related Posts

WHY TIMEPLUS?

PRODUCT

DEPLOYMENT

WHY TIMEPLUS?

PRODUCT

WHY TIMEPLUS?

PRODUCT

The Midnight Alert Fatigue

Where Today's SOC Architecture Falls Short

A Useful Way to Think About the Gap

Why "Store Everything, Search Later" Is Breaking Down

The Missing Capability: Real-Time Dynamic Correlation (Live vs. Baseline)

Step 1 — Capture Live Dynamic Traffic in Short Windows

Step 2 — Build a Dynamic Baseline

Step 3 — Correlate Two Moving Targets (Live vs. Baseline) in Real Time

Step 4 — Trigger Detection and Response

Can Existing Tools Do This?

Splunk

Cribl

Timeplus

The Architecture Shift

What Actually Changes

Try it Yourself