Serverless Stream Processing In the Cloud with Timeplus and Redpanda
top of page

Serverless Stream Processing In the Cloud with Timeplus and Redpanda

Updated: Apr 2

Introducing Timeplus built-in support for Redpanda Serverless – welcome to the fastest and easiest way to process streaming data.


 

Take on stream processing with Redpanda, a high-performance streaming data platform, paired with Timeplus, a simple, powerful, and cost-efficient stream processor. With this combination, developers can easily get started with only SQL—no infrastructure to deploy locally and no systems to configure.


Our customers love using Redpanda with Timeplus. This pairing is an excellent way for developers to build and scale powerful streaming applications, even with limited resources. The new serverless delivery model will make it even easier for data teams to deploy instantly as their requirements change.

 

A Tour of Timeplus Cloud and Redpanda Serverless


In this blog, we will walk you through how to analyze web access logs with Redpanda Serverless and Timeplus Cloud. We will use streaming SQL to explore the live data, build a stream processing application to capture notable events, and send results to a Redpanada topic without showing the raw IP address.


1. Start your free 14-day trial of Timeplus Cloud at https://us.timeplus.cloud and create your workspace in seconds. Click “Data Ingestion” in the left navigation menu and choose Redpanda.




2. This will start a wizard to set up a connection between Timeplus and Redpanda. If you don’t have access to Redpanda yet, you can create your account via https://cloud.redpanda.com and receive $100USD in free credits to spend in the first 14 days. Check out this video from the Redpanda team for more details.


Specify the Redpanda broker address and the authentication method in Timeplus UI.




3. Click “Next” and Timeplus will list all available topics in the specified Redpanda cluster.



If you don’t have topics and streaming data ready in your Redpanda cluster, you can use the following Docker Compose to generate sample “owlshop data” to the Redpanda cluster.

version: "3.7"
services:
  owl-shop:
    image: quay.io/cloudhut/owl-shop:sha-042112b
    platform: "linux/amd64"
    entrypoint: /bin/sh
    command: -c "echo \"$$OWLSHOP_CONFIG_FILE\" > /tmp/config.yml; /app/owlshop"
    environment:
      CONFIG_FILEPATH: /tmp/config.yml
      OWLSHOP_CONFIG_FILE: |
        shop:
          requestRate: 1
          interval: 0.1s
          topicReplicationFactor: 1
          topicPartitionCount: 1
        kafka:
          brokers: "ID.any.us-east-1.mpx.prd.cloud.redpanda.com:9092"
          sasl:
            enabled: true
            mechanism: SCRAM-SHA-256
            username: USER
            password: PASSWORD
          tls:
            enabled: true


4. I  chose the “owlship-frontend-events” topic and clicked “Next”. Timeplus will load the sample data from the topic and suggest the column names and data types.




5. Click “Next”. You will be asked to specify the name for the stream (I’ve called it ‘frontend_events’), with an optional description. Timeplus will create an External Stream to continuously read data from the Redpanda topic.




6. Click “Create External Stream”. In the next page showing the list of External Streams, click the “Explore” icon in the ‘frontend_events’ row. It will take you to the SQL Console, with auto-generated streaming SQL SELECT * FROM frontend_events to explore the live data.




7. Unlike traditional SQL, the streaming SQL in Timeplus is long-running, unless you cancel it. With this SQL Console, you can easily apply different SQL transformation or aggregation logic and see the results in the streaming mode. Data patterns or trends can be easily recognized with the live table header.




8. Let’s implement a simple stream processing use case: for the web access logics, get those requests not in HTTP 200 code and send them to Redpanda topic, but without the raw IP address. This can be done via the following streaming SQL:

SELECT response:statusCode as code,hex(md5(ipAddress)) as hashed_ip,method,requestedUrl 
FROM frontend_events WHERE response:statusCode!='200'


9. Timeplus provides a great way to run this kind of ad-hoc streaming SQL. Unlike FlinkSQL or ksqlDB, you can immediately get results from Timeplus without waiting for job creation, scheduling, and pooling.




10. Once you confirm the process logic is correct, you can click the “Send as Sink” button to set up a background job to run the streaming SQL and continuously send results to Redpanda.




11. You may need to create a new topic in the Redpanda Serverless console.




12. Once you finish the previous step, Timeplus will create a new external stream to connect to the target Redpanda topic, then create a Materialized View as a long running job to read the source topic, apply transformation, and send to the other topic. This is nicely visualized in the Data Lineage page in Timeplus.




13. Now if you check the Redpanda console, new messages are continuously generated in the target topic, with HTTP access log without 200 code, and the IP address is masked to protect the users’ privacy.



 

With Timeplus Cloud and Redpanda Serverless, it’s easier than ever to get started with stream processing. Try it yourself: sign up for a free trial of Redpanda Serverless and Timeplus Cloud.


197 views
bottom of page