Back to Blog
    use-caseclickhousekafkafraud-detectionstreaming

    Real-Time Fraud Detection: From Kafka to ClickHouse

    Aivena Engineering2026-02-112 min read

    Real-Time Fraud Detection: From Kafka to ClickHouse

    In the world of high-frequency finance, latency is the enemy. A fraud detection system that flags a transaction after the money has left the account is a failure. To build truly preventive security, you need an architecture that can ingest, process, and alert in under 200 milliseconds.

    The End-to-End Streaming Architecture

    Building this requires a "Zero-Copy" ingestion strategy where data moves from the producer to the analytics engine with minimal serialization overhead.

    graph LR subgraph Producers [Producers] App[Transaction API] ATM[ATM Network] end subgraph Streaming [Managed Streaming] Kafka[Apache Kafka KRaft] Topic[Topic: transactions.raw] end subgraph Analytics [Real-Time Engine] CH_Queue[Kafka Engine Table] MV[Materialized View] CH_Store[(ClickHouse Storage)] end subgraph Alerting [Action Layer] Alert[Threshold Alert Service] UI[Superset Dashboard] end App -->|JSON| Kafka ATM -->|JSON| Kafka Kafka --> Topic Topic --> CH_Queue CH_Queue --> MV MV --> CH_Store CH_Store --> Alert CH_Store --> UI style Streaming fill:#fff9f0,stroke:#f59e0b style Analytics fill:#f5f7ff,stroke:#4a6cf7 style Alerting fill:#f0fff4,stroke:#22c55e

    1. High-Throughput Ingestion with Managed Kafka

    On Aivena Data OS, we deploy Apache Kafka in KRaft mode. This eliminates the operational burden of ZooKeeper and provides higher throughput and faster leader elections.

    Pro Tip: Schema Enforcement

    Use the Aivena Schema Registry (integrated with Kafka) to ensure that your transaction producers are sending valid JSON. If an invalid payload is sent, it's rejected at the broker level, preventing downstream "poison pill" errors.

    2. Zero-Copy Ingestion: The Kafka Engine

    Instead of writing a custom Python or Go consumer to read from Kafka and write to ClickHouse, we use the ClickHouse Kafka Engine. This is a built-in consumer that treats a Kafka topic as a table.

    Configuration Snippet

    sql

    -- 1. Create the Queue table (The consumer)

    CREATE TABLE transaction_queue (

    transaction_id String,

    user_id UInt64,

    amount Float64,

    currency String,

    timestamp DateTime64(3)

    ) ENGINE = Kafka

    SETTINGS kafka_broker_list = 'kafka.internal:9092',

    kafka_topic_list = 'transactions.raw',

    kafka_group_name = 'clickhouse_consumer_group',

    kafka_format = 'JSONEachRow';

    -- 2. Create the Destination table (The storage)

    CREATE TABLE transactions (

    transaction_id String,

    user_id UInt64,

    amount Float64,

    currency String,

    timestamp DateTime64(3)

    ) ENGINE = MergeTree()

    ORDER BY (user_id, timestamp);

    -- 3. Create the Glue (The Materialized View)

    CREATE MATERIALIZED VIEW transaction_mv TO transactions AS

    SELECT * FROM transaction_queue;

    3. Real-Time Threshold Alerting

    The goal is to detect "Velocity Attacks" (e.g., 5 transactions in 1 minute). We can use another Materialized View to pre-calculate these thresholds.

    sql

    CREATE MATERIALIZED VIEW velocity_alerts

    ENGINE = MergeTree()

    ORDER BY timestamp

    AS SELECT

    user_id,

    count() as tx_count,

    sum(amount) as total_spent,

    toStartOfMinute(timestamp) as window_start

    FROM transaction_queue

    GROUP BY user_id, window_start

    HAVING tx_count > 5;

    When a record hits the velocity_alerts table, a simple Aivena Webhook Node can trigger an immediate SMS or push notification to the user to verify the activity.

    Why Aivena?

    Managing Kafka and ClickHouse clusters yourself is a full-time job. Aivena Data OS provides:

    * mTLS Everywhere: All data between Kafka and ClickHouse is encrypted by default.

    * Auto-Scaling Brokers: If your Black Friday traffic spikes 10x, Aivena scales your Kafka brokers automatically.

    * One-Click Observability: Monitor your Kafka consumer lag and ClickHouse insert rates in a pre-configured Grafana dashboard.


    Need to detect fraud in milliseconds? Deploy Kafka and ClickHouse on Aivena Data OS today.