MOBIRAN
Data Infrastructure

ARStreams

v2.0.0GA

Real-time CDC streaming

PostgreSQL logical replication to PostgreSQL or Apache Iceberg. Multi-target fan-out. Snapshot plus incremental pipeline.

Problem

What this product solves.

Enterprises run their transactional core on PostgreSQL but need that data — every insert, update and delete, with millisecond freshness — in downstream systems: analytical lakehouses, replicas across regions, audit pipelines, regulator-facing stores. Vendor CDC stacks bind you to a managed cloud, mix data and control planes, and leave the operator without an honest view of replication state. ARStreams replaces them with a self-hosted streaming layer that runs on PostgreSQL’s own logical replication and treats every target as a first-class sink.

Core capabilities

What the product does.

cap_01

pgoutput logical decoding

Reads WAL through PostgreSQL’s native pgoutput protocol — no extensions, no proprietary agents on source. Every commit becomes an ordered event stream the platform can replay, filter or fan out.

cap_02

Multi-target fan-out

One replication slot, many sinks. The same WAL stream lands in another PostgreSQL cluster and an Apache Iceberg table at the same time. Targets are added, paused and removed independently.

cap_03

Snapshot + incremental pipeline

Cold start runs a consistent snapshot, then transitions to the live WAL position without a gap and without operator intervention. New targets join the same pipeline at any time.

cap_04

Apply / log / filter modes

Each stream picks how it touches the target: apply (UPSERT/UPDATE/DELETE), log (append-only history with full event metadata), filter (subset replication — only the rows or columns you allowed).

cap_05

Per-table position tracking

Every table reports its LSN, row counts and lag in the control plane. Stalls and slot growth surface immediately — not when the disk fills up.

cap_06

Replication slot lifecycle

The platform owns publication and slot creation, advancement and tear-down. Operators stop chasing dangling slots after re-deployments.

cap_07

Backpressure-aware writers

Runner and writer are pipelined and decoupled. A slow target slows its own stream without blocking the rest of the platform.

cap_08

Operator-grade observability

Run history, rows_inserted / updated / deleted, per-stream audit trail. The same data the platform uses to make decisions is visible to your engineers.

Operator surface

What operators actually see.

Captured from a live evaluation environment. Same UI customers run; nothing reproduced from a brochure.

shot_01

CDC streams — apply / log / filter modes, WAL lag and change counters per stream.

ARStreams — CDC streams list
shot_02

Replication slot health — LSN, retained WAL and active state per slot.

ARStreams — replication slots
shot_03

Source and target connections — credentials owned by ARStreams, never re-typed in CI.

ARStreams — connections
Architecture

How it is built.

ARStreams runs as a Spring Boot service backed by PostgreSQL for metadata and JWT-secured REST + WebSocket APIs for control. The data plane is a pair of pipelined components — a CdcRunner that reads WAL via PGReplicationStream and decodes pgoutput events into batches, and CdcWriter instances that apply those batches to targets in parallel through an ExecutorService. Replication slots, publications and LSN advancement are owned by the platform; operators see them, they do not maintain them. The platform stores its own state in PostgreSQL with Flyway-versioned migrations, never in the source database it replicates from.

arch_01

Decoupled runner / writer pipeline

WAL reading, event decoding, batching and per-target writes are separate stages. Pressure on one target stays local to that target.

arch_02

PostgreSQL-resident state

Stream definitions, run history and audit logs live in the ARStreams metadata database. No reliance on external coordination services.

arch_03

Per-stream replication slot

Every CDC stream gets its own logical slot and publication. Slots advance only after target acknowledges the batch.

arch_04

Pluggable target writers

TargetWriter interface backs CdcPgWriter and SnapshotCopyWriter + incremental for Iceberg. Future targets implement the same contract.

arch_05

JWT-secured control surface

Same identity model as the rest of MOBIRAN. REST endpoints for CRUD on streams, connections, settings, users; WebSocket for live state.

REST API

Driven by a real REST surface.

Every product action available in the UI is reachable through a JWT-secured REST API. The control plane is the API; the UI is one of its consumers.

api_01JWT

List replication streams

Returns every configured stream with current LSN, lag and last error.

GET/api/v1/streams
Response
[
  {
    "id": "stream_billing_to_lake",
    "source": "pg-billing-prod",
    "target": "iceberg-warehouse",
    "mode": "log",
    "state": "running",
    "lag_ms": 142,
    "lsn": "0/3A2C9F18"
  }
]
api_02JWT

Create a stream

Provisions a publication, replication slot and target sink in one call. Idempotent on stream id.

POST/api/v1/streams
Response
{
  "id": "stream_billing_to_lake",
  "state": "snapshotting",
  "snapshot_progress": 0.0
}
api_03JWT

Inspect a single stream

Per-table progress, slot growth, replication health for one stream — same data the UI reads.

GET/api/v1/streams/{id}
Operator CLI

Operated from the terminal too.

The `arctl` CLI talks to the same control plane as the UI. Same primitives, scriptable, suitable for CI and on-call.

cli_01arctl

List streams

Same as the UI dashboard, scriptable — pipe into jq for filtering.

arctl streams list --format=json
Output
ID                          STATE       LAG    TARGET
stream_billing_to_lake      running     142ms  iceberg-warehouse
stream_orders_to_replica    snapshot    -      pg-orders-replica
cli_02arctl

Create a stream from a spec file

Declarative stream definition; safe to commit to git, replay on a recovered platform.

arctl streams create -f streams/billing-to-lake.yaml
cli_03arctl

Probe stream status

Exits with non-zero if the stream is degraded; suitable for on-call alerting.

arctl streams status stream_billing_to_lake
Integrations

What it connects to.

integration

PostgreSQL (source)

Any PostgreSQL 12+ with logical replication enabled. The platform creates and tears down its own publications and slots.

integration

PostgreSQL (target)

Production PostgreSQL clusters, read replicas, regional copies. Apply / log / filter modes per stream.

integration

Apache Iceberg

Lakehouse target via SnapshotCopyWriter plus incremental events. Schema evolution and time-travel queries remain native to Iceberg.

integration

ARStudio

Fleet console oversees source and target PostgreSQL clusters. Same identity model — credentials and roles flow across both products.

integration

ARFlow

Batch-mode reconciliation and one-off loads sit alongside streaming. The two products share connection definitions and the same metadata schema family.

integration

Object storage

S3-compatible object storage (including MinIO from ARCloud) backs the Iceberg target. No proprietary storage layer.

Use cases

Where it runs in production.

  1. case_01

    Cross-region replicas

    Mirror a banking core PostgreSQL into a regulator-facing region. Apply mode keeps the replica byte-for-byte aligned; the audit stream lands in a separate log target for compliance.

  2. case_02

    Streaming into a lakehouse

    Every change in the operational PostgreSQL lands in Apache Iceberg in near real time. Analytical workloads run against fresh data without touching the source.

  3. case_03

    Read replicas without WAL hops

    Replace fragile chained pg_basebackup setups. Direct logical replication with operator-visible lag, slot health and audit.

  4. case_04

    Audit log fan-out

    Log-mode targets capture every DML event with operator, timestamp and target. Append-only by default — exactly what regulators ask for.

  5. case_05

    Selective replication for partners

    Filter mode exposes only the rows and columns that partners are entitled to see. The source database keeps a single shape; the platform projects the right subset per partner.

Deployment

How it is operated.

ARStreams ships as a Spring Boot JAR with a React control surface. Drop the artefact on bare-metal, on Proxmox VE, or into a container. The metadata PostgreSQL lives wherever you allow it — same perimeter as the source, an isolated tenant in ARCloud, or a dedicated instance.

  • Single JAR, systemd unit, optional nginx reverse-proxy on top.
  • Source PostgreSQL needs wal_level=logical and a role with REPLICATION attribute.
  • Iceberg target connects via S3-compatible object storage credentials.
  • Air-gap-ready: no outbound calls; no SaaS dependency.
  • Upgrades roll forward Flyway migrations on the metadata database only.
Core capabilities

Evaluate this product.

Open the workspace if you already hold credentials, or request guided access through the briefing flow.