Data Engineering
Build production data pipelines — from streaming ingestion to custom Spark connectors to near real-time delivery.
Spark Declarative Pipelines Medallion-architecture pipelines with streaming tables, materialized views, and Auto Loader.
Spark Structured Streaming Production streaming with Kafka, stateful operations, watermarks, and multi-sink writes.
Custom Spark Data Sources Python data sources for connecting Spark to external systems via the PySpark DataSource API.
Zerobus Ingest Near real-time ingestion into Delta tables via gRPC — no message bus required.