Core Concepts
Understanding the core concepts of Laminar is essential for building effective streaming pipelines. This guide introduces the fundamental building blocks: Profiles, Tables, Pipelines, and Jobs.
Pipeline Overview
A Laminar pipeline connects data sources to sinks through SQL transformations. Here's the high-level flow:
- Profiles store connection credentials (e.g., Kafka brokers, Iceberg catalog)
- Tables define schemas and link to profiles
- Pipelines run SQL that reads from source tables and writes to sink tables
Profiles
Profiles store reusable connection credentials and configuration. Instead of repeating connection details for every table, create a profile once and reference it across multiple tables.
Profile Config Example
{
"name": "my_kafka_profile",
"type": "kafka",
"config": {
"bootstrapServers": "broker1:9092,broker2:9092",
"authentication": {
"protocol": "SASL_SSL",
"mechanism": "SCRAM-SHA-256",
"username": "user",
"password": "password"
}
}
}Supported Profile Types
- Kafka - Apache Kafka, Amazon MSK, Redpanda
- Confluent - Confluent Cloud
- Iceberg - Apache Iceberg tables
Tables
Tables represent external data sources and sinks. They define how Laminar connects to external systems and the schema of the data. A table consists of two parts: config and schema.
Table Config
The config section specifies connector-specific settings like topic name, offset handling, and commit mode.
{
"name": "user_events",
"profile": "my_kafka_profile",
"config": {
"topic": "user-events",
"type": {
"source": {
"offset": "latest"
}
}
}
}Supported Connectors:
- Kafka
- Confluent
- Kinesis
- Iceberg
- Delta Lake
- Filesystem (S3, GCS, local)
- Stdout (for debugging)
Table Schema
The schema section defines the data format and field definitions. See the Schema documentation for full details.
{
"schema": {
"format": { "json": {} },
"fields": [
{
"field_name": "event_id",
"field_type": { "type": { "primitive": "Utf8" } },
"nullable": false
},
{
"field_name": "user_id",
"field_type": { "type": { "primitive": "Int64" } },
"nullable": false
},
{
"field_name": "event_time",
"field_type": { "type": { "primitive": "DateTime" } },
"nullable": false
}
]
}
}Pipelines
Pipelines are the heart of Laminar. A pipeline is SQL-based stream processing logic that reads from source tables, transforms data, and writes to sink tables.
What is a Pipeline?
Think of a pipeline as a continuously running query that processes data as it arrives. Unlike batch queries that run once and finish, streaming pipelines run indefinitely, processing events in real time.
Pipeline SQL
Pipelines are defined using standard SQL. They reference tables you've already created:
INSERT INTO events_iceberg
SELECT
event_id,
user_id,
event_type,
event_time,
properties
FROM user_events
WHERE event_type != 'heartbeat'Jobs
Jobs are the runtime execution units of pipelines. When you start a pipeline, Laminar creates a job that manages the actual data processing.
What is a Job?
A job represents a running instance of a pipeline. It manages:
- Task parallelism
- Resource allocation
- State management
- Checkpointing
- Failure recovery
Job Metrics
Key metrics to monitor:
- Records In: Events received from sources
- Records Out: Events written to sinks
- Throughput: Events per second
- Latency: Processing delay
- Backpressure: Slow downstream causing buildup
- Checkpoint Duration: Time to save state
Putting It All Together
Here's how all the concepts work together in a typical workflow:
1. Create a Kafka Profile
Configure connection to your Kafka cluster.
2. Create an Iceberg Profile
Configure connection to your lakehouse.
3. Create Source Table
Define raw_events table pointing to a Kafka topic, using the Kafka profile.
4. Create Sink Table
Define events_iceberg table pointing to an Iceberg table, using the Iceberg profile.
5. Create Pipeline
Write SQL to transform and route data:
INSERT INTO events_iceberg
SELECT
event_id,
user_id,
event_type,
event_time,
CASE
WHEN event_type = 'purchase' THEN 'transaction'
ELSE 'activity'
END as category
FROM raw_events
WHERE user_id IS NOT NULL6. Start Pipeline
Start the pipeline via UI or lmnr CLI. A job is created and begins processing.
7. Monitor
Watch metrics, check logs, verify data landing in Iceberg.
Data Flow
What's Next?
- SQL Reference - Full SQL syntax and functions
- Connectors - Detailed connector configuration
- Deployment - Deploy to production