In Apache Flink (especially Table/SQL API), streams are often represented as changelogs rather than simple append-only rows.
The four event types tell downstream operators how a row has changed.
Here’s a clear breakdown 👇
The 4 Flink Changelog Event Types
| Event | Name | Meaning |
|---|---|---|
| +I | Insert | A new row is added |
| -U | Update Before | Old version of a row before an update |
| +U | Update After | New version of a row after an update |
| -D | Delete | A row is removed |
1. +I (Insert)
What it means:
A brand-new row appears.
Example:
A user places their first order.
+I (order_id=101, user=Alice, amount=50)
This row did not exist before.
2. -U (Update Before)
What it means:
The old value of a row that is about to be updated.
This is emitted only for update-capable streams.
-U (order_id=101, user=Alice, amount=50)
3. +U (Update After)
What it means:
The new value of the same row after the update.
+U (order_id=101, user=Alice, amount=75)
👉 Together, -U and +U represent one logical update.
4. -D (Delete)
What it means:
A row is removed entirely.
-D (order_id=101, user=Alice, amount=75)
How Updates Work (Very Important)
Flink represents updates as two events:
-U (old row)
+U (new row)
Why?
Because this makes the stream fully consistent for:
-
stateful operators
-
joins
-
sinks that need exact change semantics
Example Timeline
Imagine this SQL:
SELECT user, SUM(amount) FROM orders GROUP BY user;
Incoming orders:
-
Alice orders 50
-
Alice orders another 25
-
Alice cancels first order
Emitted changelog:
+I (Alice, 50) -U (Alice, 50) +U (Alice, 75) -U (Alice, 75) +U (Alice, 25)
No deletes here because the aggregation still exists.
Which Streams Produce Which Events?
| Stream Type | Possible Events |
|---|---|
| Append-only source | +I |
| Aggregation | +I, -U, +U |
| Join | +I, -U, +U, -D |
| Upsert sink | +I, +U, -D |
| Retract stream | all four |
Mental Model
Think of Flink changelog events like database binlogs:
-
+I→ INSERT -
-U→ UPDATE (old row) -
+U→ UPDATE (new row) -
-D→ DELETE
If you want, I can also explain:
-
how these map to Kafka upsert topics
-
which sinks support which event types
-
or how to convert changelog streams to append-only
Just tell me 👍
Great—here are (1) how Flink changelog events map to Kafka upsert topics and (2) which sinks support which event types, explained clearly and practically.
1️⃣ Mapping Flink Changelog Events to Kafka Upsert Topics
What is an upsert Kafka topic?
An upsert topic represents the latest state per key.
-
Keyed by a primary key
-
A record with the same key overwrites the previous value
-
A tombstone (null value) means delete
Flink’s upsert-kafka connector is designed exactly for this model.
Event → Kafka Record Mapping
Assume:
Key = user_id Value = (user_id, total_amount) | Flink Event | Kafka Key | Kafka Value | Meaning |
|---|---|---|---|
| +I | key | value | Insert new row |
| +U | key | value | Update existing row |
| -D | key | null |
Delete row |
| -U | ❌ ignored | ❌ ignored | Not written |
⚠️ Important:
Kafka upsert topics do NOT store -U events.
They only care about final state, not intermediate transitions.
Example
Flink emits:
+I (Alice, 50)
-U (Alice, 50)
+U (Alice, 75)
-D (Alice, 75)
Kafka upsert topic receives:
key=Alice, value=50
key=Alice, value=75
key=Alice, value=null ← tombstone (delete)
Why -U Is Dropped
Kafka has no concept of “update-before”.
Only the latest value per key matters.
When to Use upsert-kafka
✔ Aggregations
✔ Materialized views
✔ CDC-style streams
✔ Exactly-once state replication
❌ Append-only event logs (use normal Kafka instead)
2️⃣ Which Sinks Support Which Event Types?
This is critical when designing Flink SQL pipelines.
Sink Capability Matrix
| Sink Type | +I | -U | +U | -D | Notes |
|---|---|---|---|---|---|
| Print / Debug | ✅ | ✅ | ✅ | ✅ | For debugging |
| Upsert Kafka | ✅ | ❌ | ✅ | ✅ | Requires primary key |
| Kafka (append) | ✅ | ❌ | ❌ | ❌ | Insert-only |
| JDBC (upsert mode) | ✅ | ❌ | ✅ | ✅ | PK required |
| JDBC (append) | ✅ | ❌ | ❌ | ❌ | No updates |
| Filesystem (CSV/Parquet) | ✅ | ❌ | ❌ | ❌ | Append-only |
| Hudi / Iceberg / Paimon | ✅ | ❌ | ✅ | ✅ | Table formats |
| Elasticsearch | ✅ | ❌ | ✅ | ✅ | Document ID = key |
Why Many Sinks Reject -U
-U is an internal consistency signal, not a final state.
Most sinks only want:
-
Insert
-
Update (after)
-
Delete
So Flink:
-
Consumes
-Uinternally -
Does NOT forward it to sinks like Kafka, JDBC, Iceberg
Common Error You’ll See
Sink does not support consuming update-before records
Fix options:
-
Use an upsert sink
-
Define a PRIMARY KEY
-
Convert to append-only (if logically valid)
Quick Rule of Thumb
-
If your query has GROUP BY, JOIN, or DISTINCT → produces updates
-
If your sink doesn’t support updates → it will fail
-
Kafka upsert sinks solve 80% of these cases

浙公网安备 33010602011771号