Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Capture

An API that gets out of your way
It’s so easy, we’ve embedded a bunch of examples right
here. Copy some of these requests into your terminal and
check out what happens.
With wrappers in Ruby, PHP, Python and more, you can
get started in minutes. Learn More ➤

As complexity grew…
Then we had a ProblemFactory
Started out with
We had a problem, so we thought to use …

As data volume grew…
Database scalability is a complicated topic…
Started out with
Had to make sure it was web scale
Distributed transactions
Change Data Capture

Squirreling Away $640 Billion
Flink Forward - San Francisco 2022
Jeff Chao
Staff Engineer / Tech Lead for Change Data Capture Infrastructure at Stripe
How Stripe Leverages Flink for Change Data Capture

7
CDC at Stripe
Agenda
1 Aggregating Change Events
2 How it Started, How it Ended
3
Change Data Capture (CDC) is widely-
used at Stripe to capture data changes
from databases without critically
impacting database reliability and
scalability. CDC powers many critical
financial use cases at Stripe such as the
Stripe Dashboard, Stripe Search, Sigma,
and Financial Reporting.
From idea to production—things may
seem straightforward at first, but the
details matter. We detail our journey of
how we leveraged Flink for Change Data
Capture at Stripe in order to uphold the
highest data quality standards. Freshness,
Coverage, and Correctness SLOs are
paramount to the success of platforms
and applications running on top of our
CDC infrastructure.
Change Event Streams are ubiquitous
across Stripe given the vast number of
applications and employees generating
datasets worldwide. Change Event
Streams are independent from one
another which leads to the typical
challenges in distributed systems. One of
the major use cases revolves around
aggregating individual change events of a
database transaction to support Stripe’s
payments infrastructure.

8
CDC infrastructure.
Agenda
CDC at Stripe
3

Billing
Capital Checkout
Connect
Invoicing
Corporate
Card
Climate
Atlas
Radar
Sigma
Payouts
Payments
Terminal Treasury
Issuing
Revenue
Recognitio
n
Payment
Links
Tax
Identity
Elements
Data
Pipeline
Financial
Connections

30%
13
23
> 8000
Remote
Countries
Employees
CDC at Stripe

Correctness
Freshness Coverage
14
Strict SLOs
CDC at Stripe

Interoperable
Abstract Away Internals
Operational Excellence
15
Building a Platform
Make sure that we abstract away
database internals such as sharding
topology and ensure a datastore-agnostic
transport.
Build a high leveraged platform which
makes working with Change Events
interoperable with other systems within
the organization.
Minimal toil given as we scale the number
of datasets, ensure clean separation
between infrastructure and user issues,
create great operator experiences, reduce
control plane and data plane blast radius,
maintain good operator tooling/developer
experience/processes.
CDC at Stripe

16
Agenda
CDC at Stripe
3
CDC infrastructure.

Why?
17
Aggregating Change Events
Product teams working with payments data use transactions
Arbitrary number of tables in a database transaction
They should be able to get transactions back out from the CDC path
They shouldn’t have to become stream processing experts

18
Vites
s
Deb
eziu
m
Kaf
ka
Platform
Platform
User
Architecture
Mon
go
Kaf
ka
Flin
k

What is a Change Event?
19
{
"ts_utc" : 1659375300000,
"attributes": { ... },
"data": [
{
"operation": "CREATE",
"source": { ... },
"transaction": { ... },
"key": "some-unique-constraint",
"before": null,
"after": { ... },
"attributes": { ... }
}
]
}

20
{
"ts_utc" : 1659375300000,
"data": [
{
"source": { ... },
"before": null,
"after": { ... },
}
]
}
Stream: charges

21
{
"ts_utc" : 1659375300000,
"data": [
{
"source": { ... },
"before": null,
"after": { ... },
}
]
}
{
"id" : "transaction-id",
"global_position": 1,
"source_position": 1,
}

22
{
"ts_utc" : 1659375300000,
"data": [
{
"source": { ... },
"before": null,
"after": { ... },
}
]
}
{
}

23
{
"ts_utc" : 1659375300000,
"data": [
{
"source": { ... },
"before": null,
"after": { ... },
}
]
}
{
}

24
{
"ts_utc" : 1659375300000,
"data": [
{
"source": { ... },
"before": null,
"after": { ... },
}
]
}
{
}

25
{
"ts_utc" : 1659375300000,
"data": [
{
"source": { ... },
"before": null,
"after": { ... },
}
]
}

26
{
"ts_utc" : 1659375300000,
"data": [
{
"source": { ... },
"before": null,
"after": { ... },
}
]
}

Change Events Can Come From Anywhere
27
{
"data": [
{"source": { ... }}
]
},
{
"data": [
{"source": { ... }}
]
},
{
"data": [
{"source": { ... }}
]
},
Stream: charges
Stream: audits
Stream: disputes

Databases Have Transactions
28
BEGIN
INSERT INTO charges
UPDATE audits ...
COMMIT

What is a Transaction Metadata Event?
29
// BEGIN Marker
{
"ts_utc": 1659375300000,
"marker": "BEGIN",
"total_events": null,
"per_source_event_counts": null,
}
// COMMIT Marker
{
"ts_utc": 1659375300000,
"marker": "COMMIT",
"total_events": 3,
"per_source_event_counts": [{ ... }],
}

30
// BEGIN Marker
{
"ts_utc": 1659375300000,
"marker": "BEGIN",
}
// COMMIT Marker
{
"ts_utc": 1659375300000,
"marker": "COMMIT",
"total_events": 3,
}

31
// BEGIN Marker
{
"ts_utc": 1659375300000,
"marker": "BEGIN",
}
// COMMIT Marker
{
"ts_utc": 1659375300000,
"marker": "COMMIT",
"total_events": 3,
}
[
{
"source" : "keyspace.table1",
"total_events": 1,
},
{
"source" : "keyspace.table2",
"total_events": 1,
}
]

-- 4 events
BEGIN --
Transaction Metadata Event
INSERT INTO charges -- Change Event
UPDATE audits ... -- Change Event
COMMIT -- Transaction
Metadata Event
Putting It All Together
32

What is an Aggregated Change Event?
33
{
"ts_utc" : 1659375300000,
"data": [
{
"transaction": { “id”: "txn1"},
"before": null,
"after": { ... },
},
{
"operation": "UPDATE",
"before": { ... },
"after": { ... },
},
]
}

What is an Aggregated Change Event?
34
{
"ts_utc" : 1659375300000,
"data": [
{
"before": null,
"after": { ... },
},
{
"operation": "UPDATE",
"before": { ... },
"after": { ... },
},
]
}
● One transaction with two events
having the same transaction ID.
● Events may arrive from an
arbitrary number of tables.

35
Transaction Metadata
Event
Stream (one)
Flat
map
Flink Job Graph
Change Event
Stream (many; one per
table)
Windowed
Aggregation
Side
Output
Aggregated Change
Event
Stream

Multiple Sources
36
Union
Join Connect

Joins elements of the same
key within the same window.
● Produces pairwise
elements
Join
37
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
BEGIN COMMIT BEGIN COMMIT
Event 3
Event 1 BEGIN
,
Event 1 COMMIT
,
Event 2 BEGIN
,
Event 2 COMMIT
,
Event 3 BEGIN
,
Event 3 COMMIT
,

Unions multiple streams of
the same type into a single
stream.
● Requires streams of the
same type
Union
38
38
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
Event 3
(No output; won’t compile because streams are of different
types)

Connect
39
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
Event 3
Event 1 BEGIN
, Event 2 COMMIT
,
Event 3 BEGIN
, COMMIT
,
, ,
Unions multiple streams,
potentially of different types.
● Similar to Unions

40
Support for streams of different types
Support for flexible stream combination semantics
Don’t need pairwise outputs
What Do We Need?

Flink Job Definition
41
val mainStream =
transactionMetadataEventStream // uid and name omitted.
.connect(changeEventStream) // Union different types.

42
Event
Stream (one)
Flat
map
Flink Job Graph
Change Event
table)
Windowed
Aggregation
Side
Output
Aggregated Change
Event
Stream

Connected Streams
43
Custom
Either

Wraps an event containing one
of two types, either from left or
right stream.
● Out-of-box
● No concept of keys
Either.left =
Either.right = null
Either
44
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
Event 3
Event 1
BEGIN
, Either.left = null
Either.right =
,
…

WrappedEvent.key = txn-1
WrappedEvent.left = null
WrappedEvent.right =
Custom
45
WrappedEvent.key = txn-1
WrappedEvent.left =
WrappedEvent.right = null
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
Event 3
Event 1
BEGIN
,
, …
Wraps an event containing one
of two types, either from left or
right stream, and a common
key among both events.
● Small and simple code
addition
● Need to extract keys

46
Wrap elements of a connected stream
Be able to identify keys to support
aggregations later
What Do We Need?

47
val mainStream =
.flatMap(new WrappedEventFunction) // Like Either type, but
with extra fields.
.keyBy(_.key) //
Group events with the same transaction ID.

48
Event
Stream (one)
Flat
map
Flink Job Graph
Change Event
table)
Windowed
Aggregation
Side
Output
Aggregated Change
Event
Stream

Aggregation Characteristics
Arbitrary number of Change Event Streams
One Transaction Metadata Event Stream
Change Events must have the same
transaction IDs
Handle late arriving or duplicate Change
Events and Transaction Metadata Events
Don’t result in infinite state growth
49

Windowing
50
Session
Sliding
Tumbling

Tumbling Windows
51
Assigns elements to windows
of a fixed size.
● Windows don’t overlap
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
Event 3

Tumbling Windows
52
of a fixed size.
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
BEGIN COMMIT

Tumbling Windows
53
of a fixed size.
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
BEGIN COMMIT
● Late-arriving events? Add delay.

Tumbling Windows
54
of a fixed size.
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
BEGIN COMMIT

Tumbling Windows
55
of a fixed size.
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
BEGIN COMMIT
● Large delay? Trade-off: Freshness vs Correctness.

Tumbling Windows
56
of a fixed size.
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
BEGIN COMMIT
● Large delay? Trade-off: Freshness vs Correctness.
● Not quite right…

Sliding Windows
57
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
Event 3
of a fixed size, but with a slide
interval.
● Almost like a tumbling
window, but with windows
overlapping

Sliding Windows
58
time
Change Events
Transaction
Metadata Events
Event 1 Event 2
BEGIN COMMIT
● Late-arriving events? Same as tumbling windows.
● Slide interval? Explosion of windows
of a fixed size, but with a slide
interval.
● Almost like a tumbling
window, but with windows
overlapping

Session Windows
59
time
Change Events
Transaction
Metadata Events
Event 1
BEGIN COMMIT
Event 2
BEGIN COMMIT
Event 3
Assigns elements that are seen
relatively close to each other.
● Arbitrarily-sized windows;
no fixed start and end
● Windows close based on a
defined gap of inactivity

Session Windows
60
time
Change Events
Transaction
Metadata Events
Event 1
BEGIN COMMIT
Event 2

Session Windows
61
time
Change Events
Transaction
Metadata Events
Event 1
BEGIN COMMIT
Event 2

Session Windows
62
time
Change Events
Transaction
Metadata Events
Event 1
BEGIN COMMIT
Event 2
● Session gap too small? Incomplete aggregates

Session Windows
63
time
Change Events
Transaction
Metadata Events
Event 1
BEGIN COMMIT
Event 2

Session Windows
64
time
Change Events
Transaction
Metadata Events
Event 1
BEGIN COMMIT
Event 2

Session Windows
65
time
Change Events
Transaction
Metadata Events
Event 1
BEGIN COMMIT
Event 2
● Session gap too big? Trade-off: Freshness vs Correctness

Session Windows
66
time
Change Events
Transaction
Metadata Events
Event 1
BEGIN COMMIT
Event 2
● Session gap too big? Trade-off: Freshness vs Correctness

Global Windows
67
Assigns elements to a single
window.
● Only a single window per
key
● Window never closes
time
Change Events
Transaction
Metadata Events
Event 1
BEGIN COMMIT
Event 2
BEGIN COMMIT
Event 3

Global Windows
68
Assigns elements to a single
window.
● Only a single window per
key
● Window never closes
time
Change Events
Transaction
Metadata Events
Event 1
BEGIN COMMIT
Event 2
BEGIN COMMIT
Event 3
● Outputs never get evaluated and materialized
● Needs more…

Global Windows + Custom Stateful Trigger
69
Assign elements to a Global Window and add a custom
stateful trigger.
● Flexibly define open/close conditions for non-
overlapping windows
● Reasonably handle late-arriving events
● Avoid infinite state growth and reduce likelihood of
incomplete aggregates

What Makes an Aggregation Complete?
70
BEGIN transaction marker seen
COMMIT transaction marker seen
All Change Events of the transaction seen
All Change Events are globally and locally ordered

Custom Stateful Trigger:
TransactionBoundaryTrigger
71
if transaction metadata event:
if begin transaction marker:
update begin marker state
else:
update commit marker state
update bitmap state
using commit marker’s total event count
set timeout state and register event time timer
else:
update bitmap state
with change event’s global position
set timeout state and register event time timer
if should trigger(begin, commit, total events):
clear window
TriggerResult.FIRE_AND_PURGE
else:
TriggerResult.CONTINUE
Reference
// ChangeEvent#transaction
{
}
// TransactionMetadataEvent
{
"ts_utc": 1659375300000,
"marker": "COMMIT",
"total_events": 3,
}

val mainStream =
with extra fields.
.keyBy(_.key) //
72
.window(GlobalWindows.create)
.trigger(new TransactionBoundaryTrigger(...)) // Flexible windowing semantics.
.process(new KeyedProcessor(...))

73
Event
Stream (one)
Flat
map
Flink Job Graph
Change Event
table)
Windowed
Aggregation
Side
Output
Aggregated Change
Event
Stream

val mainStream =
with extra fields.
.keyBy(_.key) //
.window(GlobalWindows.create)
.trigger(new TransactionBoundaryTrigger(...)) // Flexible windowing semantics.
.process(new KeyedProcessor(...))
74
mainStream //
Side output to DLQ.
.getSideOutput(...)
.addSink(...)
mainStream //
Output aggregated change events.
.addSink(...)

75
Agenda
CDC at Stripe
3
CDC infrastructure.

From Idea to Production
76
Coverage
Platform
State
How it Started, How it Ended

State
77

Infinite keys due to continuous stream of new transactions
Observations
80
Using a Global Window; possible windows not closing properly
No trigger timeouts firing
No watermarks being generated

Idle
Sub Tasks
Observations
81
charges
(partitions = 2)
Transaction
Metadata Events
audits
(partitions = 1)
disputes
(partitions = 1)
Source Sub Tasks

Fix
82
Fixed an upstream issue where transaction IDs were getting mixed up
Reduce parallelism on Source Sub Tasks for all streams
Make sure parallelism ≤ ∑ Topic Partitions
Generally, check with SplitEnumerator classes

State size still growing, but slower
Observations
85
Event time timers firing, sometimes
Watermarks are being generated, but not for all sub tasks

New Observations
86
charges
(partitions = 2)
Transaction
Metadata Events
audits
(partitions = 1)
disputes
(partitions = 1)
Source Sub Tasks
Low volume stream

Possible Fix
87
Switch from event time to processing time
Less precise
Could cause premature trigger firing, resulting in incomplete aggregates

Actual Fix
88
Add idleness property on sources
Can still use event time
More precise
Not perfect; can still result in incomplete aggregates in edge cases
That’s the reality of streaming

Platform
89

Don’t want to redeploy every time a new dataset (Kafka Topic) is added
Observations
92
Blows away Freshness SLO’s error budget
Poor developer onboarding experience

Fix
93
Instead of Kafka Topic List Subscriber, use Regex Subscriber
Subscribe to all topics (for a keyspace) by default
Control plane (external) service produces an event to Broadcast Stream
On broadcast element, use Broadcast State to keep onboarded datasets in state
On element, check Broadcast State and filter for onboarded datasets

Coverage
94

Observations
Incomplete aggregates still happening, but not frequently
97
Kafka by default is at-least-once delivery
Many independent streams operating at different speeds

Storage will be expensive. Trade-off between confidence and cost-
efficiency: KV store or bloom filter
Move incomplete aggregate measurement out of the Flink Job and into a
system downstream
Fix
98
New system needs to dedupe events… for all time?

101
Agenda
CDC at Stripe
3
From idea to production – things may
CDC infrastructure.

Aggregating Change Events is relatively
straightforward, but the details matter
Wrap Up
102
Change Data Capture (CDC) is widely-used at
Stripe to improve database reliability and scalability
Flink is a critical component in Stripe’s CDC
infrastructure that allows us to work with financial
streaming data with high data quality guarantees

Thank you!
103

Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Capture

More Related Content

What's hot

What's hot (20)

Similar to Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Capture

Similar to Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Capture (20)

More from Flink Forward

More from Flink Forward (20)

Recently uploaded

Recently uploaded (20)

Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Capture

Editor's Notes