For a no-code explanation of this topic, click here.
One Tuesday at Kikoff
We merged a Dart-only UI fix at 11:03. Green CI at 11:41. Fastlane ran shorebird patch ios at 11:57. Patch live on the gated track by 12:04. By the time the team went home, many eligible users had already cold-started into it. No Play Store. No please update your app.
That's the loop React Native shops have been living in since 2015. It's new for Flutter, and it took a forked engine to get there.
The problem was never mobile. It was compiled mobile.
Every mobile app pays the store-review tax: a 1–3 day review window plus a multi-week user-update curve on every binary change. The web sidestepped this in 2010 because there's no binary to ship. How a mobile stack solves it depends entirely on how that stack compiles and runs code.
Interpreted runtimes (React Native, Cordova, Ionic). JavaScript is parsed at runtime by the native shell. Swap the JS bundle, you swap the app. Microsoft's CodePush shipped in 2015 and has been table stakes for RN shops ever since.
AOT-compiled runtimes (Flutter). Dart is compiled to native ARM ahead of time. No interpreter. The Flutter engine loads a pre-compiled binary and executes it. You can't just swap the code, because the code is machine instructions, baked into the binary your users installed from the store.
That constraint is what made Flutter fast, predictable, and platform-consistent. It's also what kept code push out of reach. Until it didn't.
Shorebird's trick
Shorebird forked the Flutter engine and taught it to load a second AOT Dart snapshot at runtime, merged over the one baked into the binary. Crucially, patches are still AOT-compiled Dart (not JIT'd, not interpreted), so the runtime characteristics that made you pick Flutter in the first place don't change. The only thing that changes is that "swap the code" is now a thing you can do.
From the app's perspective, Shorebird is a Dart package (shorebird_code_push) that ships a ShorebirdUpdater. Call checkForUpdate(), call update(), the next cold start runs on the patched snapshot. From the engineer's perspective, it's one CLI command, one YAML file, one Fastlane lane, and one GitHub Actions workflow.
One tradeoff worth naming up front: the Shorebird engine tracks Flutter stable at a lag. You don't upgrade Flutter when you want; you upgrade Flutter when Shorebird's fork catches up. In practice this is weeks, not months, and Shorebird has been prompt about it, but it's a real constraint on bleeding-edge Flutter adoption.
Three stages, one tax
Quick frame we'll refer back to throughout the post. Mobile teams live in one of three stages:
Each stage is strictly faster than the one below, and strictly more expensive to maintain. Most apps belong in Stage 2. A handful belong in Stage 3. Most have no business there.
This post is about landing Stage 2 cleanly with Flutter (the stage that didn't have a credible answer until Shorebird), and what Stage 3 looks like when you eventually graduate.
The cross-platform calculus just shifted
The conventional RN vs. Flutter matrix looked like this for nearly a decade:
The last row was the ugly one. For a long time, if continuous delivery was a hard requirement, RN was the only real answer, regardless of how much you wanted Flutter's performance or widget consistency.
That row is now close to tied, not fully tied. RN's CodePush can swap JS in a running session; Shorebird patches only activate on cold start. The cold-start floor (more on it below) is a real gap, not a rounding error. But for most teams, most of the time, the delta between "seconds after the user taps a button" and "seconds after the user next opens the app" isn't what's driving the stack choice. Continuous delivery stops forcing your hand. The RN vs. Flutter decision is back to being about what it always should have been: flexibility vs. performance, rendering model, ecosystem, team fit.
Where code push stops: the cold-start floor
Code push on any stack has a hard floor: cold start. Patches apply at engine initialization, not on session resume. A user who backgrounds your app and comes back two hours later is still running the old code. The patch only activates when the process is fully killed and reopened.
.png)
Code push adoption ≈ user cold-start frequency. That's the floor. Store-release adoption is that floor plus review lag plus voluntary-update friction. Which is why a store release can take 30 days to reach the penetration a patch reaches in one.
Large native teams route around this with server-driven UI: changes apply on the next network request, not the next cold start. Genuinely faster than code push, but it costs you a rendering framework, a layout DTO protocol, and the engineering team to maintain both. Significant investment. Most teams don't need it.
For most apps, the cold-start floor is the right place to land. Dramatically better than the store curve, dramatically cheaper than SDUI.
Now, the wiring
Enough theory. Here's what the wiring looks like.
One YAML, three app IDs
shorebird.yaml sits at the repo root. Each Flutter flavor maps to a separate Shorebird app ID:
app_id: <app-uuid-1>
flavors:
flavor_a: <app-uuid-1>
flavor_b: <app-uuid-2>
flavor_c: <app-uuid-3>
auto_update: false
We run three Flutter flavors, each mapped to its own Shorebird app ID.
Two choices worth naming.
Separate IDs per flavor. A bad patch in one flavor can't leak into another. Flavor isolation is free at setup time and priceless the day you ship a regression.
auto_update: false. This is the important one. The default behavior is "Shorebird checks and applies patches on launch, transparently." We don't want that. We want the client to own the state machine: decide when to check, decide whether to apply, decide how to tell the user. Everything downstream depends on flipping this one line.
Owning the state machine yourself
With auto_update: false, every patch check is explicit. Our ShorebirdClient sits in lib/clients/shorebird_client.dart, registered as a lazy singleton in dependencies.dart. Its entire life is four public methods: initialize(), checkForUpdates(), onPatchNotificationReceived(), and a static getPatchNumberSafely() that other clients call.
First, a gate:
bool get isEnabled =>
_flavorType == FlavorType.kikoff &&
_remoteExperimentClient.isEnabled(RemoteFeatureFlag.shorebirdEnabled);
Two dimensions of off switch: flavor (only kikoff rolls today) and a Statsig feature flag (shorebirdEnabled). If Shorebird itself has an outage, if we ship a bad patch infrastructure-wide, if anything at all surprises us in production, we flip Statsig and the whole pipeline goes dark without a release.
A patch can reach a user three ways
A patch can reach a user in three ways:
// 1. Cold start: from remote_experiment_client once Statsig is ready
sl<ShorebirdClient>().initialize();
// 2. App resume — from app_bloc, catches patches published while backgrounded
if (sl.isRegistered<ShorebirdClient>()) {
sl<ShorebirdClient>().checkForUpdates();
}
// 3. Silent FCM push: from plugin_client, for freshly shipped patches
if (sl.isRegistered<ShorebirdClient>()) {
sl<ShorebirdClient>().onPatchNotificationReceived();
}
The silent push is the geeky one. When we publish a patch to gated, we fan out a silent FCM to the target cohort. Any user with the app in the foreground or recently backgrounded pulls the patch within seconds, no cold start required. It doesn't eliminate the cold-start floor (the patch still only activates on next cold start), but it eliminates the propagation delay on top of it.
Defense in depth: two gates, not one
Every check we make pins track: UpdateTrack('gated'). Patches ship to gated first (via shorebird patch --track=gated). Only clients asking for that track see them. Since the client-side isEnabled check gates whether to ask at all, we get defense in depth: a user without the Statsig flag never pings the gated track, and the gated track never serves stable users. Promotion to stable is a console operation that takes seconds once crash telemetry bakes.
The race we didn't see coming
Shorebird's readCurrentPatch() is an async call that, under the hood, hops through Isolate.run to the Rust-backed updater via FFI. During cold start, two different clients want the patch number for different reasons: our TrackerClient stamps it on every Amplitude event, and our RemoteExperimentClient passes it to Statsig as a targeting property. Both initialize in parallel. Both called readCurrentPatch() independently. Because they're racing against each other's async initialization, one would occasionally resolve before the updater finished its first FFI probe, and we'd get mismatched values across the two clients. Statsig bucketing on null, Amplitude events stamped with 42. Analytics gets noisy in exactly the way you only notice a week later when you're trying to debug something else.
Fix: a one-shot Completer, created by the first caller, awaited by everyone else.
static int? _cachedPatchNumber;
static Completer<int?>? _patchNumberCompleter;
static Future<int?> getPatchNumberSafely() async {
if (_patchNumberCompleter?.isCompleted ?? false) {
return _cachedPatchNumber;
}
if (_patchNumberCompleter != null) {
return _patchNumberCompleter!.future;
}
_patchNumberCompleter = Completer<int?>();
try {
final updater = ShorebirdUpdater();
final patch = await updater.readCurrentPatch();
_cachedPatchNumber = patch?.number;
} on Exception catch (_) {
_cachedPatchNumber = null;
}
_patchNumberCompleter!.complete(_cachedPatchNumber);
return _cachedPatchNumber;
}
Nothing exotic. But: the Shorebird docs don't tell you to do this, and the bug is the kind you write off as "flaky analytics" until you actually notice it.
The rollback detection trick
One failure mode is unique to code push: you ship a patch, it's broken, you pull it server-side. Shorebird removes the patch from disk during the next checkForUpdate() call. But the app in memory is still running the bad bundle. The user is running broken code and doesn't know it. Until they restart. Which they might not do for hours.
The trick: read the patch number before the check, and again after. If it was present and now it's gone, Shorebird just rolled you back.
case UpdateStatus.upToDate:
// Shorebird removes the patch from disk during checkForUpdate()
// on a rollback. The app is still running the rolled-back patch
// in memory, so we need to nudge the user to restart.
final patchAfterCheck = await _updater.readCurrentPatch();
if (patchBeforeCheck != null && patchAfterCheck == null) {
_log('Patch ${patchBeforeCheck.number} was rolled back, restart required');
_trackDiagnostic(
AnalyticsEvent.shorebirdPatchStatus,
properties: {
'step': 'rollback_detected',
'rolled_back_patch': patchBeforeCheck.number,
},
);
_notifyPatchReady();
}
If we rolled back a bad patch, the same "restart to apply" banner appears, except "applying" now means reverting. Users running the broken version get nudged out of it within one session instead of however long it takes them to close and reopen the app on their own.
The state machine, in Amplitude
A runtime that silently swaps code in production is a runtime you cannot debug with logs. By the time you'd read one, the patch would already be applied. Or worse, silently failing on a subset of devices you can't identify.
So we emit an Amplitude event at every transition, with a step property keyed to exactly where in the lifecycle we are:
In Amplitude this composes into a funnel: check → outdated → download_complete → (next session) → current_patch advanced. When a patch isn't landing on a cohort, the funnel points straight at the drop-off. Download failing? Restart not happening? SDK refusing to apply? You see it.
This is the single highest-leverage piece of the integration. If you do nothing else from this post, instrument your patch lifecycle.
Ask, don't swap
When a patch is ready, we don't silently swap the Dart bundle mid-session. Not because we chose not to. We can't. Shorebird's trick is loading a second AOT snapshot at engine init; once the engine is running, the code is fused into memory and there's no API to unload it, rebind every live object to new type definitions, and resume without corrupting in-flight state. Mid-session swap would require a different engine, i.e., a Flutter fork of our own. And at the point you're maintaining a Flutter fork to skip cold start, that same engineering cost buys you server-driven UI, which gets fixes live on the next network call instead. Stage 3 wins that trade.
So we don't swap mid-session. Instead, we call _updater.update(track: track) to stage the patch, log a shorebirdPatchDownloaded Amplitude event, and flip a ValueNotifier<bool> that our banner system subscribes to. The patch activates on next cold start, and we use the banner to pull that cold start in.
One dismiss-semantics wrinkle: ValueNotifier only fires when the value changes. If the user dismissed a previous banner without restarting, the flag is still true, and a second successful download won't re-prompt them. Options are (a) reset the flag on dismiss and flip it on every ready event, or (b) reach into notifyListeners() directly:
void _notifyPatchReady() {
if (patchReadyNotifier.value) {
// ignore: invalid_use_of_protected_member
patchReadyNotifier.notifyListeners();
} else {
patchReadyNotifier.value = true;
}
}
We went with (b). It works, but it's a code smell. notifyListeners is @protected for a reason. Cleaner: a dedicated Stream<PatchReadyEvent> or a StateNotifier that emits a monotonic counter. Future refactor.
The banner subscribes via our top-level RootTabBanners priority list, slotted below the hard "app update required" banner and above most informational ones. If the user genuinely needs a full store update, we ask for that first. Otherwise: "An update is ready. Tap to restart." The user decides when.
One command, any platform, any track
A patch shouldn't require tribal knowledge. We trigger it with a GitHub Actions workflow_dispatch, inputs: release_version, flavor, platform, channel. The Shorebird CLI authenticates with a single token (SHOREBIRD_TOKEN) stored as a GitHub secret; shorebirdtech/setup-shorebird@v1 picks it up from the env. The workflow shells out to Fastlane:
# ios/fastlane/Fastfile
lane :shorebird_patch do |options|
release_version = options[:release_version]
flavor = options[:flavor] || "kikoff"
track = options[:track] || "gated"
sync_appstore_certs
cocoapods(podfile: "#{mobile_root}/ios/Podfile", repo_update: true)
sh("cd #{mobile_root} && flutter clean")
sh("cd #{mobile_root} && shorebird patch ios " \\
"--release-version=#{release_version} " \\
"--flavor #{flavor} " \\
"--target lib/main_#{flavor}.dart " \\
"--track=#{track} " \\
"--allow-asset-diffs " \\
"--no-confirm")
end
Two flags worth naming:
--track=gatedby default. Every patch ships togatedfirst. Promote tostablefrom the console after crash telemetry bakes.--allow-asset-diffspermits Dart-only changes even if asset hashes drift. Without it, unrelated asset churn in a feature branch blocks the patch.
The Android lane is symmetric. Same command, one word swapped.
shorebird preview, the only safe dry run
One thing worth naming: shorebird preview. It pulls a specific release from the console, downloads a specific patch on top of it, and runs the combination locally. This is the only way to test a patch without publishing it. We wire it into our internal QA runbook for anything nontrivial. The alternative is publishing to gated, finding a bug, and burning a patch number on a fix.
Practices worth stealing
What's patchable, and what isn't
Shorebird patches Dart. That's the whole rule. Everything downstream follows from it.
The rule of thumb: if your change is in lib/, it's patchable. If it's in ios/ or android/, it isn't.
In practice, the majority of commits on a mature Flutter app are Dart-only: pure logic, widget tweaks, UI changes, business logic. Those are the commits Shorebird patches. Native changes tend to cluster (a Flutter upgrade, a new plugin adoption, a platform deeplink change) and can be batched into scheduled releases. The emergency-release drama is what Shorebird removes.
Where we are at Kikoff, and where it stops being enough
At Kikoff we're at the far end of where code push alone is the right tool. Our product is mostly fetch-data-render-form-submit. Pure Flutter nails that and will for years. But the cold-start floor is visible in our adoption data now in a way it wasn't before we adopted Shorebird, precisely because Shorebird stripped out everything above it.
The next architectural move for us isn't a better Shorebird integration. It's starting to route specific surfaces through server-driven UI: widget tree defined on the server, client-side renderer, the app fetches its screens instead of compiling them. In Flutter that could be Google's rfw, the third-party stac, or a custom protobuf schema that fits your existing infra. The widget tree is the easy part. The hard parts are actions, state reconciliation across re-renders, versioning against old clients, and the fact that the renderer itself is still release-blocking. Stage 3 is a commitment, not a weekend.
We'd apply it selectively at first. The surfaces that change most often, where cold-start lag hurts. Not all of them. Not soon. But the first candidates are already on the roadmap in our heads.
Shorebird bought us the runway to get there at our own pace, instead of being forced into it by bug-fix latency. That's the real value of landing Stage 2 cleanly: you get to pick when you graduate, not whether. We'll write the Stage 3 post when we've actually built it.
An invitation
The cross-platform runtime choice in 2026 should be about what kind of product you're building, not about which stack makes you pay the full store-review tax on every five-line bug fix. That was a real tradeoff for a long time. It isn't anymore.
At Kikoff, we’re building systems that help us move faster without compromising quality, and this Shorebird integration is just one example of how we’re rethinking what modern fintech infrastructure can look like. If building at this scale and pace excites you, check out our careers page at Kikoff Careers
About the Author

Karthic Thangarasu is a Member of Technical Staff at Kikoff, where he owns the experimentation platform and is an expert on experimentation instrumentation. In his free time, he studies chess with a grandmaster because he likes losing. He also plays video games for the same reason 😛











