Cross-platform testing on iOS and Android — device matrices, golden tests, Flutter integration tests, Detox/Maestro patterns

Mobile Testing

Mobile testing inherits everything from web testing — pyramid, FIRST, query priorities — and adds problems that do not exist on the web: two platforms with diverging behavior, a hardware matrix, animations that the test runner must let settle, and a build-and-install step that adds minutes to every run.

A senior engineer's mobile testing job is mostly about keeping the suite fast enough to be useful while still proving the app works on the device matrix that matters.

What Changes on Mobile

Concern	Web	Mobile
Browser/runtime variants	Chromium / Firefox / WebKit	iOS / Android, plus OS version spread
Where tests run	JSDOM (fast) or browser (medium)	Simulator/emulator (slow) or device (slowest)
Network mocking	MSW intercepts fetch	Per-platform network stubs (Detox, Charles, custom HTTP client)
Animations	CSS-controllable, can be disabled	Native; require disabling in test builds
Layout regression	Visual regression (Chromatic, Percy)	Golden tests / screenshot tests per platform
OS / hardware capabilities	Largely uniform	Camera, biometrics, push, deep links — each needs mock or device

The result: mobile suites tend to have more component tests, fewer E2E tests, and a sharper division between "test on simulator" and "test on real device."

The Device Matrix

A typical product app needs at minimum:

iOS: one current iOS major (oldest still supported), one latest (release + 1 minor).
Android: one minSdk, one current, one latest. Often a low-end and a high-end device.

That is four to six environments. Two strategies:

Tiered matrix

PR-time: one iOS + one Android (latest stable, simulator/emulator).
Merge-time: full matrix, on emulators and a small device farm.
Pre-release: real-device run on the full matrix.

Targeted matrix

If the team has known issues on specific versions (e.g., a flaky Intl API on Android 11), pin those versions in CI explicitly. Don't rely on "latest" — re-test on the version that bit you last time.

A common failure mode is testing only on the latest iOS and shipping a bug to Android 9 users. Pick the matrix to match the analytics, not the latest device on a colleague's desk.

Flutter Integration Tests

Flutter's integration_test package runs real widget code on a device — same WidgetTester API as widget tests, plus a few extras.

// integration_test/app_test.dart
import 'package:flutter_test/flutter_test.dart';
import 'package:integration_test/integration_test.dart';
import 'package:my_app/main.dart' as app;

void main() {
  IntegrationTestWidgetsFlutterBinding.ensureInitialized();

  group('Onboarding', () {
    testWidgets('first launch shows tutorial', (tester) async {
      app.main();
      await tester.pumpAndSettle();

      expect(find.text('Welcome'), findsOneWidget);
      await tester.tap(find.text('Get started'));
      await tester.pumpAndSettle();

      expect(find.text('All set'), findsOneWidget);
    });
  });
}

Run:

flutter test integration_test/app_test.dart -d <device-id>

When pumpAndSettle hangs

pumpAndSettle keeps pumping until no animations are pending. If a widget has an infinite animation (a CircularProgressIndicator, a Lottie loop, a shimmer effect), it will hang forever.

// Bad: hangs if anything animates indefinitely
await tester.pumpAndSettle();

// Good: bounded pump
await tester.pump(const Duration(milliseconds: 500));

Or disable the offending animation in test mode:

if (kReleaseMode) {
  return CircularProgressIndicator();
} else {
  return SizedBox.shrink();  // in tests, render nothing
}

Network in Flutter integration tests

The standard trick is overriding the HTTP client at app entry.

// In production:
void main() => runApp(MyApp(httpClient: http.Client()));

// In test:
testWidgets('shows error on 500', (tester) async {
  final mockClient = MockClient((request) async {
    return http.Response('{"error": "oops"}', 500);
  });
  await tester.pumpWidget(MyApp(httpClient: mockClient));
  // ...
});

For dio, use MockAdapter from dio_test. For Riverpod, override the provider that supplies the client.

Golden tests on CI

Goldens are platform-sensitive:

Linux CI vs macOS dev: different font rendering, different antialiasing.
Different physical pixel density: different rasterization.

The fix is to run goldens only on CI, on a fixed OS, with the same font set every time.

# .github/workflows/test.yml
- name: Load fonts
  run: dart pub get && dart run golden_toolkit:download_fonts
- name: Run golden tests
  run: flutter test --tags golden

flutter test --update-goldens --tags golden regenerates on the same CI image; checked-in goldens are CI-derived, not dev-derived.

React Native Component Tests

For most RN components, the test runs in Node + JSDOM-like env via @testing-library/react-native. See Component Tests for the patterns. The mobile-specific additions:

Mocking native modules

Every native module needs a mock. The most common:

// jest.setup.js
import 'react-native-gesture-handler/jestSetup';
jest.mock('react-native/Libraries/Animated/NativeAnimatedHelper');
jest.mock('react-native-reanimated', () =>
  require('react-native-reanimated/mock'),
);
jest.mock('@react-native-async-storage/async-storage', () =>
  require('@react-native-async-storage/async-storage/jest/async-storage-mock'),
);
jest.mock('@react-native-firebase/messaging', () => ({
  default: () => ({
    requestPermission: jest.fn(() => Promise.resolve(1)),
    getToken: jest.fn(() => Promise.resolve('mock-token')),
  }),
}));

When a test crashes with "Cannot read property of undefined" on a native object, the module needs a mock entry.

Tests rendering a screen that depends on navigation need the NavigationContainer set up.

import { NavigationContainer } from '@react-navigation/native';

render(
  <NavigationContainer>
    <OrderScreen />
  </NavigationContainer>,
);

For tests that need a specific route, use the test helpers from @react-navigation/native:

import { NavigationContainer } from '@react-navigation/native';
import { createNativeStackNavigator } from '@react-navigation/native-stack';

const Stack = createNativeStackNavigator();
render(
  <NavigationContainer>
    <Stack.Navigator>
      <Stack.Screen name="Order" component={OrderScreen} initialParams={{ id: 1 }} />
    </Stack.Navigator>
  </NavigationContainer>,
);

Detox vs Maestro: When to Pick Each

Both run real apps on real (or simulator) devices. Pick based on the team's needs:

Need	Detox	Maestro
Power and flexibility	✓ (full JS)	Limited (YAML + JS only via plugins)
Speed of writing tests	Slower	Much faster
Cross-platform support	RN, native	RN, native, Flutter, web
CI integration	Mature (RN-specific)	Mature, simpler
Reliability	Good once set up	Generally simpler, less to break
Test-as-code review	Yes	Yes (YAML is reviewable)

For a typical product team, Maestro is enough unless the test logic needs branching, conditional flows, or heavy data setup. Detox is the answer when Maestro's expressiveness ceiling becomes painful.

Don't run both in the same repo. Pick one, commit.

Detox: realistic setup

// .detoxrc.js
module.exports = {
  testRunner: { args: { config: 'e2e/jest.config.js' } },
  apps: {
    'ios.debug': {
      type: 'ios.app',
      binaryPath: 'ios/build/Build/Products/Debug-iphonesimulator/MyApp.app',
      build: 'xcodebuild -workspace ios/MyApp.xcworkspace -scheme MyApp -configuration Debug -sdk iphonesimulator -derivedDataPath ios/build',
    },
    'android.debug': {
      type: 'android.apk',
      binaryPath: 'android/app/build/outputs/apk/debug/app-debug.apk',
      build: 'cd android && ./gradlew assembleDebug assembleAndroidTest -DtestBuildType=debug',
    },
  },
  devices: {
    simulator: { type: 'ios.simulator', device: { type: 'iPhone 15' } },
    emulator:  { type: 'android.emulator', device: { avdName: 'Pixel_7_API_34' } },
  },
  configurations: {
    'ios.sim.debug':     { device: 'simulator', app: 'ios.debug' },
    'android.emu.debug': { device: 'emulator',  app: 'android.debug' },
  },
};

CI runs detox build --configuration ios.sim.debug then detox test --configuration ios.sim.debug. Build caching is the lever that keeps this from being painfully slow.

Maestro: realistic setup

# .maestro/config.yaml
appId: com.example.app
---
- launchApp
- tapOn: "Sign in"
- inputText: ${EMAIL}
- tapOn: "Password"
- inputText: ${PASSWORD}
- tapOn: "Sign in"
- assertVisible: "Dashboard"

CI runs maestro test .maestro/flows/place-order.yaml. Works against iOS simulator, Android emulator, or real device via the same flow file.

E2E suites should not slog through a full sign-up every test. Patterns:

Pre-created test accounts. A pool of accounts allocated per worker; reset state in setup.
Magic-link login. A test-only endpoint that produces a valid session for a known account.
App state injection. A test build that accepts a deep link with a pre-authenticated session.

Pick the one that matches the architecture; do it before the suite has more than a handful of tests.

Real Device vs Simulator

Run the bulk of the suite on simulators / emulators (fast, free, parallel). Run a smaller subset on real devices (slow, requires a farm).

What real devices catch that simulators do not:

Performance issues (simulators are not throttled like real hardware).
Native-only APIs (some camera and sensor behavior).
Push notifications under real OS conditions.
Network conditions (cellular vs Wi-Fi, captive portal, no network).
Battery and memory pressure behavior.

Device farms (Firebase Test Lab, AWS Device Farm, BrowserStack, Sauce Labs) let you run on real devices in CI without owning them. Use them sparingly — they are billed per device-minute.

Performance Tests

Performance regressions do not show up in pass/fail tests. Patterns:

Frame rate during a critical scroll/animation. Capture with Flutter DevTools profile-mode for Flutter, react-native-performance or flashlight for RN. Track as a metric.
Time to interactive on app launch. Measure cold-start time in CI on a fixed device profile. Alert on regression.
Memory under a long session. Run a soak test (1000-step Maestro flow) and watch memory growth.

These are tracked over time, not asserted as pass/fail. The signal is the trend.

Pre-Commit Checklist

Before adding a mobile test or merging changes that affect the suite:

If it's a component test, mocks for all native modules used are in place.
If it's an integration / E2E test, animations that would hang pumpAndSettle or block waits are disabled in the test build.
Goldens (if any) were regenerated in the CI environment, not locally.
The test passes on both iOS and Android matrices if it touches platform-sensitive code.
The device profile is pinned (not "latest"), so the run is reproducible.
Network is mocked at the transport boundary, not at the data-access layer.

Mobile Testing

On this page