What is the most important criterion when choosing a code coverage tool for embedded software?

The tool must measure coverage on the same binary your cross-compiler produces for the target, without changing your source or build. If it can only measure a host build, the numbers do not represent the firmware that ships.

Can a code coverage tool work on a target with no file system?

Yes. A capable embedded code coverage tool streams hit data off the device over a debug channel such as SWO, RTT or JTAG, or buffers it in RAM and flushes on a trigger, so no file system is required on the target.

Does RKTracer generate tests for embedded code?

RKTracer is a code coverage tool; it measures structural coverage and identifies gaps. Test generation is handled by companion products: RKMCP exposes gap data to an AI agent over MCP, and RKTracerGen produces tests offline.

How to Choose a Code Coverage Tool for Embedded Software

Every embedded team eventually needs the same number: how much of this firmware did our tests actually exercise? Getting that number sounds simple, and on a laptop it nearly is. On a real target it is a different animal. The compiler is not the one on your machine, the device may have kilobytes of RAM rather than gigabytes, there is often no file system to write a report to, and instrumentation that slows a function by a few microseconds can change behaviour you were trying to measure.

So choosing a code coverage tool for embedded software is really a question about constraints. The wrong tool gives you a clean, reassuring percentage that describes a host build no customer will ever run. The right tool gives you the harder, truer number measured on the binary your cross-compiler emits for the target. This buyer's guide walks through the seven criteria that matter, ends with a checklist you can copy, and shows how RKTracer measures up against each one.

In one sentence

A good embedded code coverage tool measures the binary that ships, built with your cross-compiler, without changing your source or build, and reports the structural metrics your standard requires.

Why embedded code coverage is harder than host coverage

On a host, coverage is almost a solved problem. You compile with a coverage flag, run, and a runtime writes counters to a file. Embedded development strips away nearly every assumption that recipe relies on.

The compiler is not your compiler. Firmware is built with a cross-compiler that targets a different instruction set. A tool that only understands host GCC or Clang cannot instrument what IAR, Keil or a TI compiler produces, and a host coverage build can optimise code differently than the airborne or automotive build.
The target is constrained. A microcontroller may have a few kilobytes of RAM. There is no room for fat instrumentation tables or a heavyweight runtime, and every extra byte of flash competes with your application.
There may be no file system. A bare-metal device cannot fopen("coverage.dat"). Hit data has to leave the chip another way, or the tool simply cannot report.
Timing is real. Interrupt handlers, control loops and communication stacks have deadlines. Instrumentation that adds unpredictable latency does not just slow things down, it can mask or create the very bugs you are hunting.

Each constraint becomes a buying criterion. If a host-coverage number is misleading on a cross-compiled target, that is a topic worth its own deep dive; we cover it in why host coverage numbers lie. The rest of this guide assumes you have accepted the premise: measure on the target, or do not bother.

Criterion 1: no source or build changes

The first thing to ask any vendor is brutally practical. What do I have to change in my project? The answer should be: almost nothing.

The best embedded coverage tools work by prefixing your existing build command. You keep your make, your CMake, your IAR or Keil project exactly as it is, and the tool instruments transparently during compilation. Compare that with approaches that ask you to swap in a wrapper compiler, edit your build scripts, or sprinkle macros and #include hooks through your source. Every one of those is a maintenance burden and, in a regulated program, a change you now have to justify and re-verify.

If adopting a coverage tool means editing the code you are trying to certify, you have traded one verification problem for two.

A prefix-the-build model also keeps your repository honest. The source under test is byte-for-byte the source you ship, and that matters when an auditor asks whether the measured artefact is the real one.

Criterion 2: every compiler and cross-compiler supported

Embedded teams rarely use one compiler. A single product might build application code with GCC, a safety library with a qualified Arm compiler, and a motor-control routine with a chip-vendor toolchain. Your coverage tool has to speak all of them, or you end up with blind spots exactly where the risk lives.

When you evaluate, list every compiler in your build and check it explicitly:

GCC and the GNU Arm Embedded toolchain (arm-none-eabi-gcc)
IAR Embedded Workbench compilers
Arm Compiler (armclang and the classic armcc)
Keil MDK
TI code generation tools for C2000, MSP and Sitara
Microchip XC compilers
Green Hills compilers for high-integrity targets

A tool that supports only the mainstream open-source compilers will look great in a demo and fail the moment you point it at the proprietary toolchain that builds your safety-relevant module. Insist on a list, and insist on seeing it run against your compiler version.

Criterion 3: coverage on the real target

This is the criterion that most clearly separates embedded-grade tools from repurposed host tools. The coverage you report should come from running the instrumented binary on the actual hardware, an instruction-set simulator, or an emulator, not from a host recompile that merely resembles it.

The hard case is a target with no file system. A capable tool does not need one. Instead it streams hit data off the device while tests run, or buffers it in a small RAM region and flushes on a trigger, using whatever channel the board already exposes:

SWO / ITM trace on Arm Cortex-M parts
SEGGER RTT over the existing debug probe
JTAG or semihosting via the debugger
A spare UART or a memory dump pulled by the host at the end of a run

If a vendor cannot explain how data leaves a device that has no printf destination, that is a red flag. For the full picture of how this works in practice, see our walkthrough on measuring coverage on targets without a file system.

Watch for the host shortcut

A vendor that quietly measures a host build and presents it as your coverage is selling you confidence, not evidence. The metric only means something when it comes from the binary your cross-compiler produced for the target.

Criterion 4: the structural metrics that matter

Line coverage is the floor, not the goal. Embedded and safety-critical work needs the full ladder of structural metrics so you can match the metric to the rigour your project demands.

Metric	What it proves	Typical use
Statement	Every line executed at least once	Baseline for all code
Decision (branch)	Every decision took both true and false	Higher-integrity modules
Condition	Every condition took both values	Compound logic
MC/DC	Each condition independently affects the outcome	The most safety-critical logic
Multi-condition	Every combination of conditions exercised	Small, critical decisions

Make sure the tool measures MC/DC for C and C++, since that is the metric the most demanding logic is judged by, and that it handles short-circuit evaluation correctly so its independence analysis is real rather than fictional. If MC/DC is new to you, our primer on MC/DC explained works through it from truth tables to evidence. Beyond C and C++, a tool that also covers CUDA, Rust, C#, Java, JavaScript, TypeScript, Go and Python lets one workflow span the whole product, from firmware to the cloud service it talks to.

Criterion 5: low instrumentation overhead

On a constrained target, overhead is not a footnote, it is a constraint that can make a tool unusable. Two budgets matter: code size and time.

If instrumentation bloats the image past available flash, you cannot even load it. If it adds unpredictable latency, your real-time deadlines slip and you start chasing Heisenbugs that exist only because the measurement perturbs the system. Ask precisely how much flash and RAM the instrumentation adds per instrumented point, and how deterministic the per-probe cost is. A well-engineered tool keeps probes tiny and predictable, buffers efficiently, and lets you instrument selectively so a hot interrupt path can stay lean while the rest of the system is fully covered.

The quick gut check

Does it measure the binary my cross-compiler builds for the target?
Did I change a single line of source or build script to get coverage?
Does it report MC/DC, not just lines, for my C and C++?
Can it get data off a device with no file system?

Criterion 6: CI/CD and SonarQube integration

Coverage that lives on one engineer's laptop dies there. To change behaviour, the number has to run on every commit and show up where the team already looks. So the tool has to fit a pipeline without friction.

Look for clean integration with Jenkins, GitLab CI, GitHub Actions and the rest, plus reports in formats the ecosystem already consumes:

HTML reports for human review, with per-file and per-decision drill-down.
XML reports (Cobertura-style and JUnit-adjacent formats) that CI servers and dashboards ingest directly.
SonarQube integration so coverage sits alongside your other quality gates rather than in a silo.
Delta coverage on the diff, so a pull request is judged on the lines it actually changed and the gate stays meaningful as the codebase grows.

Delta coverage in particular is what keeps a coverage gate honest over time. A blanket "must hit 90%" rule rots; "every new or changed line must be covered" scales. RKTracer's full features and metrics page lists the supported report formats and integrations in detail.

Criterion 7: a path to closing gaps

Measuring coverage is half the job. The other half is what you do with the red lines a report turns up. A coverage tool's job is to find and quantify the gaps with precision; it should not pretend to be a test framework. But the workflow around it should give you a clear route to closing those gaps.

Be clear-eyed about the division of labour. RKTracer is a code coverage tool: it measures structural coverage and pinpoints exactly what is uncovered. It does not generate tests. That work is handled by companion products designed for it:

RKMCP exposes the precise gap data over the Model Context Protocol so an AI agent can read what is uncovered and propose the unit tests that close it.
RKTracerGen generates tests offline, for teams that need test creation without a live agent in the loop.

The point for your evaluation is that a good coverage tool produces gap data structured enough to act on, whether a human, an offline generator, or an AI agent picks it up. A pretty percentage with no machine-readable detail behind it leaves you to find the gaps by hand.

A checklist you can copy

Print this, take it into the demo, and refuse to be impressed by anything that does not tick the boxes that matter to your program.

Instruments by prefixing the build, with no source edits or wrapper compilers.
Supports every compiler in our build, including the proprietary cross-compilers (IAR, Keil, Arm, TI, Microchip, Green Hills).
Measures coverage on the real target, simulator or emulator, not a host recompile.
Works on a target with no file system, streaming data over SWO, RTT, JTAG or UART.
Reports statement, decision, condition, MC/DC and multi-condition, with MC/DC for C and C++.
Adds low, predictable overhead in flash, RAM and time, with selective instrumentation.
Integrates with CI/CD and SonarQube, emits HTML and XML, and supports delta coverage on diffs.
Produces actionable gap data that a human, an offline generator, or an AI agent can use to close gaps.

How RKTracer measures up

RKTracer was built for exactly the constraints above. You prefix your existing build, and it instruments during compilation using the same cross-compiler that builds your shipped firmware, so the coverage you measure is the coverage that runs on the target.

terminal: coverage on the real target

# Prefix your normal build, no source edits, no wrappers
$ rktracer make firmware

  compiler: arm-none-eabi-gcc 12.2 (target config)
  ✓ instrumented 118 files, source unmodified

$ rktracer run --target swo # stream hits off a device with no file system
$ rkresults --report html --report xml --delta

  ✓ Statement 100%
  ✓ Decision  98.4%
  ✓ MC/DC     96.1%  (gaps exported for RKMCP)

RKTracer streams coverage off constrained targets and exports gap data your team, RKTracerGen, or an AI agent via RKMCP can act on.

It speaks the proprietary cross-compilers embedded teams actually use, reports the full structural ladder through MC/DC for C and C++, keeps instrumentation small and predictable, and drops into CI with HTML and XML reports, SonarQube integration and delta coverage on every diff. To see the instrumentation model in depth, the how RKTracer works page walks through the architecture end to end.

The bottom line

Choosing a code coverage tool for embedded software comes down to one question asked seven ways: does this measure the software that actually ships, on the target it ships to, without making me change it? A tool that prefixes your build, supports your cross-compilers, runs on the real target including ones with no file system, reports MC/DC, stays light, and fits your pipeline will give you a number you can defend. A tool that measures a host build will give you a number you will have to explain away.

Take the checklist into your next evaluation, and make every vendor prove each line on your own toolchain and your own target. The right embedded code coverage tool does not just produce a higher percentage; it produces a percentage that is true.

Priya Nair

Coverage Engineering, RKValidate

Priya works with avionics, automotive and industrial teams adopting structural coverage on cross-compiled and constrained embedded targets.

How to Choose a Code Coverage Tool for Embedded and Safety-Critical Software

Why embedded code coverage is harder than host coverage

Criterion 1: no source or build changes

Criterion 2: every compiler and cross-compiler supported

Criterion 3: coverage on the real target

Criterion 4: the structural metrics that matter

Criterion 5: low instrumentation overhead

The quick gut check

Criterion 6: CI/CD and SonarQube integration

Criterion 7: a path to closing gaps

A checklist you can copy

How RKTracer measures up

The bottom line

Why Host Coverage Numbers Lie on Cross-Compiled Targets

Measuring Coverage on Targets Without a File System

MC/DC Explained for DO-178C

Measure coverage on your own target