What is AI unit test generation?

AI unit test generation is the practice of having an AI agent write the unit and functional tests for code that coverage analysis has flagged as untested. With RKMCP, the uncovered lines, decisions, multi-condition and MC/DC conditions are served to the agent as structured data, so the generated tests target the exact gaps rather than guessing.

Does RKTracer generate tests by itself?

No. RKTracer is a code coverage tool that measures statement, decision, condition, multi-condition and MC/DC coverage. Two companion tools generate the tests: RKMCP feeds the gaps to your AI agent, and RKTracerGen generates unit tests fully offline.

Can I generate unit tests without using AI tokens?

Yes. RKTracerGen is a deterministic, fully offline unit-test generator. It uses Boundary-Value Analysis, runs your code once to capture real expected values as an oracle, stubs every call made inside the function, emits a Unity, GoogleTest or doctest harness with a standalone Makefile, then builds and runs it. No AI, no tokens, no network.

AI Unit Test Generation from Coverage Gaps: RKMCP & RKTracerGen

Every team that takes testing seriously eventually runs a coverage report and feels a small jolt of dread. The number is lower than expected, and beneath the number is a list of red lines, untaken branches and Boolean conditions that no test has ever exercised. Measuring coverage was the part you could automate in an afternoon. Now comes the part that takes the rest of the quarter: actually writing tests that turn those red lines green.

This article is about that second half. AI unit test generation and offline test generation both exist to close coverage gaps, and the RKTracer toolset ships both. The key idea threaded through all of it is simple: a generator that knows exactly which lines and conditions are uncovered writes far better tests than one guessing in the dark. Let us walk through why the gaps are hard, the two paths to closing them, and when to reach for each.

In one sentence

RKTracer measures coverage; RKMCP and RKTracerGen generate the tests, each targeting the precise lines, branches and MC/DC conditions RKTracer reports as uncovered.

Measuring coverage is step one; the real work is the tests

A coverage tool answers one question well: of all the executable structure in your code, how much did your test suite actually run? That is genuinely useful. It tells you where the risk is hiding and stops a team from believing a thin suite is thorough. But a coverage percentage, on its own, changes nothing. The lines stay untested until somebody sits down and writes a test that reaches them.

The gap between knowing and doing is where most coverage initiatives stall. A team adopts a tool, sees 58% statement coverage, sets a target of 90%, and then discovers that the missing 32% is the awkward 32%: error branches that only fire on malformed input, defensive code for conditions the happy path never hits, and compound decisions whose individual conditions were never independently exercised. The easy tests were already written. What remains is exactly the code that is tedious, fiddly and slow to test by hand.

Why uncovered branches and MC/DC conditions are hard by hand

Not all uncovered code is equally annoying to test, and the worst of it clusters in a few predictable places. Understanding why helps explain why automation pays off here specifically.

Error and defensive branches. The else that handles a failed allocation or a corrupt header may require you to inject a fault that never occurs in normal operation. Reaching it means constructing an input or a stubbed dependency that fails on cue.
Deep decisions. A branch buried four conditions into a function needs a specific combination of upstream state to even be reached, let alone toggled both ways.
MC/DC conditions. For C and C++, Modified Condition/Decision Coverage asks you to prove each condition in a decision can independently flip the outcome. That means constructing pairs of inputs that differ in exactly one condition, which is genuinely puzzle-like work to do by hand on every compound expression.
The oracle problem. Even once you reach a line, you still have to assert the right result. Knowing the expected value for a gnarly numeric routine often means running the code and reading what it produced, which is error-prone when done manually.

Each of these is mechanical but unforgiving. They are precisely the tasks where a generator that already knows the target, and can run the code, beats a human typing test fixtures one at a time.

A coverage number tells you the size of the problem. It does not write a single line of the solution. That is the job the generator does.

Two ways to close the gaps, both in the RKTracer tool

RKTracer is, and remains, a code coverage tool. It does not write tests itself. What the toolset adds around it are two distinct generators, and choosing between them is mostly a question of constraints, not capability.

RKMCP is the AI path. It exposes the coverage gaps as structured data over a standard protocol so the AI agent you already use can write, build and run the tests.
RKTracerGen is the offline path. It is a deterministic generator that needs no AI, no tokens and no network, and produces unit tests with real captured oracles.

Both start from the same input: the list of uncovered lines, decisions, multi-condition and MC/DC conditions that RKTracer produces. That shared starting point is the whole reason the generated tests land on real gaps instead of re-testing code that was already green.

RKMCP: serve coverage gaps to your AI agent

RKMCP is a C++ server that speaks the Model Context Protocol. It exposes the uncovered lines, decisions, multi-condition and MC/DC conditions as JSON-RPC, over stdio or HTTP, to whatever AI agent you already work with. Rather than pasting a coverage report into a chat window and hoping the model infers what matters, RKMCP hands the agent a clean, queryable description of exactly what is untested and where.

With that context in hand, the agent does the work end to end. It writes the unit and functional tests, generates the Makefile or build glue they need, compiles them, runs them, and then re-checks coverage. If a gap is still open, it iterates: adjust the test, rebuild, re-run, re-measure, until the target line or condition is covered. The loop closes against real, measured coverage rather than the model's own guess about whether its test was sufficient.

terminal · RKMCP serving gaps to an agent

# Start the MCP server over stdio (or --http for HTTP)
$ rkmcp serve --coverage build/coverage.db

  ✓ exposing 41 uncovered lines, 12 decisions
  ✓ 7 MC/DC conditions without an independence pair
  listening on stdio · tools: list_gaps, get_source, run_tests

  agent → writes tests + Makefile, compiles, runs, re-checks
  ✓ gap closed: parse_header() error branch now covered

RKMCP turns a coverage database into structured tool calls your agent can act on, then re-measures after each pass.

The trade-off is honest and worth stating plainly: every pass spends AI tokens and compute. Writing a test, building it, running it and looping back is several model calls per gap, and a large file with many open branches can run up a real bill. RKMCP buys you the flexibility and reasoning of a full agent, including functional tests and build scaffolding, at the cost of those tokens. For many teams that is an easy trade. For a tightly budgeted nightly job over a huge codebase, it may not be, which is exactly where the second path comes in.

RKTracerGen: deterministic, fully offline generation

RKTracerGen is a unit-test generator that uses no AI at all. It is deterministic, runs entirely offline, and never makes a network call or spends a token. Given a function and the coverage gaps around it, it produces a complete, buildable test in a sequence of concrete steps.

Boundary-Value Analysis. It derives input values that probe the edges of each condition, the values most likely to flip a branch or exercise an off-by-one, rather than random inputs that tend to re-walk the happy path.
A real oracle, not a guess. It runs your code once with those inputs and captures the actual values the function produced. The assertions are built from observed behavior, so the expected results are real, not invented.
Automatic stubbing. It stubs out every call the function makes internally, so the unit under test is isolated from its dependencies and the test is fast and repeatable.
A harness you can build. It emits a Unity, GoogleTest or doctest harness plus a standalone Makefile, then builds and runs the resulting binary to confirm it compiles and passes.

The result is a genuine unit test, with captured expected values, isolated from its collaborators, that you can read, edit and commit. The scope is deliberately narrower than the AI path: RKTracerGen generates unit tests, not functional tests, and it works per function or per file rather than reasoning across a whole project. In exchange you get determinism, repeatability, zero token cost and an oracle grounded in what the code actually did.

Why a captured oracle matters

An expected value that came from running the code is a real oracle. RKTracerGen records what the function produced rather than asking a model to predict it, which removes a whole class of plausible-but-wrong assertions.

When to use which

The two paths are complementary, not competing. A useful way to decide is to look at four axes at once.

Dimension	RKMCP (AI path)	RKTracerGen (offline path)
Engine	Your AI agent via MCP	Deterministic, no AI
Test scope	Unit and functional	Unit tests only
Cost per run	AI tokens and compute	None: no tokens, no network
Working unit	Project-wide reasoning	Per function or per file
Oracle	Agent-derived, re-checked vs coverage	Captured from a real run

In practice many teams run both. RKTracerGen sweeps the large body of straightforward functions cheaply and deterministically in CI, while RKMCP is pointed at the harder, cross-cutting gaps where an agent's reasoning and functional tests earn their token cost. You can read the full side-by-side on the test generation comparison, which lays out the two paths next to RKTracer's coverage measurement.

Why coverage-driven generation beats blind generation

The most common way AI test generation goes wrong is that it generates tests for code that was already covered. Point a model at a file and ask for tests, and it tends to write the obvious ones, the same happy-path cases your existing suite already exercises, while the awkward error branch and the unflipped MC/DC condition stay red. You spend tokens and reviewer time to move the needle barely at all.

Coverage-driven generation inverts that. Because RKTracer tells the generator exactly which lines, branches and conditions are uncovered, both RKMCP and RKTracerGen aim straight at the gaps. The agent is not asked to test parse_header in general; it is told the error branch at a specific line never ran and asked to make it run. RKTracerGen is not generating boundary values for the whole function; it is targeting the condition that lacks an independence pair. Every generated test is pulling toward a real, measured deficiency, which is why the coverage number actually climbs.

The mental model to keep

Coverage is the map of what is untested; the generator is what fills the map in.
RKMCP serves the gaps to your AI agent, which writes, builds, runs and re-checks, spending tokens each pass.
RKTracerGen generates unit tests offline with a captured oracle, no tokens and no network.
Both target the exact uncovered lines and conditions, so generated tests close real gaps.

An accuracy note on who does what

It is worth being precise about the division of labor, because conflating the roles leads to the wrong expectations. RKTracer is the measurement tool: it instruments your build and reports statement, decision, condition, multi-condition and, for C and C++, MC/DC coverage. It does not generate tests. The generation is done by RKMCP and RKTracerGen, the two tools this article is about.

That separation is a feature, not an accident. The coverage measurement stays a clean, trustworthy source of truth about what your tests actually exercise, on host and on the real target, across C, C++, CUDA, Rust, C#, Java, JavaScript, TypeScript, Go and Python. The generators consume that truth to produce tests. You can see how the measurement layer works in how RKTracer works, and the full metric set on the features and metrics page.

Generate, then re-measure

Always close the loop. A generated test is only worth committing once RKTracer confirms it moved the metric. RKMCP automates that re-check inside its loop; with RKTracerGen, re-run coverage after the harness builds to confirm the gap is closed.

The bottom line

Coverage measurement is the diagnosis. Test generation is the cure, and for a long time the cure has been the expensive, manual part: deriving boundary values, reaching defensive branches, building MC/DC independence pairs and figuring out the right expected value for each. The RKTracer toolset closes that gap two ways. RKMCP hands the precise list of uncovered lines and conditions to your AI agent and lets it write, build, run and iterate until covered, at the cost of tokens. RKTracerGen does it deterministically and offline, with boundary-value inputs and a real captured oracle, at no token cost, for unit tests.

Pick the path that fits the constraint in front of you, or run both. Either way, the principle holds: feed the generator the real gaps, and every test it writes counts. Measure with RKTracer, generate with RKMCP or RKTracerGen, and watch the red lines turn green for the right reason.

Arjun Rao

Test Automation, RKValidate

Arjun works with embedded and systems teams adopting coverage-driven test generation across C, C++ and mixed-language codebases.

AI Unit Test Generation: Closing Coverage Gaps with RKMCP and RKTracerGen

Measuring coverage is step one; the real work is the tests

Why uncovered branches and MC/DC conditions are hard by hand

Two ways to close the gaps, both in the RKTracer tool

RKMCP: serve coverage gaps to your AI agent

RKTracerGen: deterministic, fully offline generation

When to use which

Why coverage-driven generation beats blind generation

The mental model to keep

An accuracy note on who does what

The bottom line

Everything You Need to Know About Code Coverage

MC/DC Explained for DO-178C

Unit Testing Tools for C, C++, Java and C#

Turn coverage gaps into passing tests