A practical guide to mutation testing with Pest

Hey fellow Laravel devs! Let's talk about something that can seriously boost the way we test our code: mutation testing. We all want clean, robust code, and we put in a lot of effort to write tests. How sure are we that our tests are actually catching everything? I mean, we've all seen tests that pass but maybe aren't as thorough as we thought. That's where mutation testing comes in.

Mutation testing goes beyond code coverage to check the quality of our tests. It works by making small changes (mutations) to our code and then rerunning our tests to see if they fail. If a test still passes with a mutation, it means the test isn't really covering that specific part of the code. Think of it like a stress test for your tests. It’s not about finding bugs in your code directly, but about highlighting areas where your tests might be a bit weak. It helps us find blind spots and make our tests more robust and reliable.

We all know that 100% code coverage doesn't guarantee perfect tests. You might have lines of code that are run by tests, but that doesn't mean you're testing every case. Mutation testing ensures that our tests are actually checking the right things. This is super important for larger projects. By using this method, we can make sure that our tests are doing their job and that they're ready to catch any issues we might accidentally introduce in the future.

So, let's dive in and see how we can use mutation testing to write better, more reliable tests!

What is mutation testing?

Mutation testing is a cool technique that helps us evaluate how effective our test suites are by introducing small, deliberate changes to our codebase. These changes, called mutations, are like mini-simulations of common coding errors or edge cases. The idea is that if our tests are well-written, they should detect these mutations and fail.

When we do mutation testing, the testing tool will:

Introduce mutations: The tool automatically makes small changes in the code, such as modifying return values, altering method calls, or changing method arguments.
Rerun tests: After each mutation, the test suite is re-executed.
Analyze results: If a test fails after a mutation, it means the test has "caught" the change, which is good. If a test passes even with the mutation, it shows a weakness in our test suite.

It's important to remember that mutation testing isn't about finding bugs in our application code. It's about finding gaps in our test suite. It helps us identify areas where our tests might not be thorough enough.

Why should we care about mutation testing?

The scenario: shipping cost calculation with a hidden flaw

Imagine a function to calculate shipping costs based on order weight and premium membership status. The logic is:

Base cost: Shipping starts at $10.00.
Weight surcharge: For every kilogram over 5kg, an additional $2.50 is added.
Premium discount: Premium members receive a 10% discount. This discount should only apply if the order weight is greater than 2kg to prevent abuse on very light, already cheap shipments.

Here's the PHP code with a tiny, but significant, bug:

declare(strict_types=1);
 
function calculateShippingCost(float $weightInKilograms, bool $isPremiumMember): float
{
    $shippingCost = 10.00; // Base shipping cost
 
    if ($weightInKilograms > 5) {
        $shippingCost += ($weightInKilograms - 5) * 2.50; // Additional cost per kg over 5kg
    }
 
    // Premium members get a discount, but only if weight is over 2kg
    if ($isPremiumMember && $weightInKilograms >= 2) { // Subtle bug: Should be > 2, not >= 2 for discount to apply correctly
        $shippingCost *= 0.90; // 10% discount for premium members
    }
 
    return $shippingCost;
}

The "invisible issue": a sneaky off-by-one error

The bug is in the premium discount condition: if ($weightInKilograms >= 2). It should be if ($weightInKilograms > 2). This >= operator means orders weighing exactly 2kg get the discount, which might be unintended. It's a subtle edge case, easy to overlook.

Pest test suite (initial, potentially flawed tests)

Let's create a Pest test suite that, while seemingly covering the function, might miss this subtle bug.

use function calculateShippingCost;
 
it('calculates base shipping cost for light orders', function () {
    expect(calculateShippingCost(1, false))->toBe(10.00);
    expect(calculateShippingCost(4.9, false))->toBe(10.00);
});
 
it('adds weight surcharge for heavier orders', function () {
    expect(calculateShippingCost(6, false))->toBe(12.50); // 10 + (1 * 2.50)
    expect(calculateShippingCost(10, false))->toBe(22.50); // 10 + (5 * 2.50)
});
 
it('applies premium discount for members on orders over 2kg', function () {
    expect(calculateShippingCost(2.1, true))->toBe(9.00); // 12.50 * 0.90
    expect(calculateShippingCost(6, true))->toBe(11.25); // 12.50 * 0.90
});
 
it('does not apply premium discount for members on very light orders', function () {
    expect(calculateShippingCost(1.9, true))->toBe(10.00); // No discount
    expect(calculateShippingCost(0.5, true))->toBe(10.00); // No discount
});
 
it('handles zero weight', function () {
    expect(calculateShippingCost(0, false))->toBe(10.00);
    expect(calculateShippingCost(0, true))->toBe(10.00); // No discount on zero weight
});

Why standard tests might miss the issue:

Notice that the test suite includes cases for:

Base shipping cost.
Weight surcharge.
Premium discount for orders over 2kg.
No premium discount for very light orders (under 2kg).
Zero weight orders.

However, it lacks a specific test case for an order weighing exactly 2kg for a premium member. The test suite makes assumptions about "over 2kg" but doesn't explicitly check the boundary condition at 2kg. Therefore, these tests will all pass even with the bug (>= 2 instead of > 2).

How mutation testing catches the invisible issue:

Now, let's introduce mutation testing using Infection PHP. Infection will automatically modify your code in small ways ("mutations") to see if your tests can "kill" these mutants (i.e., cause a test to fail).

When Infection runs, it might create a mutant by changing the >= operator to > in the premium discount condition:

Mutated code (hypothetical mutation by infection):

if ($isPremiumMember && $weightInKilograms > 2) { // Mutation: >= changed to >
    $shippingCost *= 0.90;
}

With this mutation, if we run the same Pest test suite, the test it('applies premium discount for members on orders over 2kg', function () { ... }); will still pass because it's testing weights above 2kg (like 2.1kg and 6kg).

However, mutation testing tools often provide ways to analyze "survived" mutants – mutants that were not killed by the existing test suite. Infection would likely report this mutant as surviving.

To specifically target this potential issue, we can add a new Pest test case that focuses on the 2kg boundary:

Enhanced Pest test suite (adding the crucial test):

use function calculateShippingCost;
 
// ... (previous tests remain) ...
 
it('does NOT apply premium discount for members on orders exactly 2kg (boundary test)', function () {
    expect(calculateShippingCost(2, true))->toBe(10.00); // Should be base cost, no discount
});

Now, if we run the original code with this new test, this test will fail because calculateShippingCost(2, true) currently returns 9.00 (due to the >= 2 bug), but the test expects 10.00.

If we run mutation testing again after adding this test, Infection will now likely kill the mutant where >= was changed to >. This is because the new test case specifically targets the behavior at the 2kg boundary.

Mutation testing vs. code coverage

Code coverage tells you if your code is being executed by tests, while mutation testing tells you how well your code is being tested. You might have 100% code coverage, but if your tests aren't checking the right things, they won't catch mutations.

Here's a breakdown of the key differences:

Code coverage:

Measures the percentage of your code that is executed by tests.
Indicates which parts of your code are "touched" by your tests.
Aims to ensure that all lines of code are executed at least once during testing.
Can be a good starting point, but high code coverage doesn't guarantee robust tests.
It can miss edge cases or logical errors that are not caught by simple execution.

Mutation testing:

Evaluates the effectiveness of your tests by introducing small changes to the code.
Checks if the tests can detect the changes introduced by mutations.
Aims to ensure the tests are asserted against specific conditions or data.
Identifies areas where tests may be too superficial or not comprehensive.
A high mutation score indicates a more thorough test suite that is less likely to miss regressions.

While hitting 100% mutation testing score may not always be necessary or possible, it's a good goal to aim for. Focusing on improving your mutation score will help you find gaps in your test suite and write more effective tests.

Getting started with mutation testing

Now that we know what mutation testing is and how it's different from code coverage, let's see how to use it in our Laravel projects with Pest. Pest is a testing framework for PHP that has mutation testing built in.

Before you start, make sure you have Xdebug 3.0+ or PCOV installed and configured. These tools are necessary for Pest to do mutation testing. If you use Laravel Herd, check here to learn how to set up Xdebug.

To get started with mutation testing in Pest, follow these steps:

Specify which parts of your code should be covered by your tests. In your test file, you can use the covers() or mutates() functions to specify the classes or methods that your tests cover. For example, if you want to use mutation testing, you can add covers(...) or mutates(...) at the beginning of your test file. Both functions are identical for mutation testing purposes; however, the covers() function will also filter the code coverage report.

//,,,
 
covers(ProjectController::class, StoreProjectRequest::class, UpdateProjectRequest::class);
 
describe('ProjectController store', function () {
    it('should create projects enabled by default', function () {
        $user = User::factory()->create();
 
        $this->actingAs($user)
            ->post(route('projects.store'), [
                'name' => 'My project',
                'topic' => 'My topic',
                'description' => 'My description',
                'urls' => ['https://example.com'],
                'cron_expression' => '0 0 * * *',
            ])
            ->assertRedirect(route('dashboard'));
 
        $project = $user->projects()->first();
 
        expect($project)
            ->name->toBe('My project')
            ->topic->toBe('My topic')
            ->description->toBe('My description')
            ->urls->toBe(['https://example.com'])
            ->cron_expression->toBe('0 0 * * *')
            ->enabled->toBeTrue();
    });
 
    it('should redirect to login if user is not authenticated', function () {
        $response = $this->post(route('projects.store'));
 
        $response->assertRedirect(route('login'));
    });
 
    it('validates name is required', function () {
        $user = User::factory()->create();
 
        $this->actingAs($user)
            ->post(route('projects.store'), [
                'topic' => 'My topic',
                'description' => 'My description',
                'urls' => ['https://example.com'],
                'cron_expression' => '0 0 * * *',
            ])
            ->assertSessionHasErrors(['name' => 'The name field is required.']);
    });

Notice how we use covers(ProjectController::class, StoreProjectRequest::class, UpdateProjectRequest::class). This will run mutations in all these files so we can make sure the whole workflow is fully covered.

You can see more examples on ProjectController tests.

Run Pest with the --mutate option. This command will start mutation testing. It’s recommended to use the --parallel option to speed things up by running tests in parallel:
Pest will then re-run your tests against mutated code. If a test passes with a mutation, it means that the test is not covering that specific part of the code, and Pest will output the mutation and the diff of the code.
Analyze the results. Pest will output information about:
- Tested mutations: These are mutations that were detected by your test suite, meaning that your tests were able to catch the changes introduced by the mutation.
- Untested mutations: These are mutations that were not detected by your test suite. This means that the test was not able to catch the change, indicating a gap in the tests.
- Mutation score: This is a percentage that indicates the quality of your test suite. A score of 100% means that all mutations were "tested," which is the goal of mutation testing.
Improve your tests. If you find untested mutations or a low mutation score, you'll need to write additional or better tests to cover the uncovered code or edge cases. After you've improved your tests, rerun Pest with the --mutate option to confirm that the mutations are now tested and that your mutation score has improved.

Key points to remember:

Pest will only run the tests covering the mutated code to speed up the process.
Pest caches mutations to speed up subsequent runs.
You can use parallel execution to run multiple tests to further speed up the process.
A higher mutation score means a better test suite. A score below 100% typically means that you have missing tests or that your tests are not covering all the edge cases.

By following these steps, you can start using mutation testing to find weaknesses in your test suite and improve the quality of your tests.

Options & modifiers

Pest's mutation testing has a bunch of options and modifiers to fine-tune the process. These options let you customize how mutations are generated, which tests are run, and how the results are handled. Here are some of the most important ones:

@pest-mutate-ignore: This modifier lets you ignore specific lines of code when generating mutations. This is helpful when you have code that shouldn't be mutated. To use it, just add the comment // @pest-mutate-ignore on the line you want to ignore.

public function rules(): array
{
    return [
        'name' => 'required',
        'email' => 'required|email', // @pest-mutate-ignore
    ];
}

--covered-only: This option restricts mutations to only the lines of code that are covered by your tests. This option can speed up the mutation testing process, as it will only target the parts of the code that are executed by your tests.
--bail: This option stops mutation testing as soon as an untested or uncovered mutation is detected. This option can be helpful for quickly identifying issues and speeding up the feedback loop.
--class: This option allows you to generate mutations for a specific class or classes. For example, if you only want to run mutation testing on the App\Models namespace you can specify —class=App\Models.
--ignore: This option ignores mutations in the specified class or classes. For example, if you want to skip mutations in the App\Http\Requests namespace you can specify --ignore=App\Http\Requests.
--stop-on-uncovered: This option stops mutation testing as soon as an uncovered mutation is detected. This is similar to the --bail option but will only stop on uncovered mutations and not untested mutations.
--stop-on-untested: This option stops mutation testing as soon as an untested mutation is detected.

Check the full list of options on the Options & Modifiers docs.

If you want to learn more about how to set up and write mutation tests, take a look at this project on Github where you can find real life examples of tests.

Mutation testing helps you write better tests by revealing areas of your code that are not adequately covered or have edge cases you may have missed. By understanding the difference between tested and untested mutations, and utilizing the various options and modifiers available, you can significantly improve the robustness and reliability of your code.