Do you test for sufficient randomness?

Posted on 10. September 2022

Some years ago, I have used an existing random string generator for generating random unique strings. From one day to another, there were a lot of timeouts in the application.
In the code, I did a check if the generated string did already exist in the database. If true, the generator did run again.
When a certain number of generated strings did already exist in the database, it did run forever until the maximum execution time was exceeded.

It turned out that the random string generator was not as random as I thought.

possible-combination-calculation-screenshot

With a length of 10 and 32 available chars for the generator, I would have 1.1258999e+15 available combinations, so I shouldn’t run into duplicated strings too fast - correct? Well I did.

I wrote a PHPUnit test against the random generator to see what was going on.

<?php

use PHPUnit\Framework\TestCase;

class RandomGeneratorTest extends TestCase
{
    public function testForSufficientRandomness(): void
    {
        $randomGenerator = new RandomGenerator();

        $testRunCount = 1000;
        $randomStringLength = 10;
        $randomStrings = [];

        for ($i = 0; $i < $testRunCount; $i++) {

            $randomString = $randomGenerator->generateRandomStringFrom($randomStringLength);

            $this->assertNotContains($randomString, $randomStrings, 'Random generator not sufficient');
            $randomStrings[] = $randomString;
        }
    }
}

The random generator already failed to provide sufficient randomness before reaching 1.000 generated strings.

failed-test-against-random-string-generator-screenshot

The random generator had a bug. It provided sufficient randomness for its use case before - but not for mine.

After fixing the bug, the test did run fine.

Would you include an automated test for a minimal sufficient randomness?

Let me know what you think about it!

Personally, I think now at least an error should be triggered in the error tracking application that a duplicated string has been generated. Of course, you are using an error tracking like Sentry - right?

In this way, the bug would have been noticed and fixed much earlier before reaching the customer.

Also, an application monitoring tool like Tideways would have noticed that the execution time would have taken longer and longer by the time.

Other learnings:
Do not rely on that stuff that work for other use cases do work for yours.
Test, monitor and TEST!