Some years ago, I have used an existing random string generator for generating random unique strings. From one day to another, there were a lot of timeouts in
the application.
In the code, I did a check if the generated string did already exist in the database. If true, the
generator did run again.
When a certain number of generated strings did already exist in the database, it did run forever until the
maximum execution time was exceeded.
It turned out that the random string generator was not as random as I thought.
With a length of 10 and 32 available chars for the generator, I would have 1.1258999e+15 available combinations, so I shouldn’t run into duplicated strings too fast - correct? Well I did.
I wrote a PHPUnit test against the random generator to see what was going on.
<?php
use PHPUnit\Framework\TestCase;
class RandomGeneratorTest extends TestCase
{
public function testForSufficientRandomness(): void
{
$randomGenerator = new RandomGenerator();
$testRunCount = 1000;
$randomStringLength = 10;
$randomStrings = [];
for ($i = 0; $i < $testRunCount; $i++) {
$randomString = $randomGenerator->generateRandomStringFrom($randomStringLength);
$this->assertNotContains($randomString, $randomStrings, 'Random generator not sufficient');
$randomStrings[] = $randomString;
}
}
}
The random generator already failed to provide sufficient randomness before reaching 1.000 generated strings.
The random generator had a bug. It provided sufficient randomness for its use case before - but not for mine.
After fixing the bug, the test did run fine.
Would you include an automated test for a minimal sufficient randomness?
Let me know what you think about it!
Personally, I think now at least an error should be triggered in the error tracking application that a duplicated string has been generated. Of course, you are using an error tracking like Sentry - right?
In this way, the bug would have been noticed and fixed much earlier before reaching the customer.
Also, an application monitoring tool like Tideways would have noticed that the execution time would have taken longer and longer by the time.
Other learnings:
Do not rely on that stuff that work for other use cases do work for yours.
Test, monitor and TEST!