Approaches for PHPUnit tests using a database
10 Jun 2017To be executed often unit tests need to run fast and there should be no problems with their initial configuration. Otherwise, nobody would use them as a tool to improve the whole process, but threat them like annoying requirements.
Most problems cause tests using a database. How to setup? How to make it run fast and without problem on CI machine as on developers computers?
Approaches for unit tests using a database
There are several ways to maintain a database schema for test purposes. The most popular are:
-
Prepare the DB at once
The point here is to keep scripts(migrations) to create an up-to-date schema in the version control system. Scripts can also be used to supply the initial data. Tests are executed in a transaction and are always rolled back, so the initial data remains and you keep the database clean. The main issue is to remember to update the test schema when updating production. Sometimes developers don’t care to create migration database script. They just update the production/common database. In this case they can’t do it. Test schema must be updated on every developer’s computer too. -
Drop & Create
Generate and drop the schema every time the test is run. To avoid creating useless tables and save execution time for every test there is list of table names to create. When used with a memory database, like SQLite it’s very fast. The problem starts when you want to use your production database engine’s specific features what SQLite don’t have. The slight disadvantage is that the developers need to write the tables for every test class. Sometimes on an old project, with many references between tables, it can be very long list to write. -
Update when needed
Before test execution there is check in code whether the schema is up to date and update it if it isn’t. It’s a very convenient solution because you don’t have to worry about maintaining every schema separately. But the check does take a slight amount of additional test execution time. Using this approach tests can be executed in a transaction to remove saved data or truncate used tables.
What about the initial data in the database
Again there are three ways(have I mentioned 3 is the magic number?) to use initial data in the schema:
- Factories
Write factories to populate the schema with the required data. It’s a class which helps us to create specific database data, ie create user with assigned department . This gives great test readability, look at the following example:
/**
* @test
*/
public function shouldGetCompanyUsers()
{
//given
$company = \Tests\CompanyFactory::createCompany(); //create company and insert to db
$newUser = \Tests\UserFactory::createUserForCompany($company->id);
//when
$users = User::getAllForCompany($company->id);
//then
$this->assertEquals($newUser->id, $users[0]->id);
}
When reading it the developer can see exactly what data was created and it’s easy to understand how the tested function should work.
- Fixtures
Another option is to use fixtures and execute them (all or selected) before every test. This is clean solution and well supported by most frameworks. It does give more control of the initial data in tests. When you need to populate a single test with specific data you can make a specific fixture (which can sometimes lead to having a large number of fixture files and a mess) or use factories. There is a small problem with fixtures - you don’t know the values of the inserted records. You have to make an additional fetch if you need them in the tested function/assertion.
/**
* @test
*/
public function shouldGetCategories()
{
//given
$this->addFixtures([
new Companies(),
new Users()
]);
$company = Company::where()->first(); //additional fetch for fixture data
//when
$fetchUsers = User::getAllForCompany($company->id);
//then
$this->assertCount(1, $fetchUsers);
$newUser = User::where()->first(); //additional fetch for fixture data
$this->assertEquals($newUser->id, $fetchUsers[0]->id); //to make this assertion works
}
- SQL script
Use script with SQL queries to populate the database with data when creating tables. I would not recommend it because when someone else is looking at the test case it’s not obvious where the data comes from. Also every developer would need to have good understanding of what the initial data in a project is. An additional problem is that rows in the initial data are bounded to specific tests by ids/names/etc. That makes it hard to edit in the future.
/**
* @test
*/
public function shouldGetCategories()
{
//given
$companyId = 1; //where that data comes from? Ah... specific `id` from initial data - bad approach!
//or:
$company = Company::where(['name'=>'companyWithUser'])->first(); //need to distinguish somehow companies
//when
$fetchUsers = User::getAllForCompany($company->id);
//then
$this->assertCount(1, $fetchUsers);
$newUser = User::where(['name'=>'userWithCompany'])->first(); //additional fetch for schema data
$this->assertEquals($newUser->id, $fetchUsers[0]->id); //to make this assertion works
}
Many developers use an id
to bind schema data to the tests. At the beginning it looks harmless, but later it makes complete madness when reading the tests.
Every time you have to go to the file initial_data_new_new.sql
and look for id=2321365
, to know what is happening in test,it really doesn’t make your life easier.