Several things are true:
- We want to be confident that we can deploy "master" of our code to production.
- We want to cross-check our various deployed code bases with each other prior to them hitting production.
- We want "gating" code checks to not take too long (~10 mins total), as that reduces developer productivity.
- We want to be able to merge patches quickly and with confidence, especially for incident response.
This means we want all our code to be covered "adequately" by tests, that we test all our extensions / skins / etc. together, and that these tests not take forever. These are in moderate opposition to each other.
As of mid-April 2019, the interlock "gate" only covers 31 of the 190 extensions and skins in production, and yet is already far too slow (~25 minutes for a full test run; 12 minutes for the wmf-quibble-core-vendor-mysql-hhvm-docker job, of which 5 minutes is the actual PHP unit run). Requests to add new, critical extensions to the gate (like Scribunto in T125050) are objected to on the grounds that this would be the straw that would break the camel's back. Even with no further additions of new code bases, the slowness of gate increases as teams increase the depth and breadth of their code coverage, either as maintenance and response to discovered bugs, or particularly as they add features and configuration options.
Very roughly, we can approximate that every test you write steals time from every other developer, so you should have a good reason to add that twelfth edge case test.
From the most recent merge of MediaWiki core's PHP unit test report numbers, we have 77 test classes that take more than 1000ms, an entirely arbitrary round number:
Class | Duration ↑ | Fail | Skip | Pass | Total |
---|---|---|---|---|---|
CirrusSearch\SearcherTest | 1 min 18 sec | 0 | 0 | 4 | 4 |
CirrusSearch\SearcherTest::testSearchText | 1 min 18 sec | 0 | 0 | 1203 | 1203 |
Wikibase\Repo\Tests\Store\Sql\WikiPageEntityMetaDataLookupTest | 9.7 sec | 0 | 0 | 36 | 36 |
Babel\Tests\BabelTest | 9.4 sec | 0 | 0 | 8 | 8 |
Wikibase\Repo\Tests\Api\ApiUserBlockedTest | 8.6 sec | 0 | 0 | 3 | 3 |
Wikibase\Lib\Tests\SimpleCacheWithBagOStuffTest | 8 sec | 0 | 0 | 35 | 35 |
SpecialPageFatalTest | 7.5 sec | 0 | 0 | 2 | 2 |
SpecialPageFatalTest::testSpecialPageDoesNotFatal | 7.5 sec | 0 | 0 | 193 | 193 |
Wikibase\Repo\Tests\Api\ApiUserBlockedTest::testBlock | 7.4 sec | 0 | 0 | 18 | 18 |
CirrusSearch\LanguageDetectTest | 6.9 sec | 0 | 0 | 3 | 3 |
CirrusSearch\LanguageDetectTest::testTextCatDetector | 6.8 sec | 0 | 0 | 9 | 9 |
EchoDiscussionParserTest | 6.3 sec | 0 | 0 | 11 | 11 |
Wikibase\Lib\Tests\Store\Sql\SqlEntityInfoBuilderTest | 5.9 sec | 0 | 0 | 21 | 21 |
Wikibase\Repo\Tests\Store\StoreTest | 5.7 sec | 0 | 0 | 3 | 3 |
Wikibase\Repo\Tests\Store\StoreTest::testRebuild | 5.7 sec | 0 | 0 | 1 | 1 |
Wikibase\Repo\Tests\Api\EditEntityTest | 5.4 sec | 0 | 0 | 10 | 10 |
AbuseFilterConsequencesTest | 5 sec | 0 | 0 | 2 | 2 |
ResourcesTest | 5 sec | 0 | 0 | 8 | 8 |
Wikibase\Repo\Tests\Actions\EditEntityActionTest | 3.9 sec | 0 | 0 | 3 | 3 |
Wikibase\Repo\Tests\Api\GetEntitiesTest | 3.8 sec | 0 | 0 | 5 | 5 |
ResourcesTest::testFileExistence | 3.7 sec | 0 | 0 | 2947 | 2947 |
CirrusSearch\Maintenance\ScriptsRunnableTest::testScriptCanBeLoaded | 3.7 sec | 0 | 0 | 14 | 14 |
Wikibase\Repo\Tests\Api\SetClaimTest | 3.7 sec | 0 | 0 | 8 | 8 |
Wikibase\Repo\Tests\Api\GetEntitiesTest::testGetEntities | 3.5 sec | 0 | 0 | 168 | 168 |
Wikibase\Repo\Tests\Store\Sql\WikiPageEntityStoreTest | 3.4 sec | 0 | 0 | 27 | 27 |
MessageIndexTest | 3.2 sec | 0 | 0 | 2 | 2 |
MessageIndexTest::testMessageIndexImplementation | 3.2 sec | 0 | 0 | 4 | 4 |
Wikibase\Repo\Tests\Api\SetSiteLinkTest | 3.2 sec | 0 | 0 | 7 | 7 |
Wikibase\Repo\Tests\Api\EditEntityTest::testEditEntity | 3.1 sec | 0 | 0 | 28 | 28 |
Wikibase\Repo\Tests\Api\SetAliasesTest | 3 sec | 0 | 0 | 13 | 13 |
AutoLoaderStructureTest | 3 sec | 0 | 0 | 4 | 4 |
EchoDiscussionParserTest::testGenerateEventsForRevision_mentionStatus | 3 sec | 0 | 0 | 11 | 11 |
Wikibase\Client\Tests\Usage\UsageTrackingIntegrationTest | 3 sec | 0 | 0 | 6 | 6 |
EchoDiscussionParserTest::testGenerateEventsForRevision | 2.7 sec | 0 | 0 | 8 | 8 |
Wikibase\Repo\Tests\Api\ApiXmlFormatTest | 2.7 sec | 0 | 0 | 12 | 12 |
AbuseFilterTest | 2.4 sec | 0 | 0 | 4 | 4 |
Wikibase\Repo\Tests\Api\SetSiteLinkTest::testSetSiteLink | 2.3 sec | 0 | 0 | 14 | 14 |
Wikibase\Repo\Tests\Specials\SpecialNewItemTest | 2.2 sec | 0 | 0 | 10 | 10 |
Wikibase\Lib\Tests\Formatters\ItemPropertyIdHtmlLinkFormatterTest | 2 sec | 0 | 0 | 22 | 22 |
Wikibase\Repo\Tests\Specials\SpecialNewPropertyTest | 2 sec | 0 | 0 | 5 | 5 |
Wikibase\Repo\Tests\Api\BotEditTest | 2 sec | 0 | 0 | 3 | 3 |
Wikibase\Repo\Tests\Api\BotEditTest::testBotEdits | 1.9 sec | 0 | 0 | 14 | 14 |
Wikibase\Repo\Tests\Api\FormatSnakValueTest | 1.9 sec | 0 | 0 | 4 | 4 |
Wikibase\Repo\Tests\Api\FormatSnakValueTest::testApiRequest | 1.8 sec | 0 | 0 | 15 | 15 |
Wikibase\Repo\Tests\Api\SetDescriptionTest | 1.6 sec | 0 | 0 | 8 | 8 |
Wikibase\Repo\Tests\Api\RemoveClaimsTest | 1.6 sec | 0 | 0 | 4 | 4 |
Wikibase\Repo\Tests\Api\SetAliasesTest::testSetAliases | 1.6 sec | 0 | 0 | 14 | 14 |
AbuseFilterTest::testGenerateTitleVars | 1.5 sec | 0 | 0 | 42 | 42 |
AbuseFilterConsequencesTest::testFilterConsequences | 1.5 sec | 0 | 0 | 16 | 16 |
ExtensionJsonValidationTest | 1.5 sec | 0 | 0 | 1 | 1 |
ExtensionJsonValidationTest::testPassesValidation | 1.5 sec | 0 | 0 | 43 | 43 |
Wikibase\Repo\Tests\Actions\EditEntityActionTest::testUndoRevisions | 1.5 sec | 0 | 0 | 6 | 6 |
Wikibase\Repo\Tests\Actions\EditEntityActionTest::testUndoSubmit | 1.4 sec | 0 | 0 | 20 | 20 |
Flow\Tests\PermissionsTest | 1.4 sec | 0 | 0 | 2 | 2 |
Flow\Tests\PermissionsTest::testPermissions | 1.4 sec | 0 | 0 | 104 | 104 |
ApiCoreThankIntegrationTest | 1.4 sec | 0 | 0 | 10 | 10 |
Wikibase\Repo\Tests\Api\PermissionsTest | 1.4 sec | 0 | 0 | 3 | 3 |
Wikibase\Repo\Tests\Api\SetReferenceTest | 1.4 sec | 0 | 0 | 6 | 6 |
Wikibase\Repo\Tests\Api\SetLabelTest | 1.4 sec | 0 | 0 | 7 | 7 |
Wikibase\Lib\Tests\LanguageFallbackChainFactoryTest | 1.3 sec | 0 | 0 | 4 | 4 |
ApiStructureTest | 1.3 sec | 0 | 0 | 2 | 2 |
Flow\Tests\Collection\PostCollectionTest | 1.2 sec | 0 | 0 | 8 | 8 |
LanguageSearchTest::testSearch | 1.2 sec | 0 | 0 | 16 | 16 |
Wikibase\Repo\Tests\Store\Sql\WikiPageEntityRedirectLookupTest | 1.2 sec | 0 | 0 | 8 | 8 |
Wikibase\Lib\Tests\LanguageFallbackChainFactoryTest::testNewFromLanguage | 1.2 sec | 0 | 0 | 25 | 25 |
Wikibase\Repo\Tests\Api\RemoveReferencesTest | 1.2 sec | 0 | 0 | 4 | 4 |
Babel\Tests\BabelTest::testGetUserLanguageInfo | 1.1 sec | 0 | 0 | 2 | 2 |
Wikibase\Repo\Tests\Api\SetQualifierTest | 1.1 sec | 0 | 0 | 3 | 3 |
Babel\Tests\BabelTest::testGetUserLanguages | 1.1 sec | 0 | 0 | 2 | 2 |
Wikibase\Lib\Tests\Store\Sql\SqlEntityInfoBuilderTest::testGivenInvalidArguments_constructorThrowsException | 1.1 sec | 0 | 0 | 5 | 5 |
ApiMobileViewConvertTitleTest | 1 sec | 0 | 0 | 5 | 5 |
Babel\Tests\BabelTest::testRenderDefaultLevelNoCategory | 1 sec | 0 | 0 | 2 | 2 |
Wikibase\Repo\Tests\Api\RemoveQualifiersTest | 1 sec | 0 | 0 | 4 | 4 |
Wikibase\Repo\Tests\Store\Sql\PropertyInfoTableBuilderTest | 1 sec | 0 | 0 | 4 | 4 |
Wikibase\Repo\Tests\Store\WikiPageEntityStorePermissionCheckerTest | 1 sec | 0 | 0 | 4 | 4 |
Babel\Tests\BabelTest::testRenderDefaultLevel | 1 sec | 0 | 0 | 2 | 2 |
Wikibase\Repo\Tests\Api\SetDescriptionTest::testSetDescription | 1 sec | 0 | 0 | 9 | 9 |
Obviously these are not directly comparable for a variety of reasons:
- These are shown based on class; some extensions, like Cirrus, merge all their tests into few classes, whereas others like Wikibase spread them out, making Cirrus look massive and flattering the huge number of tests that it creates.
- Individual tests can be trivial (e.g. ResourcesTest::testFileExistence runs all 2947 file checks in 3.7 sec) or very complex (the solitary Wikibase\Repo\Tests\Store\StoreTest::testRebuild takes 5.7 sec).
- Some tests are for features/behaviour that is particularly critical and it's reasonable to test every edge case we can think of; similarly, some tests are in areas where we've had repeated major issues, and we want to avoid further regressions.
- Some tests are driven by use (e.g. ResourcesTest is a structure test for files registered by extensions and skins, and though might well be made more efficient is mostly driven by the rest of us).
However, it's worth reflecting on whether we're over-testing in the areas most likely to break and most important for users.