==================== Bazaar Testing Guide ==================== The Importance of Testing ========================= Reliability is a critical success factor for any version control system. We want Bazaar to be highly reliable across multiple platforms while evolving over time to meet the needs of its community. In a nutshell, this is what we expect and encourage: * New functionality should have test cases. Preferably write the test before writing the code. In general, you can test at either the command-line level or the internal API level. See `Writing tests`_ below for more detail. * Try to practice Test-Driven Development: before fixing a bug, write a test case so that it does not regress. Similarly for adding a new feature: write a test case for a small version of the new feature before starting on the code itself. Check the test fails on the old code, then add the feature or fix and check it passes. By doing these things, the Bazaar team gets increased confidence that changes do what they claim to do, whether provided by the core team or by community members. Equally importantly, we can be surer that changes down the track do not break new features or bug fixes that you are contributing today. As of September 2009, Bazaar ships with a test suite containing over 23,000 tests and growing. We are proud of it and want to remain so. As community members, we all benefit from it. Would you trust version control on your project to a product *without* a test suite like Bazaar has? Running the Test Suite ====================== As of Bazaar 2.1, you must have the testtools_ library installed to run the bzr test suite. .. _testtools: https://launchpad.net/testtools/ To test all of Bazaar, just run:: bzr selftest With ``--verbose`` bzr will print the name of every test as it is run. This should always pass, whether run from a source tree or an installed copy of Bazaar. Please investigate and/or report any failures. Running particular tests ------------------------ Currently, bzr selftest is used to invoke tests. You can provide a pattern argument to run a subset. For example, to run just the blackbox tests, run:: ./bzr selftest -v blackbox To skip a particular test (or set of tests), use the --exclude option (shorthand -x) like so:: ./bzr selftest -v -x blackbox To ensure that all tests are being run and succeeding, you can use the --strict option which will fail if there are any missing features or known failures, like so:: ./bzr selftest --strict To list tests without running them, use the --list-only option like so:: ./bzr selftest --list-only This option can be combined with other selftest options (like -x) and filter patterns to understand their effect. Once you understand how to create a list of tests, you can use the --load-list option to run only a restricted set of tests that you kept in a file, one test id by line. Keep in mind that this will never be sufficient to validate your modifications, you still need to run the full test suite for that, but using it can help in some cases (like running only the failed tests for some time):: ./bzr selftest -- load-list my_failing_tests This option can also be combined with other selftest options, including patterns. It has some drawbacks though, the list can become out of date pretty quick when doing Test Driven Development. To address this concern, there is another way to run a restricted set of tests: the --starting-with option will run only the tests whose name starts with the specified string. It will also avoid loading the other tests and as a consequence starts running your tests quicker:: ./bzr selftest --starting-with bzrlib.blackbox This option can be combined with all the other selftest options including --load-list. The later is rarely used but allows to run a subset of a list of failing tests for example. Disabling plugins ----------------- To test only the bzr core, ignoring any plugins you may have installed, use:: ./bzr --no-plugins selftest Disabling crash reporting ------------------------- By default Bazaar uses apport_ to report program crashes. In developing Bazaar it's normal and expected to have it crash from time to time, at least because a test failed if for no other reason. Therefore you should probably add ``debug_flags = no_apport`` to your ``bazaar.conf`` file (in ``~/.bazaar/`` on Unix), so that failures just print a traceback rather than writing a crash file. .. _apport: https://launchpad.net/apport/ Test suite debug flags ---------------------- Similar to the global ``-Dfoo`` debug options, bzr selftest accepts ``-E=foo`` debug flags. These flags are: :allow_debug: do *not* clear the global debug flags when running a test. This can provide useful logging to help debug test failures when used with e.g. ``bzr -Dhpss selftest -E=allow_debug`` Note that this will probably cause some tests to fail, because they don't expect to run with any debug flags on. Using subunit ------------- Bazaar can optionally produce output in the machine-readable subunit_ format, so that test output can be post-processed by various tools. To generate a subunit test stream:: $ ./bzr selftest --subunit Processing such a stream can be done using a variety of tools including: * The builtin ``subunit2pyunit``, ``subunit-filter``, ``subunit-ls``, ``subunit2junitxml`` from the subunit project. * tribunal_, a GUI for showing test results. * testrepository_, a tool for gathering and managing test runs. .. _subunit: https://launchpad.net/subunit/ .. _tribunal: https://launchpad.net/tribunal/ Using testrepository -------------------- Bazaar ships with a config file for testrepository_. This can be very useful for keeping track of failing tests and doing general workflow support. To run tests using testrepository:: $ testr run To run only failing tests:: $ testr run --failing To run only some tests, without plugins:: $ test run test_selftest -- --no-plugins See the testrepository documentation for more details. .. _testrepository: https://launchpad.net/testrepository Babune continuous integration ----------------------------- We have a Hudson continuous-integration system that automatically runs tests across various platforms. In the future we plan to add more combinations including testing plugins. See . (Babune = Bazaar Buildbot Network.) Running tests in parallel ------------------------- Bazaar can use subunit to spawn multiple test processes. There is slightly more chance you will hit ordering or timing-dependent bugs but it's much faster:: $ ./bzr selftest --parallel=fork Note that you will need the Subunit library to use this, which is in ``python-subunit`` on Ubuntu. Running tests from a ramdisk ---------------------------- The tests create and delete a lot of temporary files. In some cases you can make the test suite run much faster by running it on a ramdisk. For example:: $ sudo mkdir /ram $ sudo mount -t tmpfs none /ram $ TMPDIR=/ram ./bzr selftest ... You could also change ``/tmp`` in ``/etc/fstab`` to have type ``tmpfs``, if you don't mind possibly losing other files in there when the machine restarts. Add this line (if there is none for ``/tmp`` already):: none /tmp tmpfs defaults 0 0 With a 6-core machine and ``--parallel=fork`` using a tmpfs doubles the test execution speed. Writing Tests ============= Normally you should add or update a test for all bug fixes or new features in Bazaar. Where should I put a new test? ------------------------------ Bzrlib's tests are organised by the type of test. Most of the tests in bzr's test suite belong to one of these categories: - Unit tests - Blackbox (UI) tests - Per-implementation tests - Doctests A quick description of these test types and where they belong in bzrlib's source follows. Not all tests fall neatly into one of these categories; in those cases use your judgement. Unit tests ~~~~~~~~~~ Unit tests make up the bulk of our test suite. These are tests that are focused on exercising a single, specific unit of the code as directly as possible. Each unit test is generally fairly short and runs very quickly. They are found in ``bzrlib/tests/test_*.py``. So in general tests should be placed in a file named test_FOO.py where FOO is the logical thing under test. For example, tests for merge3 in bzrlib belong in bzrlib/tests/test_merge3.py. See bzrlib/tests/test_sampler.py for a template test script. Blackbox (UI) tests ~~~~~~~~~~~~~~~~~~~ Tests can be written for the UI or for individual areas of the library. Choose whichever is appropriate: if adding a new command, or a new command option, then you should be writing a UI test. If you are both adding UI functionality and library functionality, you will want to write tests for both the UI and the core behaviours. We call UI tests 'blackbox' tests and they belong in ``bzrlib/tests/blackbox/*.py``. When writing blackbox tests please honour the following conventions: 1. Place the tests for the command 'name' in bzrlib/tests/blackbox/test_name.py. This makes it easy for developers to locate the test script for a faulty command. 2. Use the 'self.run_bzr("name")' utility function to invoke the command rather than running bzr in a subprocess or invoking the cmd_object.run() method directly. This is a lot faster than subprocesses and generates the same logging output as running it in a subprocess (which invoking the method directly does not). 3. Only test the one command in a single test script. Use the bzrlib library when setting up tests and when evaluating the side-effects of the command. We do this so that the library api has continual pressure on it to be as functional as the command line in a simple manner, and to isolate knock-on effects throughout the blackbox test suite when a command changes its name or signature. Ideally only the tests for a given command are affected when a given command is changed. 4. If you have a test which does actually require running bzr in a subprocess you can use ``run_bzr_subprocess``. By default the spawned process will not load plugins unless ``--allow-plugins`` is supplied. Per-implementation tests ~~~~~~~~~~~~~~~~~~~~~~~~ Per-implementation tests are tests that are defined once and then run against multiple implementations of an interface. For example, ``per_transport.py`` defines tests that all Transport implementations (local filesystem, HTTP, and so on) must pass. They are found in ``bzrlib/tests/per_*/*.py``, and ``bzrlib/tests/per_*.py``. These are really a sub-category of unit tests, but an important one. Along the same lines are tests for extension modules. We generally have both a pure-python and a compiled implementation for each module. As such, we want to run the same tests against both implementations. These can generally be found in ``bzrlib/tests/*__*.py`` since extension modules are usually prefixed with an underscore. Since there are only two implementations, we have a helper function ``bzrlib.tests.permute_for_extension``, which can simplify the ``load_tests`` implementation. Doctests ~~~~~~~~ We make selective use of doctests__. In general they should provide *examples* within the API documentation which can incidentally be tested. We don't try to test every important case using doctests |--| regular Python tests are generally a better solution. That is, we just use doctests to make our documentation testable, rather than as a way to make tests. Most of these are in ``bzrlib/doc/api``. More additions are welcome. __ http://docs.python.org/lib/module-doctest.html Shell-like tests ---------------- ``bzrlib/tests/script.py`` allows users to write tests in a syntax very close to a shell session, using a restricted and limited set of commands that should be enough to mimic most of the behaviours. A script is a set of commands, each command is composed of: * one mandatory command line, * one optional set of input lines to feed the command, * one optional set of output expected lines, * one optional set of error expected lines. Input, output and error lines can be specified in any order. Except for the expected output, all lines start with a special string (based on their origin when used under a Unix shell): * '$ ' for the command, * '<' for input, * nothing for output, * '2>' for errors, Comments can be added anywhere, they start with '#' and end with the line. The execution stops as soon as an expected output or an expected error is not matched. When no output is specified, any ouput from the command is accepted and execution continue. If an error occurs and no expected error is specified, the execution stops. An error is defined by a returned status different from zero, not by the presence of text on the error stream. The matching is done on a full string comparison basis unless '...' is used, in which case expected output/errors can be less precise. Examples: The following will succeeds only if 'bzr add' outputs 'adding file':: $ bzr add file >adding file If you want the command to succeed for any output, just use:: $ bzr add file The following will stop with an error:: $ bzr not-a-command If you want it to succeed, use:: $ bzr not-a-command 2> bzr: ERROR: unknown command "not-a-command" You can use ellipsis (...) to replace any piece of text you don't want to be matched exactly:: $ bzr branch not-a-branch 2>bzr: ERROR: Not a branch...not-a-branch/". This can be used to ignore entire lines too:: $ cat first line >... >last line You can check the content of a file with cat:: $ cat expected content You can also check the existence of a file with cat, the following will fail if the file doesn't exist:: $ cat file The actual use of ScriptRunner within a TestCase looks something like this:: from bzrlib.tests import script def test_unshelve_keep(self): # some setup here script.run_script(self, ''' $ bzr add file $ bzr shelve --all -m Foo $ bzr shelve --list 1: Foo $ bzr unshelve --keep $ bzr shelve --list 1: Foo $ cat file contents of file ''') You can also test commands that read user interaction:: def test_confirm_action(self): """You can write tests that demonstrate user confirmation""" commands.builtin_command_registry.register(cmd_test_confirm) self.addCleanup(commands.builtin_command_registry.remove, 'test-confirm') self.run_script(""" $ bzr test-confirm 2>Really do it? [y/n]: