====================
Bazaar Testing Guide
====================
The Importance of Testing
=========================
Reliability is a critical success factor for any version control system.
We want Bazaar to be highly reliable across multiple platforms while
evolving over time to meet the needs of its community.
In a nutshell, this is what we expect and encourage:
* New functionality should have test cases. Preferably write the
test before writing the code.
In general, you can test at either the command-line level or the
internal API level. See `Writing tests`_ below for more detail.
* Try to practice Test-Driven Development: before fixing a bug, write a
test case so that it does not regress. Similarly for adding a new
feature: write a test case for a small version of the new feature before
starting on the code itself. Check the test fails on the old code, then
add the feature or fix and check it passes.
By doing these things, the Bazaar team gets increased confidence that
changes do what they claim to do, whether provided by the core team or
by community members. Equally importantly, we can be surer that changes
down the track do not break new features or bug fixes that you are
contributing today.
As of September 2009, Bazaar ships with a test suite containing over
23,000 tests and growing. We are proud of it and want to remain so. As
community members, we all benefit from it. Would you trust version control
on your project to a product *without* a test suite like Bazaar has?
Running the Test Suite
======================
As of Bazaar 2.1, you must have the testtools_ library installed to run
the bzr test suite.
.. _testtools: https://launchpad.net/testtools/
To test all of Bazaar, just run::
bzr selftest
With ``--verbose`` bzr will print the name of every test as it is run.
This should always pass, whether run from a source tree or an installed
copy of Bazaar. Please investigate and/or report any failures.
Running particular tests
------------------------
Currently, bzr selftest is used to invoke tests.
You can provide a pattern argument to run a subset. For example,
to run just the blackbox tests, run::
./bzr selftest -v blackbox
To skip a particular test (or set of tests), use the --exclude option
(shorthand -x) like so::
./bzr selftest -v -x blackbox
To ensure that all tests are being run and succeeding, you can use the
--strict option which will fail if there are any missing features or known
failures, like so::
./bzr selftest --strict
To list tests without running them, use the --list-only option like so::
./bzr selftest --list-only
This option can be combined with other selftest options (like -x) and
filter patterns to understand their effect.
Once you understand how to create a list of tests, you can use the --load-list
option to run only a restricted set of tests that you kept in a file, one test
id by line. Keep in mind that this will never be sufficient to validate your
modifications, you still need to run the full test suite for that, but using it
can help in some cases (like running only the failed tests for some time)::
./bzr selftest -- load-list my_failing_tests
This option can also be combined with other selftest options, including
patterns. It has some drawbacks though, the list can become out of date pretty
quick when doing Test Driven Development.
To address this concern, there is another way to run a restricted set of tests:
the --starting-with option will run only the tests whose name starts with the
specified string. It will also avoid loading the other tests and as a
consequence starts running your tests quicker::
./bzr selftest --starting-with bzrlib.blackbox
This option can be combined with all the other selftest options including
--load-list. The later is rarely used but allows to run a subset of a list of
failing tests for example.
Disabling plugins
-----------------
To test only the bzr core, ignoring any plugins you may have installed,
use::
./bzr --no-plugins selftest
Disabling crash reporting
-------------------------
By default Bazaar uses apport_ to report program crashes. In developing
Bazaar it's normal and expected to have it crash from time to time, at
least because a test failed if for no other reason.
Therefore you should probably add ``debug_flags = no_apport`` to your
``bazaar.conf`` file (in ``~/.bazaar/`` on Unix), so that failures just
print a traceback rather than writing a crash file.
.. _apport: https://launchpad.net/apport/
Test suite debug flags
----------------------
Similar to the global ``-Dfoo`` debug options, bzr selftest accepts
``-E=foo`` debug flags. These flags are:
:allow_debug: do *not* clear the global debug flags when running a test.
This can provide useful logging to help debug test failures when used
with e.g. ``bzr -Dhpss selftest -E=allow_debug``
Note that this will probably cause some tests to fail, because they
don't expect to run with any debug flags on.
Using subunit
-------------
Bazaar can optionally produce output in the machine-readable subunit_
format, so that test output can be post-processed by various tools. To
generate a subunit test stream::
$ ./bzr selftest --subunit
Processing such a stream can be done using a variety of tools including:
* The builtin ``subunit2pyunit``, ``subunit-filter``, ``subunit-ls``,
``subunit2junitxml`` from the subunit project.
* tribunal_, a GUI for showing test results.
* testrepository_, a tool for gathering and managing test runs.
.. _subunit: https://launchpad.net/subunit/
.. _tribunal: https://launchpad.net/tribunal/
Using testrepository
--------------------
Bazaar ships with a config file for testrepository_. This can be very
useful for keeping track of failing tests and doing general workflow
support. To run tests using testrepository::
$ testr run
To run only failing tests::
$ testr run --failing
To run only some tests, without plugins::
$ test run test_selftest -- --no-plugins
See the testrepository documentation for more details.
.. _testrepository: https://launchpad.net/testrepository
Babune continuous integration
-----------------------------
We have a Hudson continuous-integration system that automatically runs
tests across various platforms. In the future we plan to add more
combinations including testing plugins. See
. (Babune = Bazaar Buildbot Network.)
Running tests in parallel
-------------------------
Bazaar can use subunit to spawn multiple test processes. There is
slightly more chance you will hit ordering or timing-dependent bugs but
it's much faster::
$ ./bzr selftest --parallel=fork
Note that you will need the Subunit library
to use this, which is in
``python-subunit`` on Ubuntu.
Running tests from a ramdisk
----------------------------
The tests create and delete a lot of temporary files. In some cases you
can make the test suite run much faster by running it on a ramdisk. For
example::
$ sudo mkdir /ram
$ sudo mount -t tmpfs none /ram
$ TMPDIR=/ram ./bzr selftest ...
You could also change ``/tmp`` in ``/etc/fstab`` to have type ``tmpfs``,
if you don't mind possibly losing other files in there when the machine
restarts. Add this line (if there is none for ``/tmp`` already)::
none /tmp tmpfs defaults 0 0
With a 6-core machine and ``--parallel=fork`` using a tmpfs doubles the
test execution speed.
Writing Tests
=============
Normally you should add or update a test for all bug fixes or new features
in Bazaar.
Where should I put a new test?
------------------------------
Bzrlib's tests are organised by the type of test. Most of the tests in
bzr's test suite belong to one of these categories:
- Unit tests
- Blackbox (UI) tests
- Per-implementation tests
- Doctests
A quick description of these test types and where they belong in bzrlib's
source follows. Not all tests fall neatly into one of these categories;
in those cases use your judgement.
Unit tests
~~~~~~~~~~
Unit tests make up the bulk of our test suite. These are tests that are
focused on exercising a single, specific unit of the code as directly
as possible. Each unit test is generally fairly short and runs very
quickly.
They are found in ``bzrlib/tests/test_*.py``. So in general tests should
be placed in a file named test_FOO.py where FOO is the logical thing under
test.
For example, tests for merge3 in bzrlib belong in bzrlib/tests/test_merge3.py.
See bzrlib/tests/test_sampler.py for a template test script.
Blackbox (UI) tests
~~~~~~~~~~~~~~~~~~~
Tests can be written for the UI or for individual areas of the library.
Choose whichever is appropriate: if adding a new command, or a new command
option, then you should be writing a UI test. If you are both adding UI
functionality and library functionality, you will want to write tests for
both the UI and the core behaviours. We call UI tests 'blackbox' tests
and they belong in ``bzrlib/tests/blackbox/*.py``.
When writing blackbox tests please honour the following conventions:
1. Place the tests for the command 'name' in
bzrlib/tests/blackbox/test_name.py. This makes it easy for developers
to locate the test script for a faulty command.
2. Use the 'self.run_bzr("name")' utility function to invoke the command
rather than running bzr in a subprocess or invoking the
cmd_object.run() method directly. This is a lot faster than
subprocesses and generates the same logging output as running it in a
subprocess (which invoking the method directly does not).
3. Only test the one command in a single test script. Use the bzrlib
library when setting up tests and when evaluating the side-effects of
the command. We do this so that the library api has continual pressure
on it to be as functional as the command line in a simple manner, and
to isolate knock-on effects throughout the blackbox test suite when a
command changes its name or signature. Ideally only the tests for a
given command are affected when a given command is changed.
4. If you have a test which does actually require running bzr in a
subprocess you can use ``run_bzr_subprocess``. By default the spawned
process will not load plugins unless ``--allow-plugins`` is supplied.
Per-implementation tests
~~~~~~~~~~~~~~~~~~~~~~~~
Per-implementation tests are tests that are defined once and then run
against multiple implementations of an interface. For example,
``per_transport.py`` defines tests that all Transport implementations
(local filesystem, HTTP, and so on) must pass. They are found in
``bzrlib/tests/per_*/*.py``, and ``bzrlib/tests/per_*.py``.
These are really a sub-category of unit tests, but an important one.
Along the same lines are tests for extension modules. We generally have
both a pure-python and a compiled implementation for each module. As such,
we want to run the same tests against both implementations. These can
generally be found in ``bzrlib/tests/*__*.py`` since extension modules are
usually prefixed with an underscore. Since there are only two
implementations, we have a helper function
``bzrlib.tests.permute_for_extension``, which can simplify the
``load_tests`` implementation.
Doctests
~~~~~~~~
We make selective use of doctests__. In general they should provide
*examples* within the API documentation which can incidentally be tested. We
don't try to test every important case using doctests |--| regular Python
tests are generally a better solution. That is, we just use doctests to
make our documentation testable, rather than as a way to make tests.
Most of these are in ``bzrlib/doc/api``. More additions are welcome.
__ http://docs.python.org/lib/module-doctest.html
Shell-like tests
----------------
``bzrlib/tests/script.py`` allows users to write tests in a syntax very close to a shell session,
using a restricted and limited set of commands that should be enough to mimic
most of the behaviours.
A script is a set of commands, each command is composed of:
* one mandatory command line,
* one optional set of input lines to feed the command,
* one optional set of output expected lines,
* one optional set of error expected lines.
Input, output and error lines can be specified in any order.
Except for the expected output, all lines start with a special
string (based on their origin when used under a Unix shell):
* '$ ' for the command,
* '<' for input,
* nothing for output,
* '2>' for errors,
Comments can be added anywhere, they start with '#' and end with
the line.
The execution stops as soon as an expected output or an expected error is not
matched.
When no output is specified, any ouput from the command is accepted
and execution continue.
If an error occurs and no expected error is specified, the execution stops.
An error is defined by a returned status different from zero, not by the
presence of text on the error stream.
The matching is done on a full string comparison basis unless '...' is used, in
which case expected output/errors can be less precise.
Examples:
The following will succeeds only if 'bzr add' outputs 'adding file'::
$ bzr add file
>adding file
If you want the command to succeed for any output, just use::
$ bzr add file
The following will stop with an error::
$ bzr not-a-command
If you want it to succeed, use::
$ bzr not-a-command
2> bzr: ERROR: unknown command "not-a-command"
You can use ellipsis (...) to replace any piece of text you don't want to be
matched exactly::
$ bzr branch not-a-branch
2>bzr: ERROR: Not a branch...not-a-branch/".
This can be used to ignore entire lines too::
$ cat
first line
>...
>last line
You can check the content of a file with cat::
$ cat expected content
You can also check the existence of a file with cat, the following will fail if
the file doesn't exist::
$ cat file
The actual use of ScriptRunner within a TestCase looks something like
this::
from bzrlib.tests import script
def test_unshelve_keep(self):
# some setup here
script.run_script(self, '''
$ bzr add file
$ bzr shelve --all -m Foo
$ bzr shelve --list
1: Foo
$ bzr unshelve --keep
$ bzr shelve --list
1: Foo
$ cat file
contents of file
''')
You can also test commands that read user interaction::
def test_confirm_action(self):
"""You can write tests that demonstrate user confirmation"""
commands.builtin_command_registry.register(cmd_test_confirm)
self.addCleanup(commands.builtin_command_registry.remove, 'test-confirm')
self.run_script("""
$ bzr test-confirm
2>Really do it? [y/n]:
.
.. |--| unicode:: U+2014
..
vim: ft=rst tw=74 ai et sw=4