2021 Call for Code Awards: Live from New York, with SNL’s Colin Jost! Learn more

Archived | Make your life easy with

Archived content

Archive date: 2019-06-24

This content is no longer being updated or maintained. The content is provided “as is.” Given the rapid evolution of technology, some content, steps, or illustrations may have changed.

The days of the Wild West are coming to their end in the world of Python testing. It was not many years ago that nearly every project built with Python seemed to have its own idioms and practices for writing and running tests. But now, the frontier is finally beginning to close. The community is rallying around a few leading solutions that are bringing convenience and common standards to the test suites of hundreds of popular projects.

This is article will serve as a guide to the new testing frameworks. In this article, you will be introduced to three popular testing frameworks and see the radically simpler test style that the newest generation of tools are encouraging. The second article, Discovering and selecting tests, will step back and look at the larger question of how these frameworks automate the task of finding and cataloging your project’s tests in the first place. Finally, Test reporting with a Python test framework will look at the powerful features these frameworks provide for viewing the results of your test runs.

By learning the common idioms of these three frameworks, you will not only be better prepared to read through other programmer’s Python packages, but to build elegant and powerful test suites for your own applications as well.

The candidates: Three Python testing frameworks

There are three Python testing frameworks that seem to be in use on large code bases today. Taking them in chronological order, they are:

  • zope.testing

    As usual, the developers working on the Zope project seem to have been early innovators. They needed a uniform way to discover and run tests across their large code base, and their answer was the zope.testing package, which remains heavily used to this day.

    The zope.testing package only supports traditional Python test styles like unittest and doctest, and not the radically simpler styles permitted by the more recent frameworks. But it does offer a powerful system of layers with which whole directories full of tests can rely on common setup code that creates once, for the layer (rather than once for each test), the environment in which the tests need to run.

  • py.test

    It was in 2004 that Holger Krekel renamed his std package, whose name was often confused with that of the Standard Library that ships with Python, to the (only slightly less confusing) name ‘py.’ Though the package contains several sub-packages, it is now known almost entirely for its py.test framework.

    The py.test framework sets a new standard for Python testing, and is popular with many developers today. The elegant and Pythonic idioms it introduced for test writing have made it possible for test suites to be written in a far more compact style than was possible before, as you shall see below.

  • nose

    The nose project was released in 2005, the year after py.test received its modern guise. It was written by Jason Pellerin to support the same test idioms that had been pioneered by py.test, but in a package that is easier to install and maintain. Though py.test has in several ways caught up, and today is quite easy to install, nose has retained its reputation for being very sleek and easy to use.

    At Python conventions, it is now common to see developers wearing black T-shirts showing the nosetests command, followed by the field of periods with which it denotes successful tests. Interest in nose continues to increase, and one often sees posts on other project mailing lists in which the local developers ask the project leads when their project will be permitted to make the switch to nose.

Of the three projects, it looks like nose might well become the standard, with py.test having a smaller but loyal community and zope.testing remaining popular only for projects built atop the Zope framework. But all are actively maintained, and each has some unique features. Keep reading, and learn about the features and differences among the three so that you can make the right choice for your own projects.

The testing revolution

The py.test framework transformed the world of Python testing by accepting plain Python functions as tests instead of insisting that tests be packaged inside of larger and heavier-weight test classes. Since the nose framework supports the same idiom, these patterns are likely to become more and more popular.

Imagine that you want to check whether the Python truth values True and False are really, as Python promises, equivalent to the Boolean numbers 1 and 0. Either py.test or nose will accept and run the following few lines of code as valid tests that answer this question:

# test_new.py - simple tests functions

def testTrue(self):
    assert True == 1

def testFalse(self):
assert False == 0

In contrast to the simplicity of the above example, you will find that older documentation about Python testing is replete with verbose example tests that all go something like this:

# test_old.py - The old way of doing things

import unittest

class TruthTest(unittest.TestCase):
    def testTrue(self):
        assert True == 1

    def testFalse(self):
        assert False == 0

if __name__ == '__main__':

Look at all of the scaffolding that was necessary to support two actual lines of test code! First, this code requires an import statement that is completely irrelevant to the code under test, since the tests themselves simply ignore the module and use only built-in Python values like True and False. Furthermore, a class is created that does not support or enhance the tests, since they do not actually use their self argument for anything. And, finally, it requires two lines of boilerplate down at the bottom, so that this particular test can be run by itself from the command line.

Experienced users of unittest might try to argue that the above example should use the testing methods that my new TruthTest class has inherited from TestCase. For example, they would encourage me to use assertEqual() instead of an assert statement that tests manually for equality, in which case the test would indeed use self instead of ignoring it:

# alternate version of the TestTrue method
    def testTrue(self):
        self.assertEqual(True, 1)

There are three responses to this recommendation.

First, calling a method hurts readability. While the assertEqual() method name does indicate that the two values are being tested for equality, the code still does not look like a comparison in the way that the Python == operator looks like a comparison to someone familiar with the language.

Second, as you will see in the third article in this series, the new testing frameworks now know how to introspect assert statements to inspect the condition that made the test fail, which means that a bare assert statement can now lead to test failure messages that are just as informative as the results of calling the old methods like assertEqual().

Finally, even if assertEqual() were still necessary, it would surely be simpler and more Pythonic to import such a function from a testing module, instead of using class inheritance merely to make functions available! You will see below, in fact, that when both py.test and nose want to make additional routines available to support tests, they simply define them as functions and expect users to import them into their code.

Of course, when authors actually need setup and teardown routines that cache state for later use in test cases, unittest subclasses still make eminent sense, and both py.test and nose fully support them. And many Python tests these days are written as doctests, which are supported by Python’s standard library and need not make use of either functions or classes:

Doctest For The Above Example

The truth values in Python, named "True" and "False",
are equivalent to the Boolean numbers one and zero.

>>> True == 1
>>> False == 0

But when programmers want to write simple test code without all the verbiage involved in a doctest, then test functions are a wonderful way to write. Above all, test functions vastly enhance what might be called the writability of tests. Instead of making each programmer remember, re-invent, or copy the test scaffolding from the last test he wrote, the new conventions enable a Python programmer to write tests as one usually writes Python code: By simply opening an empty file, and typing!

Framework-specific conveniences

Both py.test and nose provide special routines that make writing tests easier. You might say that they each allow you to write tests using their own particular dialect of convenience functions. This can make test writing simpler and less error-prone, and also result in test code that is shorter and more readable. But using these routines also carries an important consequence: your tests are then tied to the framework whose functions you are using.

The trade-off is one of convenience versus compatibility. If you write all of your tests from the ground up using only the clunky standard Python unittest module, then they will work under any testing framework you choose. Going a step further, if you adopt the simpler and sleeker practice of writing test functions (as described above), then your tests will at least work under both py.test and nose. But if you start using features peculiar to one testing framework, then a good deal of rewriting might be necessary in the future if another one of the frameworks develops important new features and you decide to migrate.

Both py.test and nose provide an alternative for the assertRaises() method of TestCase. The version provided by py.test is a bit fancier, because it can also accept a string to execute, which is more powerful because you can test expressions that raise exceptions rather than only function calls:

# conveniences.py
import math

import py.test
py.test.raises(OverflowError, math.log, 0)
py.test.raises(ValueError, math.sqrt, -1)
py.test.raises(ZeroDivisionError, "1 / 0")

import nose.tools
nose.tools.assert_raises(OverflowError, math.log, 0)
nose.tools.assert_raises(ValueError, math.sqrt, -1)
# No equivalent for third example!

Beyond the testing of exceptions, however, the two frameworks part ways. The only other py.test convenience seems to be a function to determine whether a particular call triggers a DeprecationWarning:

py.test.deprecated_call(my.old.function, arg1, arg2)

On the other hand, nose seems to have a quite rich set of assertion functions, both for cases where you want to avoid a bare assert statement, and where you need to do something more complicated. You should consult its documentation for details, but here is a quick synopsis of the possibilities offered by nose.tools:

# nose.tools support functions for writing tests

assert_almost_equal(first, second, places=7, msg=None)
assert_almost_equals(first, second, places=7, msg=None)
assert_equal(first, second, msg=None)
assert_equals(first, second, msg=None)
assert_false(expr, msg=None)
assert_not_almost_equal(first, second, places=7, msg=None)
assert_not_almost_equals(first, second, places=7, msg=None)
assert_not_equal(first, second, msg=None)
assert_not_equals(first, second, msg=None)
assert_true(expr, msg=None)
eq_(a, b, msg=None)
ok_(expr, msg=None)

The routines above that check for an approximate value are especially important when dealing with floating-point results, if you want to write tests flexible enough to succeed on Python implementations with subtle differences in their handling of floating point.

Distributed testing

Tests seem to get run more and more often these days. The practice of continuous testing has now been adopted in many shops, where project tests are run with every check-in to the team’s version-control system. And as test-driven development grows in popularity, many developers now write and run the tests for a new module before they even bring up their editor to start writing the module’s code. If tests take a long time to run, then they can become an important roadblock to developer productivity.

It is therefore an advantage to be able to bring as much computing power as possible to bear against the task of running tests. On a small scale, this can mean running multiple testing processes to take advantage of all of the CPU cores on your machine. For larger projects, whole farms of test machines are configured, either using dedicated servers ready to run tests in parallel, or even using the combined idle time of all of the developer’s workstations together.

In the area of parallel and distributed testing, the three testing frameworks the article looks at have quite significant differences:

  • The zope.testing command line has a -j option that specifies that several testing processes should be started instead of all tests being done in the same process. Since each process can run on a different CPU core, running -j 4 on a four-CPU machine would allow all four CPUs to be active in running tests at once.
  • The nose project reports that they have support for parallel tests now committed to their project trunk, but normal users will have to wait for the next release before trying it out.
  • The py.test tool not only supports a multiprocessing option (-n) for running on several CPU cores like zope.testing, but it actually has the tools to distribute tests among an entire farm of test servers.

Of these three frameworks, py.test looks like the clear leader in this area. Not only can you give it multiple --tx options, each describing an environment or remote server on which you want to run tests, but it actually supports distributing tests for two quite different reasons! With --dist=load, it will use your server farm for the traditional task of spreading your running tests across several machines to reduce the time you spend waiting. But with dist=each, it does something more sophisticated; it will make sure that each test gets run on each of the different testing environments that you have made available to py.test.

This means that py.test can simultaneously test your product on multiple versions of the Python interpreter, and on multiple operating systems. This makes py.test a very strong contender if your project supports multiple platforms and you want a testing solution that will support you out of the box, without requiring you to write your own scripts for copying tests to several different platforms and running them.

Customization and extensibility

All three testing frameworks provide ways for both individual users and for whole projects to select the behaviors and options they want from their test framework.

  • The zope.testing module is, in Zope packages, often called by a buildout recipe that specifies default options. This means that developers running the tests will get a uniform set of results, while still being free to specify their own command-line switches when the behaviors selected at the project level do not fit their needs.
  • Per-user customization is supposed by the nose framework through a nose.cfg or a .noserc file in a user’s home directory, where he can specify his own personal preferences for how test results are displayed.
  • Per-project options can be provided for either framework. The py.test framework will detect conftest.py files in any project which it is testing, where it will look for per-project options like whether to detect and run doctests and for the patterns that it should use to detect test files and functions in the first place. The nose framework, on the other hand, looks for a project-wide setup.cfg file, which is an already-standard way of providing information about a Python package, and looks for a [nosetests] section inside of it.

And, going beyond what can be accomplished by varying their configuration, both py.test and nose provide support for plug-ins, user-supplied modules that can install new command-line options and add new behaviors to both tools.