Part 2 of the Cognitive System Testing series. Read part 1 here.
A cognitive system, like most software systems, is a collection of modular components, each with varying degrees of complexity. One quick way of testing a cognitive system is to use a â€śsmoke testâ€ť, which tells you if the system is behaving so poorly that it may as well be on fire. The smoke test suite should tell you if the system as a whole is responsive to user inputs and if the inner components are generally speaking to each other.
With the whole system at your fingertips, you may have a desire to make tests as comprehensive as possible. Resist the urge! Our ideal testing approach is a layered approach and the smoke test is just the tip of the spear. The smoke test should give you just enough information to decide if you want to run the rest of your test suites (you do have multiple batches, donâ€™t you?) In fact, in our Watson solutions, our smoke tests donâ€™t even verify if Watson provides the correct answer to a question – just that Watson provides an answer!
Using any build process (hopefully an automated build), software will obviously not be delivered if it does not compile. Compile-time errors are fairly easy to catch and provide a fail-fast mechanism. If all of your software modules compile, that is a good start, but they may not run together. Smoke testing is a way of quickly flushing out batches of runtime errors by forcing the components to talk to each other.
A good smoke test suite
Your smoke test suite should aim for the extreme version of the 80/20 rule – write as few tests as possible to cover the major integration points within your application. Some of our Watson solutions started with only one test in their test suite, asking a single question of the Watson system. A medical Watson solution started with just a handful – each test tested one of major disease types covered by the system, with varied mock patient data to cover interesting patient characteristics (gender, age classes, etc). Your solution may need a few more test cases. The important thing is to hit each runtime component – in the Watson solutions described above this included a UIMA pipeline, a REST layer, a database, and a machine learning model.
I mentioned earlier that our smoke tests only verify that Watson provides an answer, not the answer. This an important distinction. By allowing some freedom in the responses, we prevent the smoke test from being brittle. As previously noted, Watson systems are probabilistic, non-deterministic systems and that can play havoc on a strict test. We rely on other test suites to verify the system is fully functioning.Â Â
In case of errors
Ideally all of our tests pass. If the smoke test suite fails, STOP, itâ€™s as bad as if the system is on fire! A smoke test failure should prevent the other test suites from failing and should start an immediate triage process. Our smoke test suites include a series of log scanners which look for error and exception messages and provided a detailed email to all of the people who contributed to the current failed build, and this failure is immediately given high priority. Other channels are great for communicating this failure including SMS alerts and Slack channel notifications.
Example failure email
The smoke test for build 20160916_0830 on system node123 failed with several errors.
grep â€“R ERROR /solution/20160916_0830/logs found
install.log:Â Â ERRORÂ Â Â Â Could not load messages.properties
runtime.log:Â Â ERRORÂ Â Â Â Failed initialize module ComponentA due to IllegalArgumentException
testcase.log:Â ERRORÂ Â Â Â Question 1 did not return a response
The following code updates were added to the build:
abcd1234 (johndoe) Integrate ComponentA into build
efgh5678 (janedoe) Fix NullPointerException in installer
This email includes key details of what build was installed, where the build was installed, where the logs can be found and what indications of failure exist.
A smoke test suite needs to quickly hit the major integration points and code paths in your application and determine if the application is in a serious failed state or not. The smoke test suite should be small, run quickly, and be tolerant of output from the system. The smoke test suite is just the first part of your overall testing solution and it can leave some work to the other suites.