I've recently been testing a desktop application I built with WPF and .NET 4 which collects data from the serial port and draws various charts on screen in real time. The volume of data isn't immense, one data frame of approx 200 bytes arriving every 200 milliseconds, but there is additional processing happening in the background. One of the steps being performed is regression analysis on various data points to construct curves of best fit.
In production this process will be run for likely not more than 30 minutes, but during testing I wanted to test the limits of the application by running the process for 2 hours, to be sure there was no memory leaks or performance issues. I was surprised when after just 1 hour the UI just became totally unresponsive, dramatically quickly.
I did find the issue and made the fix, but this got me thinking about where else this sort of issue could occur in software - because all I'd done was apply a simple form of software performance testing, known as
endurance testing, which just involved running the software for significantly longer than normal. The endurance test has something in common with its hardware equivalent,
soak testing.
|
The bathtub curve. The name derives from the cross-sectional shape of a bathtub. Image source: wikipedia |
This leads to another hardware concept which I think can be partially applied to software - the
bathtub curve, as shown above. The bathtub curve is used in reliability engineering to describe a particular form of the hazard function which comprises three parts.
Applied to hardware, the bathtub curve means:
- The first part is a decreasing failure rate, known as early failures. Burn in testing aims to detect (and discard) products which fail at this stage. If the burn-in period is made sufficiently long (and artificially stressful), the system can then be trusted to be mostly free of further early failures once the burn-in test is complete.
- The second part is a constant failure rate, known as random failures. These are like the "background" level of failures, which can usually never be totally eliminated from the production process. QC managers will often aim for a low level of failure here, via various quality control measures (such as TQM or Six Sigma).
- The third part is an increasing failure rate, known as wear-out failures. These can be detected via soak testing.
In electronics, soak testing involves testing a system up to or above its maximum ratings for a long period of time. Often, a soak test can continue for months, while also applying additional stresses like extreme temperatures and/or pressures, depending on the intended environment.
Applied to software, the bathtub curve might show bug count on the y-axis and total application run-time on the x-axis. Then, the three parts could mean:
- First part (early failures): In software, this could apply to bugs which break the software's functional requirements or specifications, and might be tested with various functional testing methods such as unit testing or integration testing. These are normally found early in the test cycle.
- Second part (random failures): Might apply to random bugs which are hard to detect, occur in specific conditions, and are outside the existing test coverage. An example might be a heisenbug.
- Third part (wear-out failures): Bugs which appear due to memory leaks or performance degredation. These can be tested with endurance testing. During endurance tests, memory utilization is monitored to detect potential memory leaks, and throughput/response times are monitored to detect performance degradation.