How to Assess Testing Resource Allocation

When I come in as a consultant to look at how software testing is done, it's common to start with a conversation about strategy — in other words, the major pieces of the test effort, and how they should ideally fit together to reduce risk.

One of the worst responses we hear is, "Well, this is how we do it," without being able to explain why. On one project, the test staff could not even explain what they were doing — ask me about that story another day.

You might expect a test strategy document to better explain the organization’s test effort. Sadly, however, I find these documents are fairly useless; most of them are just a list of ideas. Test strategy documents might include a mix of types of tests and test strategies, such as:

  • unit tests

  • performance tests

  • test features as they are created

  • regression tests at the GUI and API levels

  • continuous integration

  • crowdtesting

  • browser compatibility

  • security

  • accessibility

  • localization.

They look like a big list with very little clarity, detail or emphasis on a particular area.

Now imagine that your company possesses 100 points of "test effort" to spend on these items listed above. How much do you spend on each, and why? For that matter, imagine that every project has a percentage of budget to spend on testing. You'd expect some critical projects might receive a higher percentage of budget relative to the amount of features covered. Yet most companies do not allocate resources in this way. Instead of "testers" who have an integrated vision, there are people doing testing activities. So people with a role involving testing take the things out of the strategy document they know how to do, or the things they did in the last sprint, and do some of them — maybe.

We used to ask "How much testing is enough?" and shrug, because testing ended when we hit a key date and had no known show-stopper defects. Today, we could ask "How much of each of these testing types is enough?" To take it a step further, we could ask the organization, "What is the value produced by investing this much in each of these types of testing? Should we move the slider to invest more or less?"

If you can answer those questions, you'll be much closer to explaining testing as an investment, rather than a cost.

Assessing the value of test effort

In the past, the classic measurement of test effort was the developer/tester ratio. My old mentor, Dr. Cem Kaner, co-wrote a classic paper on this, presenting it to the Pacific Northwest Quality Conference nearly 20 years ago. At that time, build systems were standard, while continuous integration and automated unit testing were just entering the early adopter phase. Kaner, along with co-authors Elisabeth Hendrickson and Jennifer Smith-Brock, concluded that such measurement was a mixed bag — after all, any individual person, be it a developer or tester, can do a lazy or poor job. They suggested staffing projects according to risk, with high-risk and larger projects earning more resources. Here's how they suggested determining risk:

The level of risk is affected by such factors as the technical difficulty of the programming task, the skills of the programmers, the expectations of the customers, and the types of harm that errors might cause. The more risk, the more thoroughly you'll have to test, and the more times you'll probably have to retest.

To do more testing on a project because it is riskier, we must first determine a baseline, a standard amount of work to perform. Here's an exercise to help with that.

First, lay out the techniques the team uses to reduce risk. Use a survey if necessary. Without anything else to measure, you can use the 100 points of effort I suggested above. The spreadsheet could look something like this:

What

Points we should spend

Points we do spend

Unit testing

Feature testing

Human regression

Automated regression (GUI)

Automated regression (API)

Performance

Crowdtesting

Accessibility

Localization

Continuous integration

Blue/green rollouts

User acceptance testing

Usability

Platform engineering (environments and data)

Device compatibility

Hardware testing


Make a single template, copy it into an online spreadsheet tool, and invite team members — anyone from QA — to populate it. The results should yield a few interesting metrics:

  • the average results for each category;

  • the mean results (middle) results for each category;

  • the standard deviation within a category;

  • the extreme results for each category.

The metrics above provide a window into how the team assesses its current and potential testing resource allocation. There is no right answer for this exercise. Points have no meaning; they are an imaginary concept used to compare relative effort.

That said, there can be wrong answers. A wide variety of results that map to a job description would indicate the various roles do not really understand what their peers in other jobs are doing, or why.

The discovery process

These two ideas help advance a real, useful test strategy. First, consider the level of risk on projects and staff them accordingly. The total risk is the percentage of danger multiplied by the cost if something goes wrong. Even if the risk is low, consider the potential reward of how you staff a project, or the dollars that flow through the system.

The easiest way to look at the potential risks is to look at what went wrong on other projects. Issues on other projects generally fall into three broad categories:

  • defects, especially show-stopping bugs;

  • higher-level strategic problems like technology failure, marketplace failure or late project delivery;

  • regulatory compliance and conformance challenges.

If other projects are late, odds are this one is late too. We can mitigate lateness by structuring the work in phases, so late projects can still ship something. Likewise, we can reduce the chance that a feature will be rejected by the marketplace through A/B split testing. Comparing the kinds of testing outlined above with the actual bugs on recent work can help us understand whether we’re using the right techniques.

The second approach is to take a high-level look at how the team spends its time. If, for example, the team doesn't spend much time on compatibility testing, but that is where serious defects are found — by customers, no less — then it may be time to adjust. Anyone can do this. Management can gather the numbers to make decisions over time. Line workers can either create a survey, or just publish the numbers themselves.

It’s much easier to start a conversation about a spreadsheet than one that begins by pointing fingers and saying, "Why didn't you find that bug?"

Give it a try – see what you find. And share it with the rest of the testing community to provide value outside your own walls.

Ebooks

6 Steps to Get Started With Crowdtesting

Set your crowdtesting efforts up for success with these six steps. Learn how to select a project, develop success criteria, and lay the foundation for an effective partnership.

Read '6 Steps to Get Started With Crowdtesting' Now
Want to see more like this?
Matt Heusser
Managing Director at Excelon Development
Reading time: 7 min

‘Tis The Season: What Shoppers Want in Winter 2023

In October 2023, Applause conducted its annual holiday shopping survey to gather research on consumer trends ahead of the festive season.

4 Key Challenges of AI Artifact Collection

Training AI algorithms is a massive undertaking

Crowdsourced Software Testing FAQs

We answer some common questions about crowdtesting

Experts Discuss How to Improve Testing Efficiency

It takes the whole organization to reduce unnecessary QA costs

Testing the Evolving Digital Automotive Experience

Automobile manufacturers face a difficult task validating software in a digital-first world

What Is Back-End Testing for Applications?

Peek behind the UI curtain to infuse quality