*This is an adaptation of a section from our DOE ebook: a biologist’s guide to Design of Experiments. Download it for free!*

- Design of experiments (DOE) offers a wide array of experimental choices, depending on the structure of your DOE campaign and the goals at any particular stage.
- Screening, the first step in most DOE campaigns, usually employs a fractional factorial design, often progressing to higher-order designs as you move through the campaign.
- Response surface methodology designs are typically used for optimization and robustness.
- You can use several other DOE design types, including space-filling designs and definitive screens, during particular campaigns.
- Design of experiments software will not always guide you in making these choices, although it will usually prevent you from making major mistakes.

Design of experiments (DOE) offers a seemingly daunting compilation of statistical approaches. So, this blog offers a users’ guide to the main DOE designs. We’ll look at the designs’ structures, the inherent trade-offs, and how the structure relates to their uses and analysis.

Need a quick primer on DOE before we get started? Check out some of our previous blogs:

- What is Design of Experiments (DOE)?
- Why should I use Design of Experiments in the Life Sciences?
- Overcoming barriers to Design of Experiments (DOE)
- The DOE process: an overview

## Why are there so many types of designs?

As we saw in the last blog, a DOE campaign broadly encompasses screening, refinement and iteration, optimization, and assessing robustness (figure 1). But it’s not a rigid flow. You can drop steps and move backward and forward. The experimental design at each step is linked intimately to the phase in the DOE campaign, which sets the goals and guides the choice of design.

*Figure 1: The stages of a DOE campaign*

### It's a trade-off

In our last blog, we also explained that repeated iterations can move you rapidly from your initial ‘thought experiment’ to optimized conditions and robust data. So, don’t feel you have to design a single experiment that will answer all your questions. DOE is best used sequentially, with each iteration taking you closer to your goal.

As the campaign progresses, the design types involve investing more experimental effort to answer more detailed questions. So, you may have to leave some things for later. You need to make a choice. Do you need an answer now? Can you do a couple of iterations? Some DOE designs allow you to compress some stages (see below) but, of course, this results in trade-offs and limitations. Design of experiments software will often prevent you from making certain choices that your design does not permit.

At any stage in a DOE campaign, you could use one of several designs depending on your assumptions, goals, available run numbers, and so on. We won’t cover all the designs in detail. We’ll highlight the most commonly used and important.

### There are two broad design categories

There are two broad DOE categories: pre-computed and optimal designs. This blog considers pre-computed designs, which, despite some limitations, are useful in a wide array of circumstances. You can, essentially, just look them up. These, however, may be difficult to use if you have special requirements such as constraints or specific models in mind. In these circumstances, optimal (also called ‘custom’) designs will fit your needs, as we’ll discover in a forthcoming blog.

## Unleashing the power of DOE

As we saw in the last blog, to unleash the full power of DOE you need to think in terms of a campaign of sub-experiments with defined objectives for each stage (figure 1). Broadly, a DOE campaign encompasses:

**Screening**: identifying important factors and interactions.**Refinement****and iteration**: searching for the optimum value in the potential range of factors.**Optimization**: creating a high-quality predictive model to infer optimal conditions.**Assessing robustness**: determine the extent to which the system is sensitive to changes in your factor levels.

Some DOE designs (see definitive screening below) lend themselves to achieving the goals of more than one stage at a time, such as screening and optimization. Often, however, this isn’t possible. The choice depends on your factors and the system’s complexity.

### Factorial designs

Factorial designs are typically used at the early stages of a campaign: screening, iteration, and refinement. The goal is to explore a lot of factors in outline. Factorial designs can be used with any type of factor, typically keeping the number of levels small to explore more factors. For example, continuous factors are typically kept to two levels representing the maximum and minimum of an interval of interest. To help understand the experimental space better, many designs add a single central point for each factor to help determine if there is curvature. We’ve been growing a crop of virtual strawberries through this series of blogs. You can see the benefits of adding a central point in figure 2. There are two main types of factorial design: full and fractional.

*Figure 2: The value of adding a single central point can guide you to the actual local maxima rather than an incorrect one.*

### Full factorial designs

Full factorial designs investigate all possible combinations of factors and levels. So, you can determine main effects and any order of interaction. Full factorial designs, however, usually involve a large number of runs (the number of levels raised to the power of the number of factors: e.g. 2^{3} or 8 runs for a full factorial with 3 factors and 2 levels of each factor).

Full factorial designs are often most appropriate when screening has identified a few important factors to optimize or when using liquid handling automation affording an increase in throughput. Figure 3 shows that a full factorial design works through every combination of every factor that screening suggested could influence the yield of our virtual strawberries.

*Figure 3: A full factorial design to uncover the best conditions for yield in a strawberry crop*

### Fractional factorial designs

When factors or levels increase, full factorial designs can become infeasible even in sophisticated, high-capacity facilities. You can use fractional factorial designs when you have a large number of factors to screen or where resources are limited.

Fractional factorial designs assume that, firstly, while there may be many effects, only a few are important and, secondly, that lower-order effects (e.g. interactions between two factors) are more common and more influential than higher-order interactions (typically more than three). So, the design doesn’t include the high-order interactions with each other, which drastically reduces run numbers. We’ll cover aliasing, an important concept in understanding factorial designs, in the next section.

A fractional factorial design takes a rational sample of the experimental landscape to provide a balanced, structured design that generates explanatory and predictive models. Fractional factorial DOE is not, however, suitable for sophisticated modeling. Returning to our strawberries: figure 4 shows that a fractional factorial design still covers a lot of ground in only half the runs, which saves time and resources.

*Figure 4: A fractional factorial design, again for the same strawberry crop*

## Assessing factorial designs: aliasing, resolution, and power

### Aliasing

Aliasing (also known as confounding) between two effects means they cannot be distinguished from each other. By taking runs out of a factorial inevitably results in aliasing, since every run is required to distinguish some combination of factors. So, when analyzing data you can encounter a situation where you can’t tell whether, for example, an interaction between two or three factors causes a particular effect. You either have to use your existing knowledge to decide or, if necessary, do more experiments.

Aliasing is useful during screening to explore more factors without incurring a huge experimental cost. If you assume that higher-order effects are likely to be less important than lower-order effects (e.g. main effects and interactions between two factors) then you can alias higher-order interactions with each other and substantially reduce the number of runs. This approach forms the foundation of fractional factorial designs. Assessing a particular fractional factorial is then about working out if you can determine any of the particular higher-order effects that you think might be interesting.

### Resolution and power

Resolution refers to the order of interactions and helps you choose a fractional factorial design. Resolution defines the highest order of interactions you can identify. For example, resolution 4 designs, a common starting point, can distinguish any 2-factor interaction from any other. The design should also have adequate power: in other words, the number of runs should have a good chance of distinguishing signal from noise. We’ll cover aliasing, resolution, and power in more detail in future blogs.

### Response surface methodology designs

Response surface methodology (RSM) designs are typically used for optimization and robustness. (You may see RSM designs referred to by specific names such as Box-Behnken or central composite designs. Future blogs will consider these and other RSM designs.) RSM designs can be applied to many kinds of factors. In general, however, RSM designs are not applied to categorical and discrete factors, the analysis of which can become very expensive using RSM.

You could use RSM designs if you detect significant factors during screening that display curvature (figure 2). RSM models that curvature. Some types of RSM designs can be thought of as full factorial designs across two levels for each factor, with center and axial points to sample additional levels without needing to do a full factorial across all levels.

So, if you used a two-level full factorial design during the refinement and iteration stage, you need to add only the axial points and replicated center points to achieve an RSM design. This translates into a small amount of additional experimental work and can be a useful pattern when iterating from stage to stage of your DOE campaign. In reality, there are nuances to the different types of RSM designs, which we will consider in future blogs.

### Optimizing your response

RSM designs allow you to build a predictive model of your system’s response surface which is then used to find the factor settings or region that will optimize your response. If your design space is a mountain, the optimum response may be the peak (figure 5). You are searching for the ideal convergence of factors that shows us where the peak is on this particular topographical map.

RSM designs share many features with factorial designs. The assessment, however, differs. Instead of concepts like power and aliasing, which are relevant to determining which effects are real, RSM uses concepts relating to the errors when making predictions in the relevant design space. We’ll cover assessing designs more fully in a future blog.

*Figure 5: Response surface methodology*

## Other DOE designs

Researchers can benefit from several other DOE designs, which are not necessarily pre-computed. We’ll briefly introduce two: space-filling designs and definitive screens. We’ll explore these and other designs in future blogs.

### Space-filling designs

Space-filling designs investigate factors at many different levels, without making any assumptions about the structure of the space or the type of model. This means that you lose the efficiency and some statistical properties of classical DOE designs. Space-filling designs are, however, useful if you do not have much prior knowledge of your system, you want to investigate the system more broadly, or to find a starting point for future optimization during pre-screening.

### Definitive screening designs

Definitive screening designs are a relatively new class of design that combines aspects of screening and optimization. Definitive screening designs investigate continuous factors at three levels and can accommodate two-level categorical factors. This type of design is hugely efficient in terms of the number of runs and you may be able to go straight to optimization after just one experiment. The stripped-down nature of these designs makes them an excellent choice when appropriate. But definitive screening designs may be less appropriate than other designs in very noisy systems.

## Summary

As we’ve seen, it’s easy to get overwhelmed by all the design choices most DOE software gives you. But while there are many design choices, the main determinant is the structure of your DOE campaign and the goals of your particular stage. For example, at screening or refinement and iteration you would probably use a fractional factorial design, often progressing to a full factorial design and then to RSM designs as you move the stages in your campaign. As we’ve alluded to, technical restrictions govern your use of specific designs in particular circumstances. Most software packages will help identify when these apply and help you avoid making the wrong choice. But this still leaves a lot of options. We hope that this blog will help you choose the most appropriate design and take the next step toward unleashing DOE’s full power.

*This was an adaptation of a section from our DOE ebook: a biologist’s guide to Design of Experiments. Download it for free! Make sure to also check out our other DOE blogs or watch our DOE Masterclass webinar series.*

## Michael "Sid" Sadowski, PhD

Michael Sadowski, aka Sid, is the Director of Scientific Software at Synthace, where he leads the company’s DOE product development. In his 10 years at the company he has consulted on dozens of DOE campaigns, many of which included aspects of QbD.