Skip to content
    October 24, 2022

    What is Design of Experiments (DOE)?

    • Design of Experiments is a framework that allows us to investigate the impact of multiple different factors on an experimental process
    • It identifies and explores the interactions between factors and allows researchers to optimize the performance and robustness of processes or assays
    • The old conventional approach to scientific experimentation (one-factor-at-a-time, or “OFAT”) are limited in both the number of variables which you can investigate and, critically, preclude investigating how variables interact
    • This blog introduces the principles of Design of Experiments, beginning with its origins

    A richer understanding of biological complexity

    What makes a good cup of tea? A discussion about whether adding milk before or after the tea influences the taste may seem a long way from ensuring that Escherichia coli expresses a particular plasmid, optimizing vaccine formulation and delivery,1,2 or dissecting the intricacies of metabolomics.3 

    But it’s closer than you think.

    After all, scientific revolutions can arise from everyday observations: a falling apple inspired Isaac Newton to formulate gravitational theory. And, of all the places for a revolution to start, a tea party in 1920s Cambridge laid the foundations of a statistical technique called Design of Experiments (DOE), which allows researchers to investigate the impact of simultaneously changing multiple factors.

    Design of Experiments: a surprising origin story

    One afternoon some dons, their wives, and guests were having afternoon tea. One lady said she could taste whether tea or milk was poured into the cup first. (Some people believe that hot tea scorches milk, for example.) The statistician Ronald Fisher, who attended the tea party, devised an experiment to test her claim. The lady was randomly given four cups in which tea was poured before the milk and four where the milk was poured first. To analyze the interactions between the factors (milk and tea), Ronald devised Fisher’s Exact Test. This determines if any association between the two categorical variables is statistically significant.4 

    As Figure 1 shows, even four cups of tea can give rise to numerous possible permutations. But this only scratches the surface of tea–making’s complexity. A perfect cup of tea depends on multiple other factors, such as the blend, brewing time, and the addition of sugar. In other words, making a perfect cup of tea is complex and multidimensional. DOE allows researchers to investigate the effect of changing multiple factors simultaneously.

    what-is-design-of-experiments-distribution -factors

    0= Incorrect; X=correct

    Figure 1: Distribution factors assuming that the lady could not distinguish that milk was added before tea (null hypothesis)

    In a series of blogs, we’re going to explore the basis of DOE, who should consider DOE, and some ways in which this methodology helps experimental biologists deal with life’s inherent complexity. We’ll begin, however, by going back to school.

    School’s out (and so is OFAT)

    Our school teachers advocated a one-factor-at-a-time (OFAT) approach to scientific experimentation. So, pick a variable (factor) and vary the value (levels), while keeping everything else constant. 

    That may be fine in the school lab. Unfortunately, biology doesn’t work that way. Biological variation, for example, can mean results vary randomly around a set point even in a constant environment. Sample collection, transport, preservation, and measurement systems can introduce further sources of variation.5 

    DOE helps us understand emergent phenomena

    Biological phenomena, even life itself, are typically emergent. In other words, new patterns and structures appear through the interactions between autonomous elements.6 Every living thing consists of numerous autonomous parts that interact dynamically and unpredictably as part of one or more systems. This means, for example, that you can’t predict cellular diversity by examining nucleotides’ chemical and physical properties. You also can’t predict the products of cognition by analyzing neuroarchitecture. Emergence is one reason biologists often lack well-developed, robust theoretical frameworks to guide their experiments.

    DOE is better for exploring biological complexity

    Most biological processes are complicated, complex, and multidimensional.7 So, changing one factor probably changes something else. For example, it isn't possible to fully understand the functional consequences of changing a protein's structure without understanding all the contexts in which it appears. Its interactions within biological networks are what really define its function, so even minor changes can produce a plethora of unpredictable down- and upstream effects. DOE allows the explorations of complex, multidimensional experimental design spaces despite such methodological, biological, or chemical variations.7 

    OFAT ignores biology’s inherent complexity. It is limited in both the number of variables that you can investigate and, critically, it precludes any investigation of how variables interact. It’s a bit like trying to analyze the perfect cup of tea by ignoring the temperature of the water, brew time, and blend, and instead just focusing on whether you add the milk first or second. 

    Figure 2: OFAT may convince you you’ve found an optimum… but it may not be the real one.

    Unsurprisingly, OFAT can often identify the wrong system state as the optimum. Moreover, the lack of well-developed, robust theoretical frameworks can result in unconscious cognitive bias: it’s all too easy to develop OFAT experiments that confirm, rather than test, hypotheses.7 DOE helps avoid unconscious cognitive bias and allows researchers to look behind the curtain of biological complexity to see what’s really going on.

    What is Design of Experiments (DOE)?

    Design of Experiments is a framework that allows us to investigate the impact of multiple different factors—changed simultaneously—on an experimental process. DOE also identifies and explores the interactions between those factors. This allows us to optimize the performance and robustness of our processes or assays. Let’s apply DOE to another simple example: the strawberries you may have with the tea you’ve just added your milk to... 

    DOE looks at different ranges within factors

    Numerous quantitative factors (e.g. hours of sunlight, grams of plant food, and liters of water) or qualitative factors (e.g. the cultivar) can influence the strawberry crop (Figure 2). You need to begin by setting a realistic range for each factor. So, testing 1kg of plant food could prove toxic and expensive. Strawberries also need plenty of water to ensure juiciness; applying 1ml of water would be difficult to accurately achieve and, possibly, trigger drought stress responses.

     

    what-is-design-of-experiments-Strawberrys

    Figure 3: How different factors and levels may impact the yield, weight, and taste of a crop of strawberries

    DOE tests many factors at the same time

    The responses we are looking for in this experiment are the yield, the weight, and the taste of the strawberries. You may decide you want a high yield of the tastiest strawberries. 

    Design of experiments allows you to test numerous factors to determine which make the largest contributions to yield and taste. Based on this, you can fine-tune the experiment and use DOE to determine which combination of factors at specific levels gives the optimal balance of yield and taste. You can also compare different levels for given factors, such as whether a cultivar from nursery A produces a higher yield, better taste, or both than a plant from nursery B.

    DOE lets you investigate specific outcomes

    Design of Experiments also allows you to investigate specific outcomes (what combinations produce the best balance of yield and taste in a robust way) and reduce variability (define new conditions so the strawberry yield remains the same). Cost may be another consideration. DOE lets you balance trade-offs, such as what conditions produce the most cost-effective way to achieve the highest yield of strawberries.

    DOE 101: the blog series

    As we will see over the course of this series of blogs, DOE helps reduce the time, materials, and experiments needed to yield a given amount of information compared with OFAT. As well as these savings, DOE achieves higher precision and reduced variability when estimating the effects of each factor or interaction than using OFAT. It also systematically estimates the interaction between factors, which is not possible with OFAT experiments.

    This introductory blog offers only a very brief introduction to DOE. The next few blogs in this series will explore the various elements in more detail, beginning a deeper exploration of why DOE is worth doing, and who should consider doing it.

    Until then, I’m off for a cup of tea.

    References

    1. Ahl PL, Mensch C, Hu B et al. Accelerating vaccine formulation development using design of experiment stability studies. Journal of Pharmaceutical Sciences 2016;105:3046-3056
    2. Hashiba A, Toyooka M, Sato Y et al. The use of design of experiments with multiple responses to determine optimal formulations for in vivo hepatic mRNA delivery. Journal of Controlled Release 2020;327:467-476
    3. Surowiec I, Johansson E, Torell F et al. Multivariate strategy for the sample selection and integration of multi-batch data in metabolomics. Metabolomics 2017;13:114
    4. Bi J and Kuesten C. Revisiting Fisher’s ‘Lady Tasting Tea’ from a perspective of sensory discrimination testing. Food Quality and Preference 2015;43:47-52
    5. Badrick T. Biological variation: Understanding why it is so important? Practical Laboratory Medicine 2021;23:e00199
    6. Ikegami T, Mototake Y-i, Kobori S et al. Life as an emergent phenomenon: studies from a large-scale boid simulation and web data. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 2017;375:20160351
    7. Lendrem DW, Lendrem BC, Woods D et al. Lost in space: design of experiments and scientific exploration in a Hogarth Universe. Drug Discovery Today 2015;20:1365-1371

    Michael "Sid" Sadowski, PhD

    Director of Scientific Software at Synthace

    Other posts you might be interested in

    View All Posts