Don’t get hung up on gene synthesis

‘A critical limitation in synthetic biology today is the time and effort expended during fabrication of engineered genetic sequences.’ Current Wikipedia article on synthetic biology
‘Be hard on your beliefs. Take them out onto the verandah and beat them with a cricket bat.’ Tim Minchin

Synthetic biology is a new field, but it sometimes feels like we already have dogma.  The idea that de novo DNA synthesis costs are limiting seems so ingrained: many of the major news stories we hear in synbio are the latest milestones of ever larger chunks of DNA being stitched together, ever increasing numbers of strains being created, and the cost of DNA synthesis falling exponentially. There’s no doubt that the advances to date have been phenomenal, and I certainly tell anyone who gives me half a chance about how processes that took months when I was a PhD student can now be solved in minutes (and it really wasn’t that long ago).  The problem is, I’m not sure our attitudes have kept up with this pace of change: it’s the thesis of this post that due to these immensely rapid advances, DNA synthesis is no longer such an issue and we should now focus as a community on the bigger problem of effectively addressing biological complexity.

There are two reasons why we might need to make more DNA than is economic at the moment: either to make more constructs, or larger ones.

Larger constructs

The fundamental issue with building larger constructs (e.g. >50 kb) is that you need the knowledge to be able to make such larger and more complex constructs work.  So either you need full knowledge of how dozens of biological parts will work together (or not) or you resort to copying large chunks from what you know works in nature (or all of it).  Therefore, the main limitation on making larger and more complex synbio constructs is not the DNA synthesis, but our knowledge of what to do with that DNA synthesis.  Fundamentally, life is too complex and unpredictable for us to productively utilise large, engineered constructs at this point in our knowledge of biology.  We can make great advances with DNA that can directly and easily be assembled from small building blocks, and along the way develop the tools that will allow us to truly engineer the more ambitious systems that will need larger constructs.

More constructs: Brute-forcing biology

There has been a trend in biology to address the huge complexity and unpredictability of biological systems by gathering ever increasing amounts of data and performing ever-larger experiments.

Figure 1: Lots of data (Source NCBI)

The equivalent in biological engineering is the screen- the theory goes that if we can perform enough high throughput screens fast enough, we can cover enough random or semi-random space that we can successfully optimise any biological system to do what we need, no matter how complex.  This can work, and has been used across biotechnology for decades to develop new drugs, better antibodies, better enzymes and more productive microbes.  However, the problems are manifold: setting up a screen that can assess the 10s or 100s of thousands of possibilities necessary can be exceptionally challenging and expensive both in time and infrastructure, and you often have to make do with a proxy assay, because the test you’d really like to is too time and resource hungry.  Each test in a screen has to be exceptionally cheap and small in size- you can’t screen using scaled systems, but it’s usually scale that will make a bioprocess actually matter to the world.  Running bioreactors is expensive- if you can do dozens economically then you’re doing well.

I think that the cry for ever-cheaper synthetic DNA is perhaps symptomatic of this brute-force ideology.  As opposed to always trying to pour more troops over the top of the trenches, maybe we need a new strategy that makes the most of every data point.

Engineering biology: finding the few, vital data points in megadimensional space

The good news is that we don’t necessarily need huge experiments to address big challenges.  With the correct application of multidimensional mathematics, bioinformatics, machine learning and automation, it is possible to carry highly complex but exceptionally lean experimentation that not only produces more effective biological systems, but also provides a quartz-clear window into the complexities underlying the results generated.  Without going into detail (a book would be more appropriate than a paragraph in a blog post) here are a couple of examples of the power of this kind of approach, one from us at Synthace, and one from DNA2.0 on a project they performed with Pfizer.  The methods can be applied equally whether it’s an enzyme’s primary sequence you’re optimising:

Figure 2: Multifactorial optimisation of the primary sequence of a biocatalyst (four iterations, reproduced with thanks to DNA2.0)

Or genetic and environmental factors that affect the yield of a bioprocess:

Figure 3: Optimisation of heterologous enzyme yield (five iterations of design cycle)

In both cases, the methods used not only gave dramatic improvement in ~300 runs or less and 4 or 5 iterations of the design cycle, they did so by defining the way the mutations interacted with each other (DNA2.0) or the way genetic factors interacted with the environment and each other to give optimal yields (Synthace).  In this way, we get insight into the beautiful intricacy of the biology we are working with, and with understanding we can go with it, use it and engineer it, as opposed to battle it and lose.

Moore’s law no more…

…or maybe it just never was.  Rob Carlson has made numerous enlightening posts on the nature of Moore’s law, how it should be interpreted and how we should and shouldn’t draw parallels with advances in reading and writing DNA.  Rob has recently updated his extremely useful analysis of the price of DNA sequencing and synthesis– I like the graphs, but I have to say that I’ve always found the words that accompany them to be considerably more important.  The main important point (with apologies to Rob for brutal paraphrasing) is that Moore’s Law wasn’t so much a law as a self-fulfilling prophecy, and the forces that created it don’t carry over into the biosciences.  So we have an interesting and highly enabling historical phenomenon to comment on, but no prediction of what the future might bring.  Costs (or prices) may keep going down, or they may not.

What I’m arguing is that if we need more DNA just to do more tests, then maybe we should be looking more at the wisdom of the tests we’re running rather than the costs of this one element of our lab consumables budget.  DNA synthesis makes up just a fraction of overall lab costs for doing synthetic biology, and it is a testament to the great reductions in price that have already happened that this is the case.

This blog was first published on SynBioBeta :