Grokking Synthetic Biology | Dmitriy Ryaboy (Twitter, Ginkgo Bioworks)
Doing data engineering before it was cool, transitioning into biotech, and the "AlexNet" moment for biology
From building the data platform and Parquet at Twitter to using AI to make biology easier to engineer at Ginkgo Bioworks, Dmitriy joins the show to chat about the early days of big data, the conversation that made him jump into SynBio, LLMs for proteins and more.
Ronak & Guang’s Picks
#1 The value of a CS education in 2024
With recent articles proclaiming the end of junior devs (replaced by LLMs), Ronak and I find our friends asking the question of whether it's still worth it to study computer science in undergrad.
Dmitriy makes the argument for why he still remembers CS 170, a class he took almost 24 years ago: "These days, I barely write any code, but having a background in computer science is hugely valuable for my job as a VP or manager, right? Because, like, I need to understand what's going on. I need to understand how systems work together. In fact, organizations or systems. And if you understand distributed systems and how they interact, right, like with a little bit of transformation, you can start thinking about the organizations, how they direct, and how to build that system. So I think systematic thinking is still critical, and you're not going to get that from, like, ChatGPT, right?
Like, you can have ChatGPT do some routine work for you and incorporate that into a creative process. But the actual thinking about what needs to happen and why it needs to happen, right? You need to have your brain prepared to be able to do that. And to me, the CS curriculum is a really good one for that."
#2 AlphaFold2 was an “AlexNet” moment for Biology
"AlphaFold2 was a major moment for biology - it didn't solve protein folding, but it was a massive leap forward, like the AlexNet moment in the AI world, where the game changed, right? What was impossible before became fairly straightforward now, and you don't have to be an expert."
But just like AlexNet, showing the possibility of a solution doesn't mean solving the problem - "Immediately, it was like, well, yes, that problem is solved, but that's not actually the problem we care about. What we actually care about is designing a drug that's going to be expressed only in the liver and is going to pass all the phase three trials and whatnot, and not have any side effects or unpredictable side effects, et cetera. Can AlphaFold give me that? The answer is no, it can't. It tells you how the thing folds."
Segments:
(00:03:18) Data engineering roots
(00:05:40) Early influences at Lawrence Berkeley Lab
(00:09:46) Value of a "gentleman's education in computer science"
(00:14:34) The end of junior software engineers
(00:20:10) Deciding to go back to school
(00:21:36) Early experiments with distributed systems
(00:23:33) The early days of big data
(00:29:16) "The thing we used to call big data is now ai"
(00:31:02) The maturation of data engineering
(00:35:05) From consumer tech to biotech
(00:37:42) "The 21st century is the century of biology"
(00:40:54) The science of lab automation
(00:47:22) Software development in biotech vs. consumer tech
(00:50:34) Swes make more $$ than scientists?
(00:54:27) Llms for language is boring. Llms for proteins? that's cool
(01:02:52) Protein engineering 101
(01:06:01) Model explainability in biology
Show Notes:
Tech and Bio slack community: https://www.bitsinbio.org/
The Death of the Junior Developer: https://sourcegraph.com/blog/the-death-of-the-junior-developer
Dmitriy on twitter: https://x.com/squarecog?lang=en
Stay in touch:
👋 Make Ronak’s day by leaving us a review and let us know who we should talk to next! hello@softwaremisadventures.com
Music: Vlad Gluschenko — Forest License: Creative Commons Attribution 3.0 Unported: https://creativecommons.org/licenses/by/3.0/deed.en