LLMs are like your weird, over-confident intern | Simon Willison (Datasette)
"vibes-based evaluation", "conference-driven development", and being an independent developer.
Known for co-creating Django and Datasette, as well as his thoughtful writing on LLMs, Simon Willison joins the show to chat about blogging as an accountability mechanism, how to build intuition with LLMs, building a startup with his partner on their honeymoon, and more.
Ronak & Guang’s Picks
#1 Writing as an Accountability Mechanism
“My blog has been an accountability mechanism for me for a while now. Because I'm independent, I don't have an employer, so I started doing this thing I call week notes. Every two or three weeks, I post a blog entry saying, 'Here are the things I've worked on in the past couple of weeks.'
That means that when I'm thinking about what to do, occasionally I'll think, 'You know what, I haven't done anything I can write about yet.' So I should really invest in one of my open-source projects or do something so I've got something to show for it. And yeah, I love that. I think writing is thinking, and it's a great way of forcing you to structure your thinking. You know, if the best way to learn something is to try and explain it to somebody else.
So if you've got a blog, even my shortest little link blog things - where it's just a link and two sentences of text - I always try and put something valuable in there. Partly it's to prove that I read the thing I'm linking to, but also it's to add something extra from my perspective on it.
It might just be saying, 'This is related to something else.' For example, when I wrote about Claude's prompt caching, I linked back to Google Gemini, which has a similar feature. I could compare how Google Gemini pricing works and how Claude pricing works. That's a little bit of extra perspective that you won't get from Anthropic. They're not going to write about Google Gemini in their announcement of a feature. So it's that kind of thing - forcing you to engage with the material just a tiny bit more thoughtfully, so that you can try and say something interesting about it as well as linking to it.”
#2 LLMs aren’t actually easy to use
“The problem with LLMs is that they're actually really difficult to use, which is very unintuitive. Everyone assumes they're easy because it's a chatbot - you type things to it, and it responds. But to use them effectively, you need to build a really deep model of what they can and can't do.
For example, I would never ask an LLM to count all the instances of something in a paragraph, because I know they can't count. Which is totally non-obvious, right? It's a super sophisticated computer system. How can it not count? Computers are great at counting; that's what they've been doing since we invented them.
I know that if I have a question that a friend of mine could answer by reading a Wikipedia page, then I know the LLM will be able to answer that question. But if it's the kind of thing that the Wikipedia page probably won't cover, it's less likely that the LLM will be able to answer it. The challenge is that you really do have to put in the time. A friend of mine says that 10 hours is the minimum you need to spend with a GPT-4 model before it really starts to click - before you understand what these things are and how to use them. And I think that's what it takes to develop the level of expertise where I can look at a prompt and, 90 percent of the time, correctly predict whether it will work or not.”
Segments:
(00:00:00) The weird intern
(00:01:50) The early days of LLMs
(00:04:59) Blogging as an accountability mechanism
(00:09:24) The low-pressure approach to blogging
(00:11:47) GitHub issues as a system of records
(00:16:15) Temporal documentation and design docs
(00:18:19) GitHub issues for team collaboration
(00:21:53) Copy-paste as an API
(00:26:54) Observable notebooks
(00:28:50) pip install LLM
(00:32:26) The evolution of using LLMs daily
(00:34:47) Building intuition with LLMs
(00:43:24) Democratizing access to automation
(00:47:45) Alternative interfaces for language models
(00:53:39) Is prompt engineering really engineering?
(00:58:39) The frustrations of working with LLMs
(01:01:59) Structured data extraction with LLMs
(01:06:08) How Simon would go about building a LLM app
(01:09:49) LLMs making developers more ambitious
(01:13:32) Typical workflow with LLMs
(01:19:58) Vibes-based evaluation
(01:23:25) Staying up-to-date with LLMs
(01:27:49) The impact of LLMs on new programmers
(01:29:37) The rise of 'Goop' and the future of software development
(01:40:20) Being an independent developer
(01:42:26) Staying focused and accountable
(01:47:30) Building a startup with your partner on the honeymoon
(01:51:30) The responsibility of AI practitioners
(01:53:07) The hidden dangers of prompt injection
(01:53:44) “Artificial intelligence” is really “imitation intelligence”
Show Notes:
Simon’s blog: https://simonwillison.net/
Natalie’s post on them building a startup together: https://blog.natbat.net/post/61658401806/lanyrd-from-idea-to-exit
Simon’s talk from DjangoCon: https://www.youtube.com/watch?v=GLkRK2rJGB0
Simon on twitter: https://x.com/simonw
Datasette: https://github.com/simonw/datasette
Stay in touch:
👋 Make Ronak’s day by leaving us a review and let us know who we should talk to next! hello@softwaremisadventures.com
Music: Vlad Gluschenko — Forest License: Creative Commons Attribution 3.0 Unported: https://creativecommons.org/licenses/by/3.0/deed.en