The LLM Show
Nick Nisi (00:01.026)
Hello dysfunctional people. I'm your host today, Nick Nisi. Ahoy, ahoy. And I am joined by Amy Dutton. Amy, how's it going?
Amy (00:08.487)
Pretty good. Glad to be here.
Nick Nisi (00:11.054)
Glad you're here. K-ball, how are you doing?
KBall (00:14.369)
Hanging in there, a little more dysfunctional than usual on account of some dentistry on the left side of my mouth, but if y'all don't laugh, I won't laugh.
Nick Nisi (00:23.128)
We're laughing with you. Just remember that. And of course we're joined-
KBall (00:27.891)
I should say if y'all don't laugh, I won't cry, I guess, but no, we're good.
Nick Nisi (00:33.294)
And that silency here is of course bone skull bone skull. How you doing?
Boneskull (00:39.843)
I don't really have an excuse.
Nick Nisi (00:43.51)
An excuse for being not be being here. There we go.
Boneskull (00:48.081)
this dysfunctional.
Nick Nisi (00:52.088)
Same. And speaking of dysfunctional, we're talking about AI today. I don't think anybody's ever talked about AI on a podcast before. So this will be a first. I don't know. Amy, do want to give us a definition of what we're talking about specifically?
Amy (01:12.103)
definition. AI, Webster's dictionary defines what... I'm joking. man, artificial intelligence. You know what's funny? As I think back to a few years ago and I was getting invites to be part of webinars where people would talk about the future of chatbots and I remember thinking, really? Really chatbots? But I mean, here we are. And I think some of it was at the time I just couldn't foresee.
what that would look like. I imagined it, the help pop-ups, the intercom interfaces and things like that, and not really understanding all that AI can do. So I think, typically when we talk about AI, that's the format that we think of it in, is the chat interface, but there's some really interesting use cases even outside of chat.
Nick Nisi (02:02.882)
Yeah, especially for programmers and it is wild like to your point like I just remember you know being in college and hearing people talk about like a Turing test where they'd have like you chatting with a real human and chatting with a bot and Can you tell the difference and if you can't then we have achieved artificial intelligence whatever that means and I think the goalposts just move on that now But you can effectively talk to these things and not really or go a long time at least Until you realize that all it's doing is agreeing with you and then
Amy (02:04.986)
Mm-hmm.
Nick Nisi (02:32.386)
Wait a minute.
KBall (02:34.889)
Yeah, it turns out the Turing test is the wrong test because it's really not very hard to fool humans. We really, really want there to be a mind behind whatever we're conversing with.
Amy (02:34.96)
There is a part of
Amy (02:42.438)
You
Boneskull (02:48.614)
Does anybody have a conflict of interest here on this topic? Does anybody work for an AI company?
KBall (02:56.519)
I would not say that the company that I work for is an AI company, however, we are building an AI product.
Boneskull (03:04.294)
you
Amy (03:06.118)
going to say the AI response is usually agreeable.
Nick Nisi (03:06.796)
sus.
Boneskull (03:14.254)
Yeah, that's what bugs me. I use AI, I use like Copilot, right? To help me do stupid stuff when I'm coding and like, I half expect it to be like, no, I can't figure that out. And it doesn't figure it out. Like, half the time, but it sure acts like it does. And I'm just surprised like,
Why isn't there a confidence like, this answer is 85 % confidence or something like that? Is that something that can happen? I don't know, but that would be nice.
Nick Nisi (03:57.454)
Well, it is. The AI just thinks it's 100 % confident that it's correct.
Amy (04:01.51)
all the time.
KBall (04:04.119)
So there are underlying metrics you can get if you dive a little deeper, some of the APIs or things like that, like lower level interactions, that it's not exactly confidence, it's like how, what these things are doing is they're sampling over a probability distribution for each thing they suggest. And you can look at how concentrated was that probability distribution? Were there like 80 %?
in this answer that you picked and like 20 % chance of like other things, or was this like a random ass thing that it pulled out of the hat of a flat distribution? But that doesn't get exposed in end user products.
Nick Nisi (04:42.734)
Is that the temperature?
Nick Nisi (04:47.31)
Is that the temperature setting?
KBall (04:49.623)
Temperature setting, so that's sort of in some ways the inverse where you set the temperature, which is basically like how much are you willing to sort of accept bumping around in randomness versus like if your temperature is set all the way down or up, whichever direction is all the way to the max, it's like always picking the most probable with no room for any sort of exception. You could still be picking the most probable out of a relatively flat distribution.
which would be what I would want to have as low context or low confidence. But temperature reduces the variability because it says like, don't do any random selection by probability, just always pick the most probable answer period.
Boneskull (05:34.631)
So K-ball, seem to, because of your confidence in speaking about this topic, I believe that you know much more about this topic than I do. So you know what the temperature is, and you're explaining that this is all essentially statistics. I mean, is that what NELLA is? It's just a statistical model, and it's working with probabilities.
KBall (06:03.765)
That, mean, at its core, yes.
Boneskull (06:04.198)
That's all it does.
Boneskull (06:07.8)
Is that unique to a transformer? What's a transformer? Anyway.
KBall (06:11.927)
What's a transformer? A transformer is a sort of architecture of a set of neural networks, more or less, and how they pass data to each other. And you stack up, like anything in computing, have abstractions on top of abstractions on top of abstractions. you like, a transformer is consisted of these smaller modules, consist of certain
numbers of neural networks and the ways they pass data to each other, and then you stack them together and get them passing data around and kind of create this larger model, which is a blob that is essentially a very large number of neural networks with connections between them and data flowing.
Boneskull (06:58.926)
And that is built by training? Or is that built by data scientists or whatever?
KBall (07:07.679)
Hmm. So I think there's a couple different pieces to conceptually understand here. So there's the structure of the model, which is essentially code. Like you are writing a thing that says there's this many neural networks that connect in this way and all these different things. Now, when I say connect and when I say neural network, those each have some amount of data in them. These are what are referred to as model weights.
weights are saying how much is this connection prioritized? Like how does this transform it? Right? Each one of these is like a floating point transform with some sort of value associated with it that it's going to do. I should say like I mostly interact with these things at the application layer. like you're down in the areas that I like learned because I had to not because I work with it every day. So if something's not something may not be exactly right, but so high level you have the structure of the model defined in code. You have
the weights, those are data and you come to those weights by doing training. Training essentially looks like this. You set a function for defining how correct something is.
You give it an input set, you run it through all the model with its weights starting from some random set of weights. You say, how close was I? And then you do a function backwards to say, change all those weights based on how far away I was in the direction that would have made this slightly more correct. And it's doing lots and lots of essentially differential calculus, like integrating to find, what's the vector?
of a weight that would move this in the direction that would be more correct across this many, many dimensional space that is all of these little data weights. And you run this over lots and lots of inputs and outputs, and that changes your model. And so all of this thing about how expensive it is to train a model, it's mostly that. It's mostly we run a piece of data through, we get an output, we say, that output isn't quite right.
KBall (09:16.535)
And then we do what's called back propagation, which is go back through the weights, each level and say, okay, what would have, what is the direction to change these weights that would have moved it to be a more correct answer, update them and keep going.
Nick Nisi (09:34.582)
And there's like a reinforcement like training as well as part of this, right? Where real humans are providing some corrections.
KBall (09:43.679)
Yes, so fundamentally, and this is, there's a probably nuance that I'm missing here, but anytime you're updating your weights, you do it by having some sort of correctness function going input, output, and then you say, how far off is this output? That correctness function might have multiple things that go into it. It might be some absolute form of correctness. It might be like how random does, or like,
how expensive was this? There's a bunch of different ways that you might do that. One way you could do that is actually have it be a human. So is this a good answer or is this not a good answer?
Boneskull (10:22.982)
Is that also how they like tell an AI don't like don't cuss or don't talk about you know events in 1989 in China or is it like a what
KBall (10:35.243)
I think there's a lot that is done that way. Yes, that is done using reinforcement learning. You also in application layers will often do that with prompting. So moving up a few levels of abstraction, once you have this concept of a large language model, it works by taking a set of tokens and predicting what is the next token and doing that over and over again. Now that set of tokens is
Boneskull (10:40.088)
I see.
KBall (11:06.455)
generated by transforming a set of words in a language model. doing, we translate, usually it's like one word is a symbol, which is a token. There are caveats, but like a lot of the like quote unquote hacks you'll see of people being like, oh, it doesn't understand spelling or whatever is like, because it's encoding at the word level, it's not actually encoding at the letter level. But you'll encode it into a set of numbers because all of these things are numbers underneath it.
you pass it in, you generate a token. And then what happens with them also is they have this like auto-regressive nature, which means it generates a token and then it puts that back in front of its input and generates the next token. Then it puts that back in front of its input and generates the next token until it gets to a special token, which is a stop token, which says, I think I've generated enough.
Boneskull (11:56.037)
When we talk about context, that sounds like it's considering more tokens at once, essentially.
KBall (12:05.423)
Yes, so the context that you put in is the set of tokens that are being used to generate whatever the next one is. So if you talk about the size of a context window, that's how many tokens can one of these models look at at any one time. And when we talk about something like a system prompt or prompt tuning, like those are things that we're putting in that token window before whatever the user input is. And most of the really advanced models now, like...
they have been trained and reinforced on like treat a system thing different than a user thing and including all sorts of like underlying things or like underlying stuff. Like you're talking about, don't curse, don't talk about this. Like they embed all of these things in the weights of the model in a lot of ways, but then they'll also reinforce them as the context that gets prepended before anything that comes from you.
Boneskull (13:02.79)
All this seems terribly inefficient, right?
KBall (13:07.145)
It depends on for what? Because it's certainly inefficient from regards to computation, but so is JavaScript, or Ruby, or PHP, or any one of these other languages relative to writing Assembler.
Boneskull (13:14.459)
Mm-hmm.
Boneskull (13:30.453)
Yeah
Amy (13:34.479)
I think what feels the most inefficient for me when I'm working with it is you can't guarantee the same response or the same answer every single time. And so it becomes very hard to reproduce that. And even computational things that programming is good at, assuming you're talking about numbers, one plus one is two. That's very obvious. But sometimes AI seems to have trouble doing math.
Nick Nisi (13:44.013)
Mm-hmm.
Amy (14:00.87)
because you'll say, oh, I need a thousand words and it'll give you 2000 or 500. And like you can't count if computers can do one thing, they can count. So to me, that's where the most inefficiencies lie.
Nick Nisi (14:14.786)
Yeah, and I think that that's also where the inefficiencies lie in like data usage of it because that like non-determinism going in is exhausting.
Boneskull (14:26.97)
So, but why, why isn't it, why is it non-deterministic? Like, is there a pseudo-random number generator somewhere?
Nick Nisi (14:35.288)
think that's the weight piece of it I could be wrong but I thought it was like it's like if it's completing a sentence it's like the is the first token and then sky the sky the sky is and then it could randomly say this guy is red and then it something would have corrected that to say blue right but the weight could adjust how creative quote-unquote it is
Boneskull (14:56.698)
Is that right?
KBall (14:56.951)
So yeah, mean, a fundamental thing, this is not limited to LLMs, but any sort of neural network driven machine learning is probabilistic based on a distribution. what you get out, what you see when you say something into an LLM, you see a word come out. But if you were to sort of peel back two layers below that,
If you peel back one layer, you have a token that corresponds to that word that hasn't yet been translated to the English word that it's responding to. If you peel back one previous layer, what you actually have is a vector of tokens and probabilities. So you'll have, if we were to think about it as words like the sky is, if you started with the sky is, it's going to generate a vector of tokens that are going to include the tokens for probably blue, red, cloudy, sunny,
all these different things with different probability numbers. And you normalize the vector probabilities to be one across the whole vector so that the sum of all those probabilities is one. And then you use, depending on, you can actually use one of a number of different algorithms to pick which token you then put. If you want it to be purely deterministic, you set the temperature all the way down and the algorithm it does is it says, pick the one with the highest probability. If you do that, every time you put that first,
that initial context in, you will get the same token out.
Boneskull (16:29.966)
It's just that that is not used in practice because it's actually undesirable.
KBall (16:34.889)
Exactly. That is not used in practice because it's often, it is used in some situations, but it is generally not used in practice because it is often undesirable. I will say that the, one of the really interesting things to me about this is if we move away from like, what is the model layer and you start talking about applications, what are the applications or ways of using this?
where that lack of determinism is a feature and not a bug. Because it is a core attribute of these models. It's a core attribute of any neural network, like probability-based model. You get a probability distribution based on the data sample that you trained on.
that can be a bug in cases where you really want determinism. Like I don't like having, I've seen somebody actually do like software where they have like, if you think of an MVC, like a Ruby on Rails application or something like that, they would have an LLM in the controller telling things what to do. That to me seems banana pants because it's a really slow and unreliable way to do logic that is better expressed in code. However,
So we've seen for some of these sort of generative applications where you want to interpret a concept with your specific data and sort of weave those things together in a way that is reasonable code generation. It actually is shockingly powerful to have that amount of flexibility because it doesn't just reproduce the code that you saw on Stack Overflow. It knows how to apply your function and your thing to that
pattern and put it here. Or I've been using it today a lot to take JavaScript code in this fugly third party application that we're using that's open source and say like, okay, I need a reproduction of this in another language because I need to be able to do it somewhere else. And it does a shockingly good job of just translating across languages.
Nick Nisi (18:49.198)
So that's a gist of how it works. Mostly over my head. I did read Stephen Wolfram's book on it and still over my head. But as software developers, we're probably the most primed to use this day to day. And it sounds like we are. Starting with you, Bonescall, you mentioned at the beginning that you're using
Copilot, is that the extent of your AI usage today, day to day?
Boneskull (19:22.904)
Yeah, yeah, I'll reach out to like chat gpt once in a blue moon for something or other but I haven't like it's not in my workflow or what have you. I will use I just use copilot because you work in open source. GitHub lets you use it. So I use it in in VS code and
I have it.
like generate, I try to have it generate tests. It's usually pretty good. mean, what you get out, the tests won't necessarily pass, but it's, it's more than enough to get started. Just, you know, give me unit tests for this file written in TypeScript for the AVA test runner or something. And I can tell it, and you know, if it wasn't AVA, like I can tell it use, use describe and it,
and nest things and tell it how I want the tests to look and it'll poop out a skeleton at least and you know save a fair amount of time but I feel it's good for automating tedious things you know reformat all this this
comment because it's formatted weird. I want it formatted a specific way or something like that. It's good at those things. I don't really trust it enough to try to, you know, say write an algorithm that does this, that, and the other thing. I have done it where I have, say, two functions which are really similar and I'll highlight them and be like, okay.
Boneskull (21:18.576)
generalize this like make a new function that accepts inputs and and you know reduce the code duplication here and yeah can do some of that stuff but yeah and and but you know I do use it pretty much every day one way or the other usually it's in like there's like a copilot chat window in VS code that I that I use
And I find that it's most helpful when it, I don't know, you like give it these slash commands to tell it, you know, consider the workspace, consider my, that we're working with tests here and various other things, which seem to help quite a bit because if you just use,
intentions or whatever. So you highlight a function or something and then you say modify with copilot. That doesn't take everything else into account and it's good for kind of one-off things and it doesn't matter if you have that context or not. But yeah I find it's pretty helpful to
take the whole workspace into consideration. I've heard people have success with cursor. I've never used it, but I'd be interested to learn how it compares.
KBall (22:58.559)
My only gripe with cursor is its VS code, but I've been using it quite a bit. is it. I, you know, I've spent the last 20 years using Vim in my terminal and cursor has been enough of an upgrade for me that I now do my software development in curse or my writing code in cursor and everything else is still in my terminal.
Nick Nisi (23:01.507)
Yep.
deal breaker for me.
Nick Nisi (23:25.144)
Okay, so we've got copilot, cursor, Amy, what are you doing?
Amy (23:29.882)
Cursor's my daily driver. I've also experimented a little bit with windsurf.
Nick Nisi (23:36.472)
How do they compare? Cursor and Windsor.
Amy (23:38.499)
Yeah, I think Windsurf has a beautiful interface for agent mode where you just say, maybe we should explain what that is. Agent mode, well, I'll do my best to explain what it is. Agent mode has a little bit more autonomy. So when I'm trying to edit something and I say, I need to do this, agent mode's like, okay. And it'll kind of take the wheel and start driving and start modifying files. It has a little bit more agency.
and what it's doing. I also feel like it has a little bit wider context, like it'll start reaching for files and I'm like, whoa, whoa, whoa, what are you doing there? But the chat interface, it feels more like I'm reviewing a PR where it makes suggestions or it'll say, hey, why don't you do this? And then I'm the one that actually copies and pastes or approves that modification to the code. So Windsurf does a good job of that agency piece.
Although you do have agency mode in cursor, there's a little drop down where you can select agency mode both in the chat view and the composer view. the main model I'm using is Claude 3.7, which was recently released. So I've been a big fan of that.
Nick Nisi (24:51.726)
Mm-hmm.
Nick Nisi (24:55.522)
Yeah, fascinating. Okay, so cursor, a little bit of windsurf, copilot, and then me who is kind of everywhere, but cursor for visual studio reasons. I just don't wanna be in there, but I hear it's cool. There are plugins for Neovim that give you some of the functionality, like it's called Avanti.
and you give it an anthropic key and then it will let you chat with it and you have like, you can say at file or at workspace and give it context on things and set up like a rules file similar to a cursor rules which I hear is very valuable and interesting. Mysterious and important, I guess, if you're a Severance fan. But I'm also like.
I like having it a ton of code generation for me. I like talking about code and for that I think reaching out to one of the models to talk to instead is a little bit more up my alley at least to have like higher level architectural discussions because that's what I want to have day to day and then kind of start thinking about the code and like maybe I'll write that more but I go back and forth but like to that I use Claude.
chatgpt and raycast with their AI features which you can run Claude or chatgpt or deep seek or whatever model you want But then also perplexity And I also have github copilot And I've also been playing with Claude code which is Next level maybe it's equal to cursor. Yeah, but it's fascinating Chris you were talking about like letting it run tests for you or write tests for you
Amy (26:43.344)
Really?
Nick Nisi (26:51.822)
With get up go by that with Claude code this only came out like a week ago, and I've been playing with it a little bit It's like it's a NPM module that you install and then you authenticate with with Claude Or with anthropic and you have a key that has some funding on it I put $25 under the key and then you can just start asking it questions and it will the first time generate a Claude MD file that describes things about your project for you and that's like
Here's where the tests are located. Here's where the important files are. Here's, you know, if you have like multiple workspaces, like a, a mono repo, here's where these things are. And it'll give a brief description of those. And, um, I was working on a project where it was, uh, it's a node library and it's using very node specific things like process study and V and, um, crypto and HTTP, like these nodes, Pacific libraries. And I wanted to work in bun and Dino and
Cloudflare and you know all of these other places so I just asked Claude hey what do I have to do to get this to do it and Claude looked at the files asked for permission and then said here's crypto here's HTTP here's process calls you need to replace all of those would you like me to go ahead and do it and I said yes and it went and did it and that cost roughly a dollar it was kind of expensive but then I like went and ran the tests in a
Amy (28:14.958)
you
Nick Nisi (28:18.37)
you know, I had that open in one TMUX pane and I went to another TMUX pane and just ran the tests and 300 tests are broken. And so then I went back to Claude and I said, this looks correct, but the tests are now broken. And it said, how do you run the test? Is that just NPM test? And I said, yes. And then it just started running the tests and it was like, okay, I see what the problem is. Let me fix that. And it just like got into a loop and spent $4 as it went through and just started fixing the tests.
Boneskull (28:25.05)
Mmm.
Boneskull (28:44.198)
HNNNN
My next question for you Nick was gonna be and for real like who pays for this stuff, right? Like is it your company like my company pays for like we have chat GPT for the company sure Copilot I get for free but What about y'all?
Nick Nisi (28:53.292)
huh.
Nick Nisi (29:08.182)
Yeah, that was for me specifically that was me just like dorking around with it So I put $25 on the key and I also use that for Avanti and Avanti was like I don't know I used like Like 19 cents total over the course of months. So it was very very cheap Claude codes I used five dollars in like a single day. So it goes down pretty quick but my company They will give me a cursor license if I want it. They'll give me
They'll recharge my clod key. They've told me that. So like they're all about experimenting and finding where AI fits. But I know that other companies won't be as permissive.
Amy (29:51.643)
My company's paying for my cursor license as well. But one of the things that is interesting about cursor is if you don't want to pay for it, you just get added to a queue. So you can still use it for free. It will just take a lot longer to get the response that you want. You might have to sit there and it'll tell you you're 56 in the queue. Just sit there and wait for it to process. So depending on how patient you are, you could go that route.
Nick Nisi (30:11.704)
Interesting.
Boneskull (30:18.136)
And have any of you tried to run a model locally for any practical purpose?
Nick Nisi (30:25.378)
Yes, actually I've been experimenting with There's a very easy GUI tool called LM studio. It just works on Macs and it's free and you download it and then it's got like a search tool right in there and you can search for Whatever models so I downloaded Co code stroll, which is like Mistral AI but trained for code Specifically and I downloaded deep seek and I downloaded one other one. I think Claude three point
3.5 because 3.7 wasn't out at the time. And then that just sits on my machine and I've got a M1 Mac Studio that runs, that has 128 gigs of RAM. So I just let it run as a server and it presents any of those models, all of those models as an open AI API server. So I can just like point Avanti at that, for example. And then Avanti would use that and it would become free because it's all just running locally and it is significantly slower.
But it's free.
Boneskull (31:28.154)
You'll have to add a link or something to that somewhere.
Amy (31:28.666)
That is a
Nick Nisi (31:32.002)
Yeah, for sure.
Amy (31:33.254)
was gonna say that is a different mental shift when you are executing prompts and you're like, there goes my money. You just watch it drain your account.
Nick Nisi (31:50.924)
Yeah, that is something that's interesting. And I'm like trying to find out more where these local models can fit in for me. Because right now, I'm experimenting a lot. So I have GBT Pro, I have Claude Pro, I have Perplexity Pro, and I have Raycast Pro. And I use them all. And when I think about which one I want to drop, I'm like, I know. And I don't want to do that anymore. But.
Amy (32:12.058)
Wow, it's $100, easy.
Nick Nisi (32:19.778)
Like I use them for different things and I talk to them in different ways And like and that's specifically like where I go when I want to have like an architecture discussion on like I want to make a change like this Let's talk about it. I will go to Claude and Claude 3.7. Sonnet is amazing. It's the best to me So I start my conversation there, but like I'll be going right and I start getting a little worried because it'll start like in purple at the bottom It'll say like this chats running long. You should you should wrap it up basically
Amy (32:35.686)
Mm-hmm.
Amy (32:46.845)
yeah.
You know the hack on that, right? Is you tell it, so once that message comes up, you tell it, hey, will you summarize everything, including kind of the decisions and everything, and then you can add that as a prompt to your project or download that MD file and stick it on the next chat.
Nick Nisi (32:49.578)
Eh... No.
Nick Nisi (33:06.546)
okay. Yeah, I have done that. didn't know. Yeah. that
Amy (33:07.62)
Yeah. Yeah.
KBall (33:09.175)
Cloud is also really good at generating prompts for itself. So you can say like, turn this into a prompt or something like
Nick Nisi (33:16.93)
Yeah, and it's the 3.7 has like the reasoning like it does the thinking now too, which is really cool but I will have that conversation and then I Have like I don't know if you've used raycast. It doesn't anybody else use raycast at all It's an amazing like Alfred or spotlight like tool, but it has a I just built in so instead of hitting command space I hit option space and I get popped up a chat window and I use that for one-off things because I'm very mindful of my content
Amy (33:22.661)
Mm.
Amy (33:32.006)
Mm-hmm.
Nick Nisi (33:47.234)
like I don't want to go off on a tangent with the main clod Chat that I'm having a conversation with it know if I'm like going through and it's like we're talking about you know how I'm talking about the performance of proxies or something and then I ask Raycast like in a separate chat, you know more questions about that So I'm not clouding its context like that's where I'm very mindful of it and then
Amy (33:53.476)
Mm.
Amy (34:10.374)
Mm.
Nick Nisi (34:14.464)
If I want to do like more deeper research on things that's where I have perplexity right now just going through and it'll give me like a detailed report on whatever it's searching from the web and it seems to be better but I don't know it's all subjective
Amy (34:25.702)
Have you done, have you used perplexity research? Sorry, I didn't mean to cut you off. Have you used perplexity research? Okay. Because they just released that.
Nick Nisi (34:31.256)
Yeah, yeah.
Nick Nisi (34:35.106)
Yes. Yeah, it's really cool. And it gives you a ton of detailed information, which is like, and it summarizes like everything and it gives you links to everything that it's looked at. And so you can like dig into it. I really like that. So I use that as that. then honestly, chat jbT is like the least used for me, but I still have it because my kids and I play with it specifically like the talking feature of it.
Amy (34:41.039)
Interesting.
Amy (34:58.214)
Mm.
Nick Nisi (35:02.264)
for a long time Santa Claus, they're talking to Santa Claus. Yeah.
Amy (35:05.542)
We use it to generate images. Like there is a custom GPT where you can upload your image and it'll Pixar-ify it and create an image in that style. My kids love that. Now I'm going to ask you when you're talking about, I just want to talk code. Are you like rubber ducking it with AI or are you really trying to distill information and ask it specific questions about the architecture?
Nick Nisi (35:17.431)
cool.
Nick Nisi (35:33.472)
It's often like about the architecture and then I might start into some like high level rubber ducking or like just play it plotting out like what it would look like because like the the main thing I'm focused on is like good DX and It needs even Claude like it needs a lot of help to not just totally be dumb about things But it's pretty good like it is more of like a rubber ducking thing in that sense like I can just have like back and forth conversations and It's usually giving me
Amy (35:54.191)
Mm-hmm.
Nick Nisi (36:03.154)
some pretty decent advice, but like you gotta watch it very closely. Cause like even today I was doing something and I was trying to like plot out this, like right now, like one of our, one of our API or one of our libraries relies completely on configuration through environment variables. And I don't want that to be the case. I want you to be able to pass them in or get them from somewhere else, right? Like a cloud flow worker, for example.
Amy (36:06.468)
Mm-hmm. Mm-hmm.
Nick Nisi (36:29.462)
And so you need to be able to configure that a little better, but I wanted to have good DX on it and it just was like, yeah, you can do this and you can do this. And it was specifically trying to get me to use proxy, which I was like, finally a use for proxy. But like, then I had a side conversation with Raycast about the performance implications of something like that. And then like, it just started like going off on, you you can do this and it showed all of this like async code.
And it's like that would add a ton of complexity that it just glossed over to make it so that you could run this thing async. And so then you have to be like, is the worst part of having the conversation with them because you just point out the obvious. You're like, wait, this isn't how is he going to run that async code? Like that's not going to work. And it's like, you caught me or you're absolutely right. And it's like way too aggressively, like nice to me about it.
Amy (37:24.368)
Worst though is when it's like, you're right. And then it tries to tell you the same response. I had that happen because I was trying to set up the new Tailwind v4 on a project and its model is not updated. It's always v3. And I've even passed it the v4 documentation, but it was like, you're right. We don't have to use the Tailwind.config.js. And then it's like, so you should add a Tailwind.config.js.
Nick Nisi (37:29.474)
Yes.
Amy (37:53.479)
It's like, no. Yeah, but I think, you you're talking about all these different systems that you're reaching for. For me, I keep turning ChatGPT on and off. I'll subscribe to it for a month and then turn it off. But I have so much context now within Claude projects that I don't think I will ever be able to turn it off. So it's like, they got me. But I have found the, by creating those project documents where you create context documents to
Nick Nisi (38:15.998)
yeah.
Amy (38:23.526)
help it makes all the difference in the world. And that was a big mental unlock for me was when I was like, I don't even know what context to give it for that document. And then I was like, I know I'll have it write it for me. And so then it just asks me questions and generates its own context that it needs. And it really does level up the answers that it gives you.
Nick Nisi (38:44.93)
Mm-hmm, for sure.
KBall (38:45.783)
There's a version of that for the coding type of stuff that Nick was talking about, right? So you're having your architecture discussion. You get to something you like, and then you say, hey, write me a spec for this. And then you drop that in your docs folder, RFCs folder or whatever in cursor. And now you reference that when you ask it to generate code or something like that. Or I guess you're not using cursor, you're using something else. like the, I have a conversation with it where I'm prodding it back and forth and getting to a place that I like it.
Amy (38:57.274)
Mm-hmm.
Amy (39:03.194)
Yes. Yes.
KBall (39:14.955)
followed by generate a doc for this, followed by you edit it. Anything it generates might have a little LLM slop. Review the doc, cut out anything that's not referenced. But now that is killer context for actually getting it to build an implementation. And if you go through that, by the time you get to implementation, a lot of times it can one shot it. Or you can be like, here, write my spec. OK, based on the spec, write me a set of unit tests. OK, based on the spec and the unit tests, write me the code.
Amy (39:32.976)
Mm-hmm.
Nick Nisi (39:42.04)
Yeah, in my experience, and I haven't used cursor or anything, but like in my experience talking to Claude, there's a really cool tool called get ingest and you just give it whatever GitHub URL and you replace the hub with ingest and it will give you a text digest of the entire repo. And then you can just paste that into whatever LLM and start asking questions. So it's pretty great, right? Cause you could tell it, here's the entire library and here's even like the read me and everything how to use it.
Amy (39:59.963)
Mmm.
Nick Nisi (40:11.722)
and it will still make up API calls or functions that don't exist. And I'll be like, that doesn't exist. And it'll be like, sorry, I'll try and stick to things that exist from now on. Great, thanks.
Amy (40:25.574)
Well on that note have you written documentation for AI?
Nick Nisi (40:33.128)
yes. Thank you. I wanted to transition into this. No, I haven't specifically, but I am very curious about what y'all think of that because that's like the next thing, right? Like how do we quickly get things out? I was just recently talking to Tanner Linsley about Tanstack start and my question for him was like, this is like one of the first like frameworks that came out, you know, in the post AI revolution.
But AI's don't really know much about it, so where does that leave you, right? How do you get yourself out there if people no longer go to Google for things? And that's a fantastic thing.
Amy (41:04.24)
Mm-hmm.
Amy (41:12.954)
Yes, I've been.
Yeah. So I've been working on this problem that we're talking about. So with Redwood, we're rewriting a lot of stuff. Same thing. It's like, if we want to aim towards juniors, which, I mean, that's kind of our target, eh, maybe, but keeping AI in mind, AI is not going to recommend Redwood because it doesn't know about it. And it definitely doesn't know about the new features that we're creating. So, you know, I've thought a lot about this and I even, so I asked AI, if I'm going to write documentation for you, what do you want to know?
And kind of like you're talking about having it write that context. The big thing it said was that it wants structured content, usually in a markdown file. And it also wants, it does well with JS docs. So if you want to provide that kind of information to it, it does well. It also wants examples and potentially questions like an FAQ, what people will ask and how, what their correct answers for those questions would be. So I don't have,
Result, but that is at least my plan for building that out
Boneskull (42:19.046)
Sorry, you're talking about SEO for AI.
Amy (42:25.38)
Mm-hmm. Basically.
KBall (42:27.103)
yes and no, right? SEO is about what is located. What I think she's talking about is more about prompt tuning for AI, making it so that this can be useful context so that an LLM will use it well to generate code.
Amy (42:36.218)
Yeah, that's probably.
Amy (42:43.93)
Yeah, that's probably a better answer. The other thing is with Cursor, you can point at docs. So you can say at docs, and then there is a set of documentation that's already loaded into the Cursor directory or file system or whatever. And if it doesn't exist, then you can also pass it a URL and it will scrape that particular website for its documentation. for example, I mentioned Tailwind v4.
if you type in tailwind, it will automatically register V3. And so I said, no, I want V4. Here's the URL.
Boneskull (43:15.77)
But you said something about how are people going to know about redwood. Is that not
Boneskull (43:26.182)
I don't know, I don't understand the topic.
Nick Nisi (43:31.438)
If I'm not if I'm no longer going to Google or stack overflow or anywhere and you like you come to me I'm a don't know. I'm a contractor or something and you're like I want this thing built in In Redwood or in 10th Tech Start or something that didn't exist before the LLMs cutoff Like you're at a disadvantage of using that new tool or tail when for like for example You're at a disadvantage of using that new tool because the tools that you have now come to rely on don't know enough about it
Boneskull (44:01.882)
I see, I see. So not, not right, so not, not so much tell me what framework to use for this or like what tool will do X and Y it was okay I want to use this tool and it just the LLM needs to be able to understand how to use that tool that's interesting I'd be I'd be happy to learn more about how to do that how to write how to write docs for that
Amy (44:30.81)
But I think kind of what you are saying, because I said, yes, SEO for AI, it's like a yes and or no and. Because if you are having these rubber duck conversations with ChatTPT about architectural decisions and it doesn't know about Redwood, it's not like cursor where I can feed it the docs. And for somebody that knows absolutely nothing, doesn't even know that's an option, we're kind of SOL.
KBall (44:56.695)
So this gets to a bigger picture thing about using LLMs that I think is really important, which is treating them as a source of truth is a road to heartbreak. They are incredible at manipulating language, at manipulating code, at even like doing some amount of pseudo reasoning around things, but they are terrible source of truth. And so the...
Nick Nisi (45:11.95)
100%.
KBall (45:24.779)
by far the most interesting and successful applications out there right now are pulling in other sources of truth and then using the LLM as a way to reason about it, do things with it, maybe figure out what else it should be pulling in. But if you look at like a perplexity, right, they're doing search engine using the web as source of truth combined with LLM. Look at cursor, it's using your code base plus maybe docs you've put in plus these other things as source of truth. And the LLM says the manipulator or reasoning.
engine. I even, I mean, it does okay with a, as a source of truth for things that are extremely overrepresented on the web, right? So it's, it actually does a, an LLM out of the box will do a pretty good job of writing vanilla react code. It'll probably look like react code from a year or two ago, but there's so much documentation on the web, which is the bulk of their training data that it actually does a pretty good job of representing that in the core training set. But I still wouldn't
trust it because it'll get outdated. like, it's not there. It's not a good source of truth. It's, you know, kind of the way I think about it is like all of these weights, they're, they're sort of creating this model of the world and that accidentally crystallizes some knowledge into it, but it's accidental. The thing that's being modeled is the structural nature of it, not the facts. And so like things that LMS make mistakes on are things that are factual, right? If you ask it,
to do something with numbers, right? To an LLM, one number looks a lot like another number because it fills the same language role. It doesn't matter. Now, two plus two equals four is very overrepresented. It'll probably get that right. That's a high probability, but you ask it any two random numbers, it's just gonna plug in number and number. It's not doing math behind the scenes. So yeah, having some other way to think about how you're sourcing truth.
Nick Nisi (47:14.84)
Right.
KBall (47:21.121)
for your project, for whatever you're doing with an LLM, and then using the LLM to manipulate it is, from my experience at least, far more likely to succeed and far less likely to end you going down a trail of heartbreak.
Nick Nisi (47:37.1)
Yeah, but people are changing, even myself. Like, I don't really, I don't like Googling for things anymore. You know, like a perfect example is like a TypeScript error, right? You get this big long verbose thing and I would have to like copy that out of them and then I would have to like pull out the pieces that are contextually specific to me, like the path to the file and then the line numbers and all of that.
because I'm not gonna Google for something that finds that. So I'm having to adjust to match what Google can find, right? Like give it that context. Whereas today I can just take the air with my file path and line numbers and everything and say, why am I getting this? And it can usually do a pretty good job of telling me why. And it might not be perfect, but it is better. And I can ask follow up clarifying questions to continue that.
Whereas Google is like a one and done and then if you didn't get it right you're going back and you're refining putting quotes around things so it's like it's it's changing the language of how we find and You become lazy at the old way and so I think like getting making it easier for these things, especially as you're like trying to Promote or adopt new technologies like you got to go where the people are and that's striving to be lazy with LLMs, right?
Boneskull (49:04.224)
I will trust it more if I can like show it an error I'm getting at a TypeScript and it would be able to say this is a known bug in TypeScript you know since this and it appeared in version point blah blah blah here's the github issue
Boneskull (49:27.11)
I cannot fathom that any LLM would be able to do that.
KBall (49:34.901)
you tried perplexity? Because it combines the search aspect, which is what you've described combines a few different things. It combines looking at that trace, looking across the web for where that trace has shown up, and doing some reasoning about what's a relevant resource. That is the problem that perplexity is trying to solve.
Boneskull (49:57.519)
We'll also have to understand your code base and what version of TypeScript's installed and all sorts of things. Whatever it is, it would have to cover a lot of ground.
Nick Nisi (50:09.078)
Yes, and that's where we're like designing for that can come into play and so like one example that I've been seeing a lot is like LLM's TX t as like some kind of textual or markdown style digest that an LLM can consume and then have more context about things and then there's a whole protocol that Anthropic is working on called MCP or model context protocol that standardizes how apps
can provide context to LLMs to provide more up-to-date and relevant information continuously.
Nick Nisi (50:47.084)
Anybody looked into either of those?
KBall (50:54.145)
So the model context protocol, which I think is what you're talking about there with Anthropic, is very interesting. Particularly, use it on Cloud Desktop. You use Cloud Desktop. You set up an MCP server that can talk to some of your things and all this. And it is wild what you can do with it very quickly. It's, I think, not the productionized.
Nick Nisi (51:02.104)
Mm-hmm.
KBall (51:23.199)
or the thing you're going to want to put into production, necessarily. But for hacking around on your own and giving it access to, like for a developer, if you like chatting about stuff, like you can very quickly write yourself some code that you expose via an MCP server that is like, search my code base, search my Obsidian index, search my this, search my that as a tool that now Claude has access to and can load up your context. And it feels like magic when you do it.
I have one of my team members who is way far down the rabbit hole on that and he's like, he shows me stuff and it's just like absolutely bonkers. It does require, like, I think you have to be a little technical to get it set up right now. One of the things that this ties into is LLMs and tool calling and thinking about what makes for good and useful tools for LLMs. They are...
One of the things that I've seen at least is they do a much better job if the tools feel like English and are very special purpose. So a generic API handle event, that's a terrible function for an LLM. It's actually a terrible function for it to, if you're writing code with an LLM, it's a terrible function if you're trying to get it to call it as a tool because it will make up all different types of different events. If you want.
handle keyboard event, handle typing handle something like that, like make it very explicit and it will do a much more reliable job of being able to either generate code related to it or generate the right tool call at the right time. I think the joke that my team member was saying is like, I don't want a schedule calendar event. I want a schedule meeting with Kevin because he F'd this up tool, right? Like that's what he wants.
Nick Nisi (53:16.782)
Raycast can do that.
KBall (53:19.905)
There you go.
Boneskull (53:21.67)
question but aren't these these documents to provide this context I mean is that mostly information that you have summarized from the project that already exists in some other form
Nick Nisi (53:40.642)
They can be, I've seen other, like another fascinating place to look are like cursor rules files. Like people share those, there's like an awesome cursor rules thing. And they're tailored to like the falls or like the downsides that cursor would have. So it's like, if you were gonna say, I need a React component that is managing state, you know, for a counter or for a to-do list or something.
Like your cursor rules file might say based on like people's experience with it generating crappy code saying don't write code using Redux. Nobody uses that anymore. Use this instead. Or know, like things like that. Like they could have more biases that are like, I know you're gonna think to do it this way based on how you were trained. I don't want that. I want it like this instead. And like when you think about doing it like this, do it like this instead. And so it's adding more
More like specific context to like point out where the LLMs are wrong that wouldn't necessarily make sense as like general documentation that a human would want to consume.
Boneskull (54:50.926)
I mean it seems like the end game would just be for the LLM to be able to figure out all this stuff itself without... you know? But maybe again that's... you know... so that's all the compute needed to do that.
KBall (55:11.477)
I I think a thing to bear in mind with these is like the more guidance you give them, the better the outputs you get. And I think a lot of us had negative first reactions to using LLMs. A lot of people listening to this may still have negative emotions or reactions around it. I don't know if they've made it this far. They probably tuned out very quickly. But if you have, you're listening to this, you still have a negative reaction.
Nick Nisi (55:35.33)
Yeah
KBall (55:39.531)
Like if you put in very little direction, you get middle of the road junk out. And when it comes to code, that's usually code that doesn't do what you want or doesn't work very well. When you give very high levels of this is what I want and why. Essentially, you become the source of truth. This is what I want. And then you let the LLM do the manipulation. At least in my experience, your quality of output just skyrockets and
I find that if I'm building something in an architecture that we have well documented, for example, and I know what I want to build, I can tell the LLM, I can say, go and read this doc about how this piece and this piece works. Combine it with this doc, generate this tool that connects to this tool, go, and it will get me 90 % of the way there. And the things that it screws up, I can fix in five minutes where writing all of that code would have taken me an hour. And it's just a shocking speed up.
If I don't understand the system that I'm working in and I say, I want something that does X and Y, it'll create something that's based on some weird idea out there of what X and Y are that doesn't connect to my system at all and uses some library I don't do and all these other things, and it'll be garbage and it'll slow me down more than it'll speed me up. So you can't turn your brain off. You are still telling the machine what you want it to do. You just can operate at a much higher level of abstraction than you can when you're writing lines of code.
Nick Nisi (57:10.092)
Yeah, I think that's an important piece. Like, you know, in everything that I do, and I feel like I'm right now I'm paying like $80 for it. I am gonna drop that, by the way. I'm just experimenting. But like, I have to take everything with a huge grain of salt that it says. And it's very useful for speeding me up by not letting me stare at a blank editor. And I can have like a conversation that's very specific to exactly what I want to talk about, and they're always listening.
Amy (57:20.806)
You
Nick Nisi (57:39.47)
Which is great. But yeah, it lies constantly and it's wrong constantly. And so like that's where it really feels like it's it's better or it's it's more useful for people who are more like on the senior side of things of development because you can discern like, well, is that right? Like, I'm not just going to take what it says at face value. I have to I can quickly just see like, no, that's wrong.
Ask a clarifying question to get it on track, but you have as cable said you have to You have to put those guardrails in and sometimes it's that continued trial and error of the discussion
Nick Nisi (58:25.228)
Amy, do you want to ask your question first?
Amy (58:27.428)
Yeah. So I'm just curious what you think the future of AI looks like. There's been a few companies like Arc or the browser company kind of said, it's still bitter about Arc, but they said we're going to go all in on 2.0 and AI is the future. We're going to try and do this. So you need us. You have Apple that's kind of approached it from a hardware perspective of
Nick Nisi (58:41.261)
Yes.
Amy (58:56.036)
you're already on our device, so we already know everything. And then you'd mentioned Raycast earlier, which is also something where I haven't watched the demo for it, but I know they just announced Raycast AI, and you can go to an app and say, do this, and it will control it and do it. Which is crazy that now it's not just limited to these chat bots that we've been talking about, but all of a sudden it can actually do tasks on your machine. And do things across applications.
Nick Nisi (59:24.226)
Yeah, that is where Rick has extensions is what just came out and like they they came with a couple of like pre-built ones already that are third party like Google Calendar for example, I can just say schedule a calendar invite for for k-ball for 30 minutes and Find a time between you know, whatever I think it can no, maybe not because it can't find it doesn't know his But it knows mine
Amy (59:46.276)
Find a time.
Amy (59:51.174)
It'd be amazing.
Nick Nisi (59:53.326)
I can say like find a time or you know find a time on my calendar that would work for both of these and it'll and I can say like name it You know one-on-one whatever and the next time I think it remembers so like the next time I say scheduled meeting with cable it'll default to what I had set before For like the name the name of the meeting and things like that, but it never does anything like on its own It'll always like prompt you with a confirmed message before so you can see exactly what it's gonna do, but like there's
It's automatically integrated with GitHub now too, so I can say find that issue that you know where they're talking about this and close it or you know write a comment or take me to that page and it will just go find it and bring me to it. Just pretty cool.
Boneskull (01:00:34.502)
Amy, that sounds really useful for automated testing and automation of legacy systems. I can't remember. There's an industry term for it. But yeah, that's kind of an area I worked on with Appium. But yeah, that's cool. I should check that out.
Nick Nisi (01:00:50.232)
pain.
Boneskull (01:01:03.353)
Recast?
Amy (01:01:05.03)
It's replaced, mean, AI aside, it's replaced a bunch of other tiny apps that I had kind of stitched together and has it all built in mostly for free, which is crazy. And you can tack on, snippets are huge. And you can even use what AI with your snippets, right? Where you can, or yeah, where you have certain prompts and it'll plug stuff in. But if you are just...
Nick Nisi (01:01:15.938)
Yep. Snippets.
Amy (01:01:30.969)
experimenting with AI and trying to figure out what model to use. Granted, it is limited in context, but it's probably the cheapest way to do it because you can tack on AI for $10. I think $10 and then, or maybe you have to have the pro. It might be 20. Either way, you get all the models is what I'm trying to say. You don't necessarily have to pay for a Claude, Chachi PT and all the deep seek. can llama, llama. Yes. It'll just add everything. It's all right there. It's all included.
Nick Nisi (01:02:00.046)
And Anthropic is like I noticed when Claude 3.7 came out that they had a waitlist that you can join for Whatever their browser is gonna be called. I forgot what it's what it's gonna be called, but a Claude Browser sounds very interesting because I'm picturing it don't know exactly but I'm just picturing me being able to say like, you know The family wants pizza tonight get us a pizza and it will just like do it
Amy (01:02:13.999)
Interesting.
Nick Nisi (01:02:29.762)
But maybe not. I don't know.
Amy (01:02:32.911)
crazy.
Nick Nisi (01:02:36.91)
So I'm curious, kind of built on that, the future, where, like, where do see this going? I guess to ask bluntly, do you feel secure in your current role, knowing what's coming with all of this?
Amy (01:02:54.928)
Some days I do, some days I don't. Some days when it hallucinates or, I mean, even when I'm doing graphic design stuff and it generates this crazy image, that's not real. I do feel like I have job security, but I also am not disillusioned and know that it's just gonna get better. And so I don't know how long that space is. Sarah Dresner said something that I've really clung to.
which she was talking about how a lot of times when new technology comes out, it doesn't necessarily eliminate all jobs, it just changes the nature of the job. And so her point was, I love what I do, I love coding, I love the craft of coding, and I'm just not sure that the new job that we will get assigned is going to be a job that I will like as much. And I resonate with that so much because...
As you guys are talking, a lot of it is more of that prompt engineering. It's thinking more like a product owner, a project manager creating a PRD to tell AI what to do. It's thinking about it at a higher level. It's editing code. It's not necessarily writing code. And so it's getting comfortable with that new job.
Nick Nisi (01:04:11.79)
You took exactly what I was going to say, honestly. I don't think it'll replace my job. I just don't know that I will enjoy the job later when it becomes less about programming and more about, as I've been putting it, context shepherding.
Amy (01:04:21.53)
Mm-hmm.
Amy (01:04:28.326)
Mm.
Boneskull (01:04:30.36)
I don't feel like it's going to high level engineers at least. I'm probably going to retire in 20 years. I don't think we're going to see that before then.
So I guess I'm not really worried about it. And you know, my plan when I retire is become an indie game dev. And so I don't see AI kind of doing that either. God, what horrible slop it would make. So no, I'm not worried about it.
Nick Nisi (01:05:10.146)
K-boy, how about you?
KBall (01:05:12.353)
So I'm in maybe a slightly different position than you all, which is like I run, I manage a team. And one of the things that has been actually very interesting is that having cursor and having doing a dev lets me do a lot more coding than I could do before because
write code by hand, you have to remember a lot of very detailed context. And if you're a manager, you're context switching a lot. So it's actually very hard to load all of that up in your brain. When I'm using the LLM to write code, I can actually operate at a higher level of abstraction and not have to load up nearly as much of the detailed context. It's in the docs. And I don't need to grok all those details from the docs. I just need to know, okay,
The way that we approach this type of thing is in this doc. Let me tell cursor when I'm manipulating this thing, go and look at this doc. And so I find for me as a manager, it actually makes it much, much easier to stay technical in a lot of ways and to stay contributing on the technical side. And I can have architecture conversations and I can prototype an architectural piece very quickly and even some, you know, run down the implementation of it as needed. So in some ways that makes me like my job.
more. Because I actually get to do more code. Though I do like the management sides of my job as well, mostly. In terms of threatened, I think there is an interesting transformation going on, which is that for a long time we've been in this world where many, companies have sort of settled on this two pizza team that is like, okay, you have five or six engineers and you have a product manager and you have a designer and you have a manager.
And maybe you have an SRE or subject matter expert, SMA or something like that associated, right? And you have like this team of eight or nine people and that's what it is. It's like the core, sort of core productive unit.
KBall (01:07:19.927)
Cursor, windsurf, know, Claude code, all these different things, Copilot, they, when used well, dramatically increase the productivity of the coding parts of the job, such that a much higher percentage of the work done to get any particular thing out is like the decision making that a designer might be doing or that a PM might be doing. If your designer is just, you know,
making screenshots without making decisions or doing assets and not making decisions. It's a different world. They can do some of that, but there's a lot of decision making that happening. So I think that core team of PM designer, five, six engineers may not actually look like that. It may look like a designer and PM, maybe a designer who also does some PM roles, something like that. And like two engineers and get the same amount of productivity or more.
At that's what I'm seeing on the team that I'm running. running, I have two engineers, me and a designer. And it's more productive than my team of a year ago that had twice as many engineers, two or three times as many engineers. So that doesn't mean there's no engineers, but that does mean for any particular amount of product development or code, you have fewer engineers involved. And so there is a question to me of like,
Are there enough unmet software needs that that still ends up resulting in, we still have a lot of, you know, full demand for all the software engineers in the world or not? I mean, right now I'm actually weighing on the side of like software is dreams made reality. We can make a lot more dreams reality now because we were more productive. So it's going to balance out, but we're in this adjustment period. And I do think any particular company may need fewer engineers. And that means that during the transition,
Like there's a heck of a lot of engineers out there who are having trouble finding work.
Amy (01:09:17.208)
other thing though to consider is when you're talking about the balance of work, there's also AI tools for designers and also AI tools for product managers. So some of it is also just the speed at which we're all able to work together and maybe the speed of how AI enables each of those roles.
KBall (01:09:36.823)
Yes, and I think what AI cannot do is make decisions for you. And it cannot reason or understand user needs or things like that. And so my impression is that the percentage of work that it automates away for a PM or a designer is lower than the percentage of work that it automates away for a software engineer. Because those roles tend to have a higher percentage of decision making involved. Now, to be fair, as we've highlighted, this is a killer tool for
Amy (01:09:42.352)
Mm-hmm.
Amy (01:09:55.599)
Interesting.
KBall (01:10:06.391)
very senior engineers, because a lot of our work as very senior engineers is actually that decision-making and this automates away a lot of that, that mess. lets me as a manager, director or VP or whatever, like still be doing the implementation, even as I'm doing the decision-making, like it lets you do a lot of that work. So it is a big boon for. Engineers who doing a lot of decision-making and that is to your point about like the, the job changing, like
If you don't like reasoning about business product user thinking, like that's going to be a much bigger part of the job when the writing code part shrinks.
Nick Nisi (01:10:45.398)
Yeah, and I'm fine with that. I was thinking more about it in, I guess, the way that we interact with it today. I don't necessarily like talking to Claude all day. It just gets boring and, you know, that's what I was talking about specifically. But I do think that you're right, that it will help us to speed through to the results that we're looking for. And I think that that's going to be...
valuable and probably like a new way to To scratch that like the way you scratch that creative itch will change through this by making the tools easier Just as like higher level languages made it easier than lower level languages to get things done
Nick Nisi (01:11:35.054)
Alright, well, I think we could go on for hours more. There's a lot more questions that I have on this. I do think that this topic isn't going away. We tend to bring it up on everywhere all the time, whether we want to or not, and I'm sure we'll continue to talk about it a bit, but it's cool to hear where everybody is in terms of day-to-day usage and where it's most useful and even like how we're thinking about like things like exposing
Docs are documents and documentation and all of those artifacts to LLMs to simplify things as well So really cool discussion any parting thoughts from anyone before we head out?
KBall (01:12:23.915)
do think if you're still curmudgeon about using it at all, you're missing a boat.
Nick Nisi (01:12:29.486)
100 %
KBall (01:12:33.343)
Many, many years ago, there were concerns about going from assembly language to compiled languages and all the inefficiencies and all of this, that and the other. And what's it going to do for software engineering jobs or coding jobs or whatever? Going to compiled languages did not get rid of software engineers. There's a heck of a lot more of us, but it did get rid of assembly programmers.
Nick Nisi (01:12:56.192)
Mm-hmm. Yeah, I think that AI won't necessarily take your job, but the engineer using AI can definitely have a clear advantage.
Nick Nisi (01:13:10.956)
that uplifting note we will catch you next week
KBall (01:13:16.469)
We need an outro that is dysfunctional.
Nick Nisi (01:13:17.966)
Hahaha
Amy (01:13:19.27)
Noodles!
KBall (01:13:22.803)
I got noodles!
Amy (01:13:25.574)
I my kids that this morning. They thought it was hilarious. I know we were trying to wrap it up, so I didn't add this, but Kent C. Dodds has an interesting take on the AI taking your jobs, is that it is taking your jobs because they do all the AI processing and filtering to see who they should interview. So it is kind of taking your job.
Nick Nisi (01:13:47.096)
Yeah.
Nick Nisi (01:13:50.702)
That's true.
KBall (01:13:51.019)
Well, mean, flip side as a hiring manager, the number of AI generated resumes, are, it's like, they're all generating, like, you can like, tell there's some that are done better than others, but a lot of them, it's like, they're taking your job description and like inserting a bunch of random stuff that is like related to your skill things that you're looking for. And
Amy (01:13:59.717)
Unreal.
KBall (01:14:15.669)
Some of them it's like probably fabricated because they'll have this like consulting job that has no details of it, but they're like, we worked on this thing that you asked for, and this thing that you asked for, and this thing that you asked for. And you're like, really?
It's that that that is a place where I think AI is making the world unilaterally worse. And there's companies out there that are like, well, I generate job submissions and blast them out to everywhere. And that is it's a it's like email spam. It's just like making it worse for everyone. Race to the bottom.
Amy (01:14:36.024)
Mm.
Nick Nisi (01:14:45.838)
I think that one the one place where AI is just like
undeniably a better place for me is meetings. I've used tools like granola or spark AI or even zoom has like built-in stuff but like I used to take notes during meetings and I still do but now I get like a detailed summary of everything with like action items and everything like so nice
KBall (01:15:19.189)
All right, we said we were done, but maybe we can be really be done.
Amy (01:15:21.498)
Yep. Yeah, we can really be done.
Nick Nisi (01:15:22.67)
All right, let me stop
Creators and Guests
