From the ExperiencedDevs subreddit, we're getting into a topic very relevant for all of us: What agents and models are you using?
📄 Auto-Generated Transcript ▾
Transcript is auto-generated and may contain errors.
Hey folks, I'm just headed to CrossFit here and we're going to go to Experience Dev's subreddit. This one is uh I don't know kind of just a thought process through programming agents. Um so I got to move it up for a sec. This uh poster was saying or I didn't even read the post honestly. I'm just reading the title. Um, but the the question essentially was like, "What what agents have you settled on for programming?" And I was kind of thinking about this like, you know, the the TLDDR of this video was like, I haven't. Um, and it's kind of neat because when you have these companies comp uh competing against each other to have, you know, the best model or whatever, um, I don't know, like as a consumer, I feel like no matter what, I'm just like I'm I'm winning and it's great.
Uh yeah, it's a little bit of uh you know, cognitive overload just with everything changing, but at least on the model front, you know, like this has just been my experience so far, right? So with models, we have companies putting out more models, updating their models, and then we have these tools that are like, well, we want to be the best tools. So, not only will we have our own model that we're offering, but we know that it's important to you as the consumer to have access to all these other models. So, like we'll offer them, too. So, it's pretty cool because like I said, as a consumer, that means I have a bunch of options. And so, the TLDDR is that I go between a bunch of different models. And I would say currently, like at the time of recording this, what is it? It's November end of November, I don't know, the 25th, 26th, I don't know, something like that.
And uh in 2025, so depending on when you're watching this, probably no one in the world will watch this after 2025. And um you know, we have GPT 5, I think 5.1 is out. Um we just had uh Opus 4.5 launch like yesterday or something. Um, I can't remember what the cursor model is. Anyway, like, and by the time I even post this on the internet, there's going to be uh, you know, new models and stuff. And it's uh, funny. Someone posted a meme on LinkedIn that I saw and it was like uh you know this circle of like you know some company being like we have the best most advanced AI model that's ever existed best for programming blah and then it's like an arrow that points to the next one and it's uh this kind of like circle of like everyone literally doing this one after the other.
And honestly, like as a consumer and user of these models, that's exactly how it feels. Like every day we have someone being like, "We're the best." And then the next day it's like, "We're the best." And um and interestingly enough, like the results always seem mixed. Like some you'll have people that are like, "This is the most groundbreaking model I've ever laid my hands on. Oh my goodness." Like the world will never be the same. and then you know you'll like scroll a little bit on social media and the next post is like I have never seen such like a dog model in my entire life and it's just funny because like I think no matter what at this point things are going to keep advancing that's cool I don't know in terms of like the technology for LLMs I don't know when they start to like plateau um in terms of like the amount of uh like time and you know that goes in investing into like the next iteration.
I've I've already seen people claiming like it it does taper off. Um I don't know the details behind the technology there. I don't want to pretend to. So um like at least for now while they keep improving like that's great. Uh and I you know I think because these things are non-deterministic I don't expect that when I try a new model that like anything I do is infinitely better right just to give you an example um what were we doing I was doing something in brand oh in brand ghost when we were building our ghostriter feature out we um we're selecting a model we're on Azure so I was just uh you know using the open AI uh Azure sort of uh offering that we have there. And I was like, well, time to pick a model. And in my head, I'm like, well, let me pick the latest one that they grant me access to.
Like, why why would I not do that, right? Um there's a lot of good reasons why you might not. Um but we picked I think it was GPT5 was like recently out when we started doing this. Picked it. go to do some content generation. And basically the the use case just so you have some understanding here is like you're in the application and you say I have a whole bunch of these social media posts and I would like you to generate new social media posts based on my existing content. Right? if I write about race cars, like make me three new posts and based on the fact that I write about race cars, you'll you'll be able to leverage content that I already have as kind of like a a basis for that. So, we build this feature out and while we're prototyping it, testing it, it's like it's like 10 seconds per post and in my head I'm like, "Oh man, you know what?
I guarantee you what I did was something silly where um you know locally when I was running things and like getting things set up and writing tests that I bet you it's because I was using a local database and now I'm doing something obscenely inefficient where I'm like query like doing a ton of round trips. Uh I'm like, "Oh, it's got to be that. It's, you know, probably the rag stuff that I set up and I have like a a crappy, you know, uh code path. It's like calling things too many times without caching." No. No. Uh everything was fine on that front. Uh and you know, so I wanted to blame myself originally, but the only blame I have for myself is picking GPT5 in this case. And it's because it's insanely slow for what we were trying to do. We switched to like uh GPT was it four 40 mini or 40 whatever and like in 2 seconds we have the three post done.
It's enough time that when you like see the toast go away in the UI we we didn't have it refreshing at the time. The toast goes away if you refresh as soon as that happens like the posts are already there. So let's go buddy. Sorry, we're going to two lanes merging onto the highway and someone thought they were going to go in front of me which is uh not happening. So in this case the the the model use case like was was the quality of the post worse like you know I don't think anyone could objectively look at the output of those posts and say worse but the user experience was 100% you know more than 100% actually was just terrible was like so much worse on GBT5 than 40 Mini So that's a you know great use case for not going to the next model.
When it comes to coding though I have personally found that um like this is kind of what my rotation looks like because I am rotating between things and even last night I was like okay time to get cursor to try out some things. And I'll kind of explain because it's not this is the thing I want to be careful about is it's not just models. It's models and the tools. Uh, I feel like if you have good tooling and that means, you know, uh, baked into how the agents are running, you're going to be able to to get task lists or some type of planning and stuff integrated. The reasoning model, like I think if you have a reasoning model with those things, you're going to be way ahead. But, you know, sometimes the tool calls, for example, are crap. And I'll talk about this in just a sec.
So my my typical rotation is like I program in Visual Studio like when I'm actually writing code uh or if I'm using an agent and I want to like be doing like supervised mode as in I don't just want to let these agents run off and go build whatever this feature is or refactoring is. I want to like I don't want to sit there and go do the the grunt work, but I want to like actively participate and like watch and stop this thing if it's off the rails, whatever. >> >> So when I'm in Visual Studio generally at this point I'm with GPT5 and then otherwise I I don't think I have Opus models available. I don't know why but I have like the you know I just dropped down my Clawude Max plan um to whatever the pro is but um even with that I couldn't use my my Clawude models inside of Visual Studio.
I have no idea why. So, I think I was using the Sonnet models. So, I'd go between GBT5 and whatever the latest Sonnet model is, like four uh four five or something, and just kind of flip back and forth. And when I would flip is like if I noticed that even if I made a new chat, it just kind of seemed like the agent was being stupid. Uh maybe my prompt was bad, maybe how I'm how I structured that particular prompt is like just not effective and I'm missing it. But what I would notice is like it just seems to hit a period of being stupid is how it feels, right? I I'm sure I'm to blame. But I switch the model at that point either from sonnet to GPT5 or the other way around. GBT5 to Sonnet. And uh and then it seems to like kind of break through that plateau a little bit.
So I don't feel like I'm talking to uh you know to a stupid agent. And again, I'm I'm using the word stupid not to, you know, the point is not to be condescending about who's putting these agents together. I just mean like the agents seem stupid. I'm trying to take some blame for this, too. But that's how it feels in those interactions. Um, the other sort of thing that I rotate between and I I use it a lot is I use GitHub Copilot like an insane amount. Um, and that's just because uh like and I've seen people complaining about this online or like making fun of it. I think I'm personally getting a little bit like nauseated by by people who are like claiming that they're like, you know, they're so good at AI and it's like, man, this stuff has been out for like no time at all.
So when you try like by definition when you try to talk like an expert on it you literally sound stupid because you cannot have as much expertise as you're claiming just by the physical constraints of time. Okay. Um, anyway, that aside, I've seen a lot of people like poking at GitHub Copilot and it's like, "Oh, you're only using it because like Microsoft." And I'm like, "No, man. I use it because I can make a GitHub issue," which I'm familiar with doing. assign it to an agent and I can screw off and go do whatever I want and then come back and I can do this times 10 and have all of these things going in parallel and I can use my computer to play video games or go to sleep, right? I don't have to sit there at a terminal or run like, you know, have multiple environments set up with different terminals to go run things like no, I'm like I'm kicking off all this stuff asynchronously.
Okay, buddy. Fast lane. Let's go. Cuz I got to switch some lanes here. Okay, let's get over. One sec. Two more. Come on. Here we go. Um, so yeah, when I'm in GitHub Copilot, do I even I don't even think I'm picking the the model that's being used there, to be honest. I don't I don't think I have selection over that. If I do, I don't know what I'm using. Um, but it seems to be doing a good job. I'm pretty happy with it most of the time. Um, and I I know the limits. Like I know there's certain things I'll give it and I'm like, it's not going to do it right, but it'll at least give me a bunch of code I can go build on and I can massage it around. Um, and then I would say even in GitHub there's like this uh spaces concept.
I'm not sure if people use this yet, but it's kind of like how you have chat GPT conversations, but in spaces you can pull in different repositories. So, kind of feels like I'm talking to chat GPT equivalent except it has all the context of all of my code. Uh, I don't use what whatever OpenAI has codecs or whatever. I I've never used that so I don't know. I'm sure it's a similar experience. Um but there I will use either uh I'll pick like the latest GPT model um to do like if I want to go do something more complicated and my thought is like hey if it's a more advanced model that's going to take longer to go think and reason through things like I kind of want that experience um when trying to do a big refactoring or architectural overhaul. Um, otherwise if I'm just like riffing with this thing, then I'll pick like a I think I can select mini models from there.
So that's kind of been the experience. Now yesterday I switched over to cursor and I switched over to cursor because I was doing some supervised like coding but um what what I had was a starting point from GitHub copilot. I had given it some work that I knew it wasn't going to be able to like complete. Um, when I say complete, I mean to my liking. I just knew it was going to get it wrong, but I I knew or I felt confident. I shouldn't say I knew, but I felt confident that it would get the code in a direction and then probably kind of fall apart on some of the details. And that's exactly what happened. But then I said, cool. Like in a supervised way, I still think that I could have AI go do this. Uh, GitHub Copilot's a bit of a pain in the butt for like supervised um, >> coding because the sessions just there's too much overhead.
So, you leave a comment and then like you're not going to sit there and watch the session because like the overhead is just insane. Uh, but it's really nice just for leaving it run in the background. So, I said I'm going to use cursor because it has a plan mode. I want I wanted to talk to this thing about the state of the code, have it come up with the plan, modify the plan with cursor. Um, and what can't remember what model it's using. Um, damn it. It I can like visualize it in my head, but I I think it ends in a one. Uh, man, I feel like the model has code in the name. Anyway, um it's awesome because the speed. So, it's planning and then like I send it a message to like uh update the plan and it's like instantaneously it's happening.
And I noticed this when I was working in cursor in sort of a supervised way is having these models that are so responsive is such an amazing experience if you're doing this kind of development in a supervised way. Like I said, when it's not, I don't give a if it takes 20 minutes for, you know, an environment to spin up, for whatever agent to run with whatever model if I'm asleep. like I just don't care. Um it's sort of like the overhead can be really high, but when I'm doing a supervised coding session, I need that immediate feedback. And I didn't realize like how good that felt. So I was uh you know getting cursor to put these plans together, reviewing them it and it felt like you know super fluid. all the parking spots. So, that experience was really good and then um and then yeah, when it was actually coding things, it was very fast to go make the edits and stuff.
And again, like as it's coding, uh I can I could stop it and be like, "Hey, no, like uh we missed this. Like, go do it this way." and the responsiveness of whatever that cursor model is was like that was for me that was a bit of a new experience in this mode because otherwise it's always just felt kind of slow and I've dealt with it. But um I think now like until and this is the thing until there's another model from another group that's like this I think if I'm going to do supervised coding sessions like this I'll probably jump over to cursor for now. Um use cursor for that and then I'll continue to use GitHub copilot just for blasting out like tons of asynchronous issues. So um and that's less about the model more about the tooling just for for clarity. So that's kind of where I'm at with this stuff.
I'd be curious to hear from other folks and I'm sure in like a week, two weeks, a month all the stuff will change again. So, kind of exciting times. But that's the video for today. Thanks so much for watching. I will see you in the next one.
Frequently Asked Questions
These Q&A summaries are AI-generated from the video transcript and may not reflect my exact wording. Watch the video for the full context.
- What models do you currently rotate between for programming in Visual Studio?
- I typically rotate between GPT-5 and the latest Sonnet model, like 4.5, when programming in Visual Studio. If I notice the agent becoming less effective or 'stupid,' I switch between these models to break through that plateau and improve interaction quality.
- Why did you choose not to use GPT-5 for generating social media posts in your project?
- I initially picked GPT-5 for generating social media posts, but it was insanely slow, taking about 10 seconds per post. Switching to GPT-4 mini reduced the time to 2 seconds without a noticeable drop in quality, greatly improving the user experience.
- How do you use GitHub Copilot and Cursor differently in your software development workflow?
- I use GitHub Copilot extensively for asynchronous tasks like generating code from GitHub issues, allowing me to work on multiple things in parallel without constant supervision. For supervised coding sessions where I want immediate feedback and plan modifications, I prefer Cursor because of its fast, responsive planning and coding capabilities.