REALLY? These AI Agents Are Replacing Software Engineers?

REALLY? These AI Agents Are Replacing Software Engineers?

• 725 views
vlogvloggervloggingmercedesmercedes AMGMercedes AMG GTAMG GTbig techsoftware engineeringsoftware engineercar vlogvlogssoftware developmentsoftware engineersmicrosoftprogrammingtips for developerscareer in techfaangwork vlogdevleaderdev leadernick cosentinoengineering managerleadershipmsftsoftware developercode commutecodecommutecommutegithub copilotcopilotgithub copilot agentcursor aiwindsurf aiai agentsai agent

While on a little staycation from work, I decided to ramp up my usage of AI agents in my coding. I played around with Cursor AI and VS Code to help me refactor some code in BrandGhost.

The results? Let's see if my job remains safe...

📄 Auto-Generated Transcript

Transcript is auto-generated and may contain errors.

Hey folks, I'm headed to CrossFit. It's Wednesday, halfway through the station. Um, I'm going to talk about AI tooling today and specifically agent mode. Um, because that's what I've been trying out a little bit more of and wanted to share my results so far. I haven't been like I don't have like oodles and oodles of experience with it, but I am like actively using it. I think this whole this whole week I've been using it. Um I guess it's only Wednesday morning, but uh like for like over the weekend and then Monday, Tuesday, I've been coding a lot and really trying to lean into leveraging it. Um and I just wanted to like share some some thoughts and I'll I'll kind of explain try to talk about like what I think is working, what I think is absolutely not working, but um try to look at it from some different angle.

So, friendly reminder, if you're new to the channel, this channel is all about questions that you submit. And if I don't have a backlog of those, then I either jump to Reddit or I share like an insight that I have that I want to go through. Um, so if you have questions, leave them below in the comments. Otherwise, send them into Dev Leader, which is my main YouTube channel and my social media handle that I use for everything. and uh if you send them in that way, I'll keep you anonymous so you can write whatever you want and uh try to help you out. Okay, so the uh experience that I've had with agent mode, and this is with VS Code and Cursor, is currently lackluster. Uh when I first started using it, it was like I think I tried VS Code and Cursor and right away I was like, I'm never going to use this.

like I'm I'm so disappointed that like it just can't do anything. So, I I had taken a break from it. Um and then like I said this past weekend was like, "Okay, I'm going to like I have time and I I don't want to give up on it cuz I think the the premise is very promising like especially if I can be productive at the same time. So what I mean by that is one of the use cases that I am really interested in for agent mode is not probably like the vibe coding like just go build it for me kind of scenario. I know how to build software. I I enjoy building software, but there's some parts of building software that I don't enjoy and they are tedious and they are monotonous and they're repetitive and they got to get done or they should get done.

But we end up saying, "Well, maybe not right now, right?" Like it ends up being, in my opinion, kind of like a a tech debt overlap situation. And I think that having agents to be able to help with that would be tremendous, right? Like, hey, I want to go build this feature and you go screw off and fix up all this stuff that like I can explain how to go fix. I just don't want to do it. And I have a perfect uh situation that's like this right now. So, I'm building a platform for posting social media content that's called Brand Ghost. Um, so shameless plug if you're interested in starting out with content creation or you just want to do some learning in public or you have a, you know, a business where you're like, man, keeping up with social media sucks. I don't like posting everywhere.

Check out Brand Ghost. It's just at brandost.ai. It's it's absolutely free if you just want to use it for scheduling and crossosting. No strings attached. Uh, and you know where to find me, right? I'm the person who's making it and using it. So uh but with brand ghost I'm working on a feature or sort of fun enabling some functionality to build some features that uh is working around how authentication is done. And in order to make this happen um I have to basically do refactoring and and pay down some it's truly paying down some tech debt across all of these plugins that we have. And because Brand Ghost is a a platform that posts to social media, that basically means every social media platform that we integrate with is is like a plugin in our codebase. Okay. So, there's some some patterns, there's some stuff because we didn't add them all at the same time.

Those patterns have grown and adjusted over time. So, it's kind of like as a code base is evolving, if you find a better or like a cleaner, more uh, you know, scalable way to do something, you're probably going to adjust as you're going and and use the new pattern. So, because of this, and I need to unify how we're doing some of the o stuff, I Oh, man. Sorry. I'm watching a car pull into a lane out in front. I got to move over a lane. Come on, let me in. One more. Nice. Okay. So, because I have to basically do this pattern across like literally like 20 different plugins, this is like the least enjoyable type of work, right? Because it's almost the same thing, but slightly different because, like I said, the patterns have changed. So, I can't just do a fancy search and replace or write a script that looks for the pattern, replaces it.

It's a really good job for AI in my opinion because it's all like I could describe this how how to do this refactor to you as a human and I could say like this type of thing is going to cover 95% of what you see. here's a couple of variations of this. And then beyond that, basically, if you try to compile, you'll see where there's a few outliers, right? Like some real edge cases, and they're not even going to be something that you can't solve. It's just that they don't fit the other patterns quite exactly. Like, we might be using an extra property in some place and like, cool, just make sure you're passing that property along. Like, no big deal. So, I set out to do the first version. Come on, man. Let me on the highway. What are you doing? Got two people in a row.

Three people in a row. This person's going to do the same thing and box me in again. You're not helping, stupid. That's what you are. Okay. So, I do the first refactor, right? I'm gonna, as the human, I'm going to go do this properly. I'm going to set the bar for how I want it to be done. And for context, a lot of like this is really how simple it is. Okay, I have a database query that's doing a join on a table or two tables, right? So, joining them together. What I'm doing is undoing the join. Okay, so undo the join. That means the record that I'm creating as a result of that um no longer is going to have some of those fields that I was pulling in from the join. And one layer above that I'm going to be combining that data from another database query that already exists.

Okay. So all that I'm doing is getting rid of a join and restructuring uh like a data transfer object. It's just holding some fields. I'm just like basically splitting it apart and recombining it from two queries now, one of which already exists. That's like that's it in a nutshell. Everything else that has to change as a result of that is just the fact that if you were using some of those properties on the object, you now have to like say object do subobject.property property because I've composed the um the data transfer objects. It's it's all it is. Okay. So, the only ex like additional thing aside from like changing all the spots where that's getting consumed, the only additional thing is that um there is like tests, right? So, not every one of these um plug-in like uh particular paths for O uh on the database code has tests, but like I said, they're almost all the same.

So, like just write the tests, right? I already have across 20 plugins, I probably have like over half of them done. So, it's not it's not like invent it from scratch. Like, you have all of these existing options and you can look at a group of them. You could even see if there's small differences and adjust. So, the the way that I did this was like I explained it to cursor. I don't have the prompt off the top of my head, but essentially said here is a version of this that I did. Okay. So, look at this social media platform. look at how this code is structured. Here's what I did. So I explained it and then said I would like you to go apply this to the other social media platforms. And so I know that would be a big ask to do it once.

So it suggested like hey I see these social media platforms. Do you want me to start with one? And I said I think that would be a lovely idea right before setting you loose. Let's go with one. And um it tried and I think that it what's interesting is I think it starts to do the right things. Like it's looking in the right spots and that's already like it's pretty cool. It's impressive that it's like I know it's not the same code but I see similar patterns like let me go adjust it. But a couple things were happening. One is that it would do it for some of it but not all of it. And I couldn't find a pattern. So it's like, "Oh, I know I should go touch this and let me adjust it." And then it would just miss like 50% of the spots that needed the same treatment.

So I'm thinking like, how do how does it not and I don't mean this facitiously, it's a genuine question, right? How does it not know? But as a human, I could literally if someone gave me the delta for their change and applied it, if I wanted to see if it was working or done, I would just try to compile and I would notice right away like, oh I missed a ton of it. Which brings me to my next point is like it would touch a lot of the spots, but it wouldn't even do it properly. like it was I don't know if it you'd call it hallucinating, but it's making changes that that don't compile. And I feel like that's a super easy like litmus test for a like cursor or Visual Studio Code. You're literally in the IDE. Just check if it compiles. Now, the other crazy thing is that even when prompting it, right, cuz I had to do this for I'm not even done.

I have to do one more, I think, today. But across these 20 plugins, you know, one of the variations and prompting I started doing was like, make sure it compiles. And it would tell me like, oh, I this uh this this code definitely works now. Like, it definitely compiles. And I'm like, no, it doesn't. Like, I could still see the squiggies in my browser or my my IDE. the code browser, right? Like I know it doesn't work. It's very obvious. So I don't part of me is confused, right? So it's those two pieces. It's uh it's touching some of the spots, not all, and then it's not creating code that compiles. Um, one of the other things that comes up is that I'm still finding that it's like just deleting code. So, it's doing the refactor like it's supposed to like which is again it's I'm I know it sounds like I'm I'm on this.

It's it's very impressive. Okay. Very impressive. But it's doing some things like it's just deleting some method calls and I'm like nope. We need that. I put that there because it does something. I don't know why it's deciding just like out of nowhere this code's gone. So, what I have to do after this refactor when I'm reviewing it is I have to go comb through everything in great detail because I've already come across a few things where I'm like, "No, man. Like, why did you remove that? Like, that's a very important method call." So, um that's been the gist of it so far. I tried tailoring my my prompt cuz I shared this in a previous video, right? Like one of the things I tried that seemed to work really well is I gave it literally the diff from git.

So between each plugin I'm checking in and then I have um just checking in locally and then I have a commit uh delta that I can do and give it to cursor and say like here's what I need to do. Here's literally a working example of it for this other plugin. like take these patterns you're seeing for these spots that are touched and go apply it to this platform. And even with the diff, it's not doing it's it seems like it didn't improve it much. Maybe it did a little bit, but certainly not to the point where I can say go do this refactor. Um it likes to pause periodically to say like, do you want me to keep going? And um if I keep saying continue continue, by the time it's air quotes done and I look at it, I'm like doesn't compile. Um like it's it's just not done.

It makes it and you know this is where I want to start giving it some credit because I'm almost at CrossFit. It is it's doing a lot more than than nothing, right? It's like it's not zero. But what I'm finding like a fascinating thing to try and think about is like is it actually faster than me doing it myself? And right now the honest answer is it's not. It's what I'll tell it or what I'll give it is like it's less painful, right? like and and maybe marginally it's less painful because someone else did the shitty work, but now I have a different type of shitty work, which is like now I'm reviewing code and trying to find the gaps that it missed. Like that's also pretty shitty. I don't want to be in a spot where I'm guessing like did we get it all? Um right.

If I had, you know, amazing test coverage on everything, then I could go run the test and be like, "Oh no, you screwed up." but I don't. And part of it is like I need you to help create the test cuz I know there's gaps. So, it's it's kind of challenging because I see there's such a good potential here, but I don't know if it's screwing up because there's too much context. I was trying to explain this to someone yesterday, I think, where I I'm describing this challenge of like and I don't I just simply don't know enough about LLM functionality given that it's in the IDE. Is it too much context? And it's like it starts to water down the ask almost because I've been having really good success with chat GPT like getting it to like design individual uh methods like write an algorithm does whatever smaller classes.

It does a great job. Um and I've been using that extensively recently. So in my mind I'm like okay if it's able to do that imagine if it had the context of the code base. But then we go to use something like cursor or VS code was even worse somehow. Um it's like it it can't keep the the momentum going. So if I have to handhold it so much like it's kind of defeating the whole reason I want to use it which is I I wanted to go refactor this without me having to to be paying attention at least that extensively. Right. So anyway, I'm going to keep using it the rest of the week. I I you know, I'm not going to give up on it. Um that means I have to learn how to prompt more effectively, but I got to learn sort of the ins and outs because um I see this I do see this being the way forward for a lot of stuff.

But like I said earlier, I enjoy building software. I'm not going to tell it. Like if there's like features and stuff I want to go build because they're fun or I'm interested, like I'm going to go build those. But for the other that should get done, then I don't want to do it. Thanks AI. Like glad you're here. So anyway, that's the video for today. Thanks for watching and I'll see you next

Frequently Asked Questions

These Q&A summaries are AI-generated from the video transcript and may not reflect my exact wording. Watch the video for the full context.

What has been my overall experience using AI agent mode in VS Code and Cursor for software refactoring?
My experience with AI agent mode in VS Code and Cursor has been lackluster so far. While it shows promise and can handle some parts of the refactoring, it often misses many spots that need changes and sometimes produces code that doesn't compile. I find myself spending a lot of time reviewing and fixing the AI-generated code, which makes it less efficient than doing the work myself.
How do I use AI agents to help with tedious and repetitive coding tasks?
I use AI agents to help with monotonous and repetitive tasks that I don't enjoy but need to get done, such as refactoring similar patterns across multiple plugins. I explain the pattern and provide an example to the AI, then ask it to apply the changes to other parts of the codebase. Although the AI can start the work and touch many spots, I still have to carefully review and fix issues it misses or incorrectly modifies.
What challenges do I face when prompting AI agents to perform code refactoring?
One challenge is that even when I provide detailed prompts, including git diffs and working examples, the AI often fails to apply the refactor completely or correctly. It sometimes deletes important code or makes changes that cause compilation errors. Additionally, the AI may claim the code compiles when it clearly does not, which requires me to manually verify and fix the output. Learning to prompt more effectively is necessary to improve results.