Are You Sick of AI Yet? No? Let's Dive In!

Are You Sick of AI Yet? No? Let's Dive In!

• 145 views
vlogvloggervloggingmercedesmercedes AMGMercedes AMG GTAMG GTbig techsoftware engineeringsoftware engineercar vlogvlogssoftware developmentsoftware engineersmicrosoftprogrammingtips for developerscareer in techfaangwork vlogdevleaderdev leadernick cosentinoengineering managerleadershipmsftsoftware developercode commutecodecommutecommuteredditreddit storiesreddit storyask redditaskredditaskreddit storiesredditorlinkedin

Just what you wanted! Another AI video! Let's talk about some more AI usage patterns in my own development.

📄 Auto-Generated Transcript

Transcript is auto-generated and may contain errors.

Hey folks, I'm driving in the office. We're going to talk AI today. We talk AI every day. Um, going to give a little update on some agent usage because I mentioned that I would before. And more recently, the thing I've been trying to do is um put up some more guard rails for agents, I guess, is the best way to put it. And so I am a net developer like uh in particular like outside of what I do at work I write a lot of car code. I've been doing that for many years. And in C# we have something called analyzers. So using Roslin we can write basically it's kind of neat. You write C to do C. Oh my god this person to turn man. you write C to do um like C rules. Uh so it's pretty cool kind of meta and um the idea behind that is that if you have code that is not meeting the expectation of your analyzer then it can cause uh build failures and things like that.

Um and the idea is of course if we have that happening then when AI is making uh stupid decisions then sorry this turn is just a nightmare buddy that's okay we get to go we get to go um when you have AI that's making stupid choices is despite how hard you're prompting it, hopefully the analyzers tell it what's up. So, uh, this has been working pretty well. Um, I made a couple of videos, I put them out on YouTube and actually on my main channel, Dev Leader, and actually showed an example of like a really simple situation where, um, if you're familiar with like the arrange act, assert pattern in tests. If you've been using AI to write tests, a lot of the time it will do this naturally following a range act assert. But the thing that I hate, it drives me nuts and maybe this is just me being nitpicky, but I hate that it comments a range act assert.

It's in my opinion just completely ridiculous to have in my codebase. Um, I know that it's following a range act assert because that's how every test is done. I don't need a comment that tells me that. Um, so I actually don't have this rule turned on my own codebase, but uh I did write a uh analyzer rule for this YouTube tutorial. Vibe coded it and just kind of gave instructions to co-pilot and said like this is the scenario that I want in an analyzer rule. And what we did was had co-pilot generate tests without it. And then we put the rule in place after it was vibe coded by co-pilot and deleted the test and basically said you know same prompt go write test following arrange act assert and um the first pass through it it put the comments in and then it tried to compile and it was like oh like I can't all of these errors show up because of these comments.

So then it it went back and deleted them. So it is helping. Um there's absolutely situations where it's it's doing that and you like from that example alone and I kind of called this out in the video but um you could you could do better with the prompt on that like you could say do a range act assert style test and and do not put the comments in. And in fact, the the reason that I use this as an example is because I've literally had um in my own development, you know, co-pilot instructions that say to to not leave those comments. I put it in the prompt. I have it in a custom agent. I have it in the, you know, documentation for the project and it still would put in a range act comments. It's like I don't can only tell you so many times, man, until I need to make it a rule that you cannot.

So, overall, this has been working pretty well. I'm hoping to use some of this drive to, as I'm blabbing away here, to come up with a a few more examples to talk about. But the fun thing that I encountered and I posted it to Twitter, uh, by the way, I post a lot of stuff on social media. So, if you're like, hey, like these code videos are kind of fun. Uh most of my content for software engineering is just under the dev leader brand. So you can look for dev leader on social media. Um code commute also has posts and stuff that go up. But like I don't write, you know, long form or like other types of posts for code commute. I just put that all under dev leader. And so I posted this to to Twitter and showed an example where um co-pilot was writing code for me.

And I have a custom agent instruction that says that you cannot uh suppress analyzer messages. And um I said unless explicitly told to do so. And sure enough, in the screenshot, it shows uh co-pilot kind of giving up trying to fix something. It's just like, oh, there's an error with my code. Let me just turn off this analyzer rule. So, so it's still not bulletproof. Um, I think like you'd have to find some way to like I don't know allow any suppression or something like do we need an analyzer rule that looks for the uh conditional compilation like pragmas? I don't even know if you can do that but it's just it's like it's silly, right? Um, I don't know. Oh, I got a hiccup. I don't know how I'm going to do better than that right now. I'm kind of out of ideas.

So, if you're a .NET developer and you're hearing what I'm saying and you're like, "Oo, like I wonder if you could let me know because like I said, I'm out of ideas." Um, I I've kind of I think I've moved the problem or I've enhanced it uh enhanced the uh the resolution of this kind of stuff. And the way that I'm thinking about this type of thing is like you can put like I think we need to address it in different stages. And you've probably heard people saying like shift left, right? Like everything's got to shift left. And I think that's kind of what's happening is that if we didn't put helpful stuff in the prompts and we just use things like analyzers, it would go code all sorts of random stuff and then hopefully you catch it with an analyzer rule, right?

And it might waste a lot of cycles trying to do the right thing, building things, and then it's like, oh wait, when I compile it doesn't work, like let me kind of go from scratch depending on how many analyzer rules it's tripping up. So like the analyzers I think are a helpful guard rail but I think we have to do this stuff um at different stages in like the development pipeline and you know to start you have things in your prompt right you're literally asking an LLM to do some work for you so you need to have some constraints in your prompt and I think that if you can be more specific you could have better results but um like that's not new anyone, right? Then you layer on top of that like your co-pilot instructions or um agents MD file. So like that's also kind of like I think if you're using uh something in agent mode like I think that's kind of a minimum you have to do that.

Then we go one step beyond that. So, not only do you have your your agents MD file or co-pilot instructions, I I'm assuming in not too long from now, maybe we'll always stop saying co-pilot instructions and just refer to it as agents MD, but for now, then you have your custom agents. So, like in co-pilot, I can use like in GitHub copilot, I can assign a custom agent that's, you know, written up in markdown. Um, kind of like your agents MD file. uh in claude you can have custom agents. So like this is a third level your prompt your um sort of repository wide or folderwide like agents MD file and then a custom agent. So three levels now. Um I also think that kind of like with documentation I think you can have docs in your repository but I do think this like nested agents MD file is kind of supposed to help with that.

Uh, I personally haven't really had a lot of success with agents leveraging docs, even if it's markdown. I feel that I've put into my instructions and stuff, make sure you're reading docs, here's the structure you should expect for that. And I feel like it doesn't do it like at all. Um, I don't think I've ever seen it be successful with that. Um, maybe that's just me, but I would think that like docs could be like this fourth level. Um, but maybe that's just really agents MD is is the thing that's taking precedent over that in terms of benefit. So those are multiple layers and then what I was describing on top of that is like analyzer rules and if you're not a .NET developer like the idea is like you know you could have a llinter depending on what language and stuff you're using.

So you have like things for stylistic changes and depending on what you're building in like I don't know how aggressive your li uh llinter can be with rules but um kind of just like finding these programmatic ways to enforce patterns. So I think like for me that's what Roslin is. It's like the built-in rules are pretty helpful and like help a lot with style and and code safety, but there's also just like patterns in my code base that if you don't follow them, the code still is like right, still compiles, might still work, but um it's not like what's what's a good example? Like I use a result pattern for everything, but like you could also throw exceptions and like it's not that one of those is right or wrong. It's just like I don't throw exceptions in my code by choice.

So I will write code in ways where I try not to throw uh not that I don't use exceptions at all but uh I use I have like a custom result type and it's either an exception that comes back or the value that I want to return. So like I I use exceptions but I don't throw them. Uh that way I can still have try catch blocks if something goes wrong and then like it's not necessarily a exception that needs to bubble up out of my code and like you know um be caught at higher levels kind of skipping up the call stack. I can return it up and then make decisions about what I want to do with that. Um it's just a sort of different paradigm I guess but that's my preferred way. Is it better or worse or right or wrong? It's just different.

So even when I have instructions that say follow the patterns in the codebase, like there's a result pattern. Here's what it looks like with examples doesn't doesn't do it sometimes, but like not all the time. And then it sucks because sometimes co-pilot or claude or whatever cursor, it'll put a like a PR together. it'll put the changes together. And I'm looking at it, I'm like, this is like it's mostly right. Like it's it's basically doing a lot of the things I like, but it just like completely avoided the result pattern as an example. So I'm like, I don't want to commit that into my codebase because now I have like everything uses a result pattern except this spot. Like it's kind kind of weird, right? And then over time it's like there's more and more spots that just aren't following the result pattern. And now I have like these two completely different paradigms just like kind of scattered across the codebase and it's just more convoluted.

But take this concept and apply it to like to anything. I'm using result pattern in this case. Uh you could have something like maybe uh you're a fan. I say the word tpple instead of tupole by the way. I hear it both ways. I prefer tpple. I always give this disclaimer when I say it because I feel like someone's like, "Oh, you say tpple, not tupil." Whatever. I call it tpple. Um, some people might want to return tpples from things, right? Like uh unnamed sort of anonymous tpples and that might be fine and like I'm personally okay to do that with like private methods and stuff, but I don't like that on a public API of a class personally. Is it right or wrong? Like it's not right or wrong. It's just preference. I prefer to have a type. My reason for that is I find it easier to refactor things when there's a dedicated type on the public API.

But I like the convenience of just having a tpple inside a class, right? Because if I'm going to refactor stuff and break things inside the class, it's kind of scoped and I'm I'm like convenience-wise, I like it um because it's scoped. So that's another thing, right? But you can just keep dreaming up all of these different scenarios where like you have this style or this approach or this design pattern you like using and the LLM doesn't have to use that and the code will still work. So how can we start enforcing these things? Um the other example that I've kind of extended on, I'm not sure if I talked about this in one of the earlier videos, but it's on, you know, the one of the more recent Dev Leader YouTube videos at least.

I show vibe coding another analyzer where for assert messages this is this is one that I do have in my um my actual code but for assert true and assert false um in xunit these take another parameter that is a message personally if I have assert true or assert false without a message I'm not okay with that uh you might say that's kind of overkill like who cares errors. But when you're looking at test errors at a glance and you just see like test failed because asserted true and got false, it's like no matter what, I have one more step to do. No matter what, one extra step. And that's like I have like to understand anything about that. I have to go read the code. Don't get me wrong, if the test is broken, I still have to go read the code and fix it.

But it would be super handy to not have to go one step further just to like see what the assertion's talking about. So you can put a message there. So when it said instead of saying assert true got false which like obviously that's the failure. It's a it's a boolean check. It's one or the other. Um instead of saying that your assert true or assert false error message will say whatever you define like you know expected um expected the conversion process to to succeed like okay like that that tells me something more helpful. You might also say well just write your test with a single assert and like that's cool just not my style. I don't do that. So I wanted an analyzer that would um guarantee that like the code won't compile unless you put a message in. So little things like that. But I made, you know, a video walking people through that.

The really I think the cool part is like all the analyzers that I have are vibe coded like completely. And so the the pattern that I use for this is usually like depends on when the idea strikes. Sometimes like I'm lying in bed like thinking about code like before I'm going to sleep. For some reason that's something that like really knocks me out like solving tricky problems in my head. I just like pass out while I'm doing it. And it's not stressful. Like it's kind of enjoyable cuz I get to problem solve and think but without like trying to type it all out. I'm just like how do I think through this? Usually exhausts me and I fall asleep pretty fast. Um, so sometimes I'm thinking about things and I'm like, "Oh, like you know what? Like that's actually something I've seen co-pilot mess up a lot.

Like I should probably put an analyzer together for that." So sometimes that happens and then like I just go to to GitHub on my phone and I I get the analyzer started, right? Make an issue assign it to C-pilot and um see how far it gets in the morning. But almost always when I'm having co-pilot do it this way like from GitHub I will pull down the branch and I will mess around with it because I want to see for myself in the IDE like in Visual Studio playing around with things. Okay, like I said, you know, this was the pattern but like what if I change the code a little bit like is does it work or does it get caught or what do I expect? Right? like I want this analyzer for this scenario, but maybe maybe there's like more variations of the code and the analyzer is too strict or it's too loose.

I've seen times where um it's too strict and like it ended up changing a whole lot of code and I'm like no no no like all of those other scenarios are fine. Um you got to address this one specifically. So, I have to tell it to back off a little bit and revert a bunch of changes. The one that uh I was working on most recently, um, how do I describe this? Okay, so I have a a helper class that returns my result object or my result type. And in this case, it's a result type that's holding a stream. Again, if you're not a C# developer, um I have well, I haven't said the part that's C# specific. There's something called I disposable in C. And a stream is a type that is I disposable. So the pattern that you're supposed to do is when you are creating objects that implement an I disposable pattern, you need to dispose of them.

So you tell the runtime like, hey, I'm done using this thing. Like clean it up for me. Otherwise, um depending on what the types are, sometimes they'll hold on to memory longer than they should. Uh they might get cleaned up later, uh when the garbage collector's running, and other times like they're holding on to unmanaged resources and it doesn't know to go freed up and you get memory leaks and things like that. So, you want to make sure that you're uh disposing of them. Then there's some patterns around that. So I realized that there's and even like I have screwed this up in my own codebase. There's a lot of spots that that call this helper class and they say basically it's to download stuff from the internet.

Um, but I have some some things wrapping it to like I can't it's kind of weird to explain, but I want to guarantee that I get a stream back with a length on it because the type supports a length, but it's not always you can call it and it will crash. Like, it's really dumb the API for it. So, anyway, I I enforce it. There's a few more things. But because it's my result type, it's not obvious that you need to dispose it because you're not disposing of the result type. You're disposing of the result types value. So in my opinion, it's so easy to miss it because it's not like call this thing and give me a stream. It's call this thing, give me a result object, result object, give me your stream. Just one level of indirection. But the side effect is that I missed a bunch of spots that need to dispose of this resource.

I realize it's probably pretty wordy. Um, but what I was doing was like, okay, like I need an analyzer rule that basically will check if I have code that's not disposing of this stuff, catch it. Because it's not wrong like from a compilation perspective, it's not wrong to go, you know, create a stream and then not dispose of it. It's totally valid code. It's just depending on what that stream is doing or hold what resources it's holding on to might not be what you want. So, okay, co-pilot, go make me an analyzer. And it did. It was cool. But then I realized it was only doing it for like one of my three helper methods. And each one is slightly different slightly. So the first analyzer was working, had tests for it. Um, and I was doing the thing where I'm like in Visual Studio changing the code around and and being like, okay, does the analyzer pick this variation up?

Does it miss it? What do I expect? So that's like a really good feedback loop to like to play around with. Um, I really like that for building analyzers versus just like crossing my fingers that Copilot will get it perfect. And so I realized, oh crap, it's not covering the two other helper methods, even though they're very, very similar. And I said, Copilot, we got to cover those. Same idea. And what happened was that it ended up breaking like all of it. So this is why I think it's really important in your analyzers to have like you know uh happy paths that like you know don't trigger the analyzer and and paths where the analyzer is triggered but um yeah basically it broke the an like the analyzer on the scenarios that it's supposed to catch and so it just was like sitting there in a loop.

trying to like to fix this stuff to fix these other variations of the analyzer. And um you know if I would have left that running in uh in GitHub, what would have happened was that it would create the analyzer, create the test, it probably would have fixed up a few spots, but this is something that applies to the whole codebase. And like when you're looking at a diff of what it's changing, it might show you a bunch of spots and you're like, "Oh yeah, like it got it good." But like that's not sufficient for reviewing because like in my case, I had to go poke around in the code base manually and go, "Oh, like it missed these two variations." Like, "Oops, um, got to make sure those are covered." And then even when it said it covered them, I'm like, I'm going to go play around with the code even more and see like, you know, did it actually catch these?

Because it was confidently telling me that it wrote the analyzers, the test pass and stuff. And they did, but like the tests are not representing the code that it needs to run an analyzer on. So anyway, just a really I think it's a cool feedback loop, but this is just an example where like this one I think is a little bit too complicated for it to get. I think we'll get there on it cuz it got the one variation and the other ones are honestly they're very similar. So, it's probably trying to reuse too much code that's like you've probably done this yourself with code where you refactor something to be reusable and like then you change one part for one one of your scenarios and it breaks the other scenario and then you like kind of like a pendulum swing. You fix it for the one scenario and it breaks it back for the other scenario.

So, I think that's just what's happening right now. But overall, I'm I've been pretty happy with this kind of flow. I think I have 40 analyzers now. All of them vibe coded this way. And um it's been helpful. I find that it's at least it's not perfect cuz I already told you the example earlier where it's kind of screwing off and like suppressing the analyzer rules, but I think for the most part it's uh it's been helpful. And you can actually put a a descriptive message in when the analyzer um is breaking on a line. And I think if you think about that like a prompt, right? Like this is a prompt for that an LLM's going to see when the build breaks. So give it something descriptive and helpful that it it can make a informed decision about like what the right pattern is. Let me in.

Jeez. Um, okay. So, closing thought on this though, I I wanted to touch on this, not like the specifics of code necessarily, but this is continuing to highlight that like this is all the same kind of problems you have with people, right? So, I I've been talking about analyzer rules to like help enforce patterns in a codebase, right? That's not going to stop people from making the wrong decisions just like it's not stopping an LLM because a person can do the same thing. They can suppress the messages, right? Analyzer is triggered. It's like, nope, I'm going to suppress this. Like, I'm ignoring the rules. Push it up for for code review. And then what? Right? person who's on the team is going to go, "Hey, like what the hell are you doing?" Those rules are there for a reason. Now, it might be a valid reason where you're like, "This is a a special scenario.

We do need to ignore them and that might be valid." But point is, people can break the rules the same way that the LLM's can, right? And I think if we think about all these guard rails and guidance that we can give to LLMs, it's the same with people. Tried to give someone an example recently where it's like if you think about prompting an LLM and you give it like vague instructions it might go build what you want, right? If you were to give a person vague instructions, they might also build what you want. But like you actually can't be disappointed with someone if you give them vague instructions and they build you something that they think is what you want, but it's not what you want, right? You can't get mad at someone for that. You might be like, "Oh crap, that sucks.

Like, it's not what I want, but you can't be mad at them." You would go, "Oh crap, I wasn't specific enough about that." So, yes, you did build me a um I don't know like a products website. You did do that. Great. Excellent. But like I wanted it in C and you gave it to me in React, right? I like I can't be disappointed in you for that. It's still technically correct. So that can happen like you can be you know LM gets stuck on something and doesn't deliver the result at all like people can also do that right you can leave documentation at instructions and stuff and then like people can follow it for the most part and miss some of the documentation.

You can say follow the patterns in the codebase and people can try and still go you know you review it and you're like hey that's that's a pattern in the codebase but that's the the legacy one that we haven't deprecated and like so you copied the wrong pattern right there's there's all these opportunities that like that happen with people too so I think that like that's been one of my realizations and I I keep trying to remind myself that like when I get frustrated by AI doing dumb dumb I'm like, people do dumb too. And like, it's very easy for me to sit at my computer and be like furious with AI because it's like I've talked about this in many videos now, but it's like seemingly doing these really smart things where I'm like, "Oh, this is awesome." And then it's just like doing

stupid and you're like, "How can you be so like seemingly so smart and so stupid?" And it's not a person, so it's very easy to like be angry at it. But like with people, if I had a, you know, developer on my team and they did something where I'm like, oh, like that's I don't think I feel this way about people, but they make the same like, you know, air quotes like stupid decision. I wouldn't, first of all, I wouldn't think, oh, you're stupid. That's not even a natural thing that that comes up because I work with people a lot and my intuition would tell me like people people aren't like people aren't just like purposefully stupid. So they probably missed a detail or they didn't know something existed or they read the wrong thing or they got, you know, wrong information. Like they're just doing their best.

So, I would be already trying to understand like, oh, like where did where did we go off track? Like, how do we make that better? That's my first instinct with people, but it's not my first instinct with AI. My first instinct with AI is like this model is the stupidest I've ever touched. Let me switch to a different one to be frustrated as well. But like the reality is it's like I should be having the same kind of outlook where it's it's like hey it's it's just doing like you know it's just try I don't want to say it's doing its best like it doesn't it's not a person but um it's kind of doing based on the guidance you've given it. So, if it's doing dumb maybe it's a combination of like prompt, better probe, but um you know, putting more guard rails and stuff in place, being more descriptive.

A lot of the time when I if I reflect on when AI is doing stuff that seems dumb, a lot of the time it's because I wasn't being specific enough and I'm I'm disappointed in the result of not being specific enough. There's certainly other times where I'm like, "Hey man, like I don't know how many times I can put this into your various forms of context here, but like this is just not right." And like, you know, we're still early. We are definitely still early. But that is a frustration. Nice. We got prime parking. This is wicked. So, I think that's it for this video. So hopefully that was kind of interesting. But like I said, I wanted to do a bit of a an update on some of this stuff because it's been very top of mind. We use it a lot in our brand goes codebase.

So it's been helpful. But um if I come up with any more interesting advancements, I will share back. And of course, if you're interested in seeing some of these things play out in more detail, head over to Dev Leader on my main YouTube channel. And then, of course, ask, right? If you want tutorials, you want to see examples, just ask. This channel's all about you ask, I try and answer. And if that means a a programming tutorial on my main channel, happy to do it. So, thanks for watching. Leave your questions below. You can go to codecame.com, submit questions anonymously. can't speak an anonymously and uh otherwise just uh send me a message on social media. I'm happy to help. I will see you in the next one. Take care.

Frequently Asked Questions

These Q&A summaries are AI-generated from the video transcript and may not reflect my exact wording. Watch the video for the full context.

How do you use analyzers to enforce coding patterns when using AI code generation?
I write custom analyzer rules in C# using Roslyn to enforce specific coding patterns in my codebase. When AI generates code that doesn't meet these rules, the build fails, prompting the AI to fix the issues. This helps catch unwanted patterns like unnecessary comments or missing assert messages, providing guard rails for AI-generated code.
What challenges have you encountered when using AI to write code that follows your preferred result pattern?
AI sometimes ignores my preferred result pattern, leading to inconsistent code where some parts follow the pattern and others don't. Even with instructions and examples, AI can produce code that throws exceptions instead of using my custom result type. This inconsistency makes the codebase convoluted, so I use analyzers to help enforce the pattern more strictly.
How do you compare AI's mistakes in coding to human developers' mistakes?
I realize that AI making dumb decisions is similar to humans making mistakes due to missed details or misunderstandings. While I might get frustrated with AI, I try to remind myself that people also make errors and usually do their best. The key is to provide clearer instructions and guard rails, whether for AI or people, to improve outcomes rather than blaming the intelligence behind the errors.