Burning Through Tokens With Copilot CLI and Claude Code

Burning Through Tokens With Copilot CLI and Claude Code

• 515 views
vlogvloggervloggingmercedesmercedes AMGMercedes AMG GTAMG GTbig techsoftware engineeringsoftware engineercar vlogvlogssoftware developmentsoftware engineersmicrosoftprogrammingtips for developerscareer in techfaangwork vlogdevleaderdev leadernick cosentinoengineering managerleadershipmsftsoftware developercode commutecodecommutecommuteredditreddit storiesreddit storyask redditaskredditaskreddit storiesredditorlinkedin

An update on burning through tokens with Copilot CLI and Claude, as well as getting burned by vibe-coded output.

📄 Auto-Generated Transcript

Transcript is auto-generated and may contain errors.

Hey folks, I'm going to do an AI recap. Um, couldn't really find any good topics on experienced devs, so we'll skip that today. Um, so more recently for me, I've been telling people that I've been using Copilot CLI a lot. Um, and then I need to start getting switched up. I still haven't gone to codeex which is uh it is it is on the top of my list for things to explore but uh more recent sort of forcing function that has moved me at least split between co-pilot CLI is back to claw code and that's just because out of nowhere I like haven't changed how I've been coding over the past couple of weeks but I'm hitting uh rate limits and API errors like nonstop on co-pilot CLI now. And originally I was like, "Oh, like I probably have some like some budget thing that I'm hitting." And it still says I have tons of budget left like that that I set myself anyway.

So I I don't know what's happening. I don't know if uh you know on the Azure side they started rolling out more limits or or what to be honest. But um yeah, basically uh it went from being something that was extremely useful to me to to something that just literally does not work because it's completely out of uh some kind of budget like and it's always a mix like sometimes it's API uh limits are the errors other times it's saying like transient API errors and then and I happen to notice like in another uh session I have it's hitting API errors so or like a like throttling throttling throttling limits can't speak. It's going to be one of those days. So I said okay well I'm not I'm in the middle of work on a bunch of these projects. I'm not just going to go okay like let me derail everything and get cop uh codeex going.

So, switched uh back over to Claude for some of these things and then kind of balancing between Claude and and Copilot CLI. So, that's all goodness. It's uh you know, they're both extremely similar kind of interfaces now. And I realize some people that are diehard Claude fans will be screaming about me saying that, but in my experience, they are. And as a net developer, I don't know, like I find that uh Claude takes constantly takes tons of guesses the wrong way at how to go execute things that uh I would just expect to work the first time. Like for it to go run a PowerShell script, it decides it wants to write a PowerShell script. And for it to get something trivial right takes like six times before it works. um just like kind of ridiculous stuff. It ends up working. It's not the end of the world, but um at least for the stuff I find myself doing, I don't I don't find Claude to be better.

Uh I use the Opus and Sonnet models in in Copilot CLI or Claude, but um certainly not better. But I think the the most troubling thing right now is that however I've been organizing my my instruction files and I have been using an agents MD file which I thought everything was supposed to be using. Maybe not. I feel like Claude is back in stupid mode. And what I mean specifically is that I have been working with co-pilot for for weeks in these setups and building up like agents MD files or in some cases um you know trying to enforce things sorry enforce things with pre-commit hooks and whatever else and this varies from repository to repository and project to project but for the ones I've tried switching over to claude it's almost like whatever mechanisms I was using to help enforce and guide are just gone.

Um things like just for example uh it would write code and then decidingly not write the tests uh for that code and okay that's like that's weird cuz that's in my agents MD file. It's actually supposed to do TDD in this one repo. So like it should have written the tests in the first place. Didn't. Okay. So, no no worries. Like, okay, Claude, go write the tests. It's not a not a big deal. I just want to make sure I have them. Okay. Writes the tests. Move on to the next thing. Then I realize a couple of things in like it's not committing. And again, one of the agents MD instructions is to commit between features. So, not doing that. Okay. Okay, so now I get to this point where I have three or four features kind of mashed together and I have to tell it to go commit this code individually.

So it doesn't push it up to go build and fails. Like the tests don't even compile. Okay, so that's annoying because why won't the tests compile? Oh, well, when it was writing tests originally, it was writing them in one of the projects and only running those tests. Apparently running those tests. Okay, fix that. Push it up. Fails. So, it fix the compilation errors, but didn't run the tests. And it's like just back into this absolutely stupid mode. And I'm not saying that Claude is bad, for the record, cuz I don't want someone's blood to boil here. I'm not saying Claude is bad. I'm just saying that switching between these setups, I seem to have lost all of the guidance that the that the LLM agents have been using to not do stupid things. And so that's infuriating for me. So I have to go figure out um what it is.

Like maybe I'm just being stupid and in this repo I only had a co-pilot instructions file and I hadn't migrated over to an agent MD file. That would explain absolutely everything. I just don't know if that's true or not. So I have to go look that up. So kind of kind of silly, but that's a frustrating experience kind of transitioning back to Claude. Um, but it's it's it's good to to make sure that I have like a against something kind of forcing me into a different tool. It's really cool to see they added voice mode. I don't know why that's not um like top of every tools list to be honest. I talked about this before, but months back, I got myself a foot pedal so that and I have a dedicated mic and a foot pedal so that when I'm coding, if I need

to if I need to get into a mode where I'm not just like talking about specific lines of code and like copying and pasting stack traces and stuff like that, then just being able to talk is so much more convenient. And so I think the way that they've done it, it's I thought was going to be silly, but I tried it out. You just hold the space bar and talk. I think that's really nice. Um, it's a it's a little clunky. Like I've had it where it thinks I'm trying to hold down like a repeating space uh before it starts going. And so like just little clunky things. But this is in my opinion like the side effect of trying to use a terminal. Yeah.

How do you how do you make a terminal uh a rich user interface like okay well we need long press of keys to detect like it's hey we don't we don't want a UI cuz we want to stay in a terminal but we want you know a rich user experience. Okay. So I think it ended up being pretty nice though um to just be able to hold space and talk to it. It's nice, too, cuz some of the transcription tools will wait until you're done everything and then blast the text in. This uh seems to be doing it in some kind of chunked way. So, uh it's not immediate streaming as soon as every word is done, but it does it in in some type of chunks. So, pretty neat. Um, and yeah, I have no reason to like move back away from Claude unless I can't fix this instruction stuff, then I'm going to lose my mind.

So, that's been more recent on the tooling front and I will say it again out loud. I need to carve out some time for codeex. So, I will explore that. Um, one of the things I've been playing with more recently is Microsoft Agent Framework. And I finally went to go record a YouTube video of some of the the needler functionality I was building into like basically to have additional support with Microsoft agent framework. And I just wanted to talk a little bit about like uh probably what seems obvious to people when you step back from it, but just some like vibe coding blunders. And it's not super terrible or I don't know. It's not like it couldn't have been caught. It's just this idea of like trying to get to output faster. Okay. So, a couple of things. One is that I have this uh dependency injection uh assembly scanning library called Needler and it uses source generation as well.

And so I have this library and I wanted to build in Microsoft agent framework functionality into it. So, I basically vibe coded things with co-pilot to come up with some uh like source generated shortcuts. So, you can Oh, man. There's a lot of water on the road. This part of the highway is terrible. Um, you can use attributes and decorate a class that is an agent. And so, you don't really have to go write the code. you just kind of decorate it, put your prompts and uh other sort of functionality onto it and it will source generate an agent for you. That kind of stuff. So, I built a few things like this and then I had that in this reusable library. Great. That's a needler. And then on another spot, I was trying to basically go from having some uh like skills like the the markdown files into porting that over into like a a car first dedicated uh Microsoft agent framework orchestration.

So these skills if if you were running them in copilot or in claude, it would tell the LLM like do these things. Now at this point we need to like do some things in parallel then after that we need to come together and uh and summarize or whatever then do some other stuff in parallel whatever it's a it's a pipeline right there's some orchestration there and so I told co-pilot like I have these skills I want you to go port this stuff over to be orchestrated and so it I told it to use needler because that's like what I set myself up with, right? I said I wanted to start using this from the beginning and that way I can also see what things I should add or tune and had it uh just kind of vibe code porting these skills over to this orchestration. Thought, you know, so far so good on this idea and of course just wanting to focus on making progress or what looks like progress.

uh was kind of doing this this vibe coding loop. Let's get moved over here. Sorry, one sec. I just want to move these lanes. And it's very rainy. We'll do one more. Okay. And so the way that this looked is that I would go run this orchestration and I needed to use the LLM like for real so that I can see that it's actually doing the right things. And then I was really just trying to optimize it. So I was looking at how much it was failing between uh stages, having to rerun things, the token usage, the runtime, and um really got to this point where when it was doing some work, I noticed that it was when I'm saying doing work, by the way, like this this orchestration that I had it build would rerun entire stages completely. And so I was like, well, why is that happening?

And then it would tell me, hey, one of the later orchestration steps was validating it and said like not valid. Okay. So I'm like, well, it tells what the conditions are, right? Like what is not valid and it does it in a JSON structured way. Yes, it does. Okay. So then I realized that basically when it's feeding that back to the other orchestration step, the other orchestration step doesn't know how to like surgically fix things because all it can do is like a in this context is just like write text. Um it basically has to give a full output. Come on buddy. What's going on here? That's not how it works. Bus. This bus is like stopping in the middle of the road. Um Oh, he was trying to get over to this lane. He just didn't know what light it was at. Interesting. Uh so the problem is like it can't do anything surgically.

has to write all this text back out. So anytime it needs to validate what was happening, it's like validator goes, "Hey, that's not right. Here's all the reasons why." And then the other thing goes, "Well, guess we got to try again and hope that we regenerate the whole thing and get it right." And some of the things would be like uh in some cases it was looking for some C# code and it's like, "Oh, there's a syntax error like on this line." Okay. So rewrite this entire thing because one line was not right and it was like outrageously wasteful. So um was just experimenting with some different surgical fixes for things. And then I got to a point where I'm like, "Okay, I think I think it's working how how I'd like before I move on to the next wave of optimizations or like functionality here.

This is feeling pretty good." Now I can see from asking co-pilot looking at the the data that comes out of it looks like it's not being crazy wasteful anymore. Awesome. So we we're getting the surgical fix is working. This is good. In rare cases it has to redo. Um and that's okay because the surgical fix uh we don't have a good tool for it. So then I end up looking at this code. What is this person doing? Are you dumb? Okay, there's two lanes and they made a right turn from the left lane when they were coming towards coming towards me. So, I thought they were switching lanes, but they were actually making a right-hand turn across a lane of traffic. Amazing. So, I go I'm like, "Okay, I got to I got to check this code before we get further." And this is my reminder to you to check the code sooner rather than later.

So all of these optimizations it was trying to do to tune this thing were the absolute worst. And it's pro, like I said, probably easily predicted. But instead of like solving these problems and optimizations in what I would call like extensible ways, it just started like carving up this code and adding in like all of these very specific conditions. So, I was just getting lucky that all of these brittle checks it was putting in place happened to be lining up. And so, I I feel like is it was this facade of um like it confidently being like, "Oh, we have this good system and me believing it." Um because it happened to be working in those scenarios. But, man, it was like, you know, thousands of lines long. I just kept tacking stuff on here and every time there was like another edge case, it was like, well, what does one do when there's more edge cases?

Well, you add more more to the if statements. And yeah, so I ended up with this nightmare of code. Um, and so just kind of uh we got to go back to the drawing board. Now, a couple of interesting things did come out of this and I'm going to tie it back to the things I was talking about with Needler. There was a point in mentioning that and that's that I said, okay, you know, one of the good learnings here is I know I needed to do uh like analytics on uh the orchestration. I needed to get success rates for tools, all this stuff. I've built it by vibe coating it, but it's not like done in a good way. It's kind of scattered everywhere, not really reusable. Let's address that. So, I went back to Needler to go start building it. And then I ended up realizing when I checked the documentation that Microsoft did already build a bunch of that.

So, um, it was helpful because it made me say, "Maybe you should go read the documentation before you dive into any of this stuff, you big dummy." Um, yeah. So, I wasted a ton of time and, uh, I'm going to go back to the drawing board on a bunch of these pieces now that I see what's available. I see that they had some middleware type of um, tech that I can hook into. So, yeah. No, like having that knowledge now, I'm like, "Oh, man. I would have done things a lot more differently. So, um, at least I got that learning. So, that's what I've been up to. Um, feel kind of stupid after going through a bunch of that, but that's okay. So, I'll be taking those learnings and still playing around with Microsoft Agent Framework and looking forward to it. So, thanks for watching. I will see you in the next one.

Take care.

Frequently Asked Questions

These Q&A summaries are AI-generated from the video transcript and may not reflect my exact wording. Watch the video for the full context.

How do you handle switching between Copilot CLI and Claude Code when you hit API rate limits?
I switched back to Claude because Copilot CLI started hitting rate limits and API errors nonstop, even though my budget said I had plenty left. I tried to balance between Claude and Copilot CLI rather than derail my work to switch completely. I don’t think Claude is clearly better in my experience, and both interfaces are quite similar, but the outages pushed me back to Claude.
What issues did you experience with Claude regarding instruction files and guidance?
I felt like Claude was back in stupid mode, and the guidance from my agents MD files and pre-commit hooks that I relied on with Copilot vanished. I found that it would write code but not write the tests, wouldn’t commit correctly, and sometimes the tests didn’t compile. I wasn’t sure if migrating to an agent MD file was the issue, so I had to check whether I had migrated correctly.
What did you learn about building orchestration with Needler and Microsoft Agent Framework?
I learned I needed to do analytics on the orchestration and to build reusable components instead of vibing through it. I checked the Microsoft documentation and realized they had already built a lot of that tooling, which made me rethink my approach. I wasted a ton of time and will go back to the drawing board, using the documented middleware and Needler in a more structured way.