Exploring Agentic Skills And The Art of Not Repeating Yourself

Exploring Agentic Skills And The Art of Not Repeating Yourself

• 92 views
vlogvloggervloggingmercedesmercedes AMGMercedes AMG GTAMG GTbig techsoftware engineeringsoftware engineercar vlogvlogssoftware developmentsoftware engineersmicrosoftprogrammingtips for developerscareer in techfaangwork vlogdevleaderdev leadernick cosentinoengineering managerleadershipmsftsoftware developercode commutecodecommutecommuteredditreddit storiesreddit storyask redditaskredditaskreddit storiesredditorlinkedin

Let's talk about building agentic skills! This is a recap regarding some of the things I was focused on exploring with agents, and in particular, focusing on how to build skills so that I don't need to keep repeating myself.

📄 Auto-Generated Transcript

Transcript is auto-generated and may contain errors.

It is Thursday. We're going to talk about AI because that's all we do now, right? Um I want to talk about building skills and a little sort of a manual loop I'm trying out and of course anything manual. Would like to not be manual and improve that process over time. So, Whoa, there's a siren. I don't see. Is he going the other way? Yeah. Nice. We're safe. Good start to the video. Um, so yeah, I've been trying to play around with skills a little bit more. Um, and if you're not familiar with what skills are, I guess depending on when you watch this, maybe this is the most mainstream thing or it falls off or whatever. Who knows? Um, skills with AI agents. My god, this car with the beeping, man. Um, there's nothing in front of me. Skills are, what's a good way to describe skills?

Um, we have so many things that are similarish with like different names. Like, you can have a custom agent and like, well, what's a custom agent? That sounds fancy. Well, it's it's a markdown file that has instructions. Well, wait, don't we have an instructions file? Yeah, but this one's a custom agent. Okay. Well, what's a skill? Well, it's a it's a markdown file. Um, no. Conceptually though, a skill. Uh, the way that I try to talk about it is a skill is kind of like a um a specific like action or scenario or workflow uh that you could have an agent go carry out. And it's like it is confusing and weird and there's overlap because like for example you could build uh you know a custom agent which is just a markdown file and it could be like your optimization agent and like you give it a whole bunch of different uh uh you know like data or like prompts or instructions around like how to optimize and blah blah blah.

So like the a lot of the context that it like starts to preload with is is really around like optimizing and maybe you work in C. So you're you're kind of mentioning C tooling and debugging steps and optimization techniques and stuff or it's a different language. So like you're you're kind of loading it with some um some ideas around that. So you got to have an agent that you talk to that does optimization. Okay. And like so then why do you need a skill? Well, a skill is something that another agent can evoke. So it's it's you got to accomplish a similar thing, I suppose, but it's uh I like thinking about it more like building blocks. So it's a smaller unit of things that you can compose with. Um so I literally like I'm starting to think about it just like building software at this point.

Um yeah, you could go have, you know, duplicate logic everywhere for um different agents that are interested in doing optimization work, right? Or you could have an optimization skill and then any agent that needs to perform an optimization step has a skill that they can they can go leverage. So almost like a dedicated subprompt. But the the neat thing about skills is that like there's a folder that you have for your skill with a markdown file. There's some some standard to it. Um is does every company have the same standard for their skills? Uh who knows if it's stuff so frustrating with everyone kind of doing their own thing, but um I think there's like an open standard for it. So ideally we all converge to that. And so you can have your markdown file then you can have your um like associated script sometimes. So uh for me maybe there's like a PowerShell script or something like that that's going to carry out some work.

And I think I mentioned this in a video the other day. An example of like a skill that I have is when I'm talking with my uh like with co-pilot or Claude. Um, it can go invoke a skill if I ask it to go generate an image. And interestingly, like if I were in chat GPT, I can just type it in the chat and like because it's multimodal, it will go make an image for me. Um, but if I were to just say that to Copilot on the CLI, and I don't know if I've tried it recently with Claude, maybe they have some some of this stuff like built in by default, but um I have to sneeze. It's coming. Oh, pardon me. We got two. Pardon me again. Okay. Um maybe maybe some of these other tools have uh like even on the command line and stuff or maybe in cursor.

I don't know. Maybe they can go do image generation now. But um from my experience like that's been a no. So I have a skill that um that I built like you know it's a vibecoded skill which is where this loop idea is going to come in but I can say generate an image for me um and then it will use an environment variable to go uh use a connection string to Azure and then uh generate images for me which is pretty cool and so that's like a dedicated operation to go do right it's pretty specific go make a skill or go make an image for me. Sorry. I also have another one because for uh a lot of the times where I'm generating images, I I need that image uploaded specifically to Azure. So, um I have a skill for uploading and again it will use a um an environment variable for you know the right off information and connection information to go upload to Azure for me.

So now instead of having a dedicated agent that's like my image creation agent, I have a skill that any agent I work with, this guy goes down to one lane, buddy. Come on. Just like beside me in one lane. Um, not very smart. Uh, now any agent I have can go leverage that skill if I'm talking to it and I ask it to go make an image because it can go discover the skills and the skills have some uh some meta information at the top that explains how to use them. So yeah, overlap. Hope that's not super confusing. I'm still kind of navigating this stuff and figuring it out. And like I said, maybe by the time you're watching this, maybe this is all doesn't exist or maybe everyone's using it and it's like the common thing. I don't know. We'll see. But so far been pretty helpful.

And I'm trying to lean into this more because I think there's a lot of things like all of us do that I find are pretty repetitive. So when things were repetitive, I was making uh or I should maybe I should have been making custom agents for it. And I had a couple of things like that. Like I I have like a a dedicated C developer agent, but I also split out one specifically for architecture. Maybe I should have done like a a testing one, but in in practice, the way that they get used, it's like I'm not manually specifying different agents to go run. I think Claude will actually do a good job like uh you know delegating that way. And I think the co-pilot CLI does too, but uh historically when I've been using GitHub co-pilot in the cloud um I just assign it one agent.

We got to we got to switch lanes here cuz this isn't this isn't going so well. Um, so it's it's kind of like the opportunity's been there, but I'm not like leaning into it specifically to go make dedicated agents. But skills, they seem to be a good sweet spot where I'm like, "Okay, I'm doing this this thing. I do it a lot. Let me go make a skill for it. Um, and the I know the image generation one might seem kind of silly because you're like, dude, like we're talking about coding. Like, who cares about image generation? Um, it's true, but it's a it feels like a relevant example just because it's a thing that I need to do that my LLM is not doing for me in the chat.

Um other skills that I started uh doing especially because uh for article revisions and stuff when I have markdown files that have articles uh I can have a skill that does like an SEO analysis or a structure analysis and the skill itself it doesn't go fix the things it does the analysis reports on the structure so for SEO it would go figure out like or I tell it the topic uh or the keyword for the article. It will go look for the spots where you should have your keywords, the frequency uh and then kind of report back on uh where uh SEO optimizations are or where they they could be for improvements. And like uh another one just to give you some other examples, another one when I'm uh trying to draft an article. Okay. Well, if I want to be competitive for SEO, um, like what do I need to do?

Like, what are we talking about for word count, for structure, like the people that are ranking top on the internet? How do I have a a competitive article with them? So, there's a skill that will go do some of that research. So, uh, again, could that be an agent? Yes. Um, but what's nice is that I don't need to pick a dedicated agent. I can talk to any of them and it will discover a skill that can go do that. Um, okay. So, I've been doing this kind of thing because I'm like, if it's something that I regularly do, why don't I wrap it up into a skill? That way, I can ask an LLM to do it. To bring this loop idea kind of into the picture. Um I've noticed especially going between tools cuz I've been talking about that a lot uh for the beginning of this year uh trying to make sure that I'm I am using Claude more.

I am using Copilot CLI more. I have to try using uh codeex. So these skills should work across the different agents uh that support the concept skills but there's little differences. So for example uh I was using co-pilot CLI and for whatever reason the tools that are built in like it can use PowerShell pretty pretty straightforward and if I use claude and it was using the skills I'm I'm looking in the console in the terminal and I'm like it's literally running these things and like there's errors everywhere and it's still going and so I'm like so obviously there's something going on here we need to make better And so, you know, then I'm talking to Claude about it or looking at what it's doing and it's like trying to work around these like glaring issues that are in the script or it's ignoring them.

And so then the work that it has to go do, it's like, well, the the skills not working, so like let me just go write my own script and try to, you know, answer what the user's asking for. And so initially I'm like, "Okay, well, hey man, like why are you doing that?" Right? And and so it explains itself. Oh, we're hitting an issue. Like the parameters aren't escaping properly. So initially I'm like, "Okay, I just want to make sure these skills work across the agent." So, uh I would tell Claude in this in this example like go fix up this this PowerShell script or whatever so that uh you're escaping things or update the skill so that the callers know it's going to be PowerShell. Here's how you get environment variables passed and blah blah blah. So, all goodness have it working across agents. And then as I'm using these skills more and more, I start to realize like there's scenarios where like it's just not it's just not working.

So the SEO example uh no the structure analysis is a better one. Uh if I'm trying to analyze the structure of an article, it's really funny. If I'm like, hey, have this article written. I want the uh talking to uh an LLM to go like optimize the structure, right? And then I've noticed like if I use cursor for example, um cursor because it writes fast, it's it's really nice for this kind of thing. And it would be like chugging away at this thing and then it scores the the structure at the end. And I'm like, you didn't even improve. Like you just ran for like 10 minutes burning all kinds of tokens. And the structure like the score is still the same. Uh so obviously step one is be frustrated because because I'm human and I'm allowed to get frustrated. Uh step two is like okay well what the hell's going on, right?

So, I need to I ask in this case, cursor, I want you to go back through your, you know, your tool calls and the chat history here and and see why you were um basically not making progress, right? I can't remember exactly how I I prompted it, but something along the lines of like, I need you to do the analysis to understand what was going on. So then it does and it reports back and then I and then the first thing it says is like hey like um given that like do you want me to go fix the you know the structure of the article and I'm like no I mean I do but that's not the next step. What I want to do actually is make the skill better, right? Because I don't want you next time to go waste 10 minutes doing nothing productive just to afterwards go, hey, do you just want me to go do it and figure it out?

Like, no, I I want you to fix the skill. So then I said, based on your analysis, like what what would we do to make this skill better? And then so and this was a a pretty obvious one in hindsight, but it was like, hey, this skill that you have is uh yes, it's like it's trying to score things properly, blah blah blah, but um what's happening is that it's trying to it's trying to do the fixups. Um in some cases, this skill is trying to fix up, not just report. And in other cases it's saying like just to give you an example um you know there are there are 10 paragraphs that are too short which is probably accurate right like but when it goes to fix up it's like I don't know which paragraphs are too short so it's just picking random paragraphs and and going oh let's try this.

So that's why it's potentially not making progress, right? If it guessed at the right paragraphs, it gets it right, but it's like it's not being smart. It's just picking paragraphs and trying to like like do some restructuring. So the suggestion it said was like when it reports back, it should actually say like, you know, uh this paragraph or this list or whatever, like this is the thing specifically that needs to be addressed. And so it updated the skill. Then I said, "Cool. Now I want you to use the skill again and and optimize like the structure of this article." And um and then it reported back saying that it could do a good job. The tricky part here is like um one I didn't want to lose the context of the conversation uh in the general sense, but what I probably should have done to

verify that skill was improved is specifically when running it go use a new conversation because I feel like um it's kind of maybe it's cheating a little bit like the the context is maybe uh I don't know a little bit uh tainted because I I had this pre-con conversation with it about um you know what's wrong with the structure analysis. So maybe given that it's cheating a little bit than just relying on the skill. Um so anyway, all that to say that I'm trying to use the agent itself to reflect back and make improvements to the skill. Problem that I have right now, this current moment, is that that's still very uh manual. So, I notice it's doing something dumb. I have to basically tell it to stop, do the reflection, come up with an analysis, and then make proposals for the skill, and then re-evaluate it.

And I am sure um I'm sure there are better things than this. I think there's like literally eval uh harnesses and stuff like that. So, I I need to explore what I can do. Oh man, there's no parking. I need to explore what I can do in that regard. There's literally no parking at the gym. Come on. Um, so that that's more of an automated loop for me. Um, obviously I'm not like at hypers scale like doing a million terminals at once like orchestrating a world worth of agents. But if the more things that I can have uh be automatic like that, the better. Man, I can't see any traffic coming up. Yeah, there is a car. There's no parking at the gym. Is this a spot even? We'll pretend it is. I think it is. Okay.

Um, so yeah, I'm I'm kind of interested in this skill loop to figure out what I can do to to work on skills, but also have them like self-improve because I've seen too many times now where it uses a skill and then it's like the skill's either not effective and it's trying to use it, which is at least good. It's trying to use the skill, um, or the skill is broken and it's just kind of like silently failing and then it's working around it. So, it's almost like it runs the skill, good job, but then it's just creating more work in general. So, looking to see how I can make that feedback loop um and like self-correcting in terms of skills better. Final thing I'll say is that uh I have noticed too as I've been building out a bunch of these skills, just like I said around software and uh like composing these things, uh refactoring and dduplicating.

I'm like, "Hey, we're doing this common thing across these skills, talking to the LLM. Extract the common logic into a new skill." It works. It's pretty cool. So, leave that with you. Just some things I'm working on. See you in the next video.

Frequently Asked Questions

These Q&A summaries are AI-generated from the video transcript and may not reflect my exact wording. Watch the video for the full context.

What are AI agent skills and how do they differ from custom agents?
I think of skills as specific actions or workflows that an AI agent can carry out, acting like building blocks that can be composed together. A custom agent is usually a markdown file with instructions for a broader role, while a skill is a smaller, dedicated unit that other agents can evoke to perform particular tasks.
How do you use skills to improve repetitive tasks with AI agents?
When I notice something I do regularly, I create a skill for it so any AI agent I work with can leverage that skill instead of duplicating logic. For example, I have skills for generating images or uploading files to Azure, which makes these operations reusable across different agents without needing dedicated agents for each task.
How do you handle improving skills when they don't work effectively across different AI agents?
I manually intervene by asking the AI agent to analyze why a skill isn't working properly, then have it propose fixes to the skill's script or logic. For instance, if a skill is not improving article structure as expected, I prompt the agent to reflect on the issues, update the skill accordingly, and then re-run it to verify improvements. This feedback loop is still manual but helps make skills more reliable.