Are AI Skills Really Just A Loose Version Of Methods?

Are AI Skills Really Just A Loose Version Of Methods?

• 280 views
vlogvloggervloggingmercedesmercedes AMGMercedes AMG GTAMG GTbig techsoftware engineeringsoftware engineercar vlogvlogssoftware developmentsoftware engineersmicrosoftprogrammingtips for developerscareer in techfaangwork vlogdevleaderdev leadernick cosentinoengineering managerleadershipmsftsoftware developercode commutecodecommutecommuteredditreddit storiesreddit storyask redditaskredditaskreddit storiesredditorlinkedin

In this video, I talk about some of my AI usage in building out and exploring agentic skills.

📄 Auto-Generated Transcript

Transcript is auto-generated and may contain errors.

Hey folks, I got a bit of driving to do today. So, I'll probably try to get four videos done. This one is just going to be about an AI update. Um, I'll talk about a little bit what I'm doing and what I'm trying to explore. So, uh, been talking recently in these videos about trying to build out skills and things like that for for working in co-pilot. And yes, I will keep saying it out loud. I have to switch over to some other tools to make sure I'm trying them out. Codeex is still on my list. I will keep saying it. I do want to try it. Um, and I think what is uh the Claude thing that just came out. I want to try that to the co-work. Is that what it's called? I know that's not specifically for coding necessarily. I still want to try it out.

Um, give that a look. But been using co-pilot CLI a lot for building things. Um, will, you know, say this again. And I'm not like against claude or anything like that. It's just I've used it before and uh I will, you know, I'll switch back to it soon, I'm sure, to to give it another go. But Copilot CLI is just working really well for me right now. And uh like I I don't have a reason to go switch to something aside from uh forcing myself to um you know to learn and go experience other tools. So co-pilot CLI for me has been working pretty well and I've been trying to spend time building skills. And so uh what I mean by that if you're not familiar uh with what skills are uh from the perspective of working with these uh agent harnesses, it's basically a prompt, right?

It's uh in a nutshell, it's just a prompt. Everything's just a prompt, isn't it? And so the idea is that with these skills they h have some metadata associated with them and then the LLM in the conversation is able to uh my understanding is kind of u have some of that metadata preloaded in the context so that when you're having a conversation with it or it's doing LLM things and it's like wait based on what's being asked of me in the current task is there a skill so like a prompt that matches up with doing this and so that it can make the decision to go invoke a skill and follow uh a prompt to go do some work. And so they're pretty neat because I mean the way that I've been thinking about them more and more is like it's really like it's like uh function calls in code except like more loose and flexible which has pros and cons.

So just to kind of give you an example like when I think about a function call in code right it's like it's a set of instructions that you got to go execute right and a function can take parameters and return things and like a skill can do the same thing right so you can have the skill uh like say that it needs the the caller to provide things to it and the cool thing about how that ends up working is that like when the LLM is using the skill, if it doesn't have what it needs, it will ask the user and say like so if you're like, I need to call this skill and it needs a URL. If like if the LLM doesn't know the URL that you're talking about, it will ask you like, hey, give me the URL for this. But if you were talking to the LLM and you're talking about this URL or whatever and you say, hey, like go use this skill.

You don't have to like like literally call the function like write it out in your prompt uh in your terminal to go execute it. Like the LLM is like I'm going to use the word like or phrase like smart enough. I realize it's not like a intelligence thing. Try to pass me on the ramp. You get out of here. Um but it has enough context to be able to figure out like how to how to call this thing. And so it's really neat in that there's this flexibility where you don't have to like be strict with the formatting. But of course, that comes with some uh potentially weird behavior because if the LLM makes an assumption about something based on how the prompt is, uh maybe you're talking about three URLs and based on how the harness is going, it's like, oh, I think the user is referring to URL number two in this case.

And maybe that's not what you want at all, but it's like making a decision and and then executing the skill. So, pros and cons to it. Uh, but I kind of think about them like functions now. Oh, this is going to come on. Thank you. Um, when I think about functions in code, you have the ability for a function to call another function and and chain them together, of course. and with and sorry and I should say not only chain them together but you could uh have a function get called return a value and then the next function you pass that in as a parameter like the return value return that back so you can call them sequentially or you can like nest them right like you can have that ability in code and you can do similar things with skills so you can have

I'm going to just give you an example you can have a skill which is a prompt and you can say you Step one in this skill, which like I said is a prompt, is to call another skill, get the return value, and then call the next skill, get the return value. And it's just prompts chained to other prompts. So, not only can you call them sequentially like that, but you can you can kind of nest them cuz uh skills can refer to other skills. It's pretty neat that way. But I had this situation where I wanted a sort of a generic version of a skill to use, but in a particular context, I'm like, I I kind of need to to customize it. And it's not really enough just to provide uh parameters. But when I was talking to co-pilot about this, cuz I wasn't actually sure how it would operate, I was saying, "Hey, I have this generic skill.

I have this wrapper around it." So it's almost like you know uh a skill or like a function with a function nested within it. Same concept. And I said but I I need to basically override this kind of behavior that this other skill is doing. And I said do I have to like duplicate this which would suck. I don't really want to maintain two versions of it. like do I have to duplicate this or like is there a way that I can like override or overload this? And it said sure like it tells me like it's just a prompt. So like in the wrapper you can say when you call this other skill like don't do step two and three instead do X. So pretty neat. And for some of you, you might already know this and it's not that neat to you, but to me it was pretty cool because um with functions, you have to get kind of creative, uh refactoring code, whatever else.

And in this case, there was just enough like flexibility in natural language, kind of like if you were talking to a person and you said, "Hey, like there's this recipe I want you to follow, right? follow this recipe to bake this cake, but when you get to this subsection for making the icing, I want you to instead skip step two and three for the icing, and you're going to do this instead. It's kind of the the metaphor here. So, instead of duplicating the whole recipe, you can say, "Dude, it's the same as the chocolate cake, but like when you go to make the icing, if you want it to be fudgier, like do this instead." And so, I thought that was really cool.

So um been kind of uh you know refactoring skills to uh to be generic uh for reuse across my repositories and stuff like that so I can get better use out of them and then in some context specific situations making sure that I have uh this behavior overridden to tailor it. So, an example just for a little bit more context is like uh there's some stuff I'm doing with blog writing and just like documentation writing in general, but in one in one particular situation, I'm like when I go to do this, the the data that I'm working with is in a database. So, like literally go do like get the information from the database and write back to the database. And in some other situations, I'm like, we don't know where the data is stored. Like, someone has to tell you that specifically and where it's coming from and where it's going might look completely different.

I have to leave that open and be generic about it. So, just to give you an example of how I'm trying to make things generic, but also have specific situations, um, where that behavior is overridden. So, been doing a bunch of that. And then the other thing is that um there are these situations where I'm building these skills up and they're super helpful as like not only like as a developer but just someone you know doing stuff at a terminal or doing stuff that you're trying to repeat like if I have to have the same conversation with an LLM over and over to do something make it into a skill. So these have been really effective, but at the end of the day, like I'm building software for things. So there's these situations where I'm like I'm still the human sitting at the terminal having the conversation with the LLM.

And yeah, it's great that there's skills to speed some of this stuff up, but like I need to hook this thing up to a schedule or I need to like hook this up to a trigger. And what I don't want to do is hack together something that opens up a co-pilot CLI and does something. And of course, it doesn't need to look like that because, and this isn't new, we've had, you know, AI chat clients, like SDKs in code to be able to go send prompts and get responses in a in a chat for, I don't know, like years now, at least a couple years. And so that's not new if you want to codify the chat experience. But what is more newish, right, is having the LLM in that chat chain be able to call tools.

So, it's like it's the equivalent to like if you have a an MCP server available or a skill available in in an agent harness to be able to go do some other thing like to pull data in or to execute something. Same idea except we need it directly in code like whatever language or framework you're using. So for the last little bit in C that's been semantic kernel and semantic kernel gives you the opportunity to to kind of wrap these things up to have this uh I don't know like this this conversation with an LLM and you can hook up tools and that way and the tools in this case are just other C functions and so that's really cool because now you can have code. And if you're thinking about anything that you've ever built before, you can basically expose uh whatever you want in code just by having a function decorated a certain way.

And so that's really neat. But um these things are evolving and uh so I guess Microsoft I don't know like the whole uh the road map for this kind of thing because I don't work in this area at all at Microsoft but uh Microsoft has recently at the time I'm recording this recently come out with Microsoft agent framework that seems to be like a I don't know if it's officially deemed it this way. It seems like it at least the successor to semantic kernel and maybe like lang chain I guess I don't really know. Um but Microsoft agent framework seems to have a very similar configuration to semantic kernel. So you can hook up tool calls and stuff in a similar way. But what's more interesting in my opinion is that it seems like it's more first class for building like agent orchestration. And the reason this is important to me is that when I'm building out these skills, I was kind of talking about them like function calls.

And going back to this idea of function calls in code, there's times where you do a you have a function and it calls something else and you're like, well, go run this stuff in parallel or go kick off, you know, three asynchronous tasks and give me the results back so I can go to the next step. And these are basically like orchestrated pipelines, right, that you're running in code. And we can do the same thing with skills where you can tell like I I don't know the exact syntax in in for Claude for a skill. Uh I'm I know it's very doable. Uh but like you can tell co-pilot go you know execute whatever and use like sub agents across x number of these things. So that's all there and you can do it. But then when you want to go codify it like it's not really that obvious in semantic kernel how to do that.

But in uh Microsoft agent framework, it really is set up nicely for like orchestrating across multiple agents, which is really cool. So I've been trying to explore that more and more and I'm still very early in that process. So it's not um it's not something that I'm ready to be like, okay, like I'm an expert on this. Let me tell you all about it. I'm just sharing with you the thing that I'm like kind of experimenting with and trying to learn about. So right now that process for me is basically taking these skills that I've written and if there's situations where I want to be able to use that set of skills more programmatically, not just for reuse uh in my development flow, but use them literally programmatically. Uh I'm working with C-Pilot to go convert them over to Microsoft agent framework and then I can have uh because I'm a C developer I have C code that can essentially do the same things as my skills.

The tricky part is then of course you have this agent orchestration hooked up in code but it still needs the tools. And it's funny because working in these agent harnesses like co-pilot CLI or or cloud code um the the ability like that it just has by default because of what it has access to to just go pull in data to access stuff on your file system or whatever else like it's almost like feel uh I don't realize how spoiled we are in those contexts until you go to to wrap it all up in code and by default all you have is like this chat context with the LLM right by default it's whatever the LLM has been trained on. So if you were like, I need to know something like I need to know about the news, right? Give me something that's current, it can't by default because you haven't given it any tools to do so.

But if you're using co-pilot CLI or cloud code, it's like it has stuff already that it can go access, do web search, blah blah blah, go find files in your computer. So it's uh not a not a problem. It's just that we have this luxury in these harnesses and you don't have the same luxury out of the box uh with something like semantic kernel or Microsoft agent framework. So got to go build the tools which is fine. So um honestly that's it's funny that's probably the easier problem uh when I think about that conceptually is like how do you go build uh you know a tool that's going to go hit an API? Okay. Well, we've been doing that for years, right? Before years before we've been talking about LLMs, we have so many different like, you know, REST clients that we can use. There's SDKs that develop or like businesses and developer platforms put out.

So, those are, you know, well understood problems we've been solving for years. And um I would have I guess for me I would have thought that putting the orchestration together having agents working together would have been significantly more challenging but my experience so far is that working with co-pilot at least is uh definitely able to try and take what your skills are doing and then and break it out into an agent orchestration and code. So exploring that more that is one of the big things that I'm uh spending a lot more time in. So skills for myself in my development flow agent orchestration in Microsoft agent framework and honestly I think those are the the big things for right now.

The other sort of aspect to in both of these cases, and I talked about this a few videos back now, and I don't want to lose sight of it, right, is that um as they build out these things, there's like I feel like it's a lot more apparent just in in actual code where, you know, you could go write some code that's terribly inefficient and I I think it's a lot more in the foreground for us where you're like writing it, you're like, "Oh man, this is going to be pretty nasty." But like whatever, it's going to work. I'll optimize it later, right? It's you it's almost like I think for many of us built into how we're solving problems, how do we make this not terribly inefficient? And I've been noticing with my skills um that I'm putting together, it's almost just like get it to work, man.

Like just whatever I'm vibe coding skills with an LLM for it to use. just get it to work. But the problem is that you know over time you keep adding more functionality to it just like code and now you have situations where the the LLM's using the skill it's getting confused right you've added like you know now it needs up to 10 parameters to go use this thing effectively like what instructions is it actually following um is that even the optimal way to do it you're orchestrating these things are they in a in a loop like doing stupid things. Uh, are you blasting tokens away and you don't even know? All sorts of stuff like this. So, in some of the other conversations I recorded on here, I was saying that I'm doing some things where I have like uh almost like an analysis orchestration that I kick off after.

And I probably need to do this in a more generic way because it's like the same types of things that come up, but essentially taking another session, right? Kind of like breaking it across uh two dedicated agents. One who you try the skill or the workflow out, right? And it executes things and you hopefully get to a result, whether or not it's a perfect result or a passable result, whatever. even if it's failing on that situation, you take that entire context of that session. And for me, that could be uh pointing at the session folder for Copilot CLI. It could be in some cases me just like highlighting a bunch of the text from the terminal. And I take that and I go to another dedicated session and I tell it like I have this skill that I want you to look at and I want you to see how this other agent like followed along and I want you to do an analysis, right?

And then what we focus on for the analysis is uh perhaps where this becomes a little bit more customized. But the idea being that you have like a a third party that's going cool. This is what the you know the skill or the workflow says it's supposed to be doing. You're not going to merge into me buddy. Like think about what you're doing. um you have this workflow that's being followed and I can see how like based on the conversation the decisions that this LLM was making and it's almost like as an LLM I can go figure out like uh maybe how to make the tool calls more obvious. Um, it could say like, "Hey, I noticed that it tried invoking this thing three times and it failed the first two times because it was kind of guessing at what to do and the third time it guessed right." So using like this analysis flow to figure out how to optimize.

Um, it could be, um, I've had some situations where it's like, it tried using one tool because it thinks that's the right thing to do and then it tries a couple things and it's like, I think I'm supposed to be doing this, but it's not working and it finally gets to some other tool um, as like a fallback that ends up working. So giving the skill or the the prompt itself like more I don't know like better debugability. So all sorts of things like that uh for like a refinement process. And one of the things that I'm still continuing to focus on to make that whole thing more effective is that instead of just getting the work done with the skill, it's almost like having like logging or debugging or traceability baked into your skill or or whatever orchestration or harness or whatever you're doing. So that as it's executing things, does it have enough information if something were to review it after, do you know how long it took certain stages?

Do you know how many times tool calls failed? Do you know how much confidence it had at different points? And if you don't, it's hard for the thing analyzing after to make good decisions. So like, how do you bake that into what you're doing? And so trying to do a little bit more of that so that I have this flow of like do the work. Great. It it did something. I'm happy with it. How do we make it better for next time? So, that's some of the stuff I'm thinking about. See you in the next one.

Frequently Asked Questions

These Q&A summaries are AI-generated from the video transcript and may not reflect my exact wording. Watch the video for the full context.

What are 'skills' in the context of agent harnesses, and how do they work with an LLM?
I see skills as basically a prompt with metadata that the LLM can preload into context so it can decide to invoke a skill. I think of them like function calls in code, but looser and more flexible. I can chain and nest them, and if the LLM lacks a required piece of information, it will ask the user.
How can you customize or override a generic skill without duplicating it?
I wrap the generic skill and override behavior in the wrapper so I don't have to duplicate it. I learned that it's just a prompt, so you can say when you call this other skill, don’t do steps 2 and 3; instead do X.
What are you focusing on regarding orchestration and tooling like Microsoft Agent Framework?
I am exploring Microsoft Agent Framework as the successor to semantic kernel for agent orchestration. I want to convert my skills from Copilot CLI into Microsoft Agent Framework so I can run code and orchestrate across multiple agents. I think the goal is to coordinate tool calls across agents and build orchestrations inside code rather than relying only on the chat context.