Ask me what skills you need
What are you building?
Tell me what you're working on and I'll find the best agent skills for you.
INVOKE THIS SKILL when optimizing, improving, or debugging LLM prompts using production trace data, evaluations, and annotations. Covers extracting prompts from spans, gathering performance signal, and running a data-driven optimization loop using the ax CLI.
LLM applications emit spans following OpenInference semantic conventions. Prompts are stored in different span attributes depending on the span kind and instrumentation:
| Column | What it contains | When to use |
|---|---|---|
attributes.llm.input_messages | Structured chat messages (system, user, assistant, tool) in role-based format | Primary source for chat-based LLM prompts |
attributes.llm.input_messages.roles | Array of roles: system, user, assistant, tool | Extract individual message roles |
attributes.llm.input_messages.contents | Array of message content strings | Extract message text |
attributes.input.value | Serialized prompt or user question (generic, all span kinds) | Fallback when structured messages are not available |
attributes.llm.prompt_template.template | Template with {variable} placeholders (e.g., "Answer {question} using {context}") | When the app uses prompt templates |
npx skills add github/awesome-copilot --skill arize-prompt-optimizationHow clear and easy to understand the SKILL.md instructions are, rated from 1 to 5.
The SKILL.md content is hard to understand and quite ambiguous.
How directly an agent can act on the SKILL.md instructions, rated from 1 to 5.
The SKILL.md is hard to act on; an agent would not know what to do.