Ask me what skills you need
What are you building?
Tell me what you're working on and I'll find the best agent skills for you.
Use this skill when image files (.png, .jpg, .jpeg, .gif, .webp, .bmp) are the primary input and the user needs to understand, extract data from, or analyze image content. Provides a pre-configured caption script (scripts/caption.py) that converts images to text descriptions via a vision model — no API key setup needed. Covers: (1) captioning charts/tables/screenshots/diagrams via scripts/caption.py, (2) parsing caption text into structured DataFrames, (3) re-creating visualizations from extracted data, (4) exporting to Excel/CSV. Trigger when user uploads images and wants: data extraction, table OCR, chart analysis, UI description, or diagram understanding. Do NOT trigger for image editing (resize, crop, filter) or image generation.
Analyze, extract data from, or understand image files (.png, .jpg, .jpeg, .gif, .webp, .bmp). The core workflow:
scripts/caption.py to get a text description of the imageThe script converts images to text descriptions via a vision model. Set VISION_API_KEY and VISION_API_BASE environment variables before running.
# Basic — get text description
python3 scripts/caption.py /mnt/data/image.png
# Custom prompt — guide what to extract
python3 scripts/caption.py /mnt/data/chart.png --prompt "提取所有数值,Markdown 表格格式"
# JSON output — includes detected type, usage stats, cache info
python3 scripts/caption.py /mnt/data/image.png --json
# Batch — process all images in a directory
python3 scripts/caption.py /mnt/data/images/ --batch --output /mnt/data/captions.json
# Override model (optional)
python3 scripts/caption.py /mnt/data/image.png --model gemini-3.1-flash-lite-preview
| Option | Description |
|---|
npx skills add OpenSenseNova/SenseNova-Skills --skill sn-da-image-captionHow clear and easy to understand the SKILL.md instructions are, rated from 1 to 5.
Clear and well structured, with only minor parts that might need a second read.
How directly an agent can act on the SKILL.md instructions, rated from 1 to 5.
Mostly actionable with clear steps; only a few small gaps remain.