Ask me what skills you need
What are you building?
Tell me what you're working on and I'll find the best agent skills for you.
Use when the user is writing datafusion-python (Apache DataFusion Python bindings) DataFrame or SQL code. Covers imports, data loading, DataFrame operations, expression building, SQL-to-DataFrame mappings, idiomatic patterns, and common pitfalls.
DataFusion is an in-process query engine built on Apache Arrow. It is not a
database -- there is no server, no connection string, and no external
dependencies. You create a SessionContext, point it at data (Parquet, CSV,
JSON, Arrow IPC, Pandas, Polars, or raw Python dicts/lists), and run queries
using either SQL or the DataFrame API described below.
All data flows through Apache Arrow. The canonical Python implementation is
PyArrow (pyarrow.RecordBatch / pyarrow.Table), but any library that
conforms to the Arrow C Data Interface
can interoperate with DataFusion.
| Abstraction | Role | Key import |
|---|---|---|
SessionContext | Entry point. Loads data, runs SQL, produces DataFrames. | from datafusion import SessionContext |
DataFrame | Lazy query builder. Each method returns a new DataFrame. | Returned by context methods |
Expr | Expression tree node (column ref, literal, function call, ...). | from datafusion import col, lit |
npx skills add apache/datafusion-python --skill datafusion_pythonHow clear and easy to understand the SKILL.md instructions are, rated from 1 to 5.
Clear and well structured, with only minor parts that might need a second read.
How directly an agent can act on the SKILL.md instructions, rated from 1 to 5.
Mostly actionable with clear steps; only a few small gaps remain.