Ask me what skills you need
What are you building?
Tell me what you're working on and I'll find the best agent skills for you.
对多 Sheet 的 Excel 文件进行行数统计、大文件 Parquet 转换预处理、数据清洗及分组聚合分析,并生成带样式标记的统计表与可视化图表。
Step1 对数据进行清洗与预处理,包括处理合并单元格、正则过滤以及分类映射。
import re
# 1. 处理合并单元格:向前填充
target_col = 'category_column'
df[target_col] = df[target_col].ffill()
# 2. 正则清洗:去除无效字符或筛选特定格式
def clean_text(text):
if pd.isna(text): return text
return re.sub(r'[^\w\s]', '', str(text)).strip()
df[target_col] = df[target_col].apply(clean_text)
# 3. 分类映射函数骨架
def map_categories(value):
mapping = {
'example_key_1': 'Group_A',
'example_key_2': 'Group_B'
}
return mapping.get(value, 'Others')
df['group_tag'] = df[target_col].apply(map_categories)
Step2 执行分组统计,计算频数、占比,并添加总计行。
group_col = 'group_tag'
value_col = 'value_column'
# 分组聚合:计数与求和
summary = df.groupby(group_col)[value_col].agg(['count', 'sum']).reset_index()
# 计算占比
total_sum = summary['sum'].sum()
summary['percentage'] = (summary['sum'] / total_sum).map(lambda x: f"{x:.2%}")
# 添加总计行
total_row = pd.DataFrame({
group_col: ['Total'],
'count': [summary['count'].sum()],
'sum': [total_sum],
'percentage': ['100.00%']
})
summary_final = pd.concat([summary, total_row], ignore_index=True)
print(summary_final)
Step3 生成可视化柱状图,配置中文字体、数值标签及网格美化。
npx skills add OpenSenseNova/SenseNova-Skills --skill group-by-analysisHow clear and easy to understand the SKILL.md instructions are, rated from 1 to 5.
Mostly clear, but there are still a few confusing or poorly structured parts.
How directly an agent can act on the SKILL.md instructions, rated from 1 to 5.
Partially actionable with several concrete steps, but still missing important details.