Ask me what skills you need
What are you building?
Tell me what you're working on and I'll find the best agent skills for you.
从带单位的字符串列中提取数值并清洗,生成包含直方图、饼图、条形图和累积分布图的多维度综合分布可视化图表,用于展示数据的集中趋势与分布特征。
Step1 从原始数据中提取目标列,清理无效和空值数据,并安全地将带单位的字符串转换为数值类型
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# 配置中英文字体,避免图表乱码
plt.rcParams['font.sans-serif'] = ['SimHei', 'DejaVu Sans']
plt.rcParams['axes.unicode_minus'] = False
item_col = '项目名称' # 占位示例:分类或名称列
value_col = '带单位的数值' # 占位示例:需要提取数值的原始列
numeric_col = '提取数值'
unit_str = 'g' # 占位示例:需要移除的单位字符串
def extract_numeric_value(val_str):
"""从带单位的字符串中提取数值"""
if pd.isna(val_str):
return None
try:
# 移除单位并转换为浮点数
return float(str(val_str).replace(unit_str, '').strip())
except ValueError:
return None
# 清理缺失值与异常占位符
df_clean = df.dropna(subset=[item_col, value_col]).copy()
df_clean = df_clean[df_clean[item_col] != '...']
# 应用提取函数并过滤转换失败的行
df_clean[numeric_col] = df_clean[value_col].apply(extract_numeric_value)
df_clean = df_clean.dropna(subset=[numeric_col])
Step2 创建基础分布直方图,并添加平均值和中位数的参考线以展示数据的集中趋势
plt.figure(figsize=(12, 8))
# 绘制直方图
plt.hist(df_clean[numeric_col], bins=10, alpha=0.7, color='skyblue', edgecolor='black')
# 计算并添加平均值和中位数参考线
mean_val = df_clean[numeric_col].mean()
npx skills add OpenSenseNova/SenseNova-Skills --skill data-bar-formattingHow clear and easy to understand the SKILL.md instructions are, rated from 1 to 5.
Clear and well structured, with only minor parts that might need a second read.
How directly an agent can act on the SKILL.md instructions, rated from 1 to 5.
Mostly actionable with clear steps; only a few small gaps remain.