range-filtering

根据多维数值条件筛选 Excel 数据并导出结果，支持大规模数据的自动性能优化处理。

Best for Excel data analystsWorks with GitHub

#excel #conditional filtering #optimization #pandas #data cleaning

⌘source

author: @OpenSenseNova
repo: OpenSenseNova/SenseNova-Skills
language: Python

✦overview.md

Key Features

·Multi-condition numerical filtering
·Automatic performance optimization for large data
·Sheet-aware row counting and summary
·Data cleaning: header offsets, merge cells
·Force numeric column conversion

Use Cases

→Filtering Excel rows based on multiple numerical criteria for targeted analysis
→Preprocessing large Excel datasets before import into another system
→Extracting specific records from complex spreadsheets with merged cells or shifted headers
→Auditing data quality by identifying and coercing non-numeric values

Best for

✓Excel data analysts
✓Pipeline preprocessing steps
✓Large dataset optimization

Not ideal for

!Non-numerical filtering criteria
!Real-time or streaming data
!One-off manual operations

FAQs

skills/sn-da-excel-workflow/capability/excel-data-filtering/range-filtering/SKILL.md

name

excel-conditional-filtering-optimization

description

根据多维数值条件筛选 Excel 数据并导出结果，支持大规模数据的自动性能优化处理。

Excel_Conditional_Filtering_Optimization

Note: This sub-skill covers one step of the Excel analysis workflow. For the full pipeline (file reading, row counting, large-file optimization, export), see the parent workflow SKILL.md.

Step1 读取 Excel 文件中所有工作表的数据，统计各表行数并汇总，用于评估数据规模。

import pandas as pd

file_path = "input_data.xlsx"

# 读取所有 sheet，统计行数
xls = pd.ExcelFile(file_path)
print("Sheet names:", xls.sheet_names)

total_rows = 0
sheet_details = []
for sheet in xls.sheet_names:
    df_temp = pd.read_excel(file_path, sheet_name=sheet)
    row_count = len(df_temp)
    sheet_details.append({"sheet": sheet, "rows": row_count})
    total_rows += row_count

print(f"Sheet details: {sheet_details}")
print(f"Total rows across all sheets: {total_rows}")

Step2 对目标数据进行清洗，处理表头偏移，并将关键列转换为数值类型以确保计算准确。

# 读取目标数据表
target_sheet = 'Sheet1'
df = pd.read_excel(file_path, sheet_name=target_sheet, header=0)

# 处理可能的子表头或空行偏移（示例：跳过第一行）
# df = df.iloc[1:].reset_index(drop=True)

# 统一设置列名（根据实际业务逻辑调整占位符）
# df.columns = ['col_1', 'col_2', 'col_3', 'target_id', 'val_a', 'val_b', 'val_c']

# 强制转换数值列，处理非数值数据为 NaN
numeric_cols = ['val_a', 'val_b', 'val_c', 'target_id']

...

$install

1-click copy

npx skills add OpenSenseNova/SenseNova-Skills --skill range-filtering

Safety assessment

★

Clarity score

How clear and easy to understand the SKILL.md instructions are, rated from 1 to 5.

4/ 5

very good

Clear and well structured, with only minor parts that might need a second read.

◎

Actionability score

How directly an agent can act on the SKILL.md instructions, rated from 1 to 5.

4/ 5

high

Mostly actionable with clear steps; only a few small gaps remain.

~community cookbook

~you might also like

view all →

duplicate-value-coloring

★70

testing#excel

[✓]from @OpenSenseNova

[✓]

对比Excel多表中的特定系数并对异常值进行颜色标记。

April 30, 2026

◧ Compare

line-chart-visualization

★70

performance#data visualization

[✓]from @OpenSenseNova

[✓]

提取结构化数据并进行特征清洗与聚类分析，生成包含趋势对比、分布特征与参数敏感性的多维度综合可视化图表，适用于各类趋势预测与多维对比场景。

April 30, 2026

◧ Compare

trend-analysis

★70

performance#数据趋势

[✓]from @OpenSenseNova

[✓]

基于多维度数据进行分级评估与趋势预测，通过设定差异化增长率计算预测值，并生成对比可视化图表，适用于绩效评估、目标设定等场景。

April 30, 2026

◧ Compare

numeric-format-normalization

★70

testing#excel

[✓]from @OpenSenseNova

[✓]

对 Excel 数据进行数值格式标准化与清洗，支持大规模数据的 Parquet 转换流程，并完成关键指标的合计核对与结果文件导出。

April 30, 2026

◧ Compare

bar-chart-visualization

★70

testing#excel

[✓]from @OpenSenseNova

[✓]

读取多工作表Excel文件，自动处理合并单元格与数据清洗，进行交叉分组统计并生成带总计行的结果表，最后绘制支持中英文字体的美化柱状图，适用于多维度数据汇总与可视化分析。

April 30, 2026

◧ Compare

percentage-calculation

★70

testing#excel

[✓]from @OpenSenseNova

[✓]

根据文件行数动态切换大文件处理策略（Parquet转换），通过逐行扫描或列匹配提取关键指标并计算占比、均值等统计量，最终输出结构化Excel报告及可视化图表。

April 30, 2026

◧ Compare

Excel_Conditional_Filtering_Optimization

Note: This sub-skill covers one step of the Excel analysis workflow. For the full pipeline (file reading, row counting, large-file optimization, export), see the parent workflow SKILL.md.

Step1 读取 Excel 文件中所有工作表的数据，统计各表行数并汇总，用于评估数据规模。

import pandas as pd file_path = "input_data.xlsx" # 读取所有 sheet，统计行数 xls = pd.ExcelFile(file_path) print("Sheet names:", xls.sheet_names) total_rows = 0 sheet_details = [] for sheet in xls.sheet_names: df_temp = pd.read_excel(file_path, sheet_name=sheet) row_count = len(df_temp) sheet_details.append({"sheet": sheet, "rows": row_count}) total_rows += row_count print(f"Sheet details: {sheet_details}") print(f"Total rows across all sheets: {total_rows}")

Step2 对目标数据进行清洗，处理表头偏移，并将关键列转换为数值类型以确保计算准确。

# 读取目标数据表 target_sheet = 'Sheet1' df = pd.read_excel(file_path, sheet_name=target_sheet, header=0) # 处理可能的子表头或空行偏移（示例：跳过第一行） # df = df.iloc[1:].reset_index(drop=True) # 统一设置列名（根据实际业务逻辑调整占位符） # df.columns = ['col_1', 'col_2', 'col_3', 'target_id', 'val_a', 'val_b', 'val_c'] # 强制转换数值列，处理非数值数据为 NaN numeric_cols = ['val_a', 'val_b', 'val_c', 'target_id']

range-filtering

Key Features

Use Cases

Best for

Not ideal for

FAQs

Does this skill handle multiple sheets?

What file format is required?

Does it automatically optimize for large files?

Can I customize the filtering logic?

Excel_Conditional_Filtering_Optimization

Safety assessment

Clarity score

Actionability score

~community cookbook

~you might also like

duplicate-value-coloring

line-chart-visualization

trend-analysis

numeric-format-normalization

bar-chart-visualization

percentage-calculation

AI Skill Finder

range-filtering

Key Features

Use Cases

Best for

Not ideal for

FAQs

Does this skill handle multiple sheets?

What file format is required?

Does it automatically optimize for large files?

Can I customize the filtering logic?

Excel_Conditional_Filtering_Optimization

Safety assessment

Clarity score

Actionability score

~community cookbook

~you might also like

duplicate-value-coloring

line-chart-visualization

trend-analysis

numeric-format-normalization

bar-chart-visualization

percentage-calculation