verl-feature-deploy

Verl 分布式训练服务一键拉起与配置。触发场景：(1) 用户要启动 Verl 训练任务或部署 RLHF/DAPO 训练环境 (2) 在

Best for NPU 集群训练团队Works with GitHubLow risk

#verl #distributed-training #npu #rlhf #dapo #ray #swanlab #qwen

⌘source

author: @ascend-ai-coding
repo: ascend-ai-coding/awesome-ascend-skills
language: Python

✦overview.md

Key Features

·一键拉起 Verl 训练容器
·配置 Ray 集群和 SwanLab 监控
·7 位二进制掩码加速特性配置
·支持单机/多机模式
·环境自动检测与预填值
·用户确认阻断点保障安全

Use Cases

→在 NPU 集群上启动 DAPO/GRPO 训练任务
→部署 RLHF 训练环境
→快速配置并启动 Qwen3-8B 等模型的分布式训练
→灵活调整加速特性进行性能调优

Best for

✓NPU 集群训练团队
✓RLHF/DAPO 开发者
✓分布式训练运维

Not ideal for

!CPU 训练环境
!单卡小模型调试

FAQs

external/gitcode-ascend/verl-feature-deploy/SKILL.md

name

external-gitcode-ascend-verl-feature-deploy

description

Verl 分布式训练服务一键拉起与配置。触发场景：(1) 用户要启动 Verl 训练任务或部署 RLHF/DAPO 训练环境 (2) 在 NPU 集群上拉起 Verl 训练容器 (3) 配置 Ray 集群和 SwanLab 监控 (4) 根据 7 位二进制掩码灵活配置加速特性。支持 Qwen3-8B 等 Megatron 模型的 DAPO 训练全流程。

license

UNKNOWN

original-name:verl-deploy

synced-from:https://gitcode.com/Ascend/agent-skills

synced-date:2026-04-29

synced-commit:8faee0275e457955c8f50989aef8972c0838db31

Verl Deploy - 训练服务一键拉起

在 NPU 集群上拉起 Verl 分布式训练服务，并灵活配置加速特性，支持 DAPO/GRPO 等 RLHF 算法。

核心原则

用户配置优先：用户明确指定 > 自动检测 > 默认值
配置必须验证：路径、镜像等运行前检测有效性，不猜测
用户确认是阻断点：执行前展示配置清单并等待确认
严格区分宿主机/容器内：docker cp 在宿主机执行，进入容器后不再使用 docker 前缀命令

整体流程

1. 环境预检查 → 2. 用户交互 → 3. 配置确认 → 4. 镜像准备+容器拉起 → 5. SwanLab 配置 → 6. 生成双脚本 → 7. 复制+执行 → 8. 验证

阶段 1：环境预检查

执行以下 bash 命令自动探测机器环境：

# NPU 信息
npu-smi info
npu-smi info -t board 2>/dev/null

# Docker 和镜像
docker ps -a | grep verl
docker images | grep -E "verl|ascend"

# 模型权重
find /mnt/public /mnt2 /mnt/project -maxdepth 4 -type d -name "Qwen*" 2>/dev/null

# 数据集
find /mnt /mnt2 -name "*.parquet" 2>/dev/null | head -20

# 网卡和 IP
hostname -I

将检测结果记录，供阶段 2 使用。

阶段 2：用户交互收集信息

通过 AskUserQuestion 分轮次收集，每轮提供预填值（来自阶段 1 检测结果）。

轮次 1：基础配置

训练模式：单机 / 多机
NPU 卡号（预填检测结果，用户可改）
模型路径（列出检测到的模型供选择，或手动指定路径）
训练数据集路径
测试数据集路径
Checkpoint 保存路径

轮次 2：镜像与容器配置

容器处理方式：使用已有容器 / 新建容器
如新建：
- 镜像选择：列出本地已有 verl 镜像 / 手动输入 / 从 quay.io 拉取
- 默认拉取地址：quay.io/ascend/verl:verl-8.3.rc1-910b-ubuntu22.04-py3.11-v0.7.0

轮次 3：SwanLab 配置

是否开启 SwanLab 日志（是/否）

如果开启，逐项询问（提供默认值）：

...

$install

1-click copy

npx skills add ascend-ai-coding/awesome-ascend-skills --skill verl-feature-deploy

Safety assessment

★

Clarity score

How clear and easy to understand the SKILL.md instructions are, rated from 1 to 5.

5/ 5

excellent

Very clear and well structured, with almost no room for misunderstanding.

◎

Actionability score

How directly an agent can act on the SKILL.md instructions, rated from 1 to 5.

5/ 5

very high

Highly actionable with clear, concrete steps that an agent can follow directly.

~community cookbook

April 30, 2026

◧ Compare

Verl Deploy - 训练服务一键拉起

在 NPU 集群上拉起 Verl 分布式训练服务，并灵活配置加速特性，支持 DAPO/GRPO 等 RLHF 算法。

核心原则

用户配置优先：用户明确指定 > 自动检测 > 默认值
配置必须验证：路径、镜像等运行前检测有效性，不猜测
用户确认是阻断点：执行前展示配置清单并等待确认
严格区分宿主机/容器内：docker cp 在宿主机执行，进入容器后不再使用 docker 前缀命令

整体流程

1. 环境预检查 → 2. 用户交互 → 3. 配置确认 → 4. 镜像准备+容器拉起 → 5. SwanLab 配置 → 6. 生成双脚本 → 7. 复制+执行 → 8. 验证

阶段 1：环境预检查

执行以下 bash 命令自动探测机器环境：

# NPU 信息
npu-smi info
npu-smi info -t board 2>/dev/null

# Docker 和镜像
docker ps -a | grep verl
docker images | grep -E "verl|ascend"

# 模型权重
find /mnt/public /mnt2 /mnt/project -maxdepth 4 -type d -name "Qwen*" 2>/dev/null

# 数据集
find /mnt /mnt2 -name "*.parquet" 2>/dev/null | head -20

# 网卡和 IP
hostname -I

将检测结果记录，供阶段 2 使用。

阶段 2：用户交互收集信息

通过 AskUserQuestion 分轮次收集，每轮提供预填值（来自阶段 1 检测结果）。

轮次 1：基础配置

训练模式：单机 / 多机
NPU 卡号（预填检测结果，用户可改）
模型路径（列出检测到的模型供选择，或手动指定路径）
训练数据集路径
测试数据集路径
Checkpoint 保存路径

轮次 2：镜像与容器配置

容器处理方式：使用已有容器 / 新建容器
如新建：
- 镜像选择：列出本地已有 verl 镜像 / 手动输入 / 从 quay.io 拉取
- 默认拉取地址：quay.io/ascend/verl:verl-8.3.rc1-910b-ubuntu22.04-py3.11-v0.7.0

轮次 3：SwanLab 配置

是否开启 SwanLab 日志（是/否）

如果开启，逐项询问（提供默认值）：

verl-feature-deploy

Key Features

Use Cases

Best for

Not ideal for

FAQs

支持哪些训练模式？

如何配置加速特性？

是否需要预装 Docker？

是否支持自定义镜像？

Verl Deploy - 训练服务一键拉起

核心原则

整体流程

阶段 1：环境预检查

阶段 2：用户交互收集信息

轮次 1：基础配置

轮次 2：镜像与容器配置

轮次 3：SwanLab 配置

Safety assessment

Clarity score

Actionability score

~community cookbook

~you might also like

avoiding-subcomposition-pitfalls

stabilizing-compose-types

sn-image-doctor

understanding-stability-inference

using-strong-skipping-correctly

sn-da-image-caption

AI Skill Finder

verl-feature-deploy

Key Features

Use Cases

Best for

Not ideal for

FAQs

支持哪些训练模式？

如何配置加速特性？

是否需要预装 Docker？

是否支持自定义镜像？

Verl Deploy - 训练服务一键拉起

核心原则

整体流程

阶段 1：环境预检查

阶段 2：用户交互收集信息

轮次 1：基础配置

轮次 2：镜像与容器配置

轮次 3：SwanLab 配置

Safety assessment

Clarity score

Actionability score

~community cookbook

~you might also like

avoiding-subcomposition-pitfalls

stabilizing-compose-types

sn-image-doctor

understanding-stability-inference

using-strong-skipping-correctly

sn-da-image-caption