Ask me what skills you need
What are you building?
Tell me what you're working on and I'll find the best agent skills for you.
启动训练服务前,Agent 必须按以下顺序询问用户:
检测到以下容器:
[1] jins ( 运行中)
[2] 创建新容器
请选择容器 (1/2):
请提供代理配置(如不需要请回复"无"):
http_proxy: ?
https_proxy: ?
是否有自定义特性需求?(回复"默认"使用默认配置)
性能特性(默认全开):
- flash_attn: Flash Attention 加速
- dynamic_batch: 动态 Batch Size
- remove_padding: Remove Padding 优化
- gradient_checkpointing: 梯度检查点
显存特性(默认关闭,OOM 时自动开启):
- offload: 参数/优化器卸载
- recompute: 重计算
可选特性:
- prefix_cache: Prefix Cache
- chunked_prefill: Chunked Prefill
SwanLab 监控配置:
Host: ?
API Key: ?
# 方式 1: 快速启动脚本 (推荐)
CONTAINER=jins TRAIN_STEPS=100 \
SWANLAB_HOST=http://10.143.2.129:8000 \
SWANLAB_API_KEY=your-key \
bash scripts/quick_start.sh
# 方式 2: 分步执行
# 2.1 训练前检查
bash scripts/preflight_check.sh
# 2.2 启动单异步 DAPO 训练 (使用默认特性进行训练)
export TRAIN_STEPS=4
bash scripts/run_dapo.sh
# 2.3 启动 One-Step-Off-Policy 训练
export TRAIN_STEPS=4
bash scripts/run_one_step_off_policy.sh
One-Step-Off-Policy 是资源隔离的单异步训练模式,Trainer 和 Rollout 使用独立的 GPU 组。
# 基本用法
export TRAIN_STEPS=100
export TRAINER_GPUS=4
npx skills add ascend-ai-coding/awesome-ascend-skills --skill verl-async-dapoHow clear and easy to understand the SKILL.md instructions are, rated from 1 to 5.
Mostly clear, but there are still a few confusing or poorly structured parts.
How directly an agent can act on the SKILL.md instructions, rated from 1 to 5.
Mostly actionable with clear steps; only a few small gaps remain.