Ask me what skills you need
What are you building?
Tell me what you're working on and I'll find the best agent skills for you.
Enables voice synthesis, voice cloning, voice design, and audio post-processing using MiniMax Voice API and FFmpeg. Use when converting text to speech, creating custom voices, or processing/merging audio.
Professional text-to-speech skill with emotion detection, voice cloning, and audio processing capabilities powered by MiniMax Voice API and FFmpeg.
| Area | Features |
|---|---|
| TTS | Sync (HTTP/WebSocket), async (long text), streaming |
| Segment-based | Multi-voice, multi-emotion synthesis from segments.json, auto merge |
| Voice | Cloning (10s–5min), design (text prompt), management |
| Audio | Format conversion, merge, normalize, trim, remove silence (FFmpeg) |
mmVoice_Maker/
├── SKILL.md # This overview
├── mmvoice.py # CLI tool (recommended for Agents)
├── check_environment.py # Environment verification
├── requirements.txt
├── scripts/ # Entry: scripts/__init__.py
│ ├── utils.py # Config, data classes
│ ├── sync_tts.py # HTTP/WebSocket TTS
│ ├── async_tts.py # Long text TTS
│ ├── segment_tts.py # Segment-based TTS (multi-voice, multi-emotion)
│ ├── voice_clone.py # Voice cloning
│ ├── voice_design.py # Voice design
npx skills add LeoYeAI/openclaw-master-skills --skill mm-voice-makerHow clear and easy to understand the SKILL.md instructions are, rated from 1 to 5.
Very clear and well structured, with almost no room for misunderstanding.
How directly an agent can act on the SKILL.md instructions, rated from 1 to 5.
Highly actionable with clear, concrete steps that an agent can follow directly.