MiniMax-AI/cli
MiniMax-AI/cli is the official MiniMax AI Platform CLI to generate text, images, video, speech, music, plus vision & web search.
What is MiniMax-AI/cli?
MiniMax-AI/cli is the official command-line interface (CLI) for the MiniMax AI Platform. It lets you generate and process multiple media types—text, images, video, speech, and music—directly from an agent workflow, terminal, or automation pipeline.
The CLI is designed to be usable across agent environments (“from any agent or terminal”) and supports both global and CN regions via different API endpoints.
Key Features
- Multi-modal generation in one CLI: Generate text, images, video, speech (TTS), and music from command-line prompts and inputs.
- Text chat with streaming and structured output: Supports multi-turn chat, streaming, system prompts, and JSON output using the
mmx text chatcommand. - Image generation controls: Create images with aspect ratio settings and batch generation (
--n), and save results to an output directory. - Async video generation with progress tracking: Start video jobs asynchronously (
--async) and later download results using task/file identifiers. - Speech synthesis with voice, speed, and streaming: Generate TTS with 30+ voices, adjust speed, and stream audio output to a media player.
- Music generation features: Produce lyrics-based songs, generate auto lyrics from prompts (
--lyrics-optimizer), create instrumental tracks, and generate covers from reference audio. - Vision and search from the command line: Use
mmx visionto describe images andmmx searchfor web search, including JSON output mode. - Authentication and region configuration: Login with an API key and manage region settings (example includes setting region to
cn).
How to Use MiniMax-AI/cli
- Install the CLI.
- For AI agents (OpenClaw, Cursor, Claude Code, etc.): add the skill using
npx skills add MiniMax-AI/cli -y -g. - For terminal use: install globally with
npm install -g mmx-cli.
- For AI agents (OpenClaw, Cursor, Claude Code, etc.): add the skill using
- Authenticate with your MiniMax token plan API key:
mmx auth login --api-key sk-xxxxx
- Run a media command. For example:
- Text:
mmx text chat --message "What is MiniMax?" - Image:
mmx image "A cat in a spacesuit" - Speech:
mmx speech synthesize --text "Hello!" --out hello.mp3 - Video:
mmx video generate --prompt "Ocean waves at sunset" - Music:
mmx music generate --prompt "Upbeat pop" --lyrics "[verse] La da dee, sunny day"
- Text:
- Use JSON mode when needed: pipe input (e.g.,
cat messages.json) into the chat command and request--output json.
Use Cases
- Agent workflows (coding assistants): Add this CLI as a “skill” to an AI agent so the agent can call commands like
mmx text chat,mmx image, ormmx video generatewhile following agent conventions. - Terminal-based content creation: Generate images, speech, or music from scripts without building a separate UI (for example, creating assets and saving them to an output path).
- Streaming text responses for interactive work: Use
mmx text chat --streamto handle incremental output in terminal sessions when you want to observe responses as they generate. - Async media pipelines: Start a video generation job with
--async, then retrieve and download results later usingmmx video task get --task-id ...andmmx video download --file-id .... - Media transformation and music covers: Generate instrumental tracks or create cover versions from a reference audio file using
mmx music coverwith--audio-fileor--audio.
FAQ
-
What media types can the CLI generate? The README lists support for text, images, video, speech (TTS), and music, plus vision (image understanding/description) and web search.
-
How do I authenticate? Use
mmx auth login --api-key sk-xxxxx. The CLI also provides commands likemmx auth status,mmx auth refresh, andmmx auth logout. -
Can I use streaming output? Yes. Text chat includes a
--streamoption, and speech synthesis supports a--streammode (example pipes output tompv -). -
How do I work with JSON outputs for chat/search? The CLI examples show
--output jsonfor commands like text chat (including piping messages from a file/STDIN) and for search. -
Is there support for both Global and CN endpoints? The project notes “Seamless Global (api.minimax.io) and CN (api.minimaxi.com) support,” and includes an example command to set the region to
cn(mmx config set --key region --value cn).
Alternatives
- HTTP API clients for the MiniMax Platform: If you prefer direct integration, you can call the platform endpoints from your own scripts instead of using this CLI. This offers more control but requires handling authentication and request logic.
- Other agent “tool/skill” CLIs: Many AI agents support attaching tools/skills; you could use a different tool connector for agent-driven media generation. The difference is how the tool is surfaced to the agent and how commands are invoked.
- Dedicated UI-based media generators: For non-developer workflows, browser-based tools may simplify prompt-to-output interaction. Compared with a CLI, they typically trade automation and scripting flexibility for a guided interface.
Alternatives
紫东太初
A new generation multimodal large model launched by the Institute of Automation, Chinese Academy of Sciences and the Wuhan Artificial Intelligence Research Institute, supporting multi-turn Q&A, text creation, image generation, and comprehensive Q&A tasks.
PXZ AI
An All-In-One AI Platform that combines tools for image, video, voice, writing, and chat to enhance creativity and collaboration.
Slidesgo
Discover free Google Slides themes and PowerPoint templates on Slidesgo. Download and customize online to create presentations.
Grok AI Assistant
Grok is a free AI assistant developed by xAI, engineered to prioritize truth and objectivity while offering advanced capabilities like real-time information access and image generation.
AakarDev AI
AakarDev AI is a powerful platform that simplifies the development of AI applications with seamless vector database integration, enabling rapid deployment and scalability.
Oli: Pregnancy Safety Scanner
Oli: Pregnancy Safety Scanner helps check if foods, skincare, supplements, and more are safe in pregnancy with barcode/photo scanning and trimester ratings.