Agentic coding focus
Built as a coding-focused agentic model on top of Kimi K2.6, with improved support for long-horizon software engineering tasks and end-to-end task completion.
Kimi-K2.7-Code is a coding-focused agentic model from Moonshot AI on Hugging Face with thinking mode, long context, tool use, and official API access.
Kimi-K2.7-Code is a coding-focused agentic model from Moonshot AI available on Hugging Face. It is presented as an update to Kimi-K2.6 with stronger performance on real-world, long-horizon coding tasks and improved token efficiency.
The model summary describes a Mixture-of-Experts architecture with 1T total parameters, 32B activated parameters, a 256K context length, and support for thinking mode, tool calling, and image/video inputs through the official API. The deployment guide says the same architecture as Kimi-K2.5/K2.6 can be reused and provides examples for vLLM, SGLang, and KTransformers.
For teams building software engineering assistants or internal coding workflows, the documentation emphasizes end-to-end task completion, reasoning-focused usage, and deployment on common inference engines. The model also exposes OpenAI/Anthropic-compatible API access through Moonshot AI’s platform.
Built as a coding-focused agentic model on top of Kimi K2.6, with improved support for long-horizon software engineering tasks and end-to-end task completion.
The model page reports roughly 30% lower thinking-token usage than Kimi K2.6, which points to more token-efficient reasoning during coding workflows.
It uses a Mixture-of-Experts architecture with 1T total parameters, 32B activated parameters, 384 experts, and 8 selected experts per token.
The context length is listed as 256K, which supports long codebase interactions and extended task context.
The deployment guide recommends official support for vLLM, SGLang, and KTransformers, and the usage examples show OpenAI/Anthropic-compatible APIs.
The model documentation includes tool calling, thinking-mode reasoning, and image/video input examples in the official API.
Use the model as a coding assistant for multi-step software engineering work that benefits from long context, reasoning, and tool use across a repository or project plan.
Deploy it behind an internal API for teams that want OpenAI- or Anthropic-compatible access to a coding model without changing client-side request patterns.
Run it with vLLM, SGLang, or KTransformers when you need a self-hosted inference setup and want to follow the deployment patterns documented by Moonshot AI.
Use the official API examples to process text prompts together with images or video for workflows that need visual understanding alongside coding-oriented reasoning.
Apply it to persistent agent-style jobs where the model needs to keep working through long-horizon tasks rather than answer a single isolated prompt.
Kimi-K2.7-Code is a coding-focused agentic model on Hugging Face. The deployment guide says the same architecture as Kimi-K2.5/K2.6 can be reused, and example deployments are provided for vLLM, SGLang, and KTransformers.
The model is documented as supporting thinking mode only. The usage notes also say instant mode is not supported, and third-party deployments should keep the reasoning parser set appropriately.
Yes. The usage examples and deployment guide show both text chat and visual inputs, and note that image and video input are supported in the official API.
The model page says you can access the API on platform.moonshot.ai, with OpenAI-compatible and Anthropic-compatible API options.
The source pages do not provide a full public pricing breakdown for this model. The Hugging Face pricing page is linked, but no model-specific price or quota is listed in the collected evidence.
Ghost is a terminal-based AI assistant for chatting, code generation, and CLI tasks. Includes free models, supports Linux, macOS, Windows, and is open source.
Devin is an AI coding agent and software engineer for planning and executing complex software tasks, with desktop, cloud, JetBrains, and CLI access.
imgcook is a design-to-code tool that converts design drafts into front-end code. It supports plugin-based and developer workflows for Sketch, Photoshop, VS Code, and CLI usage.
Pi Coding Agent is a terminal-based coding agent for developers who want a minimal, extensible harness for interactive work and automation. It supports model switching, session branching, and TUI, print/JSON, RPC, and SDK modes.
Assemble by Cohesium AI is an open-source prompt orchestration system for AI coding tools. It generates native config files that turn one project into a structured multi-agent setup across 21 platforms.
Ably Chat is a chat API platform for building custom realtime chat applications. It supports room-based messaging, typing indicators, presence, reactions, and message updates, with usage-based pricing options for different deployment stages.