UStackUStack
Mercury Edit 2 icon

Mercury Edit 2

Mercury Edit 2 is a diffusion-based LLM for low-latency next-edit prediction in coding, suggesting your next change from recent edits and context.

Mercury Edit 2

What is Mercury Edit 2?

Mercury Edit 2 is a purpose-built diffusion LLM (dLLM) for next-edit prediction in software development workflows. It’s designed to help with the most latency-sensitive step in coding assistance: suggesting what you’re likely to change next based on your recent edits and the surrounding codebase context.

The model complements Inception’s existing auto-complete endpoint by focusing specifically on edit suggestions. In practice, you can accept a predicted edit (for example, via tab) when the suggestion fits what you’re working on.

Key Features

  • Next-edit prediction from edit history and code context: Uses “recent edits” plus codebase context to generate a targeted suggestion for what to change next.
  • Diffusion-based token generation in parallel: Generates tokens using diffusion and runs them in parallel to reduce time to first suggestion for low-latency UX.
  • Preference-aligned training using human feedback: Builds a human preference dataset from explicit accept/reject feedback, then applies an unpaired reinforcement learning method (KTO) to align suggestions with human preferences.
  • More selective, less distracting edits (as measured in the post): Reported improvements include 48% more accepted edits and 27% more selectivity in what it displays.
  • Benchmark coverage for edit correctness and speed: Quality is evaluated against a set of benchmarks (including open-sourced ones such as Instinct, Fill-in-the-middle (FIM), and Next-edit Prediction (NEP)) plus an internal next-edit benchmark; speed is measured via end-to-end latency on representative requests.
  • Available via the Inception Platform API: You can access Mercury Edit 2 through the Inception API (including an APIZedProxy mention for Zed users).

How to Use Mercury Edit 2

  • Get access on the Inception Platform: Create an account on the Inception API Platform to start using Mercury Edit 2.
  • Call the model through the API: Use the Inception API to send requests for next-edit predictions (the post references an API workflow, including an APIZedProxy for Zed integration).
  • Integrate into an editor workflow: If you’re embedding it into a development environment, use the model’s next-edit predictions alongside your editor’s acceptance action (e.g., “Just Tab to accept,” as described in the post).

Use Cases

  • IDE/editor next-change suggestions during active coding: When you make a series of edits, use Mercury Edit 2 to suggest what you’ll likely change next, aiming for low-latency responses.
  • Refactoring help with edit-targeted proposals: Generate suggestions for changes like renaming, refactoring steps, or other structured edits where “next edit” framing fits the workflow.
  • FIM/line completion-style workflows adapted to edits: In contexts where completion alone isn’t enough, use next-edit prediction to propose the edit that follows from your current edit sequence and surrounding code.
  • Feature implementation iteration: As you add functionality, rely on next-edit prediction to suggest subsequent changes (such as follow-up modifications) based on recent edits.
  • Reducing unwanted suggestions via preference alignment: Use the preference-trained behavior to lower the frequency and length of edits that would otherwise distract you (as the post describes as an explicit training motivation).

FAQ

  • What problem does Mercury Edit 2 target? It targets next-edit prediction in coding workflows, where the system needs to suggest what you will change next with low latency.

  • How does it differ from auto-complete? The post states that Mercury Edit 2 complements an existing auto-complete endpoint by focusing specifically on edit suggestions rather than general completion.

  • How is the model trained to be more useful? The post describes using a human preference dataset built from accept vs. reject feedback, then applying an unpaired reinforcement learning method called KTO for alignment.

  • How does the post evaluate model quality and speed? Quality is benchmarked across open-sourced edit-related benchmarks (Instinct, FIM, NEP) plus an internal next-edit benchmark, using LLM-as-a-judge for correctness (and test-case execution for FIM). Speed is measured using end-to-end latency on representative requests.

  • Where can I use the model? It’s available via the Inception Platform API.

Alternatives

  • Auto-complete–focused coding assistants: These aim to predict upcoming tokens or text rather than targeted next edits; they may be simpler for prefix completion but won’t specialize in “what you’ll change next.”
  • General-purpose completion models for code: You can prompt general code LLMs to propose diffs or edits, but they may not be optimized specifically for next-edit prediction latency and edit accept/reject alignment.
  • Other next-edit / fill-in-the-middle style edit predictors: Alternatives in the same category would be models evaluated on similar edit scenarios (line completion, variable renaming, refactoring, feature implementation), differing in how they generate edits and how they balance quality vs. speed.
  • Test-driven edit generation systems: Some approaches validate edits by running test cases (the post notes FIM uses test-case execution). Those systems can emphasize correctness through execution but may differ in workflow speed and latency tradeoffs.