UStackUStack
UNI-1 icon

UNI-1

UNI-1, Luma’s multimodal reasoning model, generates pixels and supports directable, reference-guided image creation for scene completion and transformations.

UNI-1

What is UNI-1?

UNI-1 is a multimodal reasoning model from Luma that can generate pixels. The page positions UNI-1 as a system built to work with both input guidance and structured references, aiming to understand intention, respond to direction, and “think with you.”

On the product page, UNI-1’s capabilities are described in terms of scene completion, spatial reasoning, and plausibility-driven transformations, along with reference-guided and source-grounded generation controls.

Key Features

  • Multimodal reasoning for pixel generation: UNI-1 is described as a multimodal model that can generate pixels, supporting tasks that involve interpreting more than one kind of input.
  • Common-sense scene completion and spatial reasoning: The page highlights scene completion, spatial reasoning, and plausibility-driven transformation as core capabilities.
  • Directable, reference-guided generation: UNI-1 is presented as respondable to direction, using source-grounded controls to steer outputs.
  • Culture-aware visual generation: The page describes visual generation across aesthetics, memes, and manga.
  • Character references as input: The interface includes character references (e.g., portrait and full body) indicating support for reference-based generation workflows.

How to Use UNI-1

  • Start by using UNI-1 “for free” from the product page, which also links to a technical report.
  • Provide your creative goal and guidance (the page describes the model as directable and responsive to direction).
  • Use references when needed: character reference inputs are shown on the page, which can be used to guide generation.
  • Explore model outputs across tasks such as editing, image-to-image, and reference-based generation formats shown in the pricing section.

Use Cases

  • Scene completion from a partial view: Use UNI-1 for common-sense scene completion where spatial relationships and plausibility matter.
  • Reference-guided generation with character inputs: Provide character references (portrait or full body) to influence how a generated result is styled or composed.
  • Source-grounded image edits and transformations: Use directable controls to perform plausibility-driven transformations rather than purely unconstrained generation.
  • Style and cultural framing: Generate visuals aligned with requested aesthetics, memes, or manga references as described on the page.
  • Reference-based generation evaluation workflows: If you’re comparing outputs for overall preference or reference-based generation quality, the page notes UNI-1 rankings in human preference Elo across several categories.

FAQ

  • What is UNI-1 used for? The page describes UNI-1 for intelligent image generation tasks such as scene completion, spatial reasoning, plausibility-driven transformation, and reference-guided generation.

  • How does UNI-1 differ from standard text-to-image generation? The page emphasizes that UNI-1 is directable and can be guided with source-grounded controls, and it highlights reference-guided generation and character references as inputs.

  • Can I access UNI-1 via an API? The page states that an API is “available soon” and provides a waitlist form for early API access.

  • Where can I find UNI-1? The product page indicates you can try UNI-1 for free and also links to a technical report. It does not describe other distribution channels.

  • What inputs does UNI-1 support? Factual details on the page include character references (e.g., portrait and full body) and described workflows like image generation, image edit/i2i, and multi-reference generation.

Alternatives

  • Other multimodal image generation models: If you need models that combine instruction and visual inputs, compare multimodal image generators that support guided edits and reference conditioning.
  • Text-to-image and image-editing models: For purely text-driven or standard image-edit workflows, consider dedicated text-to-image or image-to-image tools and compare how well they support reference guidance.
  • Reference-conditioned generation tools: If your primary requirement is steering outputs with reference images (characters, styles, or source grounding), look for models or editors focused on reference conditioning rather than direction-only generation.
  • AI research demo platforms: If you’re evaluating reasoning quality and preference-based results, compare against research-oriented model platforms that publish technical reports and benchmark-style evaluations.
UNI-1 | UStack