SlimSnap
SlimSnap is a macOS app that turns screenshots into JSON for terminal-based coding agents and text-only workflows. Capture, annotate, OCR, and export locally.
What is SlimSnap?
SlimSnap is a macOS app that turns screenshots into JSON that terminal-based coding agents can read. It is designed for situations where you want to communicate UI details to tools like Claude Code, Aider, or Codex CLI without pasting an image.
The workflow combines capture, annotation, OCR, and export. Users select an area on screen, add visual markers such as arrows or callouts, and copy a structured JSON representation that includes element text, bounding boxes, and annotation data. The goal is to make screenshot content available in text-only environments while keeping the capture process local on the Mac.
SlimSnap also publishes an open JSON schema on GitHub under the MIT license, so the exported format can be validated or reused outside the app. The product page presents it as a way to move screenshot context into places where images are not accepted, such as terminals, SSH sessions, and other text-only workflows.
Key Features
- Native macOS screen capture: users press
⌘⇧S, drag to select an area, and release to capture a screenshot region. - Annotation tools: arrows, callouts, and highlights let users point the agent at a specific UI element or issue.
- JSON export: captures are copied as structured JSON so they can be pasted into terminal agents and other text-only tools.
- OCR built in: the app reads visible labels, buttons, and error messages from the screenshot and includes them in the output.
- Element bounding boxes: output elements include normalized coordinates, which helps downstream tools reason about layout and position.
- Local processing on Mac: capture and OCR run on the device, with no account or server upload required.
- Open schema: the JSON schema is published on GitHub under MIT, allowing validation or custom exporters.
How to Use SlimSnap
Install the Mac app, then use the capture shortcut to select the part of the screen you want to share. Add any annotations needed to direct attention, such as highlighting a button or marking an error message.
After capture, copy the generated JSON and paste it into a tool that accepts text, such as a CLI coding agent. If you use the Claude Code skill, SlimSnap also writes a small local config file so the skill can find the saved JSON captures automatically.
Use Cases
- UI debugging in a terminal agent: share a screenshot of a broken interface with layout and text details already extracted into JSON.
- Iterative code review or fix prompts: point an agent at a specific button, form field, or error state without writing a long visual description.
- SSH or remote sessions: move screenshot context into an environment where image pasting is not available.
- CI or log-based troubleshooting: paste structured UI context into text-only logs or commit messages when a screenshot would not fit.
- Custom workflows: use the published schema to generate compatible JSON from another OCR pipeline or a hand-written exporter.
FAQ
Does SlimSnap upload my screenshots to a server? No. The page says capture and OCR run locally on your Mac, and screenshots do not leave the machine.
Does it work with tools that cannot read images? Yes. The product is built for terminal agents and other text-only destinations where screenshots cannot be pasted directly.
Is the schema open? Yes. The JSON schema is published on GitHub under MIT, and the page says the Claude Code skill is open as well.
Do I need the Mac app to use the Claude Code skill? No. The skill can work with any valid SlimSnap JSON file, even if it was created outside the app.
Is SlimSnap available on Windows or Linux? Not currently. The page says it is Mac-only today and invites requests for other platforms.
Alternatives
- Native screenshot sharing in AI chat apps: useful for one-off image questions, but not designed for terminal agents or text-only workflows.
- Manual text descriptions of the UI: workable when the screenshot is simple, but slower and more error-prone for detailed layouts.
- OCR plus custom JSON exporters: a flexible option for teams that want to build their own pipeline from screenshots to structured text.
- General screen recording or annotation tools: can capture and mark up interfaces, but usually do not export agent-readable JSON with OCR and bounding boxes.
Alternatives
Ably Chat
Ably Chat is a chat API and SDKs for building custom realtime chat apps, with reactions, presence, and message edit/delete.
AakarDev AI
AakarDev AI is a powerful platform that simplifies the development of AI applications with seamless vector database integration, enabling rapid deployment and scalability.
DeepMotion
DeepMotion is an AI motion capture and body-tracking platform to generate 3D animations from video (and text) in your web browser, via Animate 3D API.
Arduino VENTUNO Q
Arduino VENTUNO Q is an edge AI computer for robotics, combining AI inference hardware and a microcontroller for deterministic control. Arduino App Lab-ready.
Devin
Devin is an AI coding agent that helps software teams complete code migrations and large refactoring by running subtasks in parallel.
MakerLoft
MakerLoft is an AI app builder for non-developers that connects to your GitHub repo to generate working apps with auth, payments, files, jobs.