PandaProbe Cloud is a managed platform for agent tracing, evals, and monitoring. It helps teams run production observability and evaluation workflows without maintaining their own infrastructure.
AEVS is a developer tool for recording and verifying AI agent tool calls with signed, tamper-evident receipts. It helps developers inspect what an agent actually executed through an API or explorer instead of relying on model text alone.
Proxee is a macOS menu bar app for previewing a local web app on an iPhone while you build on a Mac. It pairs devices over QR code, syncs key browsing state, and includes an iPhone companion app for longer testing sessions.
Bugpilot is a Chrome extension that captures console, network, DOM, and user-action context as AI-ready Markdown for debugging with Claude, ChatGPT, Cursor, and similar assistants. It includes always-on redaction, local-only capture, and a free tier with optional Pro export formats.
Patchrooms is a visual feedback tool for AI-built apps. It lets reviewers comment on elements in staging or preview builds, then exports agent-ready Markdown or MCP-readable reports for tools like Claude Code and Cursor.
Backplanes Spotlight generates automatic reports for Claude Code and Codex sessions so developers and teams can review what an agent actually did after a run finishes. It focuses on post-session visibility, report summaries, and local redaction before data leaves the laptop.
browse.sh is an open catalog of browser automation skills for AI agents and developers. It combines reusable `SKILL.md` recipes with a CLI for browser actions, debugging, and cloud sessions.
FixtureKit generates fixtures, mocks, and test data from TypeScript and Zod schemas in the browser. It runs locally with no API calls and keeps schema data on your machine.
FlintLab Sirius is an AI-powered device infrastructure PaaS for managing tests across real devices, emulators, and containerized execution environments. It supports automation and observability through a web UI, CLI, and REST APIs.
Humans Not Invited is a machine-oriented challenge site that asks users or agents to select matching squares and verify the result. The captured pricing URL returns 404, so no public plan details are available from the source.
TestSprite is an AI testing agent and automation platform that helps software teams plan, generate, run, debug, and report tests with minimal input. It supports cloud-based verification workflows and integrates with MCP, IDE, and CI-based development setups.
Chunk sidecars are CircleCI validation environments that help AI coding agents catch failures in the inner loop before changes reach CI. They run lightweight checks in a remote microVM and are available to all CircleCI users, including Free plans.
The Incident Challenge is a bi-weekly production incident challenge for developers and engineers. Participants inspect logs, code, docs, architecture, and clues to identify the issue and submit the fastest correct fix.
UI Design Analyzer is a free browser-based tool for scoring UI screenshots across seven criteria, including spacing, alignment, color, hierarchy, and typography. It supports JPG, PNG, and WebP uploads with no login or signup.
OpenStatus MCP Server Health Check is a browser-based tool for validating an MCP endpoint’s JSON-RPC handshake, tool exposure, and authentication behavior. It helps operators and developers quickly spot protocol errors, missing headers, and unreachable servers without installing anything.
FixMyCWV is a web-based Core Web Vitals audit tool that produces developer-focused recommendations for improving LCP, INP, and CLS. It works across many site stacks and claims to handle sites with bot protection.
Area Contrast Checker is a Chrome extension for measuring color contrast in rendered web content, with support for WCAG 2.1/2.2 and APCA. It is aimed at accessibility checks and design work involving images, gradients, overlays, and other complex visuals.
Polarity provides sandboxed eval infrastructure for AI agents, with Keystone for isolated testing, benchmarking, replay, and production observability. It is aimed at teams shipping long-running or stateful agents that need real-service sandboxes and reproducible debugging.
Vibeocus Lens is a Chrome extension that captures frontend element context—DOM, selector, visual state, and notes—and streams it into a local Vibeocus MCP workspace. It helps developers hand off precise browser context to AI coding agents for bug fixing and refactoring.
Screen Ruler is a Chrome extension for designers and developers to inspect elements, measure spacing, copy computed CSS, sample colors, and review accessibility or SEO details directly on web pages. A PRO tier adds live CSS editing, responsive testing, and deeper inspection tools.