LLM

424 products

Aymo AI

Aymo AI is an all-in-one AI platform for teams with model switching, comparison, file analysis, web search, and shared workflows.

Argmin AI

Argmin AI turns your rules, docs, and examples into AI evaluations you can run before release—without custom code or an ML team.

BaseRT

BaseRT is an LLM runtime for Apple Silicon Macs that runs local models on your own device for on-device inference and a local coding-agent workflow.

Kimi K3

Kimi K3 is Moonshot AI’s frontier model for coding, knowledge work, and reasoning, with Kimi.com, Kimi Work, Kimi Code, Kimi API, a 1-million-token context window, and native vision.

Zro

Zro is a private inference endpoint for coding agents on EU infrastructure, with OpenAI-compatible and Anthropic-compatible access and zero request retention.

derouter.ai

derouter.ai is a web API for Claude and GPT models with fixed, discounted token pricing, plus support for Claude Code, Codex CLI, and GPT Image 2.

SuperCompress

SuperCompress is a query-aware context compressor for LLM apps that cuts input tokens before inference while preserving key evidence. Open source, CPU, API and Python package.

Muse Spark 1.1

Muse Spark 1.1 is a multimodal reasoning model for agentic tasks, coding, computer use, and multimodal understanding. Available in public preview via the Meta Model API, Meta AI app, and meta.ai.

Auriko

Auriko is an LLM inference routing API for one integration with multiple providers, offering cache-aware cost optimization, routing control, and reliability for AI apps.

Opper AI

Opper AI is an EU-hosted AI gateway for 300+ models via OpenAI SDK-compatible API, with routing, observability, guardrails and compliance.

Constellation Gate AI

Constellation Gate AI is a gateway for AI agents that screens requests, redacts sensitive data, records tamper-evident activity, and can reduce token usage. Supports desktop tools, CLI routing, and SDK setup without code changes.

LongCat-2.0

LongCat-2.0 is a LongCat AI model announcement featuring a 1.6 trillion-parameter system trained entirely on domestic chips.

TuneLLM

TuneLLM turns recurring Claude- or GPT-style workflows into smaller fine-tuned models inside your infrastructure for benchmarked quality at lower inference cost.

Alvoff Inference

Alvoff Inference is an OpenAI-compatible API for speech-to-text, text-to-speech, embeddings, and chat/code generation.

RunInfra

RunInfra benchmarks GPUs, tunes runtime paths, and turns open-source models into production inference stacks with managed API or export for self-hosting.

ClinePass

ClinePass is a paid subscription for curated open weight models in Cline, with a first-month Product Hunt offer for developers using IDE and CLI workflows.

discode.ai

discode.ai is a browser-based AI chat product that routes prompts across models with controls for eco impact, local privacy, and multi-model verification.

Heron

Heron is a passive observability tool for AI agents and LLM APIs. It reconstructs agent turns, tool calls, and LLM interactions from network traffic without SDK changes or an in-path proxy.

Oxlo.ai

Oxlo.ai is an AI inference API with OpenAI-compatible access and request-based monthly pricing for predictable costs.

Crewdle Chat

Crewdle Chat brings GPT, Claude, Gemini, and Grok into one workspace for business teams. It supports web search, uploaded documents, and token-based billing with no per-seat fees.

...

LLM

Products

Aymo AI

Argmin AI

BaseRT

Kimi K3

Zro

derouter.ai

SuperCompress

Muse Spark 1.1

Auriko

Opper AI

Constellation Gate AI

LongCat-2.0

TuneLLM

Alvoff Inference

RunInfra

ClinePass

discode.ai

Heron

Oxlo.ai

Crewdle Chat

LLM

Products

Aymo AI

Argmin AI

BaseRT

Kimi K3

Zro

derouter.ai

SuperCompress

Muse Spark 1.1

Auriko

Opper AI

Constellation Gate AI

LongCat-2.0

TuneLLM

Alvoff Inference

RunInfra

ClinePass

discode.ai

Heron

Oxlo.ai

Crewdle Chat