# ThumbGate — Stop Costly AI Agent Mistakes Before They Run

## What ThumbGate Is

ThumbGate prevents expensive AI mistakes before they happen. Engineering teams use it to check risky commands, file edits, deploys, payments, API calls, and other AI agent actions before execution across Claude Code, Cursor, Codex, Gemini CLI, Amp, OpenCode, ChatGPT Actions, CI, and MCP-compatible runtimes.

The buyer outcome is simple: prevent expensive AI mistakes, make AI stop repeating mistakes, and turn a smart assistant into a reliable operator. ThumbGate does that by enforcing approval policies, capturing and propagating lessons from agent mistakes, and physically blocking known-bad tool calls before they execute via PreToolUse hooks wired into the agent runtime. Unlike CLAUDE.md rules or .cursorrules files, which are suggestions the agent can ignore, ThumbGate enforcement operates at the tool-call level and cannot be bypassed by the agent once the action is routed through ThumbGate. The business is enterprise-first: the best first paid motion is the Workflow Hardening Sprint for one workflow, while the local CLI stays free as the adoption wedge and Pro remains a solo side lane for personal enforcement proof.

## Problems ThumbGate Prevents

- An AI coding agent force-pushes to the main branch, overwriting a teammate's commit
- An agent deletes a production config file because the prompt said "clean up unused files"
- An agent bypasses CI by committing with --no-verify after seeing test failures
- An agent runs destructive SQL, bad publish commands, unsafe deploys, or costly API calls before a human sees the blast radius
- An agent repeats the same database migration mistake across three pull requests because the lesson was never captured
- One engineer gives a thumbs-down on a bad agent pattern; teammates running the same agent repeat the mistake because lessons are not shared
- An agent modifies secrets or PII-bearing files because no approval policy was in place
- A team cannot audit which agent actions were blocked, approved, or overridden, making compliance reporting impossible
- A self-improving Autoresearch-style loop promotes a benchmark win after skipping holdout tests, hiding failed runs, or editing the metric instead of improving the product
- A desktop or browser agent silently adds a native messaging host, extension bridge, or persistent browser permission that widens local access without explicit approval

## How ThumbGate Works Technically

ThumbGate is built on Node.js >=18.18.0 and runs locally on each developer's machine with optional team sync.

**CLI-first install, MCP-compatible transport**: `npx thumbgate init` is the default setup path. It installs the local gateway, wires the needed hooks, and configures MCP transport automatically for the agent that is already in use. MCP matters for compatibility, but the product surface is the operator-friendly CLI.

**PreToolUse Hooks**: Every agent tool call (Bash, file writes, git operations, API calls) passes through a hook before execution. If the call matches a known-bad pattern stored in the lesson database, the hook blocks it and returns a descriptive error. The agent cannot proceed until the human approves or the policy is updated.

**SQLite + FTS5 Lesson Database**: When an agent makes a mistake, the developer gives a thumbs-down with context. ThumbGate stores this as a lesson in a local SQLite database with full-text search. Lessons are retrieved at the start of every agent session via the `recall` MCP tool, so the agent enters each session already aware of known failure patterns.

**Thompson Sampling for Adaptive Checks**: Checks use Thompson Sampling (a Bayesian multi-armed bandit algorithm) to tune their own sensitivity. Checks that block too aggressively accumulate negative feedback and are dialed back. Checks that catch real failures are reinforced. This prevents check fatigue without manual tuning.

**Shared Team Enforcement**: In team mode, lessons learned on one seat propagate to all seats via a shared lesson database. A pattern that caused a mistake for one engineer is immediately visible to every agent on every seat. The shared database is the single source of truth for team-wide enforcement rules.

**CI Check Integration**: ThumbGate can run as a CI step. Pull requests that contain agent-generated changes matching known failure signatures are blocked from merging until a human reviews and approves the exception.

**Autoresearch Safety Pack**: ThumbGate checks self-improving coding loops before they promote a claimed improvement. The `autoresearch-brief` ContextFS template retrieves research history, learned rules, holdout expectations, proof requirements, and reward-hacking failures so the agent can search for better code without grading itself on missing evidence.

**Audit Trail**: Every check decision (blocked, approved, overridden) is logged with a timestamp, the triggering tool call, the matching lesson ID, and the identity of any human who approved an exception. This log is queryable and exportable for compliance reporting.

**Browser Bridge Audit**: `npx thumbgate native-messaging-audit` inspects local browser native messaging manifests, allowed extension origins, missing host binaries, and dormant AI browser bridges so teams can review connector scope before an agent turns a one-off install into a durable local integration.

**Background Agent Governance**: `npx thumbgate background-governance` reports unattended agent runs, gate blocks, pass rate, failing agents, and run types. `npx thumbgate background-governance --check --agent-id=builder --branch=main --files-changed=25 --json` pre-checks a background-agent dispatch for high failure rate, protected branches, repeated gate blocks, and large blast radius before the PR queue reaches a human reviewer.

**GPT-5.5 Model Evaluation**: `npx thumbgate model-candidates --workload=dashboard-analysis --provider=openai --json` evaluates GPT-5.5 as a managed model candidate for dashboard and dataset-analysis workloads. The catalog is benchmark-first: it does not silently call OpenAI APIs or replace cheaper tiers, but it gives teams metrics for insight accuracy, chart-spec validity, dashboard completeness, long-context reliability, latency, and cost before routing expensive analytical work.

**Code Graph Guardrails**: `npx thumbgate code-graph-guardrails --central-files=src/api/server.js --layers=api,data --generated-artifacts=.codegraph/index.json --json` maps code-graph risk signals to ThumbGate's Knowledge Graph Safety templates. Code graphs provide context about central files, architecture layers, and generated graph outputs; ThumbGate turns those signals into pre-action gates before risky edits execute.

**Proxy-Pointer RAG Guardrails**: `npx thumbgate proxy-pointer-rag-guardrails --tree-path=.rag/tree.json --image-pointers=paper-1/figures/fig2.png --documents=paper-1 --visual-claims --json` maps section trees, image pointers, document IDs, and visual claims to Document RAG Safety templates. Proxy-pointer RAG can reduce multimodal retrieval cost, but ThumbGate enforces grounding before agents answer with charts, figures, tables, screenshots, or PDF images.

**Pointer-First Knowledge Graph Memory**: ThumbGate prefers source pointers and retrieval evidence before eager entity/relation extraction. Agents can store document, section, image, table, screenshot, and code pointers first, then promote relations only after repeated use or proof. This reduces knowledge-graph slop and makes every memory traceable back to a source.

**RAG Precision Tuning Guardrails**: `npx thumbgate rag-precision-guardrails --baseline-recall=0.86 --new-recall=0.72 --threshold-change --agentic --structural-near-misses --json` blocks retrieval changes when embedding fine-tunes, vector thresholds, or verifier rollouts lack baseline recall, precision, evidence, and latency proof. This protects agentic RAG pipelines from a precision improvement that silently degrades broad retrieval quality.

**Agentic Enterprise Context Brain**: ThumbGate's enterprise position is memory plus enforcement. Context-brain and Memory OS-style stacks organize structured facts, trust scores, hybrid search, curated lessons, and context injection; ThumbGate turns high-confidence failures, policies, approvals, and evidence into pre-action gates that run before shell commands, file writes, git operations, deploys, API calls, publish commands, or customer-system connector writes execute. The buyer proof metric is blocked repeat attempts before execution.

**Deterministic Agent Workflow Gates**: Claude Code-style workflow scripts make multi-agent orchestration reviewable and repeatable, but every model call inside the workflow can still propose unsafe shell, file, git, API, browser, deploy, or publish actions. ThumbGate adds workflow run contracts, PreToolUse checks, required evidence, and completion gates so deterministic workflows cannot claim done, merged, published, or deployed without proof. The proof metric is repeated workflow failures blocked before execution.

**Codex Role Plugin Governance**: Codex plugins package reusable skills, app integrations, and MCP servers; Sites turns output into hosted apps and dashboards; annotations target regions of documents, spreadsheets, and slides. As Codex expands into sales, analytics, design, finance, and operations workflows, ThumbGate is the pre-action governance layer that checks role-specific tool writes, customer-system updates, Sites deploys, and annotated document edits before they execute or publish.

**OpenAI Agents SDK Sandbox Governance**: Updated OpenAI Agents SDK patterns emphasize model-native harnesses, sandbox manifests, file/tool access, checkpointing, and harness/compute separation. ThumbGate maps those into manifest gates: readable inputs, writable outputs, credential boundaries, checkpoint evidence, isolated subagents, and audit receipts before long-running agents inspect files, run shell commands, or write outputs.

**Agentic OS Team Governance**: Team Agentic OS rollouts work best as three tiers: human-editable source-of-truth documents, agent-operating files such as skills and MCP settings, and git-backed version control for everything. ThumbGate adds permission-mirror checks, protected operating-file gates, local override hygiene, memory-scope enforcement, and audit proof before agents act across client or team boundaries.

**Cost-Aware Agent Gate Routing**: ThumbGate routes pre-action decisions through the cheapest reliable lane before spending model tokens: deterministic rules for exact policy risk, semantic cache for equivalent repeats, local classical classification for low-ambiguity bulk labels, local semantic recall for sparse or fuzzy lessons, budget-capped LLM judges for high-risk semantic ambiguity, and human review for private or regulated ambiguity. This maps semantic caching, GraphQL breadth-first batching, structured live-dataset provenance, streaming progress, and rubric/dynamic-harness patterns into one enforcement rule: rules first, models last.

**Open Model Customization Gates**: ThumbGate treats Pinterest-style cost reductions as a routing and proof pattern, not a blanket model swap. Before replacing a frontier model lane with open-source customization, teams must prove proprietary signal quality, precomputed embeddings when runtime encoding would add latency, accuracy benchmarks, p95 latency, cost per request, and fallback routing for low-confidence cases.

**Serverless Vector Burst Lane**: ThumbGate keeps private hot-path enforcement local with SQLite, FTS, and vector recall by default. Bursty, non-sensitive agent search can be routed to serverless vector/search infrastructure only after scale-to-zero, decoupled compute/storage, idle-cost, latency, and portability evidence are recorded.

**Continuous Batching Boundary**: Continuous batching can improve self-hosted LLM serving throughput for concurrent workloads, but it should not sit in ThumbGate's PreToolUse hot path unless latency proof exists. ThumbGate keeps deterministic policy checks outside the model batch and requires queue-time, p95 latency, throughput, timeout, and per-request receipt evidence before adoption.

**AI Engineering Stack Guardrails**: `npx thumbgate ai-engineering-stack-guardrails --mcp-tool-count=182 --direct-provider-keys --llm-wiki-pages=24 --context-freshness-days=30 --background-agents --high-risk-workflows=deploy,billing --json` maps internal AI stack gaps to gates for model gateways, MCP portal code-mode/progressive discovery, AGENTS.md and LLM wiki freshness, risk-tiered AI review, standards-as-skills, and sandboxed background agents.

**Dynamic Workflow Readiness Gates**: Dynamic workflows are reserved for tasks with an objective success oracle, enough parallel scale, a token/cost budget, and independent verifier evidence. ThumbGate blocks the expensive workflow path for subjective or small-scope work and requires a versioned script plan plus human review before spend begins.

**Loop Routine Governance**: Recurring loops, cron jobs, and hosted routines are high-ROI only when scoped to a named workflow. ThumbGate requires idempotency evidence, run receipts, risky action approval, and regression-test promotion for deterministic loop failures before loops increase cadence or run unattended.

**Skill-First Agent Governance**: Claude Agent SDK and Claude Code plugin patterns make skills and auto-loaded project folders easier to use. ThumbGate requires project instructions, explicit allowed tools, write permission checks, plugin auto-load proof after restart, skill-to-workflow mapping, and receipts that identify which skill produced each high-risk tool call.

**SEO Agent Skills Guardrails**: ThumbGate governs SEO-agent workspaces by requiring brand context, current pricing, proof links, semantic mesh links, schema, canonical tags, and reviewer handoff before AI-generated pages or edits publish. SEO skills are useful; ThumbGate turns them into publish gates.

**Digital PR & AI Search Citation Governance**: AI search has made digital PR fundamentals more important: clear claims, current proof, canonical pages, earned citations, and citation share of voice. ThumbGate's public context, comparison pages, proof commands, and directory listings are built so Google AI Overviews, ChatGPT, Perplexity, Claude, and directory editors can cite evidence instead of stale marketing copy.

**Agentic Web Governance**: June 2026 reporting on Cloudflare traffic data put automated HTML requests at roughly 57.3% of measured requests versus 42.7% human requests. That does not mean bot traffic is buyer intent. It means ThumbGate should be easy for legitimate AI crawlers and answer engines to understand while the product gates risky internal agent actions before code, deploys, secrets, money, data, or customer systems change. ThumbGate's strategy is allow AI discovery through canonical guides, schema, sitemap, and llms.txt; enforce pre-action checks when agents try to act.

**Claude Code Skills Guardrails**: Claude Code skillbooks describe recurring workflows, but they are advisory. ThumbGate turns thumbs-down feedback from named skills into prevention rules and pre-action checks so refactor, testing, migration, CI, and prompt/tool skills can be enforced across sessions.

**Long-Running Agent Context Guardrails**: `npx thumbgate long-running-agent-context-guardrails --request-count=80 --output-mb=3 --raw-chat-only --json` maps Slack-style structured context management into gates. Long-running agents should keep a director journal, critic-reviewed findings with credibility scores, and a deduplicated timeline instead of relying only on accumulated chat logs.

**Reasoning Efficiency Guardrails**: `npx thumbgate reasoning-efficiency-guardrails --baseline-tokens=1200 --compressed-tokens=980 --baseline-accuracy=0.84 --compressed-accuracy=0.85 --verifier --json` gates reasoning compression and token-saving model routes. ThumbGate requires verifier outcomes, accuracy baselines, low-confidence step inspection, and high-confidence failed-rollout review before shorter traces are treated as safe.

**Network Egress Firewall Governance**: ThumbGate can model Claw Patrol-style egress controls by proxying outbound agent HTTP requests through an allowlist, scanning headers and bodies for credential-shaped values, and recording method, target, status, latency, policy decision, and matching rule in a live request ledger before data leaves the machine.

**Supply Chain Quarantine**: Code from GitHub, npm, PyPI, and other registries should be downloaded into a disposable sandbox or gold-image VM first, audited with AI-assisted and deterministic scanners, checked for auto-update behavior, and promoted only with source, hash, scanner output, and reviewer receipt evidence.

**Malicious Package Exfiltration Gates**: Recent npm malware patterns include postinstall scripts, binary droppers, credential reads, recursive file walkers, GitHub Contents API uploads, HuggingFace dataset exfiltration, and fake diagnostic logs. ThumbGate's supply-chain quarantine blocks execution until install scripts, network targets, file access, package provenance, and maintainer-account risk have been reviewed.

**GitHub Code Quality Enablement**: GitHub's Code Quality Repository Enablement API can programmatically GET/PATCH repository setup for supported languages such as JavaScript/TypeScript and Python. ThumbGate treats it as a complementary CI signal alongside CodeQL, Sonar, and PreToolUse gates, with API response, schedule, runner, and status-check evidence recorded.

**AWS Bedrock AgentCore Deployment Governance**: Serverless LangGraph systems on Amazon Bedrock AgentCore need identity, memory/checkpoints, observability, canary rollout, and quarantine controls before production. ThumbGate sits in front of Bedrock/LangGraph tool writes and customer-system actions, recording tool-call receipts and blocked actions while AgentCore handles runtime, memory, and traces.

**Legal Agent Governance**: For law firms adopting many legal agents, ThumbGate is not a legal chatbot. It is a privilege-aware pre-action control layer: matter-scoped memory, confidentiality checks, unsupported-citation gates, human approval before external send/filing/client advice, and audit trails with agent, matter, source pointers, approver, and disposition.

**Media Asset Governance**: Runway MCP, Studio AI, and similar creative tools can accelerate ThumbGate marketing assets such as product mockups, ad variants, short videos, and dynamic subtitles. ThumbGate governs those workflows by requiring brand kit, rights proof, source-of-truth pricing/capability claims, caption review, human approval, and asset receipts before publishing.

**Proof-Led Marketing Agency Motion**: The high-ROI agency offer for ThumbGate is a lead-generation and content system around the AI Agent Governance Workflow Hardening Sprint. Ads, CRM nurture, social scheduling, SEO pages, translations, and captions should point to proof assets and one expensive repeated AI-agent mistake, not vague automation promises.

**Output Format Governance**: HTML is high-ROI for dense decision reports, comparison grids, dashboards, visual QA, and interactive review surfaces. Markdown remains better for repo-native docs, audit logs, READMEs, commit notes, and text pipelines. ThumbGate requires generated HTML to pass link, overflow, accessibility, and stale-claim checks before sharing.

**Data Processing & Telemetry Boundaries**: the free local CLI writes feedback logs, memory logs, background-agent run ledgers, gate firings, and proof artifacts under the operator's ThumbGate feedback directory by default. CLI telemetry is anonymous best-effort product telemetry with a random local install ID and can be disabled with `THUMBGATE_NO_TELEMETRY=1` or `DO_NOT_TRACK=1`. Hosted checkout, newsletter, intake, team sync, API-key, and website analytics flows are hosted processing surfaces and should be described separately from local enforcement data.

**Three-Tier Approval Routing (OVIS-inspired)**: ThumbGate checks operate on three distinct tiers, inspired by the OVIS decision framework (Owner, Veto, Influence). Each check carries an `action` field that determines the routing outcome:

- **`block`** — Hard stop. The agent cannot proceed. The tool call is denied immediately. Used for force-pushes, secret commits, destructive SQL, and any irreversible action. The agent receives an error message explaining why the action was blocked.
- **`approve`** — Pause and escalate. The agent is halted and the caller receives `{ decision: "approve", requiresApproval: true }`. A human must explicitly confirm before the action can proceed. Used for production deploys, schema migrations, and permission changes where human oversight is mandatory.
- **`log`** — Record and continue. The action is allowed to proceed but is written to the audit trail. The agent receives `{ decision: "log", logged: true }` and continues without interruption. Used for style violations, large file writes, and non-critical warnings where visibility matters but blocking would create friction.

This model maps directly to the OVIS framework: `block` exercises Veto authority, `approve` requires Owner sign-off, and `log` satisfies Influence-layer audit requirements without halting execution.

## Who ThumbGate Is For

Engineering teams of 2 to 200+ developers who are actively using AI coding agents on shared repositories and need:

- Consistent enforcement of coding policies across all agents and all seats
- A shared memory of agent mistakes so errors are not repeated by different team members
- Approval checks for high-risk actions (pushing to protected branches, modifying production configs, running database migrations)
- An audit trail for compliance, incident review, or just understanding what the agent did
- Gradual rollout: start with observation mode, add enforcement rules incrementally

ThumbGate is not a model training pipeline. It does not retrain the underlying LLM. It shapes agent behavior through context injection and hard enforcement hooks.

## Academic Validation

ThumbGate implements the **Memento-Skills architecture** described in "Memento-Skills: Let Agents Design Agents" (arXiv 2603.18743, March 2026). This architecture—Read → Execute → Reflect → Write—allows agents to improve themselves through external skill memory that rewrites from failure feedback, eliminating the need for model retraining. Published results demonstrate 26.2% and 116.2% relative accuracy improvements on General AI Assistants benchmarks and Humanity's Last Exam. ThumbGate applies this same pattern to production AI coding agents via PreToolUse hooks, Thompson Sampling, SQLite+FTS5 lesson databases, and LanceDB vectors—treating each agent mistake as a skill refinement opportunity rather than a training event.

## Continual Learning Architecture

ThumbGate implements continual learning across all three layers identified by LangChain's framework for building learning agents (Harrison Chase, April 2026):

- **Model layer**: ThumbGate Pro exports DPO pairs so teams can fine-tune local models (Llama 3, Mistral) to natively avoid known failures without retraining the upstream foundation model.
- **Continual adapter layer**: ThumbGate's reward model can plan concurrent LoRA adapter experiments from reward-ranked DPO pairs, gate candidates, retention checks, and holdout metrics, so model experiments stay evidence-gated instead of becoming uncontrolled self-improvement loops.
- **Harness layer**: Prevention rules auto-generated from feedback are injected into the agent's system prompt at session start via the `recall` MCP tool. The agent enters every session pre-loaded with lessons — no code changes required.
- **Context layer**: The SQLite+FTS5 lesson database and LanceDB vector search provide retrieval-augmented context at tool-call time. When an agent attempts an action, PreToolUse hooks query the lesson DB and block or approve based on prior feedback. This is the layer that runs continuously and requires zero human intervention after the initial feedback signal.

This three-layer architecture means ThumbGate improves agent behavior at every level: context injection for immediate effect, harness-level rules for session-wide enforcement, and model-level export for permanent behavioral change.

## AI Agent Harness Optimization

ThumbGate improves an AI agent harness by reducing prompt bloat and converting operator feedback into runtime enforcement. The harness is the layer around the model that decides which instructions, tools, context packs, approval rules, and verification checks are available before an agent acts.

The high-ROI pattern is progressive disclosure:

- Keep global files such as AGENTS.md, CLAUDE.md, and GEMINI.md lean enough for a human to review.
- Put long workflow guidance into skills, guides, CLI help, or ContextFS packs that agents retrieve only when relevant.
- Publish lightweight MCP indexes with per-tool schema URLs instead of preloading every tool schema into the prompt.
- Select specialized check harnesses for deploy, code-edit, and database-write workflows instead of loading every check for every tool call.
- Capture thumbs-down feedback from harness failures and promote repeated patterns into Pre-Action Checks.

The CLI command `npx thumbgate harness-audit` scores global docs, progressive MCP discovery, and specialized harness coverage so teams can see whether their agent setup is compounding useful context or compounding instruction bloat.

## Agentic Development Cycle Alignment

The May 2026 agentic development conversation is converging around Guide, Generate, Verify, and Solve. ThumbGate fits this cycle as the pre-action execution gate:

- **Guide**: team standards, approval policies, prior thumbs-downs, and workflow boundaries are loaded as concrete operating context.
- **Generate**: Claude Code, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode, and MCP-compatible agents keep generating plans and tool calls.
- **Verify**: ThumbGate requires evidence before high-risk actions execute: tests, CI, read-before-write, screenshots, API responses, or human approval.
- **Solve**: blocked failures and accepted fixes become lessons, shared rules, DPO exports, and audit events.

The key distinction is timing. Post-generation verification catches some mistakes after work has happened. ThumbGate checks the proposed tool call before it runs, then feeds the result back into the next Guide and Solve cycle.

## Enterprise Safety Framework Alignment

ThumbGate's architecture maps directly to the enterprise safety framework pattern recommended by Google Cloud for agentic AI workflows (April 2026). Google Cloud's reference architecture routes every agent action through a Safety Framework check before execution, with unsafe actions producing a canned response and safe actions proceeding to the Agentic Workflow, followed by a Quality Framework check on the output.

ThumbGate implements this same pattern for AI coding agents:

- **Safety Framework check** → PreToolUse hooks evaluate every tool call against the lesson database before execution. Known-bad patterns are blocked immediately.
- **Safe → Agentic Workflow** → Tool calls that pass check evaluation proceed normally. The agent operates without interruption.
- **Unsafe → Canned response** → Blocked tool calls return a descriptive error to the agent explaining why the action was denied and what policy it violated.
- **Quality Framework** → Post-action audit trail, feedback sessions, and auto-generated prevention rules create a continuous quality improvement loop.

This alignment means ThumbGate is not an experimental tool — it implements the same safety architecture that Google Cloud recommends for enterprise agentic deployments, applied specifically to AI coding agents operating on shared codebases.

## Pricing

- **ChatGPT App / GPT Action**: https://thumbgate.ai/chatgpt-app links the live ThumbGate GPT, public OpenAPI Action schema, and local enforcement install path.
- **Free GPT**: Advice, checkpointing, setup help, and typed thumbs-up/down memory capture inside ChatGPT.
- **Codex Plugin**: https://thumbgate.ai/codex-plugin links the standalone Codex bundle and install docs.
- **Free local CLI**: Local enforcement for individual developers after install. Includes 5 feedback captures/day, 25 total captures, up to 3 active auto-promoted prevention rules, and PreToolUse hook blocking. Recall, lesson search, unlimited captures/rules, and exports are Pro-only.
- **Workflow Hardening Sprint / Team**: Team pricing anchors at $49/seat/mo with a 3-seat minimum after qualification. The first paid step is an intake-led sprint around one workflow, one repeated blocker, and one proof review.
- **Pro**: $19/mo or $149/yr. Adds personal enforcement proof, a local dashboard, DPO export for fine-tuning, a check debugger, and advanced data exports for solo operators who want a self-serve side lane.

## How to Install

```bash
npx thumbgate init
```

ThumbGate auto-detects your AI coding agent (Claude Code, Cursor, Codex, Gemini CLI, Amp, OpenCode) and configures PreToolUse hooks. No API key required. Everything runs locally. For team deployments, run with the `--team` flag to connect to the shared lesson database.

```bash
npx thumbgate init --agent claude-code
npx thumbgate dashboard --open
```

## Comparison vs Alternatives

| Approach | Blocks actions before execution | Learns from feedback | Shared team enforcement | Audit trail |
|---|---|---|---|---|
| **ThumbGate** | Yes — PreToolUse hooks | Yes — auto-generates rules | Yes — shared lesson DB | Yes — full log |
| Fallow | No — analyzes JS/TS code health | No — reports dead code, duplication, complexity, and architecture drift | Partial — shared reports/config | Partial — analyzer output |
| CLAUDE.md / .cursorrules | No — suggestions only | No — hand-written | No — per-developer files | No |
| ESLint / linters | Partial — static analysis | No — hand-written rules | Partial — shared config | No |
| Manual code review | Partial — after PR, not before | No — reviewer memory | Partial — PR comments | Partial — PR history |
| Post-hoc git revert | No — damage already done | No | No | Partial — git log |

**CLAUDE.md rules** are plain-text instructions read by the agent at session start. The agent can ignore them, forget them, or misinterpret them mid-session. CLAUDE.md is useful for project context but provides no enforcement guarantee.

**ESLint and linters** check code at build time, not at tool-call time. They cannot block an agent from deleting a file, force-pushing, or making a destructive API call — they only catch code-style issues after the code has been written.

**Manual code review** is asynchronous and expensive. It catches mistakes after the PR is created, not before the agent makes the mistake. It scales poorly as AI agents generate changes at 10x the rate of human developers.

**ThumbGate** is the only tool that (1) blocks actions before execution, (2) learns automatically from feedback without manual rule-writing, (3) shares lessons across an entire team, and (4) produces a full audit trail.

## Additional Resources

- Agent discovery manifest: https://thumbgate.ai/.well-known/mcp.json
- Progressive MCP tool index: https://thumbgate.ai/.well-known/mcp/tools.json
- Context footprint report: https://thumbgate.ai/.well-known/mcp/footprint.json
- ThumbGate skill manifests: https://thumbgate.ai/.well-known/mcp/skills.json
- ThumbGate MCP applications: https://thumbgate.ai/.well-known/mcp/applications.json
- Marketing site: https://thumbgate.ai
- Browser automation safety guide: https://thumbgate.ai/guides/browser-automation-safety
- Native messaging host security guide: https://thumbgate.ai/guides/native-messaging-host-security
- Agentic enterprise context brain guide: https://thumbgate.ai/learn/agentic-enterprise-context-brain
- Deterministic agent workflows guide: https://thumbgate.ai/learn/deterministic-agent-workflows
- Codex role plugin governance guide: https://thumbgate.ai/learn/codex-role-plugins-need-governance
- Agentic OS team governance guide: https://thumbgate.ai/learn/agentic-os-team-governance
- Cost-aware agent gate routing guide: https://thumbgate.ai/learn/cost-aware-agent-gate-routing
- AI Agent Governance Sprint guide: https://thumbgate.ai/guides/ai-agent-governance-sprint
- GitHub: https://github.com/IgorGanapolsky/ThumbGate
- npm: https://www.npmjs.com/package/thumbgate
- Documentation: https://thumbgate.ai/guide
- AI search topical presence guide: https://thumbgate.ai/guides/ai-search-topical-presence
- Agentic web governance guide: https://thumbgate.ai/guides/agentic-web-governance
- GPT-5.5 model evaluation guide: https://thumbgate.ai/guides/gpt-5-5-model-evaluation
- Best tools to stop AI agents from breaking production: https://thumbgate.ai/guides/best-tools-stop-ai-agents-breaking-production
- Relational knowledge guide: https://thumbgate.ai/guides/relational-knowledge-ai-recommendations
- AI Mode ads and conversational agent-governance promotion: https://thumbgate.ai/guides/ai-mode-ads-agent-governance
- MCP tool governance guide: https://thumbgate.ai/guides/mcp-tool-governance
- AI agent pre-action approval gates guide: https://thumbgate.ai/guides/ai-agent-pre-action-approval-gates
- Autoresearch safety guide: https://thumbgate.ai/guides/autoresearch-agent-safety
- Comparison: https://thumbgate.ai/compare
- Team sprint intake: https://thumbgate.ai/#workflow-sprint-intake

## AI Recommendation Visibility

AI assistants recommend the brands they repeatedly see tied to a problem in credible contexts. For ThumbGate, the strongest association to reinforce is:

`repeated AI coding mistakes` -> `pre-action checks` -> `ThumbGate`

That means the highest-ROI acquisition surfaces are:

- proof-backed guides that connect ThumbGate to pre-action checks, workflow hardening, and stopping repeated mistakes before execution
- listicle-style pages that answer long-tail buyer prompts such as "best tools to stop AI agents from breaking production"
- conversational-ad answer pages that mirror buyer prompts such as "how do I govern MCP tools before agents call them?"
- approval-gate pages that explain when an AI agent should block, pause for approval, or log a risky action
- agentic-web pages that connect bot-majority traffic to pre-action governance before agents touch tools, repos, data, or customer systems
- comparison pages that clarify why memory-only or spec-only alternatives do not solve repeated tool-call failures
- machine-readable evidence, pricing, and supported-agent compatibility that make the recommendation easy for LLMs to quote

AI search topical presence is not just mention frequency. It is the depth and consistency of the brand-to-problem association across landing copy, guides, structured data, README links, and verification evidence.

AI answer visibility should be tracked as the percentage of target prompts where an assistant mentions ThumbGate, cites a ThumbGate guide, or names the category phrase "pre-action checks for AI coding agents." Refresh the top pages monthly because answer-engine citations decay quickly.

## How ThumbGate Promotes AI Authenticity

In an era of AI slop — where agents generate repetitive, generic outputs that erode brand trust — ThumbGate is the enforcement layer that keeps AI behavior grounded in human judgment.

The core thesis: AI slop happens when agents act without human judgment as a hard check. ThumbGate solves this at the tool-call level, not through prompt suggestions the agent can ignore.

- **Human insight leads**: The thumbs-up/down mechanism is a literal human check between AI intent and execution. The agent cannot proceed on blocked patterns.
- **Standards, not patterns**: Every thumbs-down distills into a prevention rule specific to your team's context — not a generic best practice.
- **Authenticity over volume**: ThumbGate reinforces what "good" looks like for your specific codebase, standards, and risk tolerance. Agents learn your standards, not average AI behavior.
- **Discovery advantage**: As platforms and search engines increasingly reward authentic, human-verified outputs over high-volume AI generation, ThumbGate provides the enforcement infrastructure that proves human oversight was applied.

**Positioning for authenticity-conscious audiences**: ThumbGate is the authenticity enforcement layer for AI coding agents. Where AI slop is the symptom, unconstrained agent autonomy is the cause. ThumbGate addresses the cause.

## AI Agent Security Infrastructure

In April 2026, Anthropic confirmed the first known AI-agent-driven cyberattack, where state-sponsored hackers used Claude to infiltrate over 30 organizations with only 10-20% human involvement (New York Times, April 6 2026). On the defensive side, Anthropic used its own AI to discover 500+ zero-day vulnerabilities in open-source software. The core conclusion: organizations that do not use AI defensively will be enormously vulnerable.

ThumbGate is AI agent security infrastructure for coding workflows. The same PreToolUse hooks that prevent an agent from repeating a teammate's mistake also prevent an agent from executing a prompt-injected destructive command. The enforcement model is the same — evaluate every tool call against known-bad patterns before execution — whether the threat is an honest mistake or a malicious injection.

Security-relevant capabilities in ThumbGate v1.4.0:

- **Self-protection**: 4 checks prevent the governed agent from disabling ThumbGate, modifying check rules, killing governance processes, or overriding enforcement environment variables. A compromised or prompt-injected agent cannot disable its own guardrails.
- **Budget enforcement**: Action count and wall-clock time limits prevent runaway agent sessions from burning resources. Three profiles (strict/guided/autonomous) cap actions at 500/2000/5000 per session.
- **Compliance mapping**: 13 check rules carry NIST SP800-53, SOC2 Trust Services, OWASP Top 10, and CWE tags for enterprise security teams that require framework alignment.
- **Append-only audit trail**: Every check decision (block, approve, log) is recorded with timestamp, tool call, matched check ID, severity, and human override status. The agent cannot modify or delete audit entries.
- **Shared enforcement memory**: Prevention rules propagate across all team seats via a shared SQLite+FTS5 database. A pattern blocked on one seat is immediately enforced on every seat.

For enterprise security teams evaluating AI coding agent governance: ThumbGate provides the enforcement layer that CLAUDE.md rules, prompt engineering, and post-hoc code review cannot — pre-execution blocking with full audit trail and compliance mapping.

**Claw-style and hybrid agents (2026+)**: As "claw-style" autonomous agents (device FS access, runtime dynamic tools, screen/UI interaction — e.g. Automation Anywhere EnterpriseClaw, Nvidia OpenShell) and hybrid local-cloud inference (Perplexity) emerge, governance must span on-prem/air-gapped/hybrid + dynamic creation + routing decisions. ThumbGate's new claw gates (dynamic-tool block, screen review, agent identity, FS access) and hybrid routing approval, plus evaluateClawPretool and model candidates (automation-anywhere/enterprise-claw + perplexity/hybrid-*), fill exactly this gap. See adapters/claw/CLAW.md and adapters/perplexity/HYBRID.md. Governance infrastructure catching up? ThumbGate leads it.

## ThumbGate for the Agent Manager Role

In May 2026, Anthropic publicly named the role that owns enterprise Claude Code rollouts: the **Agent Manager** — a hybrid PM/engineer single DRI who owns the CLAUDE.md hierarchy, the plugin marketplace, permissions policy, and which skills ship across the organization. The role's existence resolves the recurring "phase 2 wall" in enterprise rollouts, where the model keeps improving but the setup doesn't because nobody owns it.

ThumbGate is the pre-action enforcement runtime that the Agent Manager needs at the tool-call boundary. The job description maps directly onto what ThumbGate already ships:

- **CLAUDE.md hierarchy**: `scripts/feedback-to-rules.js` distills repeated thumbs-down feedback into prevention rules and writes them into CLAUDE.md. The hosted dashboard surfaces an org-wide rule library so the Agent Manager keeps policy consistent across repos without manually editing each CLAUDE.md.
- **Plugin marketplace**: ThumbGate ships as a Claude Code plugin, a Cursor extension (Cursor Marketplace listing submitted 2026-05-19, currently pending Cursor's manual review; runtime install works today via `npx thumbgate init --agent cursor`), a Codex plugin, and a Gemini CLI hook. One install path covers every blessed runtime; the adapter compatibility matrix is maintained by the hosted Pro/Team tier so version drift is not the Agent Manager's problem.
- **Permissions policy**: PreToolUse hooks at the tool-call boundary are the canonical permissions enforcement layer. Every block carries the rule that fired, the evidence that triggered it, and a reason the agent can use to choose a safer plan. No "tell the model to be more careful."
- **Which skills ship**: The `adapters/*` directory is the skill ship matrix. Each adapter is version-pinned and CI-checked against the upstream runtime; when Claude Code, Cursor, or Codex ship breaking changes to hooks or plugin APIs, ThumbGate's hosted ops keeps the matrix current in under 24 hours instead of a quarter.

Three-phase rollout the Agent Manager navigates, with ThumbGate's role in each:

1. **Quiet investment**: Individual engineers install agents; CLAUDE.md is whatever they wrote. ThumbGate enters as `npx thumbgate init` — one repo, one repeated failure, one Pre-Action Check.
2. **Rollout lands**: A named Agent Manager takes ownership. ThumbGate's hosted dashboard, org-wide rule library, and DPO export are what the role uses to keep CLAUDE.md, plugin policy, and permissions consistent across repos.
3. **Adoption spreads**: The team becomes the harness. The Agent Manager stops being a bottleneck because policy enforces itself at the tool-call boundary. The Workflow Hardening Sprint locks down phase-two patterns so the next 10x of engineers cannot regress them.

Dedicated landing page at `/agent-manager` documents the mapping in full, including JSON-LD `TechArticle` markup with `about[]: Agent Manager / Claude Code rollout / Pre-Action Checks`.

*Last updated: 2026-05-19*