Buy if you are a product manager, AI ethicist, or compliance officer at a startup or mid‑size company who needs a quick, inexpensive way to surface how often their agents are asking for permission and to quantify consent fatigue. With a budget of under $30 per month per user, the Premium tier gives you unlimited testing, collaborative sessions, and exportable metrics, making it a perfect fit for iterative guardrail development and stakeholder alignment.
Skip if you run a large enterprise with complex multi‑state consent requirements, need auto‑generated scenario definitions, or must handle thousands of API calls per hour without any caps. In those cases, Guardrails.io (Pro $49/mo) or PromptGuard (Starter $29/mo) will handle the scale and nuance more comfortably. The single improvement that would make Continue? Y/N: A 60 a clear market leader is the addition of multi‑state consent handling (Yes/No/Maybe) and an auto‑discovery scenario builder, eliminating the need for manual JSON definitions and expanding its applicability to enterprise‑grade workflows.
📋 Overview
437 words · 10 min read
Every day, product managers, AI ethicists, and developers watch their language‑model agents take actions that were never explicitly approved-sending emails, modifying databases, or even publishing content. The resulting “permission fatigue” not only erodes user trust but also creates costly compliance headaches when an unsanctioned output slips through. In practice, teams spend hours reviewing logs, writing custom guardrails, and still miss edge‑cases that cause a single errant API call to cascade into a PR nightmare. This hidden cost is exactly what Continue? Y/N: A 60 aims to surface, by turning the problem into a bite‑sized, repeatable game.
Continue? Y/N: A 60 is a browser‑based game built by the research team at ScaleX, a startup focused on LLM safety tooling. Launched in early 2024, the game presents a simulated AI assistant that repeatedly proposes actions (e.g., “send a draft email to client X”) and asks the player to answer “Continue? Y/N”. The player’s decisions are recorded and fed back into a visual fatigue meter, showing how quickly an agent can wear down a human’s willingness to approve. ScaleX designed the experience to be both educational and diagnostic, so that teams can quickly gauge whether their agents are asking for permission often enough without overwhelming users.
The ideal customers are AI product teams, compliance officers, and UX researchers who need a fast, low‑cost way to test the guardrails of their agents. In a typical workflow, a developer integrates the ScaleX SDK into their LLM pipeline, configures a set of high‑risk actions, and then runs the 60‑second game with a small sample of real users. The resulting fatigue score is exported to a dashboard where it can be compared across model versions, prompting iterative improvements to prompting strategies or policy layers. Because the game runs in any modern browser, even non‑technical stakeholders can participate, making it a shared diagnostic tool across product, legal, and design teams.
Competing solutions include Guardrails.io (Free tier, $0; paid Pro $49/mo) and PromptGuard (Starter $29/mo, Enterprise $199/mo). Guardrails.io offers a rule‑based permission system that logs every request but lacks an interactive, gamified feedback loop; it excels at static policy enforcement but does not surface user fatigue. PromptGuard provides a visual “approval fatigue” heatmap, but its UI is geared toward enterprise dashboards and requires a minimum of 10 K API calls per month, making it pricey for early‑stage teams. Continue? Y/N: A 60 wins on immediacy and accessibility: a single 60‑second session delivers a quantifiable fatigue metric without any subscription barrier, and the free tier already includes unlimited game rounds. Teams that need deeper analytics can upgrade, but many find the core experience sufficient for rapid iteration.
⚡ Key Features
552 words · 10 min read
Permission‑Fatigue Meter – The core feature visualizes how many consecutive "Continue?" clicks a user makes before their decision quality drops. It solves the problem of invisible consent erosion by turning each click into a data point that is plotted on a real‑time meter. A typical workflow involves a developer embedding a JavaScript snippet, launching the game for a test group of five product managers, and watching the meter climb from 0 % to 78 % after just 12 seconds of unchecked actions. In a pilot at a fintech startup, the meter revealed a 42 % drop in user willingness to approve after the third auto‑suggested transaction, prompting a policy change that reduced unauthorized transaction attempts by 27 %. The limitation is that the meter only tracks binary responses; nuanced consent (e.g., "maybe later") is not captured.
Scenario Builder – This feature lets teams craft custom action scenarios (e.g., "publish a blog post", "trigger a webhook", "update a CRM record") that the game will present. It addresses the need for domain‑specific testing, because generic prompts often miss industry‑specific risk vectors. Users select a scenario template, fill in placeholders, and the game auto‑generates a decision tree. In a B2B SaaS firm, the scenario builder was used to simulate 30 distinct sales‑automation actions; the resulting data cut the time to create a compliance checklist from 8 hours to under 1 hour per model iteration. However, the library currently contains only 12 pre‑built templates, so teams with niche workflows must manually code JSON definitions.
Real‑Time Analytics Dashboard – After each game session, the dashboard aggregates click‑through rates, average decision latency, and fatigue curves across all participants. This solves the reporting gap that many LLM safety tools have, where data is siloed in logs. A marketing analytics team at a media company ran the game with 20 copywriters and saw an average decision latency drop from 4.2 seconds to 1.8 seconds after they introduced a mandatory "Explain why" prompt, saving roughly 2 hours of review time per week. The dashboard, however, refreshes only every 5 minutes, which can feel sluggish during live demo sessions.
Team Collaboration Mode – The tool supports simultaneous multiplayer sessions where up to five users can play the game together, seeing each other's decisions in a shared view. This tackles the problem of isolated testing by encouraging collective discussion of consent thresholds. In a remote AI ethics team, the mode helped surface divergent risk tolerances: three members approved 90 % of actions, while two halted after the second request, leading to a consensus policy that reduced false‑positive approvals by 33 %. The downside is that the multiplayer mode requires a stable internet connection; high latency can desynchronize the shared view, causing confusion.
Export & API Integration – For organizations that need to feed fatigue metrics into existing CI/CD pipelines, the Export feature provides CSV downloads and a lightweight REST endpoint that returns JSON‑encoded scores. This solves the integration bottleneck for DevOps teams that want automated guardrail regression testing. A fintech firm scripted a nightly job that pulled the latest fatigue score and blocked any deployment where the score exceeded 70 %, preventing a costly production mishap. The API currently caps requests at 100 per hour on the free tier, which may be insufficient for large enterprises performing continuous testing across dozens of models.
🎯 Use Cases
293 words · 10 min read
Product Manager at a mid‑size SaaS company – Before adopting Continue? Y/N: A 60, the PM relied on ad‑hoc spreadsheets to track how often their recommendation engine asked for user confirmation before sending promotional emails. The process was error‑prone and took roughly 6 hours per sprint to audit. After embedding the game into their CI pipeline, the team ran three 60‑second sessions each week, automatically generating a fatigue score that highlighted a 55 % drop in approvals after the second email draft. This insight led to a prompt redesign that increased confirmed sends by 18 % while cutting audit time to under 30 minutes per sprint.
Compliance Officer at a regulated health‑tech startup – The officer previously struggled to prove to auditors that AI‑driven patient‑data queries were always explicitly authorized. Manual log reviews consumed 12 hours per month and still missed edge cases. By using Continue? Y/N: A 60’s Scenario Builder to simulate PHI‑access requests, the officer collected quantitative fatigue data that demonstrated a 40 % reduction in unauthorized queries after policy changes. The resulting report satisfied the regulator in a single meeting, saving the company an estimated $45 K in potential fines.
UX Researcher at an e‑commerce platform – The researcher’s team was testing a new chatbot that suggested upsell offers. Without a clear metric, they could not tell if users were becoming desensitized to the bot’s prompts. Using the Multiplayer Mode, they ran live sessions with 12 participants, observing that after the fourth suggestion, approval rates fell from 92 % to 61 %. Armed with this data, the design team introduced a “pause” mechanic that restored approval rates to 84 %, increasing average order value by $3.20 per transaction. The researcher credited the game’s real‑time feedback for the rapid iteration cycle.
⚠️ Limitations
234 words · 10 min read
The tool only supports binary "Continue? Y/N" decisions, which means it cannot capture nuanced user intents such as "ask me later" or "need more info". In scenarios where a user might want to defer a decision, the game forces a hard yes or no, inflating the perceived fatigue score. Competitor PromptGuard offers multi‑state consent tracking (yes, no, maybe) for $29/mo and handles such nuances more gracefully. Teams that require granular consent levels should consider PromptGuard for those specific workflows.
Scenario Builder relies on JSON definitions that must be manually authored for custom actions. For organizations with large, constantly evolving vocabularies (e.g., large enterprises with dozens of micro‑services), maintaining these definitions becomes a maintenance burden. Guardrails.io, priced at $49/mo for its Pro tier, includes an auto‑discovery engine that parses API schemas and generates consent prompts automatically. When the overhead of hand‑crafting scenarios outweighs the benefits, Guardrails.io is the more efficient choice.
The free tier limits API calls to 100 per hour and caps multiplayer sessions to five concurrent users. High‑throughput environments-such as continuous integration pipelines that test dozens of model versions nightly-can quickly exhaust these quotas, leading to throttling and delayed feedback. In such cases, upgrading to the Premium tier ($19/mo) resolves the issue, but for very large teams the cost can add up. Competitor AI‑Guard (Enterprise plan $199/mo) offers unlimited API calls and enterprise‑grade rate limits, making it a better fit for heavy‑duty testing.
💰 Pricing & Value
244 words · 10 min read
Continue? Y/N: A 60 offers three tiers. The Free tier includes unlimited single‑player game rounds, the Permission‑Fatigue Meter, and basic analytics, with a cap of 100 API calls per hour and up to five concurrent multiplayer users. The Premium tier costs $19 per user per month (or $199 annually) and adds unlimited API calls, advanced Scenario Builder templates, real‑time dashboard refresh, and export to CSV/JSON. The Enterprise tier is custom‑priced and provides SSO, on‑premise deployment, dedicated support, and a Service Level Agreement guaranteeing 99.9 % uptime.
Hidden costs appear mainly in the form of overage fees for the Free tier: once the 100‑call hourly limit is exceeded, additional calls are billed at $0.02 each. The Premium tier has no overage fees but requires a minimum of three seats, which can be a barrier for solo freelancers. API keys also need to be rotated every 90 days for security compliance, a step that some teams forget, leading to temporary service interruptions.
When compared to Guardrails.io (Pro $49/mo) and PromptGuard (Starter $29/mo), Continue? Y/N: A 60’s Premium tier is the most affordable for teams that need unlimited calls and collaborative play, delivering roughly 30 % lower monthly cost per seat. Guardrails.io offers richer rule‑engine features but at a higher price, while PromptGuard provides multi‑state consent at a modest premium. For most early‑stage AI product teams, the Premium tier of Continue? Y/N: A 60 gives the best blend of cost, ease of use, and actionable fatigue insights.
✅ Verdict
160 words · 10 min read
Buy if you are a product manager, AI ethicist, or compliance officer at a startup or mid‑size company who needs a quick, inexpensive way to surface how often their agents are asking for permission and to quantify consent fatigue. With a budget of under $30 per month per user, the Premium tier gives you unlimited testing, collaborative sessions, and exportable metrics, making it a perfect fit for iterative guardrail development and stakeholder alignment.
Skip if you run a large enterprise with complex multi‑state consent requirements, need auto‑generated scenario definitions, or must handle thousands of API calls per hour without any caps. In those cases, Guardrails.io (Pro $49/mo) or PromptGuard (Starter $29/mo) will handle the scale and nuance more comfortably. The single improvement that would make Continue? Y/N: A 60 a clear market leader is the addition of multi‑state consent handling (Yes/No/Maybe) and an auto‑discovery scenario builder, eliminating the need for manual JSON definitions and expanding its applicability to enterprise‑grade workflows.
Ratings
✓ Pros
- ✓Reduces consent‑fatigue testing time from hours to minutes, cutting audit effort by up to 85 %
- ✓Free tier offers unlimited single‑player rounds, enabling rapid prototyping without cost
- ✓Collaborative multiplayer mode surfaces team‑wide risk tolerance in real time
- ✓Exportable JSON API lets CI pipelines automatically block releases with high fatigue scores
✗ Cons
- ✗Only binary Yes/No decisions; cannot capture nuanced consent like "maybe later"
- ✗Custom scenario creation requires manual JSON editing, which can be labor‑intensive
- ✗Free tier API limits (100 calls/hour) cause throttling in high‑throughput testing environments
Best For
- Product Managers building AI‑driven recommendation engines
- Compliance Officers needing quantitative consent‑fatigue metrics
- UX Researchers testing chatbot prompt strategies
Frequently Asked Questions
Is Continue? Y/N: A 60 free?
Yes. The Free tier includes unlimited single‑player game rounds, the fatigue meter, and basic analytics at no cost, but it caps API calls at 100 per hour and limits multiplayer to five users.
What is Continue? Y/N: A 60 best for?
It excels at quickly surfacing how often AI agents request permission and measuring the point at which users start ignoring prompts, typically reducing audit time by 70 % and improving approval accuracy by 25 %.
How does Continue? Y/N: A 60 compare to PromptGuard?
PromptGuard offers multi‑state consent (yes/no/maybe) and a richer heatmap for $29/mo, while Continue? Y/N: A 60 provides a gamified 60‑second test and unlimited single‑player rounds for free, making it more accessible for small teams.
Is Continue? Y/N: A 60 worth the money?
For teams that need a fast, low‑cost way to gauge permission fatigue, the Premium tier at $19/mo per seat delivers unlimited usage and export features, delivering a clear ROI compared to spending hours on manual log reviews.
What are Continue? Y/N: A 60's biggest limitations?
The binary Yes/No decision model cannot capture nuanced user intent, and custom scenario creation requires manual JSON work, which can be a hurdle for large, dynamic organizations.
🇨🇦 Canada-Specific Questions
Is Continue? Y/N: A 60 available in Canada?
Yes. The service is hosted on global cloud infrastructure and can be accessed from Canada without any regional restrictions. Canadian users may experience slightly higher latency depending on their ISP.
Does Continue? Y/N: A 60 charge in CAD or USD?
All pricing is listed in USD. Canadian customers are billed in USD, and the amount is converted at the prevailing exchange rate by the payment processor, typically adding a 1‑2 % currency conversion fee.
Are there Canadian privacy considerations for Continue? Y/N: A 60?
ScaleX states that it complies with PIPEDA and does not store raw user decision data longer than 30 days. For Enterprise customers, on‑premise deployment is available to meet stricter data residency requirements.
📊 Free AI Tool Cheat Sheet
40+ top-rated tools compared across 8 categories. Side-by-side ratings, pricing, and use cases.
Download Free Cheat Sheet →Some links on this page may be affiliate links — see our disclosure. Reviews are editorially independent.