Every score is calculated.
Not curated.
The NGS is a structured scoring framework that converts real testing data into comparable, objective scores. across 8 pillars, 3 personas, and 4 analysis lenses.
Built on data,
not opinions
Most AI tool rankings tell you what someone thinks. The NGS tells you what the tool actually did. Measured against a fixed protocol, converted by fixed rules.
Real-workflow testing
Every tool runs the same benchmark protocol. same prompts, same workflow stages, same verification steps. Tested on free-tier access where possible, so the score reflects what you get before paying.
Persona-weighted scoring
A Freelancer and an SEO Specialist have fundamentally different needs. Three persona weight matrices shift how pillar scores contribute to the final NGS. The same tool scores differently for each persona.
Threshold-based mapping
Raw inputs. seconds, grade levels, feature counts, binary checkpoints. convert to 1–10 pillar scores via fixed mapping tables. Zero human judgment in the conversion. Same input always produces the same score.
From raw test data
to a comparable score
Every NGS score is produced by the same three-stage conversion. so any two tools can be placed on the same scale and compared directly.
Raw Tool Data
Real measurements captured from the benchmark protocol during live testing.
NGS Mapping Engine
Inputs convert to 1–10 pillar scores using fixed threshold tables. No manual overrides on active tools.
Final NGS Score
Weighted pillar scores produce the composite 0–10 score, 3 persona breakdowns, and 4 lens scores.
The 8 performance pillars
Every tool is tested and scored across all 8 pillars. The underlying measurements never change. only the persona weights that determine each pillar's contribution to the final score.
Quality of generated content. Evaluated via Hemingway readability grade, tone fit across 5 criteria, SEO keyword compliance checkpoints, and a 5-point content quality checklist.
Friction between start and a usable result. Measured by counting required workflow stages from blank screen to complete output. fewer stages scores higher.
Factual reliability under verification. Three specific claims from generated output are checked against named primary sources. Unverifiable claims count as incorrect. regardless of plausibility.
Time from final trigger action to complete, usable output. Stopwatch-measured in seconds. generation time only, not setup. Consistent measurement across all tools in the same archetype.
Feature sophistication for the tool's archetype. For SEO tools: keyword integration, SERP competitor analysis, real-time web research, citation capability. Only features accessible on the tested plan count.
Workflow compatibility. Count of native integrations, API access, Zapier connectors, and platform plugins available on the tested plan tier. Critical weight for agency stacks.
Economic value as price per 1,000 words (PPU1000) on the available plan. Tools with unlimited-word plans use the monthly entry price. Monthly billing rate only. annual discounts excluded.
Barrier to first usable output. Scored via two 10-point binary checklists: Setup Complexity (friction before generation begins) and Documentation Quality (ability to self-onboard without support).
Why the same tool can
score differently
Pillar scores never change. Their contribution to the final NGS does. Select a persona to see how the weight distribution shifts. and why.
The 4 analysis lenses
Beyond the composite NGS score, four lenses group pillar scores into interpretable dimensions. You can see exactly where a tool excels and where it falls short.
The tool's composite score against the best-fit persona's weighted priorities. High Persona Fit means the NGS is persona-consistent. The tool genuinely performs where that persona needs it to.
Economic efficiency. How much output quality and usability you get relative to cost and barrier to entry. A high Value score means the tool punches above its price point.
Throughput efficiency. How quickly and smoothly the tool moves you from blank page to usable draft. accounting for speed, ease, integration, and output quality.
Raw AI capability. What does the tool produce, how factually reliable is it, how sophisticated are its features, and how fast does it deliver? Capability-first, cost-agnostic.
The NGS in action:
Rytr
Real score. Real test. Rytr was evaluated using the Generalist AI Writer protocol on Apr 13, 2026. Every number below came directly from that session.
What the number actually means
Every NGS maps to one of four tiers. Thresholds are fixed. A tool cannot move tiers without a measurable improvement in its real-world test data.
Significant gaps in core performance. Not recommended for production use without heavy editorial oversight on every output.
Functional for low-stakes use cases. Meaningful weaknesses exist. Fit is highly persona and workflow dependent.
Reliable across most workflows. Clear strengths with identifiable trade-offs. Recommended for matched personas.
Consistently strong across pillars with no critical weaknesses. Top-tier recommendation for matched workflows.
The 6 NGS Archetypes
Not all AI writing tools are built for the same job. Before scoring begins, each tool is classified into one of six archetypes, based on what it's actually designed to produce. Archetype determines which benchmark protocol is applied, and it's the second axis in every NGS result alongside your persona.
SEO Content System
Tools built end-to-end for search-optimized article production. Must demonstrate: keyword structuring, heading hierarchy, SERP-aware output, and real-time citation capability.
Generalist AI Writer
Broad-purpose tools that cover multiple content formats - blogs, emails, ads, social, and more. Scored across the widest format range with moderate depth expectations per format.
Conversion Copy System
Tools optimized for performance-driven copy; ads, landing pages, email sequences, product descriptions. Benchmarked for persuasion structure, CTA clarity, and A/B testability.
Brand Content Ops
Tools built for teams maintaining a consistent brand voice across channels and collaborators. Scored on tone control, multi-user workflows, approval layers, and style guide adherence.
Rewrite / Polish Layer
Tools that operate on existing content; paraphrasing, tone-shifting, grammar correction, and structural refinement. Benchmarked on fidelity to source, transformation quality, and detection resistance.
GTM Workflow Platform
All-in-one tools that go beyond writing; combining content creation with project management, publishing, distribution, or analytics. Scored on end-to-end workflow coverage and integration depth.
Benchmark protocols
Every tool is tested using a defined benchmark protocol matched to its archetype. Protocols are not changed between tools in the same archetype, if the protocol changes, all tools in that archetype are re-tested together.
Protocol assignment
Each archetype maps to a named protocol: longform_seo, conversion_copy, rewrite_polish, generalist_multiformat, brand_voice_ops, or gtm_workflow. The protocol defines exactly which prompts, formats, and evaluation criteria apply.
Controlled prompt set
Each tool receives identical prompts within its protocol. Prompts are drawn from real workflows; an actual SEO brief, a real client email sequence, or a live product page, not synthetic test cases. No prompt is modified between tools.
Pillar-by-pillar scoring
Each of the 8 pillars is scored independently from 0–10. Output Quality and Accuracy are scored using the NGS verification protocol: 3 factual claims per output are independently verified. Speed is clocked. Ease is measured in clicks-to-first-output.
Persona weighting applied
Raw pillar scores are the same for all personas. Persona weighting is applied at the score calculation layer, the same 8 numbers produce three distinct composite scores. No pillar is re-tested per persona.
Re-test triggers
A tool is re-tested when: pricing changes significantly, a major feature update ships, or its NGS score deviates more than 0.5 points from community-reported performance. The lastUpdated field in every tool record shows the date of the most recent benchmark run.
See the NGS scores
for every tool
Every tool in the leaderboard is scored using this exact framework. Find the right fit for your workflow.