Infinite Quorum
A swarm-intelligence platform that runs hundreds of mock jurors against a brief in five minutes. The name is the product: an infinite quorum.
The Idea
Empaneling twelve real jurors for a focus group costs tens of thousands of dollars and takes weeks. The signal you get is excellent, but you can only afford to pull it once or twice in a case. So most arguments and briefs go out the door having been read by their authors and maybe one or two colleagues. The rest of the testing happens at trial, when the test is real and the stakes are everything.
Instead of empaneling 12 jurors or running one focus group, IQ runs hundreds or thousands of deliberative panels simultaneously.
Infinite Quorum runs hundreds of LLM-powered agents — each one with a distinct demographic profile, personality, set of cognitive biases, life experiences, and reading style — against a brief or argument. Each agent answers a structured battery of survey questions: persuasiveness, clarity, credibility, emotional impact, logical coherence, the strongest argument, the weakest argument, what would change their mind. Subsets of the population get assembled into focus-group panels and run multi-turn moderated deliberations against the same materials. The output is a structured report: how different audience segments reacted, where the argument lost people, what objections came up most often, and how multi-turn deliberation shifted opinions.
The natural abbreviation is IQ. "We ran it through IQ." "What did IQ say about the closing?"
The Pipeline
Population Generation
A configurable population of agents is built from real demographic and psychographic distributions. Each agent gets a full persona: age, region, education, occupation, political lean, life events, reading style, cognitive biases. Seeded RNG throughout — same seed produces the same population, which matters for reproducible A/B tests.
Multi-Model Diversity
Agents are assigned across nine different LLMs from four providers, round-robin. Free-tier models, budget models, mid-tier, premium. Different models produce different response characteristics, which crudely mimics real human variation in reading and reasoning. A single OpenRouter key gets all of them.
Survey Pass
Every agent reads the brief — paragraph-numbered with [P1], [P2] tags so they can reference specific sections — and returns a structured JSON survey: 9 numeric ratings, 2 categorical, 8 free-text fields, and 4 section-reference fields. Robust JSON parser handles markdown fences and embedded JSON. Async, parallel, with cost tracking.
Focus Group Deliberation
Subsets of the population get assembled into multi-turn moderated focus groups. Engagement tracking, post-discussion surveys, and a record of which agents persuaded which other agents. The deliberation often shifts the population's median view in ways the survey alone doesn't reveal.
Report Compilation
Aggregation, terminal output, and an interactive single-file HTML report with embedded JS — no framework dependencies. Section-attention heatmaps showing which paragraphs were flagged most/least compelling, demographic breakdowns, A/B comparisons when two briefs are run through the same population.
What It's Useful For
Brief Optimization
Run the same brief against a representative population. Look at which sections lost which agents. Rewrite, run again. Cycle in minutes instead of weeks.
A/B Argument Testing
Run two versions of the same brief — or two opposing briefs — through the same seeded population. Read the difference, not the absolute numbers.
Mock Jury
Assemble representative virtual jury panels to deliberate on a case presentation. Verdict predictions and persuasiveness scores by demographic.
Opening Statement Variants
Test a formal/analytical opening against a narrative/moral one through the same population. Which one moves which segments, and where does the deliberation push them.
What's Honest About It
This is a proof of concept. The pipeline works end to end — population generation, evaluation, focus-group deliberation, reporting — but I want to be honest about what it is and isn't. Agent panels are not a replacement for live focus groups. The agents read fast and never get tired, but they also don't get bored, distracted, or hungry, and their priors come from training data rather than lived experience. What IQ is good at is rapid, cheap iteration — testing the directional effect of changes, not predicting verdicts.
The real comparison is not IQ versus a real focus group. It's IQ versus the zero focus groups most matters actually get. Tested rough is better than untested polished.
The current honest list of caveats: persona quality is still being tuned and pressure-tested with a dedicated test harness; some cheap models truncate JSON at lower token limits than they advertise; free-tier models occasionally return empty post-discussion surveys; one specific agent named Kevin Rodriguez seems cursed and fails in every focus-group run for reasons I have not yet diagnosed.
Where It Stands
POC v0.2. Deployed prototype, persona-quality test harness in progress, prompt-hardening underway. Sibling project to BenchLab — IQ tests advocacy against juries and general audiences; BenchLab tests advocacy against judges.