Law School Tests Trial With Jury Made Up of ChatGPT, Grok, and Claude

The University of North Carolina School of Law held an unusual mock trial on Friday.

Looming over the proceedings even more prominently than the judge running the show were three tall digital displays, sticking out with their glossy finishes amid the courtroom’s sea of wood paneling. Each screen represented a different AI chatbot: OpenAI’s ChatGPT, xAI’s Grok, and Anthropic’s Claude.

These AIs’ role? As the “jurors” who would determine the fate of a man charged with juvenile robbery.

The case, thankfully, was fictional. But all three of the AI chatbots serving on the “jury” have been used by professional lawyers in real court cases — often resulting in embarrassing blunders — meaning that to some extent the technology is already affecting legal outcomes across the country.

Organizers said that the stunt, called “The Trial of Henry Justus,” is meant to raise questions about AI’s role in the justice system.

“This exercise highlights critical issues of accuracy, efficiency, bias, and legitimacy raised by such use,” Joseph Kennedy, a UNC professor of law who designed the mock trial and served as judge, said in a statement before the event was held.

AI’s inroads into legal settings continues to be a contested subject as many lawyers leveraging AI tools are blasted for committing egregious errors with the tech. Typically, an AI goes wrong by citing either misquoted or fabricated caselaw, a symptom of the tech’s fundamental problem of “hallucinating” misinformation that it presents as fact, which the industry is still nowhere close to solving. Judges have handed out harsh punishments, including fines and sanctions, to attorneys who have turned in shoddy AI-sabotaged work.

Such embarrassing debacles notwithstanding, the fact remains that AI tools are already gaining popularity in legal fields, and with considerable enthusiasm. Nearly three quarters of legal professionals in a Reuters survey this year said that they believe AI is a force for good in their profession. Over half of them said their organizations were already seeing a return on investment from going in on AI.

At the AI mock trial, the AI “jurors” were given a real-time transcript of the proceedings and then “deliberated” in front of the audience, according to Eric Muller, a UNC professor of law in jurisprudence and ethics who watched the trial.

It did not make a great impression.

“Intense criticism came from members of a post-trial panel including a law professor and a philosopher with legal training,” Muller wrote in a Bluesky post. “I suspect most in the audience came away believing that trial-by-bot is not a good idea,” he added in a followup thread.

Attendees pointed out how the bots couldn’t see a witness’s body language, or draw from human experience. We might also add AI’s well documented tendency to drastically misinterpret information because of simple typos and to exhibit racial bias, sometimes pretty egregiously. Grok, Elon Musk’s chatbot which served as one of the “jurors,” literally styled itself “MechaHitler” during a legendary meltdown that saw it spew racist rants and praise Nazis.

Do you know anything about the role that AI is playing in the legal system? Send us a tip at tips@futurism.com.

Clearly, there’s room for improvement. But according to Muller, we should be extremely wary of the AI industry’s “instinct to repair.”

“The bots were bad, but they are getting better. Every release is a beta for a better build,” Muller wrote. “Bots can’t read body language? We’ll give them a video feed. Bots can’t infuse their judgment with the wisdom of experience? We’ll give them backstories.”

“Technology will recursively repair its way into every human space if we let it,” Muller warned. “Including the jury box.”

More on AI: Woman Wins Court Case by Using ChatGPT as a Lawyer

The post Law School Tests Trial With Jury Made Up of ChatGPT, Grok, and Claude appeared first on Futurism.

Related Posts