Boson AI logo
IHBench: What Voice Agents Say After You Interrupt
Jun. 29, 2026
IHBench: What Voice Agents Say After You Interrupt

A benchmark for post-interruption recovery in voice agents — whether what the model says after you cut in resumes the workflow correctly. 428 interruption samples, 6 types, 10 enterprise domains, closed- and open-weight models.

ProactBench: Beyond what the user asked for
May. 28, 2026
ProactBench: Beyond what the user asked for

A benchmark for conversational proactivity in LLMs — noticing and acting on what the user implied but never said. 198 curated dialogues, 624 trigger points, 16 models, and a leaderboard where Recovery proves dramatically hard.

Introducing RPBench-Auto
Aug. 5, 2024
Introducing RPBench-Auto

Since the release of Higgs Llama 2, we have received much positive feedback from the community. While we are amazed by the community's creativity in utilizing our model, we realize the importance of providing an automated benchmark to effectively evaluate large language model (LLM)'s roleplaying capability.

Announcing Higgs Llama 2
Jul. 15, 2024
Announcing Higgs Llama 2

At Boson AI, we are working on intelligent agents that can serve as human companions and helpers. Today we are excited to share Higgs Llama 2 70B, a new model that significantly improves upon its predecessor. It narrows the gap to the very best proprietary models on benchmarks relevant for dialog interaction and understanding.

Announcing the Higgs Family of LLMs
Jun. 5, 2024
Announcing the Higgs Family of LLMs

Since founding Boson AI in 2023, we have dedicated ourselves to empower enterprises with AI technologies, with a mission to transform how stories are told, knowledge is learned, and insights are gathered. We helped customers build intelligent agents to interact with their users by playing various roles, including game characters, language tutors, insurance agents and financial advisors.