Thinking outside the black box

A Freeman School of Business professor is part of a research team behind a new large language model (LLM) that delivers comparable performance to leading proprietary AI systems while offering something they don’t: complete transparency.

Yumei He, assistant professor of management science, collaborated with researchers from institutions including Northeastern University, Harvard University, Cornell University and the University of Washington as well as three companies to create Moxin 7B, a new language model that’s fully open source. Moxin 7B’s entire design — not only model architecture and weights, but also pre-training code, training data, and all intermediate and final checkpoints — is freely available to the public.

“Many commercial AI models are like black boxes, incredibly powerful but impossible to examine,” explains He. “By making every aspect of Moxin 7B accessible, we’re inviting the entire scientific community to understand how it works, verify its safety, and build upon our research.”

Assistant Professor of Management Science Yumei He is part of the team of researchers responsible for Moxin 7B, a new AI language model that is completely open source.

This transparency addresses growing concerns about AI development. While companies often claim their models are “open,” many withhold critical components such as training code and data, creating barriers for researchers and businesses wanting to understand or improve these systems and preventing the transparent and responsible use of the models.

Moxin 7B achieves the highest classification level of “open science” under the Model Openness Framework (MOF), a system that rates AI models based on their completeness and openness, yet still performs impressively. In zero-shot evaluations, the base model achieved an average score of 75.44 across multiple benchmarks, outperforming other 7-billion-parameter models like Mistral-7B and Meta’s Llama 2-7B.

The research team also developed several specialized versions of Moxin 7B for different applications, such as complex reasoning tasks, that match or exceed proprietary models.

“This technology could democratize access to advanced AI,” He notes. “Smaller organizations and academic institutions can now leverage powerful specialized language models without the costs of commercial alternatives. These smaller, application-focused models will enable more organizations to deploy agentic AI.”

With the release of Moxin 7B, the researchers have published all components needed to reproduce and improve the model, including pre-training code, training data, and development checkpoints, a level of transparency rare in today’s AI landscape.

“We’re excited to see what innovations emerge when we remove barriers to understanding how these systems work,” says He. “I’m looking forward to seeing what users create with it.”

Thinking outside the black box

In This Issue

Links

Search Freeman

Follow Freeman