That’s not the case with open source. “Model creators don’t often take on legal liability,” says Chandrasekaran. And yes, open source models can be more easily re-trained or customized. But this process is complex and expensive, he says. “And the underlying base models are changing rapidly,” he adds. “If you customize something and the base model changes, you have to re-customize it.”
Finally, there’s the question of long-term sustainability. “It’s one thing to build an open model, release it, and have millions of people use it, versus building a business model around it and monetizing it,” he says. “Monetizing is hard, so who’s going to continue bankrolling these models? It’s one thing to build version one, but it’s another thing to build version five.”
In the end, we’re likely headed for a hybrid future, says Sreekanth Menon, global head of AI at Genpact. “Both open and closed-source models have their place, despite the popular sentiment of open source takeover,” he says. “Enterprises are better off being model agnostic.”
Closed source models, backed by well-funded companies, can push the boundaries of what’s possible in AI. “They can provide highly refined, specialized solutions that benefit from significant investment in research and development,” he says.
Why the open source definition matters to business
Meta’s Llama comes up first in any conversation about open source gen AI. But it might not technically be open source, and the distinction matters. In late October, the Open Source Initiative released the first form definition of open source AI.
It requires open source AI to share not just the source code and supporting libraries, but also the model parameters, and a full description of the model’s training data, its provenance, scope, characteristics, and labeling procedures. But, more importantly, users must be able to use open source AI for any purpose without having to ask for permission.
By that definition, Meta’s Llama models are open, but not technically open source, since there are limitations. For example, some Llama models can’t be used to train other models. And if it’s used in an app or service with more than 700 million monthly users, a special license from Meta is required.
Meta itself refers to it as a community license or a bespoke commercial license. It’s important that corporate users understand these nuances, says Mark Collier, COO at OpenInfra Foundation, who helped work on the new definition. “To me, what matters most is that people and companies have the ability and freedom to take this fundamental technology and remix it, use it, and modify it for different purposes without having to ask a gatekeeper to give them permission.” So a company needs to feel assured it can incorporate the AI into a product and not have someone come back and say it can’t be used that way.
Vendors will sometimes announce their AI is open source because it helps with marketing and recruitment, and lets customers feel they’re not locked in. “They have this halo effect, but they’re not really living up to that,” Collier says.
In the rush to adopt AI, companies might take a vendor’s description of their AI as open source at face value.
“The Meta example is a good one,” he says. “A lot of the mainstream tech coverage says this is open source AI, and that’s how Zuckerberg describes it, and it gets repeated that way. But when you get into the details, there are restrictions on the license.”
As companies get more serious about making big commercial bets on AI technology, they need to be careful with the license, he adds. And there are also other benefits to using a model with a fully open source license, he adds. For example, having access to a model’s weights makes it easier to fine tune and adapt. Another thing for companies to watch out for is open source licenses that require all derivative works to also be open source.
“If a company customized a model or fine-tuned it on their own proprietary data, they might not want to publish it,” he says. That’s because there are ways to get a model to expose its training data.
Staying on top of these issues is tricky, he admits, especially since the gen AI sector is evolving so quickly. It doesn’t help when model developers invent new licenses all the time.
“If your company is releasing something open source and your lawyers are attempting to create yet another license — please don’t do that,” he says. “There are plenty of good licenses out there. Just pick one that meets your goals.”