Tech companies are aiming for smaller and leaner AI models

NEW YORK

AI firms have long boasted about the enormous size and capabilities of their products, but they are increasingly looking at leaner, smaller models that they say will save on energy and cost.

Programs like ChatGPT are underpinned by algorithms known as "large language models", and the chatbot's creator bragged last year that its GPT-4 model had nearly two trillion "parameters", the building blocks of the models.

The vast size of GPT-4 allows ChatGPT to handle queries about anything from astrophysics to zoology.

But if a company needs a program with knowledge only of, say, tigers, the algorithm can be much smaller.

"You don't need to know the terms of the Treaty of Versailles to answer a question about a particular element of engineering," said Laurent Felix of Ekimetrics, a firm that advises companies on AI and sustainability.

Google, Microsoft, Meta and OpenAI have all started offering smaller models.

Amazon too allows for all sizes of models on its cloud platform.

Smaller models are better for simple tasks like summarizing and indexing documents or searching an internal database.

U.S. pharmaceutical company Merck, for example, is developing a model with Boston Consulting Group (BCG) to understand the impact of certain diseases on genes.

"It will be a very small model, between a few hundred million and a few billion parameters," said Nicolas de Bellefonds, head of AI at BCG.

Laurent Daudet, head of French AI startup LightOn, which specialises in smaller models, said they had several advantages over their larger siblings.

They were often faster and able to "respond to more queries and more users simultaneously," he said.

He also pointed out that they were less energy hungry, the potential climate impact being one of the major concerns over AI.

Huge arrays of servers are needed to "train" the AI programs and then to process queries.

These servers, made up of highly advanced chips, require vast amounts of electricity both to fuel their operation and to cool them down.

Daudet explained that the smaller models needed far fewer chips, making them cheaper and more energy efficient.

Other proponents point out that they can run without using data centers altogether by being installed directly on devices.

Laurent Felix pointed out that direct use on a device also meant more "security and confidentiality of data".

The programs could potentially be trained on proprietary data without fear of it being compromised.

The larger programs, though, still have the edge for solving complex problems and accessing wide ranges of data.

De Bellefonds said the future was likely to involve both kinds of models talking to each other.

"There will be a small model that will understand the question and send this information to several models of different sizes depending on the complexity of the question," he said.

"Otherwise, we will have solutions that are either too expensive, too slow, or both."

tech,