Photo by Pauline Bernfeld

Large Language Models are dead, long live Small Language Models

The day when Artificial General Intelligence will suddenly flicker on as the manager turns on her computer ready to serve and allow the her to dispense with her minions and be left alone in silence with the coffee machine and the glorious view of an empty office in the low morning light is probably far off. There are many reasons that is not going to happen, but increasingly it is becoming clear that even generalist know-it-all approaches are not optimal and are not paying off. The Large Language Models underpinning the likes of Claude, CoPilot, ChatGPT and Gemini are built on the premise that they should be able to do it all. 

Technically, they are expensive to develop and needs significant hardware investment, but there is already a subtle move away from this and it will probably increase. 

The situation is similar to the Relational Database market in the 90s and 00s, where Oracle championed the all in-one-data base. There is little doubt that the Oracle database was the most advanced database ever created, much as ChatGPT might be the most advanced LLM ever created. But today it is mostly legacy technology that runs on Oracle databases. What happened to the database market was the so-called noSQL movement where other databases that were typically optimised for one particular use case appeared. Most were entirely or primarily based on opensource projects and therefore cheap and easy to run.

‘There might be several reasons for why that change happened, but the key reason is that databases often have to do something very specific and you might as well use an optimised cheap (or free open source) alternative if you already know that it is only ever going to do one thing. If you are just building a cache for your web application, or store blobs of video, you don’t need the world’s most advanced database you need one that is good at caching or storing unstructured files. 

Something similar is happening in the language model space as recently noted by The Economist, where smaller and more nimble variants tailor made to a specific purpose are becoming available. Small Language Models (SLMs) are compact, efficient artificial intelligence models designed for natural language processing tasks using the same architecture as large language models but unlike their larger counterparts, (LLMs) with hundreds of billions or even trillions of parameters, SLMs typically have parameters ranging from a few million to a few billion. As argued in a recent article on arxiv they are already sufficiently powerful, more suitable and economic than their larger counterparts. The smaller size makes them cheaper to train and run, faster to respond, easier to fine tune with a lower risk of hallucinations. What’s not to like? 

This converges with the realisation that successful implementations are focused on very specific tasks rather than general purpose. The MIT report, The State of AI in Business 2025, details how generic large language models do not provide any measurable valuable. Another observation from a recent conference was that almost all presentations mentioned that more focused agentic solutions are more succesful. The most efficient use cases thus do not need the general purpose intelligence of the LLM. It is both cheaper to implement and run and have a higher chance of success to use highly task specific SLMs. 

In PAs recent report AI in the Nordics, we provided a framework for understanding different ways to use AI captured in archetypes: The first and most basic is the Task Assistant, which is the generic use of LLMs for a wide range of ad hoc tasks. The second is the Knowledge Agent, which is focused on a specific area of application. Building the Knowledge Agent requires a bit more planning than just buying a round of ChatGPT licenses for all employees, but those that do see significantly greater return. 

As was the the situation with databases a decade or two ago, organisations need to start thinking hard about what precisely they want to build and then find the best model for that. It is likely to be an SML. In the future managers will thus still have to be bothered with one-to-ones and queues at the coffee machine in the noisy office every morning, but the company and its employees may be much more productive.

Photo by Pauline Bernfeld on Unsplash


Posted

in

by

Tags:

en_GBEnglish