Why small language models are the smarter choice

Instinct might tell you that models trained on more data that can answer virtually any question with linguistic fervor would be most favorable to use for complex agentic journeys.

As stated in recent research in the field , “Large language models (LLMs) are often praised for exhibiting near-human performance on a wide range of tasks and valued for their ability to hold a general conversation.”

However, the research paper, titled “Small Language Models are the future of Agentic AI” also presents the compelling argument that LLMs are far less than ideal for agentic AI frameworks.

LLMs have the primary objective of extracting patterns from large amounts of data and to produce novel content. While this is useful for contexts where a user requires feedback on a well-documented topic, LLMs are inefficient when it comes to tasks that require a more specialized approach.

As our founder Jay van Zyl puts it:

“the more information a model possesses, the longer the thinking process will be, the more verbose the answers are, and the less successful the model is at detecting intent. “

SLMs, on the other hand, are trained on smaller amounts of data, which restricts their context automatically. This means that SLMs have highly-specific knowledge about a certain thing, and cannot return answers for anything that falls outside of their expertly small window of context.

The potential of agentic AI will not be fully realised if businesses and tech professionals do not confront the weaknesses and harness the strengths of generative AI.