AI is stepping out of the realm of the magical and mystical and into practical and powerful business use cases.
One such use case is using conversational AI to engage with customers. However, to draw the most out of all that it offers, an AI platform must be structured in the most logical and efficient manner.
Qamir Hussain, Webio’s Head of AI & Machine Learning, explains:
“Webio’s AI architecture is based on several custom language models which vary in size. Each one is designed to perform a specific function, and they work together in harmony, along with Webio’s natural language understanding engine and API. Currently, Webio uses three small language models (SLMs), two medium sized language models and one large language model (LLM) to provide a full service for its conversational AI platform. You need to pick the right sized language model for the job at hand.”
While large language models are impressive at understanding human language and generating responses, they have some weaknesses that prohibit their uncontrolled use, such as security concerns and hallucinations. Although, you do need a larger model at times, for example, you would use an LLM for generative AI to summarise a customer conversation with a human or AI agent.
On the other hand, SLMs are easier to control and customise than LLMs. So, although LLMs are an attractive and powerful technology, sometimes a small language model is more appropriate to get the job done.
Employing multiple language models, in different sizes, is much like choosing a vehicle. If you live in a dense city, it makes sense to buy a nippy run-around which slips into tight parking spaces and uses little fuel. Whereas if you live out on a farm, a tractor would fit the bill for the job. It would be foolish to drive a large farm vehicle when taking the kids to school in a congested city. It’s much the same if you were to use a large language model when a small one would work better, and vice versa.
Mark Oppermann, Chief Revenue & Marketing Officer at Webio, adds:
“This Webio model of using a collection of language models underpins the three fundamental components that make AI-driven customer engagement so effective – entity extraction, intent recognition and propensity guidance. With these three pistons working together, the digital conversation engine performs effortlessly.”
Entities are the key bits of information that a customer provides, e.g. date of birth for ID&V, while intent recognition reveals what a customer is really saying or would like to do. Going a step further, propensities indicate the most likely outcome of a conversation and are used to guide the customer down the best route to resolve their query.
Data Security
In terms of security, AI technology should be hosted within a ‘demilitarised zone’ (DMZ) on cloud servers, with no external access permitted – all interactions should be restricted solely to the company’s application. Using SLMs makes securing this data a simpler and more reliable process.
Compliance, Regulations and Ethics
These are top concerns of businesses, especially those operating in highly regulated environments like financial services. It is easier to stick within these frameworks when working with SLMs. Furthermore, enterprises using AI require explainable AI, which is more attainable with smaller language models which are transparent in nature.
As regulations in the credit and collections industry and AI become more stringent, maintaining compliance will be increasingly challenging. Large enterprises may have the resources to ensure compliance, but smaller companies could be at a disadvantage without the deep pockets of larger corporations.
Cost-effective and Gentler on the Environment
SLMs use fewer resources compared to LLMs, which are insanely resource-intensive and expensive in terms of processing power, electricity, and water for cooling the enormous data centres.
Consistency and Quality
A high standard is possible when using SLMs as they are easier to manage and build. Furthermore, this added control makes it possible to create a more uniform performance across all interactions.
Quicker and Cheaper to Train
SLMs require less training data and time to build. Often, a company will build their focused language models by fine-tuning open-source models, and train them on real, anonymised customer conversations and data.
More Accessible to Smaller Companies
It is undoubtably expensive, and requires expertise, to build a large language model which can put them out of the reach of smaller organisations. However, smaller companies can buy, out-the-box, a fully developed AI tool, complete with pretrained entity and intent recognition.
Easier to Maintain and Improve
Smaller models can be individually adjusted without affecting the other models. They can be upgraded or fixed at the drop of a hat. This allows for a ‘future-proof’ AI solution and facilitates continuous improvement as these SLMs are designed to adapt and evolve with technological advancements.
Need Less Processing Power and Storage vs LLMs
SLMs are compact enough to run on a device as small as a personal computer. For example, small language models are approximately 100MB and medium models about 500MB, whereas large models reach about 30-40GB.
Multiple Languages
Since SLMs are easy to work with, translating SLMs into other languages is a fairly straightforward process, making the model accessible across countries and languages.
To wrap up, this table gives a summary of when and why you would choose a small or large language model depending on your needs.
FEATURE | SMALL LANGUAGE MODELS | LARGE LANGUAGE MODELS |
Model Size | Few million to a few hundred million parameters | Billions to trillions of parameters |
Training Time | Faster to train | Requires extensive computational resources and time |
Cost | Lower cost for training and deployment | High cost due to large computational and storage requirements |
Performance | Adequate for simpler tasks, limited understanding of context | Superior performance, better understanding of context and nuances |
Energy Consumption | Low | High |
Inference Speed | Faster due to smaller size | Slower due to larger size |
Memory Requirements | Lower (100-500MB) | Higher (30-200GB) |
Flexibility | Less flexible, may require fine-tuning for specific tasks | More flexible, can handle a wide range of tasks without fine-tuning |
Accessibility | Easier to deploy on local devices and edge computing | Often requires powerful cloud infrastructure |
Use Cases | Simple text generation, chatbots, basic translations, summarisation | Complex text and response generation, advanced chatbots, nuanced translations, detailed summarisation |
Strengths | Cost-effective, quick deployment, low resource consumption | Deep contextual understanding, versatility, powerful text generation |
Weaknesses | Limited understanding and less accurate in complex tasks | High cost, resource-intensive, slower inference speed |
Best Used For | Applications where cost, speed and control are important | Scenarios requiring nuanced comprehension, detailed content creation, and advanced AI applications |
Large language models might be the talk of the town at the moment - and they certainly have their place – but best practice is to employ a mix of different language model sizes that cover the full end-to-end requirements to get the job done in the most effective way. Each model is designed to perform a specific role, and working together, a collection of models covers all the bases.
If you need to improve your customer engagement, talk to us and we'll show you how AI and automation via digital messaging channels work.
You will love the Webio experience.
We promise.