Determining the right LLM for your organization


Today’s business leaders recognize that some application of generative AI has great potential to help their business perform better, although they may still be exploring exactly how and what the ROI may ultimately be. Indeed, as companies turn their gen-AI prototypes into scaled solutions, they must take into account such factors as the technology’s cost, accuracy, and latency to determine its long-term value.
The growing landscape of large language models (LLMs), combined with the fear of making the wrong decision, leaves some businesses in a quandary. LLMs come in all shapes and sizes and can serve different purposes, and the truth is, no single LLM will solve every problem. So, how can a business determine which one is the right one?
Here, we discuss how to make the best selection so your business can use generative AI with confidence.
Chief Design and Strategy Officer at New Relic.
Choose your level of LLM sophistication — the sooner, the better
Some businesses are conservative in adopting an LLM, launching pilot projects, and waiting for the next generation to see how that might change their application of generative AI. Their reluctance to commit may be warranted, as diving in too early and failing to test it correctly could mean big losses. But generative AI is a rapidly evolving technology, with new foundational models introduced seemingly weekly, so being too conservative and continuing to wait for the technology to evolve may mean you never actually move forward.
With that said, there are three levels of sophistication companies may consider when it comes to generative AI. The first is a simple wrapper application around GPT, designed to interact with OpenAI’s language models and provide an interface to guide text completions and conversation-based interactions. The next level of sophistication is using an LLM with retrieval-augmented generation (RAG). RAG allows businesses to enhance their LLM output with proprietary and/or private data. GPT-4, for example, is a powerful LLM that can understand nuanced language and even reasoning.
However, it hasn’t been trained on the data for any specific company and can lead to potential inaccuracies, inconsistencies, or irrelevancies (hallucinations). Companies can get around hallucinations by using implementations like RAG, which allows them to merge insights from a base-model LLM with some of the data unique to their business. (It should be noted that alternative large-context models like Claude 3 may actually render RAG obsolete. And, while many are still in their infancy, we all know how fast technology moves, so obsolescence may come sooner than later.)
In the third level of generative AI sophistication, a company runs its own models. For example, a company may take an open-source model, fine-tune it with proprietary data, and run the model on its own IT infrastructure in place of any third-party offerings like OpenAI. It should be noted that this third-level LLM requires the oversight of engineers trained in machine learning.
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
Apply the right LLM to the right use case
Given the options here and the differences in cost and capability, companies must determine exactly what they plan to accomplish with their LLM. For example, if you’re an ecommerce company, human support is trained to intervene when a customer is at risk of abandoning their cart and help them decide to complete their purchase. A chat interface will allow for the same result at one-tenth the cost. In this case, it may be worth it for the ecommerce company to invest in running its own LLM with engineers to control it.
But bigger isn’t always cost-effective — or even needed. If you’re a banking application, you can’t afford to make transaction errors. For this reason, you’ll want tighter control. Developing your own model or using an open-source model, fine-tuning it, applying heavily engineered input and output filters, and hosting it yourself gives you all the control you need. And for those companies that simply want to optimize the quality of their customers’ experience, a well-performing LLM from a third-party vendor would work well.
A note about observability
Regardless of the chosen LLM, understanding how the model performs is key. As tech stacks become increasingly complex, homing in on performance issues that may pop up in an LLM can prove challenging. Additionally, due to the uniqueness of the tech stack and the very different LLM interactions, there are entirely new metrics that must be tracked, such as time-to-token, hallucinations, bias, and drift. That’s where observability comes into play, providing end-to-end visibility across the stack to ensure uptime, reliability, and operational efficiency. In short, adding an LLM without visibility could greatly impact how a company measures the technology’s ROI.
The generative AI journey is exciting and fast-paced — if not a bit daunting. Understanding your business’s needs and matching those to the right LLM will not only ensure short-term benefits but also lay the foundation for ideal future business outcomes.
This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro
Today’s business leaders recognize that some application of generative AI has great potential to help their business perform better, although they may still be exploring exactly how and what the ROI may ultimately be. Indeed, as companies turn their gen-AI prototypes into scaled solutions, they must take into account such…
Recent Posts
- How Claude’s 3.7’s new ‘extended’ thinking compares to ChatGPT o1’s reasoning
- ‘We’re nowhere near done with Framework Laptop 16’ says Framework CEO
- Razer’s new Blade 18 offers Nvidia RTX 50-series GPUs and a dual mode display
- Samsung’s first Pro series Gen 5 PCIe SSD arrives in March
- I tried adding audio to videos in Dream Machine, and Sora’s silence sounds deafening in comparison
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010