Large language models (LLMs) are foundation models that use artificial intelligence (AI), deep learning and massive data sets, including websites, articles and books, to generate text, translate between languages and write many types of content. There are two types of these generative AI models: proprietary large language models and open source large language models.
In this video, Martin Keen briefly explains large language models, how they relate to foundation models, how they work and how they can be used to address various business problems.
Proprietary LLMs are owned by a company and can only be used by customers that purchase a license. The license may restrict how the LLM can be used. On the other hand, open source LLMs are free and available for anyone to access, use for any purpose, modify and distribute.
The term “open source” refers to the LLM code and underlying architecture being accessible to the public, meaning developers and researchers are free to use, improve or otherwise modify the model.
Previously it seemed that the bigger an LLM was, the better, but now enterprises are realizing they can be prohibitively expensive in terms of research and innovation. In response, an open source model (link resides outside ibm.com) ecosystem began showing promise and challenging the LLM business model.
Enterprises that don’t have in-house machine learning talent can use open source LLMs, which provide transparency and flexibility, within their own infrastructure, whether in the cloud or on premises. That gives them full control over their data and means sensitive information stays within their network. All this reduces the risk of a data leak or unauthorized access.
An open source LLM offers transparency regarding how it works, its architecture and training data and methodologies, and how it’s used. Being able to inspect code and having visibility into algorithms allows an enterprise more trust, assists regarding audits and helps ensure ethical and legal compliance. Additionally, efficiently optimizing an open source LLM can reduce latency and increase performance.
They are generally much less expensive in the long term than proprietary LLMs because no licensing fees are involved. However, the cost of operating an LLM does include the cloud or on-premises infrastructure costs, and they typically involve a significant initial rollout cost.
Pre-trained, open source LLMs allow fine-tuning. Enterprises can add features to the LLM that benefit their specific use, and the LLMs can also be trained on specific datasets. Making these changes or specifications on a proprietary LLM entails working with a vendor and costs time and money.
While proprietary LLMs mean an enterprise must rely on a single provider, an open source one lets the enterprise take advantage of community contributions, multiple service providers and possibly internal teams to handle updates, development, maintenance and support. Open source allows enterprises to experiment and use contributions from people with varying perspectives. That can result in solutions allowing enterprises to stay at the cutting edge of technology. It also gives businesses using open source LLMs more control over their technology and decisions regarding how they use it.
Organizations can use open source LLM models to create virtually any project useful to their employees or, when the open source license allows, that can be offered as commercial products. These include:
Open source LLM models allow you to create an app with language generation abilities, such as writing emails, blog posts or creative stories. An LLM like Falcon-40B, offered under an Apache 2.0 license, can respond to a prompt with high-quality text suggestions you can then refine and polish.
Open source LLMs trained on existing code and programming languages can assist developers in building applications and finding errors and security-related faults.
Open source LLMs let you create applications that offer personalized learning experiences, which can be customized and fine-tuned to particular learning styles.
An open source LLM tool that summarizes long articles, news stories, research reports and more can make it easy to extract key data.
These can understand and answer questions, offer suggestions and engage in natural language conversation.
Open source LLMs that train on multilingual datasets can provide accurate and fluent translations in many languages.
LLMs can analyze text to determine emotional or sentiment tone, which is valuable in brand reputation management and analysis of customer feedback.
LLMs can be valuable in identifying and filtering out inappropriate or harmful online content, which is a huge help in maintaining a safer online environment.
A wide range of organization types use open source LLMs. For example, IBM and NASA developed an open source LLM trained on geospatial data to help scientists and their organizations fight climate change.
Publishers and journalists (link resides outside ibm.com) use open source LLMs internally to analyze, identify and summarize information without sharing proprietary data outside the newsroom.
Some healthcare organizations (link resides outside ibm.com) use open source LLMs for healthcare software, including diagnosis tools, treatment optimizations and tools handling patient information, public health and more.
The open source LLM FinGPT (link resides outside ibm.com) was developed specifically for the financial industry.
The Open LLM Leaderboard (link resides outside ibm.com) aims to track, rank and evaluate open source LLMs and chatbots on different benchmarks.
Although LLM outputs sound fluent and authoritative, there can be risks that include offering information based on “hallucinations” as well as problems with bias, consent or security. Education on these risks is one answer to these issues of data and AI.
AI models, particularly LLMs, will be one of the most transformative technologies of the next decade. As new AI regulations impose guidelines around the use of AI, it is critical to not just manage and govern AI models but, equally importantly, to govern the data put into the AI.
To help organizations address these needs and multiply the impact of AI, IBM offers watsonx, our enterprise-ready AI and data platform. Together, watsonx offers organizations the ability to:
The IBM watsonx Assistant conversational search functionality builds on the foundation of its prebuilt integrations, low-code integrations framework (link resides outside ibm.com), and no-code authoring experience (link resides outside ibm.com). Developers and business users alike can automate question-answering with conversational search, freeing themselves up to build higher-value transactional flows and integrated digital experiences with their virtual assistants.
Beyond conversational search, watsonx Assistant continues to collaborate with IBM Research and watsonx to develop customized watsonx LLMs that specialize in classification, reasoning, information extraction, summarization and other conversational use cases. Watsonx Assistant has already achieved major advancements in its ability to understand customers with less effort using large language models.
Read the CEO’s guide to generative AI
Learn more about IBM watsonx
IBM Granite 3.0, the third generation of the Granite series of large language models (LLMs) and complementary tools.