Modular reasoning, knowledge, and language (MRKL, pronounced "miracle") systems expand the utility of large language models (LLMs) by giving them access to current information and to proprietary information and systems, and adding expertise for specific tasks or types of tasks such as performing mathematical calculations.
MRKL systems are comprised of two major components:
An extendable set of expert modules, along with a large language model (LLM), that are specialized for specific tasks such as performing math, retrieving current information via an API call, or accessing a priorietary system to retrieve, say, customer profile information.
A router that routes every user query to a module that can best respond to user's question. Module outputs can be returned directly to the user or used as inputs to other modules.
MRKL systems provide a number of benefits over fine-tuned, multi-task LLMs on their own:
Safe fallback. If a user query doesn't match any expert module route the query can be handled directly by the LLM.
Extensibility. The system can be easily and inexpensively extended to new tasks and capabilities through the addition of new modules and routes. Similarly, existing modules can be extended without impact to the overall system.
Interpretability. Routing decisions and the operation of the modules produce events that can be logged and later used to show a model's rationale for an answer.
Information currency. Modules that integrate with external APIs can provide the model with up to date information that enables system to answer questions, for example "What is the weather in Paris?", that a static model cannot.
Proprietary knowledge. Modules that integrate with internal systems and APIs can provide the model with proprietary information, eg. "What is the balance of the customer's credit account?", that an isolated model cannot.
Composability. By composing modules in mult-input/output chains the model can correctly respond to complex questions and inputs.
A more detailed architectural view of a MRKL system is shown in the diagram above.
A user submits a query to a generative AI application (for example, a chatbot, or a query interface within an enterprise application)
The generative AI application passes the user's query to the MRKL router. The router is shown here as an independent component invocable through an API to promote re-use of the service across applications and to decouple the MRKL system from the consuming applications but the router could be embedded within the application for solutions such as prototypes, or integrated chatbots that provide only a 'thin' user interface in front of the MRKL system.
The router uses a tuned LLM to break the user query down into a series of actions, or steps, necessary to arrive at an answer. For example, to answer the query "What is the current temperature in Winnipeg, Manitoba, Canada? How does that compare to the historical average for this time of year?" the LLM may respond with the following conceptual list of actions:
The router then invokes the appropriate expert module for each action in the list. Continuing with the example from Step 3:
The router uses the LLM to formulate a response; in our example "The current temperature in Winnipeg is -1°C. That is 2.4°C cooler than the historical norm of 1.4°C".
The formulated response is passed back to the generative AI application and to the user.