Github BlogOctober 30, 2023

The architecture of today’s LLM applications

Building an LLM app involves five major steps: selecting the right pre-trained model, evaluating its performance, customizing the model to specific needs, setting up user input tools, and deploying the app on an app hosting platform.
When selecting a pre-trained model, it is important to consider the number of parameters, as a higher number usually indicates better learning capabilities. Open-source LLM models like OpenLLaMA and Falcon-Series can be good options.
Evaluating model performance can be done through offline evaluations, which measure how well and fast the model generates the desired output.
Customizing a pre-trained LLM can be achieved through techniques like in-context learning, reinforcement learning from human feedback, or fine-tuning. In-context learning involves providing specific instructions or examples at the time of inference to generate contextually relevant outputs.
The components needed for an LLM app can be grouped into three categories: user input tools, LLM and AI components, and app hosting platforms. User input tools include a UI, LLM API, and input enrichment tools.
An example user flow for an LLM app could involve a customer calling their internet service provider for assistance. The UI would include a router tool to navigate through options, and input enrichment tools would help package the user's query for the LLM's response.
The app hosting platform is responsible for deploying and hosting the LLM app, whether it is run locally or in the cloud.