I have been reading about AI agents and the shift towards agentic approach for almost all of 2024. This is a quick refresher for me about AI Agents and I hope you would find it useful
Why AI Agents ?
Today, if you use any LLM or simply ChatGPT, it acts like a isolated black box that provides a vanilla answer or solution to your task. But often these responses can have mistakes or they are not tested thoroughly or simply they cant count number of r’s in Strawberry.
AI Agents allow Large Language Models an opportunity to improve responses with additional contextual information, feedback , examples etc. and more number of steps , instead of just 1 single step.
Simply said, industry is shifting from : user input – > model – > response to user input – > model – > get feedback -> response ->add context -> check sources – > iterate till model is happy with response – > response
The most interesting aspect for me being that it brings aspects of system design & modular functions that can do multiple things in parallel, to be applied for Artificial Intelligence.
What are Agentic Design Patterns?
Yes, we want AI Agents, but how would they work logically with LLMs and the external environments. Dr Andrew Ng – a known name across AI workspaces, categorizes AI agents as the following :
- Reflection
- Tool Use
- Planning
- Multi Agent Collaboration
Understanding Agentic Design Pattern : Reflection
Instead of accepting the response from LLM, we can add a task to automatically allow the model to check its work a.k.a. reflect allow it to improve its response as need be. Furthermore, the responses can create a repository of good and bad examples that the models can refer to enhance the outcomes.
Below are examples of research papers that have successfully implemented this pattern
CRITIC framework by Gou et al. (2024) , verifies LLM responses against external tools and allows improvement to the response iteratively.
Self Refine approach by Madaan et al. (2023) asks the LLM to get feedback on its own output and then asks LLM to refine the output based on the feedback provided in an iterative manner until a defined stop condition is met.
Use Cases for Reflections
Based on my understanding, the Reflections framework can be implemented in the following use cases familiar to me :
- Create Marketing Content for A/B/N Testing
- Improve marketing content in alignment with marketing rules & regulations
- Validate New Content based on local or regional factors
Understanding Agentic Design Pattern : Tool Use
Tool Use are simple functions or modules for LLMs to connect for more information such as web search or execute certain action such as trigger API using CURL. Each tool contains its detailed description & dependencies as parameters. When there are multiple functions, we can expect the LLM to automatically select the best function or tool based on the task & context.
Here are some examples of implementing this design pattern
Chain of Abstraction method by Gao et al. (2024) decouples the general reasoning capability of LLMs from domain-specific knowledge available from external platforms such as weather. It trains the LLM to split the ask into reasoning chains with abstract placeholders. Next placeholders are replaced by domain -specific tools external to the LLM.
Gorilla by Patil et al. (2023) is a fine tuned model specifically for AI ML based API calls. It generates reliable API calls to machine learning models by creating instruction-API pairs – without any hallucination and also accommodates for API changes over time.
Use Cases for Tool Use
The above examples are very amazing implementation of this framework, but still have potential for covering the following use cases :
- Triggering Actional Insights : After actional insights are created, let them be triggered by tools automatically without human intervention
- API to Tools Library : Convert existing in-house enterprise APIs to tools that are available for LLMs to choose from. Here’s where Hybrid RAG approach would require to be thought of
- Tools Manager : Decide on which tools to trigger based on costs, context etc.
Understanding Agentic Design Pattern : Planning
Here instead of asking the model to accomplish a task, we ask for the steps it would take to accomplish the task and/or break down the topic into smaller tasks to achieve the final task. Later it can then use tools to complete the tasks and pass parameters between the tools.
Huang et al. (2024) have created a systemic view of how LLM- based agents plan and have categorized into Task Decomposition, Plan Selection, Externa Module, Reflection & Memory. When enterprise AI agents are created the systemic view would help define interaction with various tools/resources.
Chain of Thought Prompting by Wei et al. (2022) brought a fundamental shift in prompt engineering derived from bringing human thought process of breaking complicated reasoning task into a step by step problem. There’s so much context and detail already out there will not cover any further here.
Use Cases for Planning
Building Presentations : Rather than starting from scratch, let LLM plan to build a presentation step-by-step and then use tools to gather information from external sources
Understanding Agentic Design Pattern : Multi Agent Collaboration
Similar to managers hiring for multiple specialists to complete a specialized task, multiple AI agents can be triggered to complete various aspects of the same specialized task. It follows the same paradigm as when dependent on a single CPU, we often break our programs into different threads or subroutines.
ChatDev from Qian et al. (2023) is the most fun example on multi agents as you are responsible to run a virtual software development company – which is a complex tasks and you would have to communicate with LLM who are assigned various roles in the organization to develop solutions.
AutoGen by Wu et al. (2023) is an open source framework that allows a combination of human input, tools & LLMs to build applications. Here developers can assign agents with a defined behaviors and outcomes based on the input. Then these agents collaborate or integrate with other tools or human-in-the-loop workflows based on the scenarios.