Agents Powered by Gemini Pro: Unlocking Agentic Workflows with LangTrace

Discover how Gemini Pro and LangTrace unlock agentic workflows for advanced LLM capabilities. Explore planning, external tool usage, and tracing for production-ready AI systems.

January 15, 2025

Unlock the power of AI-driven agents with the Gemini Pro Experimental model. Discover how this cutting-edge language model can seamlessly integrate external tools and APIs to tackle complex tasks, delivering comprehensive and tailored responses. Explore the benefits of this agent-based approach and unlock new possibilities for your content and workflows.

What is an Agent and How it Works?
Setting up the Environment
Creating the RAG Pipeline
Setting up the Tools
Creating the Agent
Interacting with the Agent
Conclusion

What is an Agent and How it Works?

An agent is essentially a large language model (LLM) with additional capabilities, including:

Planning: The agent can decompose the user query and come up with a comprehensive plan to address it.
External Tools/APIs: The agent has access to external tools or APIs that it can use to execute its plan.
Memory: The agent keeps track of where it is in the plan execution and what additional steps it needs to take.

In the example provided, the agent uses the Gemini 1.5 Pro experimental model as the LLM, and it has access to two tools:

RAG (Retrieval-Augmented Generation): This acts as the agent's knowledge base, using the "Attention is All You Need" paper.
Web Search: The agent can look up information on the web to supplement its knowledge.

The agent uses LangTrace to keep track of all the steps it takes in executing the plan. This observability is crucial for understanding the agent's decision-making process, especially in a production environment.

Setting up the Environment

To get started, we need to install the necessary packages and set up the required API keys. Here's how we'll do it:

Install Packages: We'll install the Google Generative AI package, LangChain, Tavly Python (our search engine), FAISS (vector store), and the LangTrace Python SDK.
Import Packages: We'll import the recursive character text splitter, PDF loader, Tavly search results, and the Google generative AI embedding model.
Set up API Keys:
- Tavly API Key: You can get the Tavly API key by clicking on the "get API key" link provided in the video description.
- Google API Key: You can get the Google API key from your Google AI Studio.
- LangTrace API Key: You'll need to create an account on LangTrace, an open-source observability platform for LLM applications. You can then click on "generate API key" to get your API key.
Set Environment Variables: We'll set the Tavly API key and the LangTrace API key as environment variables.
Handle Errors: If you see any errors related to LangSmith integrations, don't worry about them. We're only interested in LangChain and the Gemini Pro integration at this point.

Now that we have the environment set up, we can move on to the next steps of building the agent and integrating the necessary tools.

Creating the RAG Pipeline

To set up a basic RAG pipeline, we first need to load the PDF document that will serve as our knowledge base. In this case, we have the "transformers.pdf" file, which contains 12 pages of content.

Next, we use the RecursiveCharacterTextSplitter from the langchain.text_splitter module to chunk the document into smaller pieces, each with up to 500 tokens and an overlap of 100 tokens. This allows us to create a total of 24 different chunks that can be used for retrieval.

We then load the Google Generative AI embedding model, which will be used to compute embeddings for each of the document chunks. The embeddings have a dimensionality of 768.

To store the embeddings, we use the FAISS vector store. This allows us to efficiently perform similarity searches on the document chunks when a query is provided.

Finally, we create a RetrievalQA tool that can be used by the agent to retrieve relevant information from the knowledge base. The tool is configured with the FAISS retriever and a description that indicates it is useful for retrieving information related to the "Attention is All You Need" paper.

With this setup, the agent will be able to use the RAG pipeline to find and retrieve relevant information when answering questions about transformers and related topics.

Setting up the Tools

To set up the tools for our agent, we first need to install the necessary packages:

1# Install required packages
2!pip install google-generative-ai langchain tavlpy faiss-cpu langtracing

Next, we import the required modules and set up the API keys for the different services we'll be using:

1import os
2from langchain.document_loaders import PyPDFLoader
3from langchain.text_splitter import RecursiveCharacterTextSplitter
4from langchain.embeddings.openai import OpenAIEmbeddings
5from langchain.vectorstores import FAISS
6from langtracing.sdk.python import LangTracer
7
8# Set up Tavly API key
9TAVLY_API_KEY = os.environ.get("TAVLY_API_KEY")
10
11# Set up LangTrace API key
12LANGTRACING_API_KEY = os.environ.get("LANGTRACING_API_KEY")

We then set up the RAG (Retrieval Augmented Generation) pipeline by loading the PDF document, splitting it into chunks, and creating the embeddings and vector store:

1# Load the PDF document
2loader = PyPDFLoader("transformers.pdf")
3documents = loader.load()
4
5# Split the document into chunks
6text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100, length_function=len)
7chunks = text_splitter.split_documents(documents)
8
9# Create embeddings and vector store
10embeddings = OpenAIEmbeddings()
11vector_store = FAISS.from_documents(chunks, embeddings)

Finally, we create the two tools that our agent will have access to: the retrieval tool and the search tool:

1from langchain.agents import Tool
2
3# Create the retrieval tool
4retriever_tool = Tool(
5    name="retriever_tool",
6    description="For any information related to transformers architecture, use this tool. Useful for retrieving information related to Attention is all you need paper.",
7    func=lambda query: vector_store.similarity_search(query, k=3)
8)
9
10# Create the search tool
11search_tool = Tool(
12    name="search_tool",
13    description="A search engine optimized for comprehensive, accurate, and trusted results. Useful for when you need to answer questions about current events. Input should be a search query.",
14    func=lambda query: Tavly(TAVLY_API_KEY).search(query)
15)

With the tools set up, we're now ready to create the agent and start using it to answer questions.

Creating the Agent

To create the agent, we will be using the ReAct agent class from LangChain. This is a special type of agent that can do planning, has memory, and has access to the tools we want it to use.

We first need to provide a detailed prompt or instructions to the agent for controlling the tools and coming up with a plan. LangChain provides a template for this using the LangChain Hub, which is similar to the Llama Index Hub.

We will use the ReAct template and modify it as needed. The prompt includes the following:

Answer the following questions as best you can.
You have access to the following tools:
- Retriever tool: For any information related to the "Attention is All You Need" paper.
- Search tool: A search engine optimized for comprehensive, accurate, and trusted results. Useful for answering questions about current events.
Use the following format:
- The input question you must answer.
- Your thought process.
- The action to take (i.e., which tool to use).
- The output of the tool.
- The final answer.

With this prompt, we can create the ReAct agent:

1agent = ReActAgent(
2    name="Gemini 1.5 Pro",
3    tools=tools,
4    system_message=system_prompt
5)

Here, tools is the list of tools we defined earlier, and system_prompt is the prompt we created based on the ReAct template.

Now, we can use the AgentExecutor to execute the agent and provide it with queries:

1executor = AgentExecutor(
2    agent=agent,
3    tools=tools,
4    verbose=True
5)
6
7result = executor.run("What is the current weather in LA?")
8print(result)

The agent will go through its thought process, use the appropriate tools, and provide the final answer. You can also inspect the traces using the LangTrace library to understand the agent's decision-making process.

Interacting with the Agent

The agent we have set up is a Reactive Agent, which means it can plan, execute actions, and update its internal state based on the results of those actions. Let's see how we can interact with this agent.

First, we'll ask the agent a simple question about the current weather in Los Angeles:

1question = "What is the current weather at LA?"
2result = agent.run(question)
3print(result)

The agent goes through a thought process, decides to use the search tool to look up the current weather, executes the search, and then provides the final answer.

Next, let's ask the agent for a list of gold medals per country in the current Olympics:

1question = "Can you give me a list of gold medals per country in the current Olympics?"
2result = agent.run(question)
3print(result)

Here, the agent again decides to use the search tool to look up the relevant information, processes the results, and provides the answer.

Finally, let's ask the agent to explain the concept of attention in transformers:

1question = "Can you explain the concept of attention in transformers?"
2result = agent.run(question)
3print(result)

In this case, the agent recognizes that the question is related to the "Attention is All You Need" paper, so it decides to use the retrieval tool to fetch relevant information from the paper. It then processes the information and provides a concise explanation of the core concept of scaled dot-product attention.

Throughout these interactions, you can see the agent's thought process and the steps it takes to arrive at the final answer. The use of tools like LangTrace helps us understand the agent's internal workings and the performance of the system.

Conclusion

The Gemini 1.5 Pro experimental model has demonstrated its capabilities as an effective agent, leveraging tools like RAG and web search to provide comprehensive and concise responses to various queries. The use of LangTrace has been instrumental in tracking the agent's thought process and execution steps, providing valuable insights for optimizing the system's performance.

While the model's weather information was slightly off, the agent's ability to rewrite queries and retrieve more relevant information showcases its adaptability and problem-solving skills. The detailed explanations on the core concepts of attention in transformers further highlight the model's depth of understanding and its potential to serve as a valuable tool for users seeking information and insights.

Overall, the Gemini 1.5 Pro experimental model has proven to be a promising agent, capable of integrating external tools, planning, and executing comprehensive responses. As the field of large language models and agent-based systems continues to evolve, this example serves as a testament to the advancements in the field and the potential for even more sophisticated and capable agents in the future.

FAQ

What is an agent and how does it work?

What tools are used in this example agent?

What is the purpose of using LangTrace?

How is the RAG pipeline set up in this example?

How does the agent decide which tool to use?

How does the agent handle the thought process and execution of the plan?