🚀 SAP AI Core Agent QuickLaunch Series 🚀 – Part 5 RAG Basics ②: Building Retriever Tool

Estimated read time 13 min read

Generative AI has leapt from research papers to daily business reality— and SAP is surfing that wave at full speed. In this hands‑on series, I’ll show you how to spin up a custom AI agent on SAP AI Core in minutes, then grow it into a production‑ready asset—without drowning in theory.

Notice
The Japanese version is available here.

 

📖 What You’ll Learn in This Series

How to spin up a custom AI agent on SAP AI Core in minutesHands‑on with LangChain, Google Search Tool, RAG, and StreamlitExposing the agent as a REST API and rebuilding the UI in SAPUI5/Fiori

Time Commitment
Each part is designed to be completed in  10–15 minutes .

 

🗺 ️ Series Roadmap

Part 0 ProloguePart 1 Env Setup: SAP AICore & AI LaunchpadPart 2 Building a Chat Model with LangChainPart 3 Agent Tools: Integrating Google SearchPart4 RAG Basics① HANA Cloud VectorEngine & EmbeddingPart 5 RAG Basics ②: Building Retriever Tool [current blog]Part 6 Streamlit UI PrototypePart 7 Expose as a REST APIPart 8 Rebuild the UI with SAPUI5

Note
Subsequent blogs will be published soon.

If you enjoyed this post, please give it a  kudos! Your support really motivates me. Also, if there’s anything you’d like to know more about, feel free to leave a comment!

RAG Basics ②: Building Retriever Tool

1 | Introduction

In the previous chapter, we stored our internal documents in the SAP HANA Cloud Vector Engine and completed the similarity search between user queries (as vectors) and those documents. In this chapter, we’ll turn that search logic into a tool that our AI agent can invoke (a Retriever Tool). With this in place, the AI agent will be able to build its responses using a two-stage “internal vector store → web search” process.

 

2 | Prerequisites

BTP sub-accountSAP AI Core instanceSAP AI LaunchPad SubscriptionPython 3.13 and pipVSCode, BAS or any IDE

Note for the Trial Environment
The HANA Cloud instance in the Trial enviroment automatically shuts down every night. If your work spans past midnight, please restart the instance the following day.

 

3 | Build the Retriever Tool

A retriever is an object that returns text chunks whose content closely matches a given query, and it’s the first component invoked in LangChain’s RAG pipeline. Because our DB instance from the previous chapter already contains both vectors and metadata, we can promote it directly to a retriever.

First, let’s inspect the data we embedded and stored in the last chapter via the HANA Cloud Explorer. If you open the TEST_TABLE table, you’ll see that it holds both the text column and its corresponding vector column.

 

In LangChain, you can turn your vector store into a Retriever in one line using the .as_retriever() method. Let’s reuse the db instance we created in the previous chapter:

# ▶ Notebook Cell 11
retriever = db.as_retriever()

# Send a test query and retrieve just one result
retriever.invoke(“Tell me about SAP security”)[0]

 

Next, we’ll package this Retriever as a tool. LangChain’s create_retriever_tool helper takes three arguments—the Retriever itself, a tool name, and a description. Because the description is what the LLM reads to decide when to invoke the tool, be sure to state clearly what this tool searches and when it should be used. Here’s an example:

# ▶ Notebook Cell 12
from langchain.tools.retriever import create_retriever_tool
retriever_tool = create_retriever_tool(
retriever=retriever,
name=”hana_vectorengine”,
description=(
“Use this tool to search internal SAP documents stored in HANA Cloud Vector Engine when the user asks about company-specific policies, security, or best practices.”
),
)

 

4 | Integrate and Running in the AI Agent

Now that we have a two-tool setup—Google Search and our HANA VectorEngine retriever—we can wire them into an AI Agent. Pass both tools into initialize_agent() to create the agent instance, then fire off a complex query:

# ▶ Notebook Cell 13
from langchain.agents import initialize_agent, AgentType

agent = initialize_agent(
tools=[google_tool, retriever_tool],
llm=chat_llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=True,
)

result = agent.invoke(
“Please explain the services offered by RISE with SAP and, as of 2025, the cloud-migration market share compared to other vendors.”
)

 

By setting verbose=True, you’ll see the agent’s full “Thought → Action → Observation” trace. For example, in the screenshot below you’ll spot:

Invoking: `google_search` with `cloud migration market share 2025 SAP compared to other vendors`

which shows the agent pulling the latest market‐share data via Google Search, followed by:

Invoking: `hana_vectorengine` with `{‘query’: ‘services offered by RISE with SAP’}`

indicating that it’s then querying your internal SAP documents in HANA Cloud VectorEngine. (The exact order or query phrasing may vary slightly by environment.)

  

Finally, let’s render the agent’s final answer as Markdown. With this step, we’ve completed our chapter goal—an AI Agent that combines a web search tool with the HANA Cloud VectorEngine Retriever!

# ▶ Notebook Cell 14
from IPython.display import Markdown, display

display(Markdown(result[“output”]))

 

5 | Challenge – Add Memory to the AI Agent

So far, the AI Agent we’ve built does not retain any conversation history. Without this, it can’t handle natural dialogue features like “answer based on the previous question” or “remember the user’s name.” To fix that, we’ll extend it to store its history using LangChain’s conventional ConversationBufferMemory class.

Future migration note:
ConversationBufferMemory is deprecated in LangChain 0.3, and going forward, memory implementations based on LangGraph are recommended. In this series, we’re prioritizing “speed to implementation,” so we’ll stick with the traditional class for now. In Part 5.5, we’ll revisit this and build an AI Agent with LangGraph–based memory, covering the migration steps in detail.

# ▶ Notebook Cell 15
from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain_core.prompts import MessagesPlaceholder
from langchain.agents import initialize_agent, AgentType

agent_kwargs = {
“extra_prompt_messages”: [MessagesPlaceholder(variable_name=”memory”)],
}

memory = ConversationBufferMemory(memory_key=”memory”, return_messages=True)

agent_with_memory = initialize_agent(
tools=[google_tool, retriever_tool],
llm=chat_llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=True,
agent_kwargs=agent_kwargs,
memory=memory,
)

In the cell above, we did two things:

Created a ConversationBufferMemory instance to hold the conversation history.

Passed that memory into initialize_agent.

By including a MessagesPlaceholder, LangChain will now automatically inject the stored history into each prompt. 

Next, let’s verify that it’s working:

result1 = agent_with_memory.invoke(“My name is Ryota”)
result2 = agent_with_memory.invoke(“Who am I”)
print(result2[“output”])

On the second call, you should see a reply like “Your name is Mr. Ito,” which shows that the agent remembered your earlier message. Check the output to confirm that the history is being threaded through.

With memory enabled, you can now naturally ask follow-up questions in the same session, or build on results from HANA Cloud searches and Google searches. Give it a try!

 

7 | Next Up

Part 6 Streamlit UI Prototype

In Part 6, we’ll kick things off by getting hands-on with Streamlit in record time. We’ll build only the bare bones UI—chat input box, send button, and message history—and verify that a local web app spins up successfully. Stay tuned!

 

Disclaimer

All the views and opinions in the blog are my own and is made in my personal capacity and that SAP shall not be responsible or liable for any of the contents published in this blog.

 

​ Generative AI has leapt from research papers to daily business reality— and SAP is surfing that wave at full speed. In this hands‑on series, I’ll show you how to spin up a custom AI agent on SAP AI Core in minutes, then grow it into a production‑ready asset—without drowning in theory.NoticeThe Japanese version is available here. 📖 What You’ll Learn in This SeriesHow to spin up a custom AI agent on SAP AI Core in minutesHands‑on with LangChain, Google Search Tool, RAG, and StreamlitExposing the agent as a REST API and rebuilding the UI in SAPUI5/FioriTime CommitmentEach part is designed to be completed in  10–15 minutes . 🗺 ️ Series RoadmapPart 0 ProloguePart 1 Env Setup: SAP AICore & AI LaunchpadPart 2 Building a Chat Model with LangChainPart 3 Agent Tools: Integrating Google SearchPart4 RAG Basics① HANA Cloud VectorEngine & EmbeddingPart 5 RAG Basics ②: Building Retriever Tool [current blog]Part 6 Streamlit UI PrototypePart 7 Expose as a REST APIPart 8 Rebuild the UI with SAPUI5NoteSubsequent blogs will be published soon.If you enjoyed this post, please give it a  kudos! Your support really motivates me. Also, if there’s anything you’d like to know more about, feel free to leave a comment!RAG Basics ②: Building Retriever Tool1 | IntroductionIn the previous chapter, we stored our internal documents in the SAP HANA Cloud Vector Engine and completed the similarity search between user queries (as vectors) and those documents. In this chapter, we’ll turn that search logic into a tool that our AI agent can invoke (a Retriever Tool). With this in place, the AI agent will be able to build its responses using a two-stage “internal vector store → web search” process. 2 | PrerequisitesBTP sub-accountSAP AI Core instanceSAP AI LaunchPad SubscriptionPython 3.13 and pipVSCode, BAS or any IDENote for the Trial EnvironmentThe HANA Cloud instance in the Trial enviroment automatically shuts down every night. If your work spans past midnight, please restart the instance the following day. 3 | Build the Retriever ToolA retriever is an object that returns text chunks whose content closely matches a given query, and it’s the first component invoked in LangChain’s RAG pipeline. Because our DB instance from the previous chapter already contains both vectors and metadata, we can promote it directly to a retriever.First, let’s inspect the data we embedded and stored in the last chapter via the HANA Cloud Explorer. If you open the TEST_TABLE table, you’ll see that it holds both the text column and its corresponding vector column. In LangChain, you can turn your vector store into a Retriever in one line using the .as_retriever() method. Let’s reuse the db instance we created in the previous chapter:# ▶ Notebook Cell 11
retriever = db.as_retriever()

# Send a test query and retrieve just one result
retriever.invoke(“Tell me about SAP security”)[0] Next, we’ll package this Retriever as a tool. LangChain’s create_retriever_tool helper takes three arguments—the Retriever itself, a tool name, and a description. Because the description is what the LLM reads to decide when to invoke the tool, be sure to state clearly what this tool searches and when it should be used. Here’s an example:# ▶ Notebook Cell 12
from langchain.tools.retriever import create_retriever_tool
retriever_tool = create_retriever_tool(
retriever=retriever,
name=”hana_vectorengine”,
description=(
“Use this tool to search internal SAP documents stored in HANA Cloud Vector Engine when the user asks about company-specific policies, security, or best practices.”
),
) 4 | Integrate and Running in the AI AgentNow that we have a two-tool setup—Google Search and our HANA VectorEngine retriever—we can wire them into an AI Agent. Pass both tools into initialize_agent() to create the agent instance, then fire off a complex query:# ▶ Notebook Cell 13
from langchain.agents import initialize_agent, AgentType

agent = initialize_agent(
tools=[google_tool, retriever_tool],
llm=chat_llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=True,
)

result = agent.invoke(
“Please explain the services offered by RISE with SAP and, as of 2025, the cloud-migration market share compared to other vendors.”
) By setting verbose=True, you’ll see the agent’s full “Thought → Action → Observation” trace. For example, in the screenshot below you’ll spot:Invoking: `google_search` with `cloud migration market share 2025 SAP compared to other vendors`which shows the agent pulling the latest market‐share data via Google Search, followed by:Invoking: `hana_vectorengine` with `{‘query’: ‘services offered by RISE with SAP’}`indicating that it’s then querying your internal SAP documents in HANA Cloud VectorEngine. (The exact order or query phrasing may vary slightly by environment.)  Finally, let’s render the agent’s final answer as Markdown. With this step, we’ve completed our chapter goal—an AI Agent that combines a web search tool with the HANA Cloud VectorEngine Retriever!# ▶ Notebook Cell 14
from IPython.display import Markdown, display

display(Markdown(result[“output”])) 5 | Challenge – Add Memory to the AI AgentSo far, the AI Agent we’ve built does not retain any conversation history. Without this, it can’t handle natural dialogue features like “answer based on the previous question” or “remember the user’s name.” To fix that, we’ll extend it to store its history using LangChain’s conventional ConversationBufferMemory class.Future migration note:ConversationBufferMemory is deprecated in LangChain 0.3, and going forward, memory implementations based on LangGraph are recommended. In this series, we’re prioritizing “speed to implementation,” so we’ll stick with the traditional class for now. In Part 5.5, we’ll revisit this and build an AI Agent with LangGraph–based memory, covering the migration steps in detail.# ▶ Notebook Cell 15
from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain_core.prompts import MessagesPlaceholder
from langchain.agents import initialize_agent, AgentType

agent_kwargs = {
“extra_prompt_messages”: [MessagesPlaceholder(variable_name=”memory”)],
}

memory = ConversationBufferMemory(memory_key=”memory”, return_messages=True)

agent_with_memory = initialize_agent(
tools=[google_tool, retriever_tool],
llm=chat_llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=True,
agent_kwargs=agent_kwargs,
memory=memory,
)In the cell above, we did two things:Created a ConversationBufferMemory instance to hold the conversation history.Passed that memory into initialize_agent.By including a MessagesPlaceholder, LangChain will now automatically inject the stored history into each prompt. Next, let’s verify that it’s working:result1 = agent_with_memory.invoke(“My name is Ryota”)
result2 = agent_with_memory.invoke(“Who am I”)
print(result2[“output”])On the second call, you should see a reply like “Your name is Mr. Ito,” which shows that the agent remembered your earlier message. Check the output to confirm that the history is being threaded through.With memory enabled, you can now naturally ask follow-up questions in the same session, or build on results from HANA Cloud searches and Google searches. Give it a try! 7 | Next UpPart 6 Streamlit UI PrototypeIn Part 6, we’ll kick things off by getting hands-on with Streamlit in record time. We’ll build only the bare bones UI—chat input box, send button, and message history—and verify that a local web app spins up successfully. Stay tuned! DisclaimerAll the views and opinions in the blog are my own and is made in my personal capacity and that SAP shall not be responsible or liable for any of the contents published in this blog.   Read More Technology Blog Posts by SAP articles 

#SAP

#SAPTechnologyblog

You May Also Like

More From Author