🚀 SAP AI Core Agent QuickLaunch Series 🚀 – Part 4 RAG Basics ①: HANA Cloud VectorEngine&Embedding

Notice
The Japanese version is available here.

What You’ll Learn in This Series

How to spin up a custom AI agent on SAP AI Core in minutesHands‑on with LangChain, Google Search Tool, RAG, and StreamlitExposing the agent as a REST API and rebuilding the UI in SAPUI5/Fiori

Time Commitment
Each part is designed to be completed in 10–15 minutes .

️ Series Roadmap

Part 0 Prologue Part 1 Env Setup: SAP AICore & AI Launchpad Part 2 Building a Chat Model with LangChain Part 3 Agent Tools: Integrating Google Search Part4 RAG Basics① HANA Cloud VectorEngine & Embedding [current blog]Part 5 RAG Basics ②: Building Retriever ToolPart 6 Streamlit UI PrototypePart 7 Expose as a REST APIPart 8 Rebuild the UI with SAPUI5

Note
Subsequent blogs will be published soon.

If you enjoyed this post, please give it a kudos! Your support really motivates me. Also, if there’s anything you’d like to know more about, feel free to leave a comment!

RAG Basics ①: HANA Cloud VectorEngine & Embedding

1 | Introduction

General chat models often miss the mark because they don’t fully grasp a company’s specific terms or workflows. In the last chapter, we gave the agent a Google search tool so it could pull in the latest information from the web—but that alone can’t handle internal documents or your own knowledge base.

In this chapter, we’ll preload your internal docs into SAP HANA Cloud VectorEngine and set up a similarity search—the core of RAG (Retrieval-Augmented Generation)—so the agent can look up and reference that information when crafting its answers.

2 | Prerequisites

BTP sub-accountSAP AI Core instanceSAP AI LaunchPad SubscriptionPython 3.13 and pipVSCode, BAS or any IDE

Note for the Trial Environment
The HANA Cloud instance in the Trial enviroment automatically shuts down every night. If your work spans past midnight, please restart the instance the following day.

3 | Create a HANA Cloud Instance

First, let’s reveiw RAG (Retrieval-Augmented Generation). RAG follows a two-step workflow of “Retreive → Generate”:

Retreive — Use vector Similarity to fetch documents that closely match the query.Generate — Append those retreived documents to the LLM’s prompt and produce an answer.

By supplying reliable, in-house infomation during the retrieval ohase, you can curb LLM hallucinations while still addressing your organization’s unique knowledge—one of the RAG’s biggest benefits.

To prepare the indispensable vector store for RAG, start by provisioning an SAP HANA Cloud instance in your BTP Trial enviroment. In the subsequent steps, you’ll save the embedding vectors you generate into this database, creating a solid foundation for your agent’s high-speed searches.

Follow the tutorial “ Provision an Instance of SAP HANA Cloud, SAP HANA Database ” to create your HANA Cloud instance.

When configuring your HANA Cloud instance, in addition to the tutorial steps, apply these settings:

Runtime Environment: change to Cloud Foundry

Additional Features: turn Natural Language Processing (NLP) ON
Connection: set Allowed Connections to Allow all IP addresses

Once provisioning completes in HANA Cloud Central, the instance will appear in your BTP Cockpit.
As with SAP AI Core, generate a service key to prepare for RAG connectivity.

4 | Interact with HANA Cloud via LangChain

In this section, you’ll use the official LangChain connector to hook into your HANA Cloud instance and prepare to invoke its built-in embedding model. The complete flow is covered in the official docs—give them a quick read to familiarize yourself with the overall steps.

Add your service-key details to a .env file. In keeping with our “learn in seconds” approach, we’ll reuse the existing DBADMIN user rather than creating a new one. (For production, you should provision a dedicated service account.)

Service-Key Field.env VariableExample / ValuehostHANA_DB_ADDRESS例: XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXXX.hana.trial-us10.hanacloud.ondemand.comportHANA_DB_PORTe.g. 443userHANA_DB_USERDBADMINpasswordHANA_DB_PASSWARD(the password you set when you created the instance)

With these environment variables in place, LangChain’s HANA Cloud connector will be able to authenticate and send your vectors to the database for RAG retrieval.

Install the necessary libraries by running the following commands in your terminal:

pip install langchain-hana hdbcli –quietlangchain-hana is the library that lets you access HANA Cloud’s vector store and built-in embedding models.hdbcli is the SAP HANA Database Client for Python.

HANA Cloud has offered an internal embedding model since Q4 2024. If you enabled Natural Language Processing (NLP) under Additional Features when you provisioned your instance, you can call it directly.
Add the following cell to your notebook:

# Notebook Cell 6
from langchain_hana import HanaInternalEmbeddings

embeddings = HanaInternalEmbeddings(
internal_embedding_model_id=”SAP_NEB.20240715″
)

This sets up the embeddings object to use the SAP_NEB.20240715 internal model for generating vectors.

Finally, initialize the vector store (i.e., create the table). Name the table “TEST_TABLE” and add the following cell:

# Notebook Cell 7
from langchain_hana import HanaDB
from hdbcli import dbapi
import os

connection = dbapi.connect(
address=os.getenv(“HANA_DB_ADDRESS”),
port=os.getenv(“HANA_DB_PORT”),
user=os.getenv(“HANA_DB_USER”),
password=os.getenv(“HANA_DB_PASSWORD”),
sslValidateCertificate=False,
autocommit=True,
)

db = HanaDB(
embedding=embeddings,
connection=connection,
table_name=”TEST_TABLE”
)

print(“‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌ HANA Vector Store is ready!”)

With this, you’ve set up LangChain to connect to HANA Cloud, configured the internal embedding model, and initialized your vector store. In the next step, you’ll chunk your text files, generate embeddings, store them in TEST_TABLE, and build the similarity search needed for RAG.

5 | Retrieve Relevant Documents via Similarity Search

In this chapter, we’ll build the Retrieve part by implementing a similarity search that returns two documents relevant to a query.

First, load the text file that will feed the similarity search, and split it into text chunks of about 100 characters—an optimal size for embedding. Download the file sap_jp_corpus.txt from the link below and place it in the same folder as your notebook:
https://drive.google.com/drive/folders/1wYLPcKrsgUs6P7JqDgHS30qoGE-gt3Av?usp=sharing

Then add the following cell:

# Notebook Cell 8
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

# Load the entire text file as a single document
text_documents = TextLoader(“sap_en_corpus.txt”).load()

# Split into 100-character chunks, with no overlap
text_splitter = CharacterTextSplitter(
chunk_size=100,
chunk_overlap=0,
)

# Perform the split and inspect the number of chunks
text_chunks = text_splitter.split_documents(text_documents)
print(f”Number of document chunks: {len(text_chunks)}”)

What this cell does:

TextLoader: Reads the text file as one document.CharacterTextSplitter: Breaks the document into 100-character chunks.The resulting list of Document objects is stored in text_chunks.

Next, store your text chunks in the HANA Cloud vector store by adding this cell:

# Notebook Cell 9
# Clear any existing rows to reset the demo table
db.delete(filter={})

# Embed each text chunk and write them into TEST_TABLE
db.add_documents(text_chunks)

What this cell does:

db.delete(filter={}): Deletes all existing rows to initialize the demo table.db.add_documents(text_chunks): Embeds each text chunk and writes it into “TEST_TABLE”.

Now let’s implement similarity search. Add the following cell:

# Notebook Cell 10
# Define your query
query = “I want to know about the security”

# Retrieve the top 2 most similar chunks
docs = db.similarity_search(query, k=2)

# Print out the content of each retrieved chunk
for doc in docs:
print(“-” * 80)
print(doc.page_content)

Here, the input query is vectorized using the same embedding model, and HANA Cloud returns the two text chunks with the highest similarity.

If the search results include related content—such as “security policies” or “access control”—then you’ve succeeded!

6｜Challenge – Compare Accuracy with Your Custom Embedding Model

In Part 2 you deployed an OpenAI embedding model. Now let’s use that model to compare results on the same data and with the same query.

# Notebook Cell 11
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from langchain_hana import HanaDB
from hdbcli import dbapi
import os

# Swap out the embedding
openai_embed = OpenAIEmbeddings(
deployment_id=os.getenv(“EMBED_DEPLOYMENT_ID”) # The ID you noted in Part 2
)

# Use a separate table for easy comparison
db_alt = HanaDB(
embedding=openai_embed,
connection=connection,
table_name=”TEST_TABLE_OAI”
)

# Initialize the alternate table
db_alt.delete(filter={})
db_alt.add_documents(text_chunks)

# Run the same query
alt_docs = db_alt.similarity_search(query, k=2)
print(“n=== Results with OpenAI Embeddings ===”)
for doc in alt_docs:
print(“-” * 80)
print(doc.page_content)

Points of Comparison

How do the top-hit chunks differ between the HANA built-in model and the OpenAI model?Try both short and long queries (and other variations) to see where each model excels.

7 | Next Up

Part 5 RAG Basics ②: Building Retriever Tool

In Part 5, we’ll integrate the HANA Cloud vector store you’ve just created as a RAG tool into a LangChain Agent. You’ll see how to combine it with a Google Search tool to run a “two-pronged” agent. Stay tuned!

Disclaimer

Disclaimer – All the views and opinions in the blog are my own and is made in my personal capacity and that SAP shall not be responsible or liable for any of the contents published in this blog.

Generative AI has leapt from research papers to daily business reality— and SAP is surfing that wave at full speed. In this hands‑on series, I’ll show you how to spin up a custom AI agent on SAP AI Core in minutes, then grow it into a production‑ready asset—without drowning in theory.NoticeThe Japanese version is available here. What You’ll Learn in This SeriesHow to spin up a custom AI agent on SAP AI Core in minutesHands‑on with LangChain, Google Search Tool, RAG, and StreamlitExposing the agent as a REST API and rebuilding the UI in SAPUI5/FioriTime CommitmentEach part is designed to be completed in 10–15 minutes . ️ Series RoadmapPart 0 ProloguePart 1 Env Setup: SAP AICore & AI LaunchpadPart 2 Building a Chat Model with LangChainPart 3 Agent Tools: Integrating Google SearchPart4 RAG Basics① HANA Cloud VectorEngine & Embedding [current blog]Part 5 RAG Basics ②: Building Retriever ToolPart 6 Streamlit UI PrototypePart 7 Expose as a REST APIPart 8 Rebuild the UI with SAPUI5NoteSubsequent blogs will be published soon.If you enjoyed this post, please give it a kudos! Your support really motivates me. Also, if there’s anything you’d like to know more about, feel free to leave a comment!RAG Basics ①: HANA Cloud VectorEngine & Embedding1 | IntroductionGeneral chat models often miss the mark because they don’t fully grasp a company’s specific terms or workflows. In the last chapter, we gave the agent a Google search tool so it could pull in the latest information from the web—but that alone can’t handle internal documents or your own knowledge base.In this chapter, we’ll preload your internal docs into SAP HANA Cloud VectorEngine and set up a similarity search—the core of RAG (Retrieval-Augmented Generation)—so the agent can look up and reference that information when crafting its answers. 2 | PrerequisitesBTP sub-accountSAP AI Core instanceSAP AI LaunchPad SubscriptionPython 3.13 and pipVSCode, BAS or any IDENote for the Trial EnvironmentThe HANA Cloud instance in the Trial enviroment automatically shuts down every night. If your work spans past midnight, please restart the instance the following day. 3 | Create a HANA Cloud InstanceFirst, let’s reveiw RAG (Retrieval-Augmented Generation). RAG follows a two-step workflow of “Retreive → Generate”:Retreive — Use vector Similarity to fetch documents that closely match the query.Generate — Append those retreived documents to the LLM’s prompt and produce an answer.By supplying reliable, in-house infomation during the retrieval ohase, you can curb LLM hallucinations while still addressing your organization’s unique knowledge—one of the RAG’s biggest benefits.To prepare the indispensable vector store for RAG, start by provisioning an SAP HANA Cloud instance in your BTP Trial enviroment. In the subsequent steps, you’ll save the embedding vectors you generate into this database, creating a solid foundation for your agent’s high-speed searches. Follow the tutorial ” Provision an Instance of SAP HANA Cloud, SAP HANA Database ” to create your HANA Cloud instance.When configuring your HANA Cloud instance, in addition to the tutorial steps, apply these settings:Runtime Environment: change to Cloud FoundryAdditional Features: turn Natural Language Processing (NLP) ONConnection: set Allowed Connections to Allow all IP addressesOnce provisioning completes in HANA Cloud Central, the instance will appear in your BTP Cockpit.As with SAP AI Core, generate a service key to prepare for RAG connectivity. 4 | Interact with HANA Cloud via LangChainIn this section, you’ll use the official LangChain connector to hook into your HANA Cloud instance and prepare to invoke its built-in embedding model. The complete flow is covered in the official docs—give them a quick read to familiarize yourself with the overall steps.Add your service-key details to a .env file. In keeping with our “learn in seconds” approach, we’ll reuse the existing DBADMIN user rather than creating a new one. (For production, you should provision a dedicated service account.)Service-Key Field.env VariableExample / ValuehostHANA_DB_ADDRESS例: XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXXX.hana.trial-us10.hanacloud.ondemand.comportHANA_DB_PORTe.g. 443userHANA_DB_USERDBADMINpasswordHANA_DB_PASSWARD(the password you set when you created the instance) With these environment variables in place, LangChain’s HANA Cloud connector will be able to authenticate and send your vectors to the database for RAG retrieval. Install the necessary libraries by running the following commands in your terminal:pip install langchain-hana hdbcli –quietlangchain-hana is the library that lets you access HANA Cloud’s vector store and built-in embedding models.hdbcli is the SAP HANA Database Client for Python. HANA Cloud has offered an internal embedding model since Q4 2024. If you enabled Natural Language Processing (NLP) under Additional Features when you provisioned your instance, you can call it directly.Add the following cell to your notebook:# Notebook Cell 6
from langchain_hana import HanaInternalEmbeddings

embeddings = HanaInternalEmbeddings(
internal_embedding_model_id=”SAP_NEB.20240715″
)This sets up the embeddings object to use the SAP_NEB.20240715 internal model for generating vectors. Finally, initialize the vector store (i.e., create the table). Name the table “TEST_TABLE” and add the following cell:# Notebook Cell 7
from langchain_hana import HanaDB
from hdbcli import dbapi
import os

db = HanaDB(
embedding=embeddings,
connection=connection,
table_name=”TEST_TABLE”
)

print(“‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌ HANA Vector Store is ready!”)With this, you’ve set up LangChain to connect to HANA Cloud, configured the internal embedding model, and initialized your vector store. In the next step, you’ll chunk your text files, generate embeddings, store them in TEST_TABLE, and build the similarity search needed for RAG. 5 | Retrieve Relevant Documents via Similarity SearchIn this chapter, we’ll build the Retrieve part by implementing a similarity search that returns two documents relevant to a query.First, load the text file that will feed the similarity search, and split it into text chunks of about 100 characters—an optimal size for embedding. Download the file sap_jp_corpus.txt from the link below and place it in the same folder as your notebook: https://drive.google.com/drive/folders/1wYLPcKrsgUs6P7JqDgHS30qoGE-gt3Av?usp=sharingThen add the following cell:# Notebook Cell 8
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

# Load the entire text file as a single document
text_documents = TextLoader(“sap_en_corpus.txt”).load()

# Split into 100-character chunks, with no overlap
text_splitter = CharacterTextSplitter(
chunk_size=100,
chunk_overlap=0,
)

# Perform the split and inspect the number of chunks
text_chunks = text_splitter.split_documents(text_documents)
print(f”Number of document chunks: {len(text_chunks)}”)What this cell does:TextLoader: Reads the text file as one document.CharacterTextSplitter: Breaks the document into 100-character chunks.The resulting list of Document objects is stored in text_chunks. Next, store your text chunks in the HANA Cloud vector store by adding this cell:# Notebook Cell 9
# Clear any existing rows to reset the demo table
db.delete(filter={})

# Embed each text chunk and write them into TEST_TABLE
db.add_documents(text_chunks)What this cell does:db.delete(filter={}): Deletes all existing rows to initialize the demo table.db.add_documents(text_chunks): Embeds each text chunk and writes it into “TEST_TABLE”. Now let’s implement similarity search. Add the following cell:# Notebook Cell 10
# Define your query
query = “I want to know about the security”

# Retrieve the top 2 most similar chunks
docs = db.similarity_search(query, k=2)

# Print out the content of each retrieved chunk
for doc in docs:
print(“-” * 80)
print(doc.page_content)Here, the input query is vectorized using the same embedding model, and HANA Cloud returns the two text chunks with the highest similarity.If the search results include related content—such as “security policies” or “access control”—then you’ve succeeded! 6｜Challenge – Compare Accuracy with Your Custom Embedding ModelIn Part 2 you deployed an OpenAI embedding model. Now let’s use that model to compare results on the same data and with the same query.# Notebook Cell 11
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from langchain_hana import HanaDB
from hdbcli import dbapi
import os

# Swap out the embedding
openai_embed = OpenAIEmbeddings(
deployment_id=os.getenv(“EMBED_DEPLOYMENT_ID”) # The ID you noted in Part 2
)

# Use a separate table for easy comparison
db_alt = HanaDB(
embedding=openai_embed,
connection=connection,
table_name=”TEST_TABLE_OAI”
)

# Initialize the alternate table
db_alt.delete(filter={})
db_alt.add_documents(text_chunks)

# Run the same query
alt_docs = db_alt.similarity_search(query, k=2)
print(“n=== Results with OpenAI Embeddings ===”)
for doc in alt_docs:
print(“-” * 80)
print(doc.page_content)Points of ComparisonHow do the top-hit chunks differ between the HANA built-in model and the OpenAI model?Try both short and long queries (and other variations) to see where each model excels. 7 | Next UpPart 5 RAG Basics ②: Building Retriever ToolIn Part 5, we’ll integrate the HANA Cloud vector store you’ve just created as a RAG tool into a LangChain Agent. You’ll see how to combine it with a Google Search tool to run a “two-pronged” agent. Stay tuned! DisclaimerDisclaimer – All the views and opinions in the blog are my own and is made in my personal capacity and that SAP shall not be responsible or liable for any of the contents published in this blog. Read More Technology Blog Posts by SAP articles

#SAP

#SAPTechnologyblog