Generative AI has leapt from research papers to daily business realityâ and SAP is surfing that wave at full speed. In this handsâon series, Iâll show you how to spin up a custom AI agent on SAP AI Core in minutes, then grow it into a productionâready assetâwithout drowning in theory.
Notice
The Japanese version is available here.
What Youâll Learn in This Series
How to spin up a custom AIâŻagent on SAPâŻAI Core in minutesHandsâon with LangChain, Google Search Tool, RAG, and StreamlitExposing the agent as a REST API and rebuilding the UI in SAPUI5/Fiori
Time Commitment
Each part is designed to be completed in 10â15 minutes .
ïž Series Roadmap
Part 0 ProloguePart 1 Env Setup: SAP AICore & AI LaunchpadPart 2 Building a Chat Model with LangChainPart 3 Agent Tools: Integrating Google SearchPart4 RAG Basicsâ HANA Cloud VectorEngine & Embedding [current blog]Part 5 RAG Basics âĄ: Building Retriever ToolPart 6 Streamlit UI PrototypePart 7 Expose as a REST APIPart 8 Rebuild the UI with SAPUI5
Note
Subsequent blogs will be published soon.
If you enjoyed this post, please give it a kudos! Your support really motivates me. Also, if thereâs anything youâd like to know more about, feel free to leave a comment!
RAG Basics â : HANA Cloud VectorEngine & Embedding
1 | Introduction
General chat models often miss the mark because they donât fully grasp a companyâs specific terms or workflows. In the last chapter, we gave the agent a Google search tool so it could pull in the latest information from the webâbut that alone canât handle internal documents or your own knowledge base.
In this chapter, weâll preload your internal docs into SAP HANA Cloud VectorEngine and set up a similarity searchâthe core of RAG (Retrieval-Augmented Generation)âso the agent can look up and reference that information when crafting its answers.
2 | Prerequisites
BTP sub-accountSAP AI Core instanceSAP AI LaunchPad SubscriptionPython 3.13 and pipVSCode, BAS or any IDE
Note for the Trial Environment
The HANA Cloud instance in the Trial enviroment automatically shuts down every night. If your work spans past midnight, please restart the instance the following day.
3 | Create a HANA Cloud Instance
First, letâs reveiw RAG (Retrieval-Augmented Generation). RAG follows a two-step workflow of âRetreive â Generateâ:
Retreive â Use vector Similarity to fetch documents that closely match the query.Generate â Append those retreived documents to the LLMâs prompt and produce an answer.
By supplying reliable, in-house infomation during the retrieval ohase, you can curb LLM hallucinations while still addressing your organizationâs unique knowledgeâone of the RAGâs biggest benefits.
To prepare the indispensable vector store for RAG, start by provisioning an SAP HANA Cloud instance in your BTP Trial enviroment. In the subsequent steps, youâll save the embedding vectors you generate into this database, creating a solid foundation for your agentâs high-speed searches.
Follow the tutorial â Provision an Instance of SAP HANA Cloud, SAP HANA Database â to create your HANA Cloud instance.
When configuring your HANA Cloud instance, in addition to the tutorial steps, apply these settings:
Runtime Environment: change to Cloud Foundry
Additional Features: turn Natural Language Processing (NLP) ON
Connection: set Allowed Connections to Allow all IP addresses
Once provisioning completes in HANA Cloud Central, the instance will appear in your BTP Cockpit.
As with SAP AI Core, generate a service key to prepare for RAG connectivity.
4 | Interact with HANA Cloud via LangChain
In this section, youâll use the official LangChain connector to hook into your HANA Cloud instance and prepare to invoke its built-in embedding model. The complete flow is covered in the official docsâgive them a quick read to familiarize yourself with the overall steps.
Add your service-key details to a .env file. In keeping with our âlearn in secondsâ approach, weâll reuse the existing DBADMIN user rather than creating a new one. (For production, you should provision a dedicated service account.)
With these environment variables in place, LangChainâs HANA Cloud connector will be able to authenticate and send your vectors to the database for RAG retrieval.
Install the necessary libraries by running the following commands in your terminal:
pip install langchain-hana hdbcli âquietlangchain-hana is the library that lets you access HANA Cloudâs vector store and built-in embedding models.hdbcli is the SAP HANA Database Client for Python.
HANA Cloud has offered an internal embedding model since Q4 2024. If you enabled Natural Language Processing (NLP) under Additional Features when you provisioned your instance, you can call it directly.
Add the following cell to your notebook:
# Notebook Cell 6
from langchain_hana import HanaInternalEmbeddings
embeddings = HanaInternalEmbeddings(
internal_embedding_model_id=âSAP_NEB.20240715âł
)
This sets up the embeddings object to use the SAP_NEB.20240715 internal model for generating vectors.
Finally, initialize the vector store (i.e., create the table). Name the table âTEST_TABLEâ and add the following cell:
# Notebook Cell 7
from langchain_hana import HanaDB
from hdbcli import dbapi
import os
connection = dbapi.connect(
address=os.getenv(âHANA_DB_ADDRESSâ),
port=os.getenv(âHANA_DB_PORTâ),
user=os.getenv(âHANA_DB_USERâ),
password=os.getenv(âHANA_DB_PASSWORDâ),
sslValidateCertificate=False,
autocommit=True,
)
db = HanaDB(
embedding=embeddings,
connection=connection,
table_name=âTEST_TABLEâ
)
print(âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ HANA Vector Store is ready!â)
With this, youâve set up LangChain to connect to HANA Cloud, configured the internal embedding model, and initialized your vector store. In the next step, youâll chunk your text files, generate embeddings, store them in TEST_TABLE, and build the similarity search needed for RAG.
5 | Retrieve Relevant Documents via Similarity Search
In this chapter, weâll build the Retrieve part by implementing a similarity search that returns two documents relevant to a query.
First, load the text file that will feed the similarity search, and split it into text chunks of about 100 charactersâan optimal size for embedding. Download the file sap_jp_corpus.txt from the link below and place it in the same folder as your notebook: https://drive.google.com/drive/folders/1wYLPcKrsgUs6P7JqDgHS30qoGE-gt3Av?usp=sharing
Then add the following cell:
# Notebook Cell 8
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
# Load the entire text file as a single document
text_documents = TextLoader(âsap_en_corpus.txtâ).load()
# Split into 100-character chunks, with no overlap
text_splitter = CharacterTextSplitter(
chunk_size=100,
chunk_overlap=0,
)
# Perform the split and inspect the number of chunks
text_chunks = text_splitter.split_documents(text_documents)
print(fâNumber of document chunks: {len(text_chunks)}â)
What this cell does:
TextLoader: Reads the text file as one document.CharacterTextSplitter: Breaks the document into 100-character chunks.The resulting list of Document objects is stored in text_chunks.
Next, store your text chunks in the HANA Cloud vector store by adding this cell:
# Notebook Cell 9
# Clear any existing rows to reset the demo table
db.delete(filter={})
# Embed each text chunk and write them into TEST_TABLE
db.add_documents(text_chunks)
What this cell does:
db.delete(filter={}): Deletes all existing rows to initialize the demo table.db.add_documents(text_chunks): Embeds each text chunk and writes it into âTEST_TABLEâ.
Now letâs implement similarity search. Add the following cell:
# Notebook Cell 10
# Define your query
query = âI want to know about the securityâ
# Retrieve the top 2 most similar chunks
docs = db.similarity_search(query, k=2)
# Print out the content of each retrieved chunk
for doc in docs:
print(â-â * 80)
print(doc.page_content)
Here, the input query is vectorized using the same embedding model, and HANA Cloud returns the two text chunks with the highest similarity.
If the search results include related contentâsuch as âsecurity policiesâ or âaccess controlââthen youâve succeeded!
6ïœChallenge â Compare Accuracy with Your Custom Embedding Model
In Part 2 you deployed an OpenAI embedding model. Now letâs use that model to compare results on the same data and with the same query.
# Notebook Cell 11
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from langchain_hana import HanaDB
from hdbcli import dbapi
import os
# Swap out the embedding
openai_embed = OpenAIEmbeddings(
deployment_id=os.getenv(âEMBED_DEPLOYMENT_IDâ) # The ID you noted in Part 2
)
# Use a separate table for easy comparison
db_alt = HanaDB(
embedding=openai_embed,
connection=connection,
table_name=âTEST_TABLE_OAIâ
)
# Initialize the alternate table
db_alt.delete(filter={})
db_alt.add_documents(text_chunks)
# Run the same query
alt_docs = db_alt.similarity_search(query, k=2)
print(ân=== Results with OpenAI Embeddings ===â)
for doc in alt_docs:
print(â-â * 80)
print(doc.page_content)
Points of Comparison
How do the top-hit chunks differ between the HANA built-in model and the OpenAI model?Try both short and long queries (and other variations) to see where each model excels.
7 | Next Up
Part 5 RAG Basics âĄ: Building Retriever Tool
In Part 5, weâll integrate the HANA Cloud vector store youâve just created as a RAG tool into a LangChain Agent. Youâll see how to combine it with a Google Search tool to run a âtwo-prongedâ agent. Stay tuned!
Disclaimer
Disclaimer â All the views and opinions in the blog are my own and is made in my personal capacity and that SAP shall not be responsible or liable for any of the contents published in this blog.
â Generative AI has leapt from research papers to daily business realityâ and SAP is surfing that wave at full speed. In this handsâon series, Iâll show you how to spin up a custom AI agent on SAP AI Core in minutes, then grow it into a productionâready assetâwithout drowning in theory.NoticeThe Japanese version is available here. What Youâll Learn in This SeriesHow to spin up a custom AIâŻagent on SAPâŻAI Core in minutesHandsâon with LangChain, Google Search Tool, RAG, and StreamlitExposing the agent as a REST API and rebuilding the UI in SAPUI5/FioriTime CommitmentEach part is designed to be completed in 10â15 minutes .
ïž Series RoadmapPart 0 ProloguePart 1 Env Setup: SAP AICore & AI LaunchpadPart 2 Building a Chat Model with LangChainPart 3 Agent Tools: Integrating Google SearchPart4 RAG Basicsâ HANA Cloud VectorEngine & Embedding [current blog]Part 5 RAG Basics âĄ: Building Retriever ToolPart 6 Streamlit UI PrototypePart 7 Expose as a REST APIPart 8 Rebuild the UI with SAPUI5NoteSubsequent blogs will be published soon.If you enjoyed this post, please give it a kudos! Your support really motivates me. Also, if thereâs anything youâd like to know more about, feel free to leave a comment!RAG Basics â : HANA Cloud VectorEngine & Embedding1 | IntroductionGeneral chat models often miss the mark because they donât fully grasp a companyâs specific terms or workflows. In the last chapter, we gave the agent a Google search tool so it could pull in the latest information from the webâbut that alone canât handle internal documents or your own knowledge base.In this chapter, weâll preload your internal docs into SAP HANA Cloud VectorEngine and set up a similarity searchâthe core of RAG (Retrieval-Augmented Generation)âso the agent can look up and reference that information when crafting its answers. 2 | PrerequisitesBTP sub-accountSAP AI Core instanceSAP AI LaunchPad SubscriptionPython 3.13 and pipVSCode, BAS or any IDENote for the Trial EnvironmentThe HANA Cloud instance in the Trial enviroment automatically shuts down every night. If your work spans past midnight, please restart the instance the following day. 3 | Create a HANA Cloud InstanceFirst, letâs reveiw RAG (Retrieval-Augmented Generation). RAG follows a two-step workflow of âRetreive â Generateâ:Retreive â Use vector Similarity to fetch documents that closely match the query.Generate â Append those retreived documents to the LLMâs prompt and produce an answer.By supplying reliable, in-house infomation during the retrieval ohase, you can curb LLM hallucinations while still addressing your organizationâs unique knowledgeâone of the RAGâs biggest benefits.To prepare the indispensable vector store for RAG, start by provisioning an SAP HANA Cloud instance in your BTP Trial enviroment. In the subsequent steps, youâll save the embedding vectors you generate into this database, creating a solid foundation for your agentâs high-speed searches. Follow the tutorial â Provision an Instance of SAP HANA Cloud, SAP HANA Database â to create your HANA Cloud instance.When configuring your HANA Cloud instance, in addition to the tutorial steps, apply these settings:Runtime Environment: change to Cloud FoundryAdditional Features: turn Natural Language Processing (NLP) ONConnection: set Allowed Connections to Allow all IP addressesOnce provisioning completes in HANA Cloud Central, the instance will appear in your BTP Cockpit.As with SAP AI Core, generate a service key to prepare for RAG connectivity. 4 | Interact with HANA Cloud via LangChainIn this section, youâll use the official LangChain connector to hook into your HANA Cloud instance and prepare to invoke its built-in embedding model. The complete flow is covered in the official docsâgive them a quick read to familiarize yourself with the overall steps.Add your service-key details to a .env file. In keeping with our âlearn in secondsâ approach, weâll reuse the existing DBADMIN user rather than creating a new one. (For production, you should provision a dedicated service account.)Service-Key Field.env VariableExample / ValuehostHANA_DB_ADDRESSäŸ: XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXXX.hana.trial-us10.hanacloud.ondemand.comportHANA_DB_PORTe.g. 443userHANA_DB_USERDBADMINpasswordHANA_DB_PASSWARD(the password you set when you created the instance) With these environment variables in place, LangChainâs HANA Cloud connector will be able to authenticate and send your vectors to the database for RAG retrieval. Install the necessary libraries by running the following commands in your terminal:pip install langchain-hana hdbcli âquietlangchain-hana is the library that lets you access HANA Cloudâs vector store and built-in embedding models.hdbcli is the SAP HANA Database Client for Python. HANA Cloud has offered an internal embedding model since Q4 2024. If you enabled Natural Language Processing (NLP) under Additional Features when you provisioned your instance, you can call it directly.Add the following cell to your notebook:#
Notebook Cell 6
from langchain_hana import HanaInternalEmbeddings
embeddings = HanaInternalEmbeddings(
internal_embedding_model_id=âSAP_NEB.20240715âł
)This sets up the embeddings object to use the SAP_NEB.20240715 internal model for generating vectors. Finally, initialize the vector store (i.e., create the table). Name the table âTEST_TABLEâ and add the following cell:# Notebook Cell 7
from langchain_hana import HanaDB
from hdbcli import dbapi
import os
connection = dbapi.connect(
address=os.getenv(âHANA_DB_ADDRESSâ),
port=os.getenv(âHANA_DB_PORTâ),
user=os.getenv(âHANA_DB_USERâ),
password=os.getenv(âHANA_DB_PASSWORDâ),
sslValidateCertificate=False,
autocommit=True,
)
db = HanaDB(
embedding=embeddings,
connection=connection,
table_name=âTEST_TABLEâ
)
print(âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ HANA Vector Store is ready!â)With this, youâve set up LangChain to connect to HANA Cloud, configured the internal embedding model, and initialized your vector store. In the next step, youâll chunk your text files, generate embeddings, store them in TEST_TABLE, and build the similarity search needed for RAG. 5 | Retrieve Relevant Documents via Similarity SearchIn this chapter, weâll build the Retrieve part by implementing a similarity search that returns two documents relevant to a query.First, load the text file that will feed the similarity search, and split it into text chunks of about 100 charactersâan optimal size for embedding. Download the file sap_jp_corpus.txt from the link below and place it in the same folder as your notebook:
https://drive.google.com/drive/folders/1wYLPcKrsgUs6P7JqDgHS30qoGE-gt3Av?usp=sharingThen add the following cell:#
Notebook Cell 8
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
# Load the entire text file as a single document
text_documents = TextLoader(âsap_en_corpus.txtâ).load()
# Split into 100-character chunks, with no overlap
text_splitter = CharacterTextSplitter(
chunk_size=100,
chunk_overlap=0,
)
# Perform the split and inspect the number of chunks
text_chunks = text_splitter.split_documents(text_documents)
print(fâNumber of document chunks: {len(text_chunks)}â)What this cell does:TextLoader: Reads the text file as one document.CharacterTextSplitter: Breaks the document into 100-character chunks.The resulting list of Document objects is stored in text_chunks. Next, store your text chunks in the HANA Cloud vector store by adding this cell:# Notebook Cell 9
# Clear any existing rows to reset the demo table
db.delete(filter={})
# Embed each text chunk and write them into TEST_TABLE
db.add_documents(text_chunks)What this cell does:db.delete(filter={}): Deletes all existing rows to initialize the demo table.db.add_documents(text_chunks): Embeds each text chunk and writes it into âTEST_TABLEâ. Now letâs implement similarity search. Add the following cell:# Notebook Cell 10
# Define your query
query = âI want to know about the securityâ
# Retrieve the top 2 most similar chunks
docs = db.similarity_search(query, k=2)
# Print out the content of each retrieved chunk
for doc in docs:
print(â-â * 80)
print(doc.page_content)Here, the input query is vectorized using the same embedding model, and HANA Cloud returns the two text chunks with the highest similarity.If the search results include related contentâsuch as âsecurity policiesâ or âaccess controlââthen youâve succeeded! 6ïœChallenge â Compare Accuracy with Your Custom Embedding ModelIn Part 2 you deployed an OpenAI embedding model. Now letâs use that model to compare results on the same data and with the same query.# Notebook Cell 11
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from langchain_hana import HanaDB
from hdbcli import dbapi
import os
# Swap out the embedding
openai_embed = OpenAIEmbeddings(
deployment_id=os.getenv(âEMBED_DEPLOYMENT_IDâ) # The ID you noted in Part 2
)
# Use a separate table for easy comparison
db_alt = HanaDB(
embedding=openai_embed,
connection=connection,
table_name=âTEST_TABLE_OAIâ
)
# Initialize the alternate table
db_alt.delete(filter={})
db_alt.add_documents(text_chunks)
# Run the same query
alt_docs = db_alt.similarity_search(query, k=2)
print(ân=== Results with OpenAI Embeddings ===â)
for doc in alt_docs:
print(â-â * 80)
print(doc.page_content)Points of ComparisonHow do the top-hit chunks differ between the HANA built-in model and the OpenAI model?Try both short and long queries (and other variations) to see where each model excels. 7 | Next UpPart 5 RAG Basics âĄ: Building Retriever ToolIn Part 5, weâll integrate the HANA Cloud vector store youâve just created as a RAG tool into a LangChain Agent. Youâll see how to combine it with a Google Search tool to run a âtwo-prongedâ agent. Stay tuned! DisclaimerDisclaimer â All the views and opinions in the blog are my own and is made in my personal capacity and that SAP shall not be responsible or liable for any of the contents published in this blog. Read More Technology Blog Posts by SAP articles
#SAP
#SAPTechnologyblog