GenerativeĀ AI has leapt from research papers to daily business realityā and SAP is surfing that wave at full speed.Ā InĀ this handsāon series, Iāll show you how to spin up a custom AI agent on SAPĀ AIĀ Core in minutes, then grow it into a productionāready assetāwithout drowning in theory.
Notice
The Japanese version is available here.
Ā
šĀ What Youāll Learn in This Series
How to spin up a custom AIāÆagent on SAPāÆAIĀ Core in minutesHandsāon with LangChain, Google Search Tool, RAG, and StreamlitExposing the agent as a RESTĀ API and rebuilding the UI in SAPUI5/Fiori
Time Commitment
Each part is designed to be completed inĀ Ā 10ā15Ā minutesĀ .
Ā
šŗĀ ļø Series Roadmap
Part 0 ProloguePart 1 Env Setup: SAP AICore & AI LaunchpadPart 2 Building a Chat Model withĀ LangChainPart 3 AgentĀ Tools: Integrating GoogleĀ SearchPart4 RAG Basicsā HANA Cloud VectorEngine & Embedding [current blog]Part 5 RAG Basics ā”: Building Retriever ToolPart 6 Streamlit UI PrototypePart 7 Expose as a RESTĀ APIPart 8 Rebuild theĀ UI withĀ SAPUI5
Note
Subsequent blogs will be published soon.
If you enjoyed this post, please give it aĀ Ā kudos! Your support really motivates me. Also, if thereās anything youād like to know more about, feel free to leave a comment!
RAG Basics ā : HANA Cloud VectorEngine & Embedding
1Ā | Introduction
General chat models often miss the mark because they donāt fully grasp a companyās specific terms or workflows. In the last chapter, we gave the agent aĀ Google search toolĀ so it could pull in the latest information from the webābut that alone canāt handle internal documents or your own knowledge base.
In this chapter, weāll preload your internal docs intoĀ SAP HANA Cloud VectorEngineĀ and set up a similarity searchāthe core ofĀ RAG (Retrieval-Augmented Generation)āso the agent can look up and reference that information when crafting its answers.
Ā
2 | Prerequisites
BTP sub-accountSAP AI Core instanceSAP AI LaunchPad SubscriptionPython 3.13 and pipVSCode, BAS or any IDE
Note for the Trial Environment
The HANA Cloud instance in the Trial enviroment automatically shuts down every night. If your work spans past midnight, please restart the instance the following day.
Ā
3 | Create a HANA Cloud Instance
First, let’s reveiw RAG (Retrieval-Augmented Generation). RAG follows a two-step workflow of “Retreive ā Generate”:
RetreiveĀ ā Use vector Similarity to fetch documents that closely match the query.GenerateĀ ā Append those retreived documents to the LLM’s prompt and produce an answer.
By supplying reliable, in-house infomation during the retrieval ohase, you can curb LLM hallucinations while still addressingĀ your organization’s unique knowledgeāone of the RAG’s biggest benefits.
To prepare the indispensableĀ vector storeĀ for RAG, start by provisioning an SAP HANA Cloud instance in your BTP Trial enviroment. In the subsequent steps, you’ll save the embedding vectors you generate into this database, creating a solid foundation for your agent’s high-speed searches.
Ā
Follow theĀ tutorialĀ “Ā Provision an Instance of SAP HANA Cloud, SAP HANA DatabaseĀ ” to create your HANA Cloud instance.
When configuring your HANA Cloud instance, in addition to the tutorial steps, apply these settings:
Runtime Environment:Ā change toĀ Cloud Foundry
Additional Features:Ā turnĀ Natural Language Processing (NLP)Ā ON
Connection:Ā setĀ Allowed ConnectionsĀ toĀ Allow all IP addresses
Once provisioning completes in HANA Cloud Central, the instance will appear in your BTP Cockpit.
As with SAP AI Core, generate aĀ service keyĀ to prepare for RAG connectivity.
Ā
4 | Interact with HANA Cloud via LangChain
In this section, youāll use theĀ official LangChain connectorĀ to hook into your HANA Cloud instance and prepare to invoke its built-in embedding model. The complete flow is covered inĀ the official docsāgive them a quick read to familiarize yourself with the overall steps.
Add your service-key details to a .env file. In keeping with our ālearn in secondsā approach, weāll reuse the existing DBADMIN user rather than creating a new one. (For production, you should provision a dedicated service account.)
With these environment variables in place, LangChainās HANA Cloud connector will be able to authenticate and send your vectors to the database for RAG retrieval.
Ā
Install the necessary libraries by running the following commands in your terminal:
pip install langchain-hana hdbcli –quietlangchain-hanaĀ is the library that lets you access HANA Cloudās vector store and built-in embedding models.hdbcliĀ is the SAP HANA Database Client for Python.
Ā
HANA Cloud has offeredĀ an internal embedding modelĀ since Q4 2024. If you enabled Natural Language Processing (NLP) under Additional Features when you provisioned your instance, you can call it directly.
Add the following cell to your notebook:
# ā¶ Notebook Cell 6
from langchain_hana import HanaInternalEmbeddings
embeddings = HanaInternalEmbeddings(
internal_embedding_model_id=”SAP_NEB.20240715″
)
This sets up the embeddings object to use the SAP_NEB.20240715 internal model for generating vectors.
Finally, initialize the vector store (i.e., create the table). Name the table “TEST_TABLE” and add the following cell:
# ā¶ Notebook Cell 7
from langchain_hana import HanaDB
from hdbcli import dbapi
import os
connection = dbapi.connect(
address=os.getenv(“HANA_DB_ADDRESS”),
port=os.getenv(“HANA_DB_PORT”),
user=os.getenv(“HANA_DB_USER”),
password=os.getenv(“HANA_DB_PASSWORD”),
sslValidateCertificate=False,
autocommit=True,
)
db = HanaDB(
embedding=embeddings,
connection=connection,
table_name=”TEST_TABLE”
)
print(“āāāāāāāāāāāāāāāāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāāāāāāāāāāāāāā HANA Vector Store is ready!”)
With this, youāve set up LangChain to connect to HANA Cloud, configured the internal embedding model, and initialized your vector store. In the next step, youāll chunk your text files, generate embeddings, store them in TEST_TABLE, and build the similarity search needed for RAG.
Ā
5 | Retrieve Relevant Documents via Similarity Search
In this chapter, weāll build the Retrieve part by implementing a similarity search that returns two documents relevant to a query.
First, load the text file that will feed the similarity search, and split it into text chunks of about 100 charactersāan optimal size for embedding. Download the fileĀ sap_jp_corpus.txtĀ from the link below and place it in the same folder as your notebook:
šĀ https://drive.google.com/drive/folders/1wYLPcKrsgUs6P7JqDgHS30qoGE-gt3Av?usp=sharing
Then add the following cell:
# ā¶ Notebook Cell 8
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
# Load the entire text file as a single document
text_documents = TextLoader(“sap_en_corpus.txt”).load()
# Split into 100-character chunks, with no overlap
text_splitter = CharacterTextSplitter(
chunk_size=100,
chunk_overlap=0,
)
# Perform the split and inspect the number of chunks
text_chunks = text_splitter.split_documents(text_documents)
print(f”Number of document chunks: {len(text_chunks)}”)
What this cell does:
TextLoader: Reads the text file as one document.CharacterTextSplitter: Breaks the document into 100-character chunks.The resulting list of Document objects is stored in text_chunks.
Ā
Next, store your text chunks in the HANA Cloud vector store by adding this cell:
# ā¶ Notebook Cell 9
# Clear any existing rows to reset the demo table
db.delete(filter={})
# Embed each text chunk and write them into TEST_TABLE
db.add_documents(text_chunks)
What this cell does:
db.delete(filter={}): Deletes all existing rows to initialize the demo table.db.add_documents(text_chunks): Embeds each text chunk and writes it into “TEST_TABLE”.
Ā
Now letās implement similarity search. Add the following cell:
# ā¶ Notebook Cell 10
# Define your query
query = “I want to know about the security”
# Retrieve the top 2 most similar chunks
docs = db.similarity_search(query, k=2)
# Print out the content of each retrieved chunk
for doc in docs:
print(“-” * 80)
print(doc.page_content)
Here, the input query is vectorized using theĀ same embedding model, and HANA Cloud returns the two text chunks with the highest similarity.
If the search results include related contentāsuch as āsecurity policiesā or āaccess controlāāthen youāve succeeded!
Ā
6ļ½Challenge ā Compare Accuracy with Your Custom Embedding Model
In Part 2 you deployed an OpenAI embedding model. Now letās use that model to compare results on the same data and with the same query.
# ā¶ Notebook Cell 11
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from langchain_hana import HanaDB
from hdbcli import dbapi
import os
# Swap out the embedding
openai_embed = OpenAIEmbeddings(
deployment_id=os.getenv(“EMBED_DEPLOYMENT_ID”) # The ID you noted in Part 2
)
# Use a separate table for easy comparison
db_alt = HanaDB(
embedding=openai_embed,
connection=connection,
table_name=”TEST_TABLE_OAI”
)
# Initialize the alternate table
db_alt.delete(filter={})
db_alt.add_documents(text_chunks)
# Run the same query
alt_docs = db_alt.similarity_search(query, k=2)
print(“n=== Results with OpenAI Embeddings ===”)
for doc in alt_docs:
print(“-” * 80)
print(doc.page_content)
Points of Comparison
How do the top-hit chunks differ betweenĀ the HANA built-in model and the OpenAI model?Try both short and long queries (and other variations) to see where each model excels.
Ā
7 | NextĀ Up
Part 5 RAG Basics ā”: Building Retriever Tool
In Part 5, weāll integrate the HANA Cloud vector store youāve just created as a RAG tool into a LangChain Agent. Youāll see how to combine it with a Google Search tool to run a ātwo-prongedā agent. Stay tuned!
Ā
Disclaimer
Disclaimer ā All the views and opinions in the blog are my own and is made in my personal capacity and that SAP shall not be responsible or liable for any of the contents published in this blog.
Ā
āĀ GenerativeĀ AI has leapt from research papers to daily business realityā and SAP is surfing that wave at full speed.Ā InĀ this handsāon series, Iāll show you how to spin up a custom AI agent on SAPĀ AIĀ Core in minutes, then grow it into a productionāready assetāwithout drowning in theory.NoticeThe Japanese version is available here.Ā šĀ What Youāll Learn in This SeriesHow to spin up a custom AIāÆagent on SAPāÆAIĀ Core in minutesHandsāon with LangChain, Google Search Tool, RAG, and StreamlitExposing the agent as a RESTĀ API and rebuilding the UI in SAPUI5/FioriTime CommitmentEach part is designed to be completed inĀ Ā 10ā15Ā minutesĀ .Ā šŗĀ ļø Series RoadmapPart 0 ProloguePart 1 Env Setup: SAP AICore & AI LaunchpadPart 2 Building a Chat Model withĀ LangChainPart 3 AgentĀ Tools: Integrating GoogleĀ SearchPart4 RAG Basicsā HANA Cloud VectorEngine & Embedding [current blog]Part 5 RAG Basics ā”: Building Retriever ToolPart 6 Streamlit UI PrototypePart 7 Expose as a RESTĀ APIPart 8 Rebuild theĀ UI withĀ SAPUI5NoteSubsequent blogs will be published soon.If you enjoyed this post, please give it aĀ Ā kudos! Your support really motivates me. Also, if thereās anything youād like to know more about, feel free to leave a comment!RAG Basics ā : HANA Cloud VectorEngine & Embedding1Ā | IntroductionGeneral chat models often miss the mark because they donāt fully grasp a companyās specific terms or workflows. In the last chapter, we gave the agent aĀ Google search toolĀ so it could pull in the latest information from the webābut that alone canāt handle internal documents or your own knowledge base.In this chapter, weāll preload your internal docs intoĀ SAP HANA Cloud VectorEngineĀ and set up a similarity searchāthe core ofĀ RAG (Retrieval-Augmented Generation)āso the agent can look up and reference that information when crafting its answers.Ā 2 | PrerequisitesBTP sub-accountSAP AI Core instanceSAP AI LaunchPad SubscriptionPython 3.13 and pipVSCode, BAS or any IDENote for the Trial EnvironmentThe HANA Cloud instance in the Trial enviroment automatically shuts down every night. If your work spans past midnight, please restart the instance the following day.Ā 3 | Create a HANA Cloud InstanceFirst, let’s reveiw RAG (Retrieval-Augmented Generation). RAG follows a two-step workflow of “Retreive ā Generate”:RetreiveĀ ā Use vector Similarity to fetch documents that closely match the query.GenerateĀ ā Append those retreived documents to the LLM’s prompt and produce an answer.By supplying reliable, in-house infomation during the retrieval ohase, you can curb LLM hallucinations while still addressingĀ your organization’s unique knowledgeāone of the RAG’s biggest benefits.To prepare the indispensableĀ vector storeĀ for RAG, start by provisioning an SAP HANA Cloud instance in your BTP Trial enviroment. In the subsequent steps, you’ll save the embedding vectors you generate into this database, creating a solid foundation for your agent’s high-speed searches.Ā Follow theĀ tutorialĀ ”Ā Provision an Instance of SAP HANA Cloud, SAP HANA DatabaseĀ ” to create your HANA Cloud instance.When configuring your HANA Cloud instance, in addition to the tutorial steps, apply these settings:Runtime Environment:Ā change toĀ Cloud FoundryAdditional Features:Ā turnĀ Natural Language Processing (NLP)Ā ONConnection:Ā setĀ Allowed ConnectionsĀ toĀ Allow all IP addressesOnce provisioning completes in HANA Cloud Central, the instance will appear in your BTP Cockpit.As with SAP AI Core, generate aĀ service keyĀ to prepare for RAG connectivity.Ā 4 | Interact with HANA Cloud via LangChainIn this section, youāll use theĀ official LangChain connectorĀ to hook into your HANA Cloud instance and prepare to invoke its built-in embedding model. The complete flow is covered inĀ the official docsāgive them a quick read to familiarize yourself with the overall steps.Add your service-key details to a .env file. In keeping with our ālearn in secondsā approach, weāll reuse the existing DBADMIN user rather than creating a new one. (For production, you should provision a dedicated service account.)Service-Key Field.env VariableExample / ValuehostHANA_DB_ADDRESSä¾:Ā XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXXX.hana.trial-us10.hanacloud.ondemand.comportHANA_DB_PORTe.g. 443userHANA_DB_USERDBADMINpasswordHANA_DB_PASSWARD(the password you set when you created the instance)Ā With these environment variables in place, LangChainās HANA Cloud connector will be able to authenticate and send your vectors to the database for RAG retrieval.Ā Install the necessary libraries by running the following commands in your terminal:pip install langchain-hana hdbcli –quietlangchain-hanaĀ is the library that lets you access HANA Cloudās vector store and built-in embedding models.hdbcliĀ is the SAP HANA Database Client for Python.Ā HANA Cloud has offeredĀ an internal embedding modelĀ since Q4 2024. If you enabled Natural Language Processing (NLP) under Additional Features when you provisioned your instance, you can call it directly.Add the following cell to your notebook:# ā¶ Notebook Cell 6
from langchain_hana import HanaInternalEmbeddings
embeddings = HanaInternalEmbeddings(
internal_embedding_model_id=”SAP_NEB.20240715″
)This sets up the embeddings object to use the SAP_NEB.20240715 internal model for generating vectors.Ā Finally, initialize the vector store (i.e., create the table). Name the table “TEST_TABLE” and add the following cell:# ā¶ Notebook Cell 7
from langchain_hana import HanaDB
from hdbcli import dbapi
import os
connection = dbapi.connect(
address=os.getenv(“HANA_DB_ADDRESS”),
port=os.getenv(“HANA_DB_PORT”),
user=os.getenv(“HANA_DB_USER”),
password=os.getenv(“HANA_DB_PASSWORD”),
sslValidateCertificate=False,
autocommit=True,
)
db = HanaDB(
embedding=embeddings,
connection=connection,
table_name=”TEST_TABLE”
)
print(“āāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāā HANA Vector Store is ready!”)With this, youāve set up LangChain to connect to HANA Cloud, configured the internal embedding model, and initialized your vector store. In the next step, youāll chunk your text files, generate embeddings, store them in TEST_TABLE, and build the similarity search needed for RAG.Ā 5 | Retrieve Relevant Documents via Similarity SearchIn this chapter, weāll build the Retrieve part by implementing a similarity search that returns two documents relevant to a query.First, load the text file that will feed the similarity search, and split it into text chunks of about 100 charactersāan optimal size for embedding. Download the fileĀ sap_jp_corpus.txtĀ from the link below and place it in the same folder as your notebook:šĀ https://drive.google.com/drive/folders/1wYLPcKrsgUs6P7JqDgHS30qoGE-gt3Av?usp=sharingThen add the following cell:# ā¶ Notebook Cell 8
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
# Load the entire text file as a single document
text_documents = TextLoader(“sap_en_corpus.txt”).load()
# Split into 100-character chunks, with no overlap
text_splitter = CharacterTextSplitter(
chunk_size=100,
chunk_overlap=0,
)
# Perform the split and inspect the number of chunks
text_chunks = text_splitter.split_documents(text_documents)
print(f”Number of document chunks: {len(text_chunks)}”)What this cell does:TextLoader: Reads the text file as one document.CharacterTextSplitter: Breaks the document into 100-character chunks.The resulting list of Document objects is stored in text_chunks.Ā Next, store your text chunks in the HANA Cloud vector store by adding this cell:# ā¶ Notebook Cell 9
# Clear any existing rows to reset the demo table
db.delete(filter={})
# Embed each text chunk and write them into TEST_TABLE
db.add_documents(text_chunks)What this cell does:db.delete(filter={}): Deletes all existing rows to initialize the demo table.db.add_documents(text_chunks): Embeds each text chunk and writes it into “TEST_TABLE”.Ā Now letās implement similarity search. Add the following cell:# ā¶ Notebook Cell 10
# Define your query
query = “I want to know about the security”
# Retrieve the top 2 most similar chunks
docs = db.similarity_search(query, k=2)
# Print out the content of each retrieved chunk
for doc in docs:
print(“-” * 80)
print(doc.page_content)Here, the input query is vectorized using theĀ same embedding model, and HANA Cloud returns the two text chunks with the highest similarity.If the search results include related contentāsuch as āsecurity policiesā or āaccess controlāāthen youāve succeeded!Ā 6ļ½Challenge ā Compare Accuracy with Your Custom Embedding ModelIn Part 2 you deployed an OpenAI embedding model. Now letās use that model to compare results on the same data and with the same query.# ā¶ Notebook Cell 11
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from langchain_hana import HanaDB
from hdbcli import dbapi
import os
# Swap out the embedding
openai_embed = OpenAIEmbeddings(
deployment_id=os.getenv(“EMBED_DEPLOYMENT_ID”) # The ID you noted in Part 2
)
# Use a separate table for easy comparison
db_alt = HanaDB(
embedding=openai_embed,
connection=connection,
table_name=”TEST_TABLE_OAI”
)
# Initialize the alternate table
db_alt.delete(filter={})
db_alt.add_documents(text_chunks)
# Run the same query
alt_docs = db_alt.similarity_search(query, k=2)
print(“n=== Results with OpenAI Embeddings ===”)
for doc in alt_docs:
print(“-” * 80)
print(doc.page_content)Points of ComparisonHow do the top-hit chunks differ betweenĀ the HANA built-in model and the OpenAI model?Try both short and long queries (and other variations) to see where each model excels.Ā 7 | NextĀ UpPart 5 RAG Basics ā”: Building Retriever ToolIn Part 5, weāll integrate the HANA Cloud vector store youāve just created as a RAG tool into a LangChain Agent. Youāll see how to combine it with a Google Search tool to run a ātwo-prongedā agent. Stay tuned!Ā DisclaimerDisclaimer ā All the views and opinions in the blog are my own and is made in my personal capacity and that SAP shall not be responsible or liable for any of the contents published in this blog.Ā Ā Ā Read MoreĀ Technology Blog Posts by SAP articlesĀ
#SAP
#SAPTechnologyblog