🚀 SAP AI Core Agent QuickLaunch Series 🚀 – Part 6: Converting the AI Agent into a REST API

Notice
日本語版はこちらです。

What You’ll Learn in This Series

How to Run a Custom AI Agent on SAP AI Core in SecondsImplementation Using LangChain, Google Search Tool, and RAGSteps to Convert the AI Agent into a REST API, Integrate It into an SAPUI5/Fiori UI, and Deploy to Cloud Foundry

Time Commitment
Each part is designed to be completed in 10–15 minutes .

️ Series Roadmap

Part 0 Prologue Part 1 Env Setup: SAP AICore & AI Launchpad Part 2 Building a Chat Model with LangChain Part 3 Agent Tools: Integrating Google Search Part4 RAG Basics① HANA Cloud VectorEngine & Embedding Part 5 RAG Basics ②: Building Retriever ToolPart 6: Converting the AI Agent into a REST API [current blog]Part 7: Building the Chat UI with SAPUI5Part 8: Deploying to Cloud Foundry

Note
Subsequent blogs will be published soon.

If you enjoyed this post, please give it a kudos! Your support really motivates me. Also, if there’s anything you’d like to know more about, feel free to leave a comment!

Converting the AI Agent into a REST API

1 | Introduction

In this chapter, we will take the AI agent we have been developing in the notebook and expose it as a REST API, packaging it so that it can later be deployed on Cloud Foundry (CF). By turning our agent into an API, we gain the following advantages:

Reusability: Any client—whether it’s a CLI tool, another application, or an SAP UI5 frontend—can invoke the agent via HTTP requests.Scalability: Running the agent as a dedicated web server allows it to handle multiple simultaneous requests.Maintainability: On CF, we can leverage built-in cloud features such as automatic scaling, centralized logging, and authentication controls.

In this chapter, we will use FastAPI to implement a minimal web API and expose our LangChain-based AI agent as an endpoint.

2 | Prerequisites

BTP sub-accountSAP AI Core instanceSAP AI LaunchPad SubscriptionPython 3.13 and pipVSCode, BAS or any IDE

Note for the Trial Environment
The HANA Cloud instance in the Trial enviroment automatically shuts down every night. If your work spans past midnight, please restart the instance the following day.

3 | Folder Structure and Library Preparation

With an eye toward deploying on CF, create a new folder for this project. All of the code and configuration files described below will be managed inside this folder.

# Folder Structure
my-ai-agent-api/
├── main.py
├── requirements.txt
├── .env
└── (Additional deployment-related files to be added)

Because we plan to deploy on CF, we explicitly list the required Python packages in requirements.txt. When the application starts on CF, it will automatically install libraries according to requirements.txt. In this file, we include fastapi, uvicorn, and gunicorn—all of which are essential for running a web server. Below is an example of how to enumerate these libraries in requirements.txt.

# Dependencies Specified in the Official generative-ai-hub-sdk Documentation
generative-ai-hub-sdk
ai_core_sdk>=2.5.7
pydantic==2.9.2
openai>=1.56.0
langchain~=0.3.0
langgraph==0.3.30
langchain-community~=0.3.0
langchain-openai>=0.2.14
langchain-google-vertexai==2.0.1
langchain-google-community==2.0.7
langchain-aws==0.2.9
google-cloud-aiplatform==1.61.0
boto3==1.35.76

# For SAP HANA Cloud VectorSearch
hdbcli==2.24.24
langchain-hana==0.1.0

# To Load the .env File
python-dotenv==1.1.0

# FastAPI / ASGI Server
fastapi==0.109.0
gunicorn
uvicorn[standard]==0.27.0

To verify everything works locally, let’s create a new virtual environment (you can reuse the one from Part 5 if you prefer, but here we’ll create a fresh one):

cd my-ai-agent-api
python -m venv .venv
source .venv/bin/activate
pip install –upgrade pip
pip install -r requirements.txt

Next, prepare the environment-variable file (.env):

# SAP AI Core credentials
AICORE_CLIENT_ID=”<YOUR_AICORE_CLIENT_ID>”
AICORE_CLIENT_SECRET=”<YOUR_AICORE_CLIENT_SECRET>”
AICORE_AUTH_URL=”https://<your-region>.authentication.<your-region>.hana.ondemand.com”
AICORE_BASE_URL=”https://api.ai.prod.<your-region>.aws.ml.hana.ondemand.com”
AICORE_RESOURCE_GROUP=”<YOUR_RESOURCE_GROUP>”
DEVELOPMENT_ID=”<YOUR_OPENAI_DEPLOYMENT_ID>”

# HANA Cloud connection information
HANA_DB_ADDRESS=”<YOUR_HANA_DB_HOST>.hana.trial-<region>.hanacloud.ondemand.com”
HANA_DB_PORT=<YOUR_HANA_DB_PORT> # e.g., 443
HANA_DB_USER=”<YOUR_HANA_DB_USER>”
HANA_DB_PASSWORD=”<YOUR_HANA_DB_PASSWORD>”

# Google Custom Search API
GOOGLE_CSE_ID=”<YOUR_GOOGLE_CSE_ID>”
GOOGLE_API_KEY=”<YOUR_GOOGLE_API_KEY>”

With the required packages installed and the .env file in place, you’re all set to begin API development!

4 | Create main.py

From here, we will create the main.py file and implement an endpoint using FastAPI to invoke our AI agent. We’ll largely reuse the notebook cells from Part 5, but we’ll make some adjustments so that we can see, in the chat interface, how the AI agent arrives at its thoughts and what actions it takes. Specifically, we’ll structure the response as follows:

Load environment variables (from .env)Initialize the FastAPI applicationCreate the LangChain/Google Search and HANA Cloud Retriever toolsInitialize the AI agent (using AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION)Define a function that extracts Thought/Action/Observation entries from the agent’s logTransform the AI agent’s raw response into our structured formatImplement the /agent/chat endpointimport os
import re
import uvicorn
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from dotenv import load_dotenv

# LangChain-related imports
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from langchain.tools import Tool
from langchain.tools.retriever import create_retriever_tool
from langchain_google_community import GoogleSearchAPIWrapper
from langchain.agents import initialize_agent, AgentType

# HANA-related imports
from langchain_hana import HanaInternalEmbeddings
from langchain_hana import HanaDB
from hdbcli import dbapi

# Load environment variables
load_dotenv(verbose=True)

# Initialize the FastAPI application
app = FastAPI()

# CORS settings
app.add_middleware(
CORSMiddleware,
allow_origins=[“*”],
allow_credentials=True,
allow_methods=[“*”],
allow_headers=[“*”],
)

# Request model
class QueryRequest(BaseModel):
query: str

# Initialize ChatOpenAI
chat_llm = ChatOpenAI(deployment_id=os.getenv(“DEVELOPMENT_ID”))

# Set up Google search tool
search = GoogleSearchAPIWrapper(k=5)
google_tool = Tool.from_function(
name=”google_search”,
description=”Search Google and return the first results”,
func=search.run
)

# Initialize embeddings
embeddings = HanaInternalEmbeddings(
internal_embedding_model_id=”SAP_NEB.20240715″
)

# Connect to HANA Cloud
connection = dbapi.connect(
address=os.getenv(“HANA_DB_ADDRESS”),
port=os.getenv(“HANA_DB_PORT”),
user=os.getenv(“HANA_DB_USER”),
password=os.getenv(“HANA_DB_PASSWORD”),
sslValidateCertificate=False,
autocommit=True,
)

# Initialize the database
db = HanaDB(
embedding=embeddings,
connection=connection,
table_name=”TEST_TABLE_EN”
)

# Set up retriever tool
retriever_tool = create_retriever_tool(
retriever=db.as_retriever(),
name=”hana_vectorengine”,
description=(
“Use this tool to search internal SAP documents stored “
“in HANA Cloud Vector Engine when the user asks company-specific questions.”
),
)

# Initialize the agent
agent = initialize_agent(
tools=[google_tool, retriever_tool],
llm=chat_llm,
agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
return_intermediate_steps=True,
)

def extract_thought_from_log(log):
if not log or not isinstance(log, str):
return “”

# Use regex to extract the “Thought” portion
patterns = [
r’Thought:s*(.*?)(?:nAction:|$)’,
r’^(.*?)(?:nAction:|$)’
]

# Try multiple patterns
for pattern in patterns:
match = re.search(pattern, log, re.DOTALL | re.IGNORECASE)
if match:
thought = match.group(1).strip()
# Skip if it’s empty or only whitespace
if thought and not thought.isspace():
return thought

# If no match, return the original log trimmed
return log.strip()

def transform_response(raw_response):
if not isinstance(raw_response, dict):
return {“output”: “”, “intermediate_steps”: []}

output = raw_response.get(‘output’, ”)
intermediate_steps = raw_response.get(‘intermediate_steps’, [])

if not isinstance(intermediate_steps, list):
intermediate_steps = []

structured_steps = []

for idx, step in enumerate(intermediate_steps, 1):
try:
# If the step is already in dictionary format (already structured)
if isinstance(step, dict):
structured_steps.append(step)
continue

# If the step is a tuple
if isinstance(step, tuple) and len(step) == 2:
action, observation = step

# Extract Thought from the log
thought = “”
if hasattr(action, ‘log’) and action.log:
thought = extract_thought_from_log(action.log)

# Create a structured step
structured_step = {
“step_no”: idx,
“thought”: thought,
“action”: getattr(action, ‘tool’, ‘unknown’),
“action_input”: getattr(action, ‘tool_input’, {}),
“observation”: observation if observation is not None else “”
}
structured_steps.append(structured_step)

except Exception:
# Skip if an error occurs
continue

return {
“output”: output,
“intermediate_steps”: structured_steps
}

@app.post(“/agent/chat”)
async def chat(request: QueryRequest):
“””Run the agent on the given query and return a structured response.”””
# Run the agent
raw_response = agent.invoke({“input”: request.query})

# Transform the response
structured_response = transform_response(raw_response)

return structured_response

if __name__ == “__main__”:
uvicorn.run(app, host=”0.0.0.0″, port=8000)

What this cell does:

AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION: By choosing this agent type, the AI Agent will output its internal reasoning steps in the form of “Thought → Action → Observation.” (Note that you must set return_intermediate_steps=True; otherwise, these intermediate steps will not be included in the response.)extract_thought_from_log(): This is a custom regular-expression function that parses the agent’s log output (in English or Japanese) to extract the “Thought” portion.transform_response(): Converts each (action, observation) tuple into a dictionary of the form { step_no, thought, action, action_input, observation }.

Note on AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION
In this series, we use AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION so that the agent’s thought process is visible in the chat UI. However, with this agent type, you may occasionally encounter an error when asking questions that don’t require any tools. We are currently investigating better workarounds, so please be aware of this limitation for now.

5 | Verify the Operation

Now let’s run the Python file we created and confirm that the API behaves as expected. First, start the FastAPI application locally and, while viewing the API documentation page, send a request to the endpoint. Then, check that the returned JSON follows the structure with “output” and “intermediate_steps.”

With your virtual environment activated, run the following command in the terminal to launch the FastAPI application as a local server:

gunicorn -w 1 -k uvicorn.workers.UvicornWorker main:app –bind 0.0.0.0:${PORT:-8000}

Once startup is complete, you’ll see a log entry saying “Application startup complete.” If you’re running this inside Business Application Studio (BAS), you’ll see a screen like this:

Click “Open in a New Tab” in the lower-right corner of the popup, and then append /docs to the URL. This will bring up the Swagger UI automatically generated by FastAPI, where you can interactively see what kind of requests you can send to the /agent/chat endpoint.

Open the “POST – /agent/chat” accordion and click “Try it out.” In the Request Body, enter JSON in the form: { “query”: “any question you like” }. Then click the Send button, and the API will respond immediately.

At the bottom of the screen, you should see an HTTP status code of 200. If the Response Body includes both output and intermediate_steps in its JSON structure, then your test has succeeded!

6｜Challenge – Add a Text-File Upload Feature

Let’s add an endpoint that allows users to upload a text file and have its contents stored in HANA Cloud Vector Engine. By providing this functionality, users can send documents directly to the AI Agent from the chat interface.

Below is a snippet of code you can insert into main.py—place the imports, model definitions, and endpoint definition at the appropriate locations. You can decide exactly where to put each part, but it’s often helpful to group all file-upload–related code into its own block or to insert imports where you collect similar imports, models where you define other request schemas, and so on.

# Imports for file upload
import shutil
import tempfile
from fastapi import File, UploadFile, HTTPException
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

# Response model
class UploadResponse(BaseModel):
message: str
filename: str
chunks_created: int

@app.post(“/agent/upload”)
async def upload_file(file: UploadFile = File(…)):
“””Upload a text file and store it as embeddings in HANA.”””

# Check file extension
if not file.filename.endswith(‘.txt’):
raise HTTPException(
status_code=400,
detail=”Only files with a .txt extension can be uploaded.”
)

try:
# Save the uploaded file to a temporary file
with tempfile.NamedTemporaryFile(delete=False, suffix=’.txt’) as tmp_file:
shutil.copyfileobj(file.file, tmp_file)
tmp_file_path = tmp_file.name

# Load the text file
text_documents = TextLoader(tmp_file_path).load()

# Split the text into appropriate chunks
text_splitter = CharacterTextSplitter(
chunk_size=50,
chunk_overlap=0,
)
text_chunks = text_splitter.split_documents(text_documents)

# Store the chunks into the HANA vector database
db.add_documents(text_chunks)

# Delete the temporary file
os.unlink(tmp_file_path)

return UploadResponse(
message=”File uploaded successfully, and embeddings have been stored.”,
filename=file.filename,
chunks_created=len(text_chunks)
)

except Exception as e:
raise HTTPException(
status_code=500,
detail=f”An error occurred while processing the file: {str(e)}”
)

Key Point is , instead of using the UploadFile object directly, to firstly copy it into a tempfile.NamedTemporaryFile to obtain a filesystem path that can be passed to TextLoader. This ensures that the file’s contents are safely written to local disk, after which we can split it into chunks and send them to HANA Cloud for embedding.

Once you’ve added this code to your main.py, follow the same steps as before to verify that everything works as expected.

After uploading, open the HANA Cloud Database Explorer and check the table named “TEST_TABLE.” Verify that the uploaded text has been split into chunks and inserted as new records, and that a vector has been generated for each record.

If everything is reflected correctly, then the file upload feature is complete!

7 | Next Up

Part 7: Building the Chat UI with SAPUI5

In Part 7, we will assemble a chat UI based on SAPUI5, which will act as the frontend that calls the AI Agent API we have created so far. Stay tuned!

Disclaimer

All the views and opinions in the blog are my own and is made in my personal capacity and that SAP shall not be responsible or liable for any of the contents published in this blog.

Generative AI has leapt from research papers to daily business reality— and SAP is surfing that wave at full speed. In this hands‑on series, I’ll show you how to spin up a custom AI agent on SAP AI Core in minutes, then grow it into a production‑ready asset—without drowning in theory.Notice日本語版はこちらです。 What You’ll Learn in This SeriesHow to Run a Custom AI Agent on SAP AI Core in SecondsImplementation Using LangChain, Google Search Tool, and RAGSteps to Convert the AI Agent into a REST API, Integrate It into an SAPUI5/Fiori UI, and Deploy to Cloud FoundryTime CommitmentEach part is designed to be completed in 10–15 minutes . ️ Series RoadmapPart 0 ProloguePart 1 Env Setup: SAP AICore & AI LaunchpadPart 2 Building a Chat Model with LangChainPart 3 Agent Tools: Integrating Google SearchPart4 RAG Basics① HANA Cloud VectorEngine & EmbeddingPart 5 RAG Basics ②: Building Retriever ToolPart 6: Converting the AI Agent into a REST API [current blog]Part 7: Building the Chat UI with SAPUI5Part 8: Deploying to Cloud FoundryNoteSubsequent blogs will be published soon.If you enjoyed this post, please give it a kudos! Your support really motivates me. Also, if there’s anything you’d like to know more about, feel free to leave a comment!Converting the AI Agent into a REST API1 | IntroductionIn this chapter, we will take the AI agent we have been developing in the notebook and expose it as a REST API, packaging it so that it can later be deployed on Cloud Foundry (CF). By turning our agent into an API, we gain the following advantages:Reusability: Any client—whether it’s a CLI tool, another application, or an SAP UI5 frontend—can invoke the agent via HTTP requests.Scalability: Running the agent as a dedicated web server allows it to handle multiple simultaneous requests.Maintainability: On CF, we can leverage built-in cloud features such as automatic scaling, centralized logging, and authentication controls.In this chapter, we will use FastAPI to implement a minimal web API and expose our LangChain-based AI agent as an endpoint. 2 | PrerequisitesBTP sub-accountSAP AI Core instanceSAP AI LaunchPad SubscriptionPython 3.13 and pipVSCode, BAS or any IDENote for the Trial EnvironmentThe HANA Cloud instance in the Trial enviroment automatically shuts down every night. If your work spans past midnight, please restart the instance the following day. 3 | Folder Structure and Library PreparationWith an eye toward deploying on CF, create a new folder for this project. All of the code and configuration files described below will be managed inside this folder.# Folder Structure
my-ai-agent-api/
├── main.py
├── requirements.txt
├── .env
└── (Additional deployment-related files to be added) Because we plan to deploy on CF, we explicitly list the required Python packages in requirements.txt. When the application starts on CF, it will automatically install libraries according to requirements.txt. In this file, we include fastapi, uvicorn, and gunicorn—all of which are essential for running a web server. Below is an example of how to enumerate these libraries in requirements.txt.# Dependencies Specified in the Official generative-ai-hub-sdk Documentation
generative-ai-hub-sdk
ai_core_sdk>=2.5.7
pydantic==2.9.2
openai>=1.56.0
langchain~=0.3.0
langgraph==0.3.30
langchain-community~=0.3.0
langchain-openai>=0.2.14
langchain-google-vertexai==2.0.1
langchain-google-community==2.0.7
langchain-aws==0.2.9
google-cloud-aiplatform==1.61.0
boto3==1.35.76

# For SAP HANA Cloud VectorSearch
hdbcli==2.24.24
langchain-hana==0.1.0

# To Load the .env File
python-dotenv==1.1.0

# FastAPI / ASGI Server
fastapi==0.109.0
gunicorn
uvicorn[standard]==0.27.0 To verify everything works locally, let’s create a new virtual environment (you can reuse the one from Part 5 if you prefer, but here we’ll create a fresh one):cd my-ai-agent-api
python -m venv .venv
source .venv/bin/activate
pip install –upgrade pip
pip install -r requirements.txt Next, prepare the environment-variable file (.env):# SAP AI Core credentials
AICORE_CLIENT_ID=”<YOUR_AICORE_CLIENT_ID>”
AICORE_CLIENT_SECRET=”<YOUR_AICORE_CLIENT_SECRET>”
AICORE_AUTH_URL=”https://<your-region>.authentication.<your-region>.hana.ondemand.com”
AICORE_BASE_URL=”https://api.ai.prod.<your-region>.aws.ml.hana.ondemand.com”
AICORE_RESOURCE_GROUP=”<YOUR_RESOURCE_GROUP>”
DEVELOPMENT_ID=”<YOUR_OPENAI_DEPLOYMENT_ID>”

# Google Custom Search API
GOOGLE_CSE_ID=”<YOUR_GOOGLE_CSE_ID>”
GOOGLE_API_KEY=”<YOUR_GOOGLE_API_KEY>”With the required packages installed and the .env file in place, you’re all set to begin API development! 4 | Create main.pyFrom here, we will create the main.py file and implement an endpoint using FastAPI to invoke our AI agent. We’ll largely reuse the notebook cells from Part 5, but we’ll make some adjustments so that we can see, in the chat interface, how the AI agent arrives at its thoughts and what actions it takes. Specifically, we’ll structure the response as follows:Load environment variables (from .env)Initialize the FastAPI applicationCreate the LangChain/Google Search and HANA Cloud Retriever toolsInitialize the AI agent (using AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION)Define a function that extracts Thought/Action/Observation entries from the agent’s logTransform the AI agent’s raw response into our structured formatImplement the /agent/chat endpointimport os
import re
import uvicorn
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from dotenv import load_dotenv