Introduction
The implementation of Retrieval-Augmented Generation (RAG) is becoming more popular as a technique to provide accurate and reliable responses from Large Language Models (LLMs). This time, we examine PandasAI as a method to enable LLMs to handle more advanced tasks such as data analysis. PandasAI allows data search and visualization by generating Python code for data analysis in response to natural language queries. In this article, we explore how PandasAI can be integrated with Generative AI Hub, enabling developers to quickly create a chat application to interact with their business data using only Python.
Prerequisites
When using Generative AI Hub for the first time to access Language Models (LLMs), refer to the following document for a quick start:
https://pypi.org/project/generative-ai-hub-sdk/
from dotenv import load_dotenv
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
load_dotenv(override=True)
proxy_client = get_proxy_client(‘gen-ai-hub’)
model_name = ‘gpt-4o’
llm = ChatOpenAI(proxy_model_name=model_name, proxy_client=proxy_client)
In the following document, you can find the models available in Generative AI Hub. Please note that you must deploy the model you want to use, such as GPT-4o, as described in the document.
More details on how to deploy them are covered in steps 2 to 4 of the following tutorial:
https://developers.sap.com/tutorials/ai-core-generative-ai.html
Smart Excel Chat
Preparation (Back-end)
import os
import pandas as pd
from typing import List, Unio
class PandasAI:
def __init__(self):
self.files = []
self.filenames = []
def upload_excel(self, file: List[str]) -> str:
try:
for filepath in file:
df = pd.read_excel(filepath)
self.files.append(df)
self.filenames.append(os.path.basename(filepath))
return “\n”.join(self.filenames)
except Exception as e:
return f”An error occurred: {str(e)}”
The chat_with_excel method allows users to interact with uploaded data through messages while maintaining a chat history. The Agent class from PandasAI facilitates interaction with dataframes for multi-turn conversations. This method is designed to handle both image and text outputs. Since the Agent outputs dataframes and numbers (not just strings), they are cast to strings in lines 29 and 31.
from pandasai import Agent
class PandasAI:
def __init__(self):
self.files = []
self.filenames = []
def upload_excel(self, file: List[str]) -> str:
# Code omitted for brevity
def chat_with_excel(self, message: str, chat_history: List) -> List[Union[str, List]]:
agent = Agent(self.files, config={“llm”: llm})
prompt = f”’
User Message:
{message}
Chat History:
{str(chat_history)}
”’
bot_message = agent.chat(prompt)
if isinstance(bot_message, str) and os.path.exists(bot_message):
img = gr.Image(bot_message)
chat_history.append((message, None))
chat_history.append((None, img))
return [“”, chat_history]
elif isinstance(bot_message, pd.DataFrame):
bot_message = bot_message.to_string(index=False)
else:
bot_message = str(bot_message)
chat_history.append((message, bot_message))
return [“”, chat_history]
Preparation (Front-end)
import gradio as gr
with gr.Blocks() as excel_chatbot:
gr.Markdown(“## Smart Excel Chat”)
with gr.Row():
with gr.Column():
excel_file = gr.File(label=”Upload Excel Sheet”, file_count=”multiple”, file_types=[“.xlsx”], type=”filepath”)
upload_button = gr.Button(“Upload”)
upload_status = gr.Textbox(label=”Uploaded Files”, interactive=False)
with gr.Column():
chatbot = gr.Chatbot()
msg = gr.Textbox()
clear = gr.ClearButton([msg, chatbot])
pandasai = PandasAI()
def reset_file():
return None
upload_button.click(pandasai.upload_excel, inputs=excel_file, outputs=upload_status)
upload_button.click(reset_file, outputs=excel_file)
msg.submit(
pandasai.chat_with_excel,
inputs=[msg, chatbot],
outputs=[msg, chatbot]
)
excel_chatbot.launch()
It includes components for uploading Excel sheets, a chat view, and a text input box.
Performance Verification
I will demonstrate our prototype using the following sample data related to HR. We prepared two simple Excel sheets for the demonstration, even though it worked well with larger datasets as well.
The app can correctly answer user queries by considering the chat history.
It can plot bar chart,
and pie chart as well.
Limitations
When we ask a query about something the LLMs shouldn’t know, they hallucinate.
However, it recognizes when it lacks the necessary information. By restructuring the prompt to ensure that the LLM has access to the relevant information and adjusting actions based on the output, we assume that this issue can be addressed.
Conclusion
Integrating PandasAI with Generative AI Hub enhances data analysis by enabling seamless interaction with business data through natural language queries. This solution simplifies tasks like data search and visualization, making them more accessible and intuitive. It demonstrates the powerful potential of RAG to improve productivity and streamline workflows in business applications.
Introduction The implementation of Retrieval-Augmented Generation (RAG) is becoming more popular as a technique to provide accurate and reliable responses from Large Language Models (LLMs). This time, we examine PandasAI as a method to enable LLMs to handle more advanced tasks such as data analysis. PandasAI allows data search and visualization by generating Python code for data analysis in response to natural language queries. In this article, we explore how PandasAI can be integrated with Generative AI Hub, enabling developers to quickly create a chat application to interact with their business data using only Python.PrerequisitesWhen using Generative AI Hub for the first time to access Language Models (LLMs), refer to the following document for a quick start:https://pypi.org/project/generative-ai-hub-sdk/ from dotenv import load_dotenv
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
load_dotenv(override=True)
proxy_client = get_proxy_client(‘gen-ai-hub’)
model_name = ‘gpt-4o’
llm = ChatOpenAI(proxy_model_name=model_name, proxy_client=proxy_client) In the following document, you can find the models available in Generative AI Hub. Please note that you must deploy the model you want to use, such as GPT-4o, as described in the document.https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/create-deployment-for-generative-ai-model-in-sap-ai-coreMore details on how to deploy them are covered in steps 2 to 4 of the following tutorial:https://developers.sap.com/tutorials/ai-core-generative-ai.htmlSmart Excel ChatPreparation (Back-end)PandasAI class manages Excel file uploads and facilitates chat interactions based on their contents. Upon initialization, it creates two empty lists: files for storing DataFrames and filenames for the corresponding file names. The upload_excel method takes a list of file paths, reads each Excel file into a DataFrame, and appends them to files while also storing the file names in filenames. It returns a string of the uploaded file names for display in the interface. import os
import pandas as pd
from typing import List, Unio
class PandasAI:
def __init__(self):
self.files = []
self.filenames = []
def upload_excel(self, file: List[str]) -> str:
try:
for filepath in file:
df = pd.read_excel(filepath)
self.files.append(df)
self.filenames.append(os.path.basename(filepath))
return “\n”.join(self.filenames)
except Exception as e:
return f”An error occurred: {str(e)}” The chat_with_excel method allows users to interact with uploaded data through messages while maintaining a chat history. The Agent class from PandasAI facilitates interaction with dataframes for multi-turn conversations. This method is designed to handle both image and text outputs. Since the Agent outputs dataframes and numbers (not just strings), they are cast to strings in lines 29 and 31. from pandasai import Agent
class PandasAI:
def __init__(self):
self.files = []
self.filenames = []
def upload_excel(self, file: List[str]) -> str:
# Code omitted for brevity
def chat_with_excel(self, message: str, chat_history: List) -> List[Union[str, List]]:
agent = Agent(self.files, config={“llm”: llm})
prompt = f”’
User Message:
{message}
Chat History:
{str(chat_history)}
”’
bot_message = agent.chat(prompt)
if isinstance(bot_message, str) and os.path.exists(bot_message):
img = gr.Image(bot_message)
chat_history.append((message, None))
chat_history.append((None, img))
return [“”, chat_history]
elif isinstance(bot_message, pd.DataFrame):
bot_message = bot_message.to_string(index=False)
else:
bot_message = str(bot_message)
chat_history.append((message, bot_message))
return [“”, chat_history] Preparation (Front-end)We prepared the UI using Gradio, a Python library that allows developers to quickly create ML-based applications. import gradio as gr
with gr.Blocks() as excel_chatbot:
gr.Markdown(“## Smart Excel Chat”)
with gr.Row():
with gr.Column():
excel_file = gr.File(label=”Upload Excel Sheet”, file_count=”multiple”, file_types=[“.xlsx”], type=”filepath”)
upload_button = gr.Button(“Upload”)
upload_status = gr.Textbox(label=”Uploaded Files”, interactive=False)
with gr.Column():
chatbot = gr.Chatbot()
msg = gr.Textbox()
clear = gr.ClearButton([msg, chatbot])
pandasai = PandasAI()
def reset_file():
return None
upload_button.click(pandasai.upload_excel, inputs=excel_file, outputs=upload_status)
upload_button.click(reset_file, outputs=excel_file)
msg.submit(
pandasai.chat_with_excel,
inputs=[msg, chatbot],
outputs=[msg, chatbot]
)
excel_chatbot.launch() It includes components for uploading Excel sheets, a chat view, and a text input box.Performance VerificationI will demonstrate our prototype using the following sample data related to HR. We prepared two simple Excel sheets for the demonstration, even though it worked well with larger datasets as well.The app can correctly answer user queries by considering the chat history.It can plot bar chart,and pie chart as well.LimitationsWhen we ask a query about something the LLMs shouldn’t know, they hallucinate.However, it recognizes when it lacks the necessary information. By restructuring the prompt to ensure that the LLM has access to the relevant information and adjusting actions based on the output, we assume that this issue can be addressed.ConclusionIntegrating PandasAI with Generative AI Hub enhances data analysis by enabling seamless interaction with business data through natural language queries. This solution simplifies tasks like data search and visualization, making them more accessible and intuitive. It demonstrates the powerful potential of RAG to improve productivity and streamline workflows in business applications. Read More Technology Blogs by SAP articles
#SAP
#SAPTechnologyblog