Consuming data from SAP Datasphere in SAP BTP CAP Applications

Estimated read time 13 min read

Premise:

Often, an application developed on SAP Business Technology Platform (BTP) requires data from various systems for different purposes. SAP Datasphere, which unifies and integrates all SAP and non-SAP data from various sources, is an excellent resource for such business data. In a recent project with one of the leading banks in Europe, we observed that machine learning experts preferred working directly with the underlying data using Python for building and training models. However, since the SAP Cloud Application Programming (CAP) Model does not natively support Python, it’s important to consider alternative approaches that allow integration with Python-based workflows and thus bridging the gap.

In the context of an enterprise, imagine a large business with stringent internal and external requirements for data quality, ensuring the reliability of any form of data that is processed, particularly if the documents are related to financial accounting, as these are very critical. Considering table in S/4HANA, which combines multiple financial journals into one consolidated table and holds all the financial data, has high volume and complexity that can give rise to data quality issues. These issues can be costly when discovered at later stages of accounting-related processes, posing significant challenges to the enterprise. Given the high complexity and variety in accounting scenarios, along with significant data volume across multiple business divisions, user data quality control measures must be complemented by machine-enabled solutions. Machine learning algorithms can play a crucial role in supporting responsible organizational units by detecting potential data quality issues early and assisting in their resolution. This proactive approach helps maintain data quality throughout the entire accounting process.

The accounting ledger line-item data from the S/4HANA application can be sourced into SAP Datasphere to perform anomaly detection using machine learning, ensuring no additional load on the operational S/4 system. The anomaly detection algorithm identifies potential anomalies across document line items, which are then presented to end users, such as controllers, through SAP Analytics Cloud. To assist in investigating detected anomalies, business users like controllers can be provided with the reasoning behind why the machine learning algorithm flagged the relevant items as unusual.

Such an application can be easily realised by following the detailed steps outlined in the blog, Hands-on Tutorial: Machine Learning with SAP Datasphere. The blog also includes a hands-on code repository providing snippets to get you started.

Developing this concept further, an application developed on SAP BTP can interact seamlessly with the anomaly detection outcomes in natural language. Additionally, such an application can integrate with the S/4HANA system and could automate the creation of corrective actions, ensuring complete and streamlined workflow.

Solution:

The application on BTP is built following steps detailed in the reference architecture, Retrieval Augmented Generation and Generative AI on SAP BTP and the associated mission, GenAI Mail Insights: Develop a CAP application using GenAI and Retrieval Augmented Generation (RAG). The mission includes associated code on GitHub, btp-cap-genai-rag.

Although the provided code isn’t an exact match, it offers all the necessary elements in the code and can be easily adjusted based on your needs, such as here for building an application to interact with anomaly detection outcomes. One key difference for our scenario is to be able to refer or integrate with the data source (tables) in SAP Datasphere within the SAP Cloud Application Programming Model (CAP) framework. The following two options are available for this integration.

SAP HANA Cross Container Access:

SAP CAP,  being an opinionated framework, provides instructions on how to use databases with CAP applications. SAP HANA Cloud is the recommended and standard database for productive usage. Deploying to SAP HANA from SAP CAP is done using SAP HANA Deployment Infrastructure (HDI). The SAP HANA Deployment Infrastructure (HDI) uses containers to store design-time artifacts and the corresponding deployed run-time (catalog) objects. 

Prepare SAP Datasphere space by enabling access to HDI containers. This process is used to create a mapping between SAP Datasphere and the SAP BTP sub account, thereby enabling bi-directional access. In one direction, to access the artefacts from HDI containers associated with the corresponding SAP HANA Cloud attached to one of your SAP BTP sub accounts in the SAP Datasphere spaces. In another direction, it allows the SAP HANA Cloud underneath the SAP Datasphere to also deploy HDI containers to be used for SAP CAP applications.

In SAP Datasphere space where the anomaly detections results are made, create a technical database user to expose the artefacts within the space for consumption as source for the CAP application. This database user must have appropriate privileges and should be enabled for HDI consumption.

Now, SAP HANA cross container access can be used to access data from one schema or HDI container by another. This blog details cross container access scenarios using user-provided service on cloud foundry.

The diagram below puts all these steps into perspective.

                                                 

To summarise, the SAP CAP application deploys an HDI container (denoted as consumer in the above diagram) on the SAP HANA Cloud underneath SAP Datasphere thereby enabling cross container access with the provider schema or HDI container (denoted as provider in the above diagram) of the corresponding SAP Datasphere space by defining synonyms and a user provided service. Once this is established, the data is brought into SAP CAP by defining @cds.persistence.exists for the necessary entities in the model.

The primary strategy in this option involves utilizing SAP HANA Cloud within SAP Datasphere for deploying the HDI container in the SAP CAP application. Should this option be unsuitable due to the SAP HANA Cloud being managed by SAP, an alternative approach detailed below may be considered.

SAP HANA Virtual Tables and Cross Container Access:

SAP HANA smart data access let’s you access remote data as if the data were stored in local tables in SAP HANA, without copying the data into SAP HANA. In SAP HANA, linked databases or create virtual tables are used, which point to remote tables in different remote databases, and then SQL queries can be used in SAP HANA to access these virtual tables. In this option, you will need an SAP HANA Cloud instance for the CAP application. The virtual tables using smart data access will be created here, thereby linking the remote data source/tables to the space from SAP HANA Cloud underneath the SAP Datasphere.

Like option 1, In SAP Datasphere space where the anomaly detection’s results are made, create a technical database user to expose the artefacts within the space for consumption as source for the CAP application. This database user must have appropriate privileges and should be enabled for HDI consumption.

Once the access is enabled, create necessary virtual tables as described in the initial part of the blog Step by Step guide to creating remote data in HDI.

The diagram below puts all these steps into perspective.

Now, similar to option 1, the virtual tables can be used in the HDI container managed by the SAP CAP application using user-provided service on cloud foundry. The second part of the blog referred to in the previous para also details this.

Summary:

To summarise, SAP Datasphere is utilised to load business data, such as financial accounting information from S/4 HANA, and perform anomaly detections before writing back the data. Subsequently, a second application is developed using SAP CAP on SAP BTP to access these results and provide an interface that interacts with natural language using SAP AI Core. In effect, this approach bridges the gap between leveraging SAP Datasphere for centralised data processing and anomaly detection using Python and integrating the results seamlessly into a user-facing application built with SAP CAP on BTP.

 

​ Premise:Often, an application developed on SAP Business Technology Platform (BTP) requires data from various systems for different purposes. SAP Datasphere, which unifies and integrates all SAP and non-SAP data from various sources, is an excellent resource for such business data. In a recent project with one of the leading banks in Europe, we observed that machine learning experts preferred working directly with the underlying data using Python for building and training models. However, since the SAP Cloud Application Programming (CAP) Model does not natively support Python, it’s important to consider alternative approaches that allow integration with Python-based workflows and thus bridging the gap.In the context of an enterprise, imagine a large business with stringent internal and external requirements for data quality, ensuring the reliability of any form of data that is processed, particularly if the documents are related to financial accounting, as these are very critical. Considering table in S/4HANA, which combines multiple financial journals into one consolidated table and holds all the financial data, has high volume and complexity that can give rise to data quality issues. These issues can be costly when discovered at later stages of accounting-related processes, posing significant challenges to the enterprise. Given the high complexity and variety in accounting scenarios, along with significant data volume across multiple business divisions, user data quality control measures must be complemented by machine-enabled solutions. Machine learning algorithms can play a crucial role in supporting responsible organizational units by detecting potential data quality issues early and assisting in their resolution. This proactive approach helps maintain data quality throughout the entire accounting process.The accounting ledger line-item data from the S/4HANA application can be sourced into SAP Datasphere to perform anomaly detection using machine learning, ensuring no additional load on the operational S/4 system. The anomaly detection algorithm identifies potential anomalies across document line items, which are then presented to end users, such as controllers, through SAP Analytics Cloud. To assist in investigating detected anomalies, business users like controllers can be provided with the reasoning behind why the machine learning algorithm flagged the relevant items as unusual.Such an application can be easily realised by following the detailed steps outlined in the blog, Hands-on Tutorial: Machine Learning with SAP Datasphere. The blog also includes a hands-on code repository providing snippets to get you started. Developing this concept further, an application developed on SAP BTP can interact seamlessly with the anomaly detection outcomes in natural language. Additionally, such an application can integrate with the S/4HANA system and could automate the creation of corrective actions, ensuring complete and streamlined workflow.Solution:The application on BTP is built following steps detailed in the reference architecture, Retrieval Augmented Generation and Generative AI on SAP BTP and the associated mission, GenAI Mail Insights: Develop a CAP application using GenAI and Retrieval Augmented Generation (RAG). The mission includes associated code on GitHub, btp-cap-genai-rag.Although the provided code isn’t an exact match, it offers all the necessary elements in the code and can be easily adjusted based on your needs, such as here for building an application to interact with anomaly detection outcomes. One key difference for our scenario is to be able to refer or integrate with the data source (tables) in SAP Datasphere within the SAP Cloud Application Programming Model (CAP) framework. The following two options are available for this integration.SAP HANA Cross Container Access:SAP CAP,  being an opinionated framework, provides instructions on how to use databases with CAP applications. SAP HANA Cloud is the recommended and standard database for productive usage. Deploying to SAP HANA from SAP CAP is done using SAP HANA Deployment Infrastructure (HDI). The SAP HANA Deployment Infrastructure (HDI) uses containers to store design-time artifacts and the corresponding deployed run-time (catalog) objects. Prepare SAP Datasphere space by enabling access to HDI containers. This process is used to create a mapping between SAP Datasphere and the SAP BTP sub account, thereby enabling bi-directional access. In one direction, to access the artefacts from HDI containers associated with the corresponding SAP HANA Cloud attached to one of your SAP BTP sub accounts in the SAP Datasphere spaces. In another direction, it allows the SAP HANA Cloud underneath the SAP Datasphere to also deploy HDI containers to be used for SAP CAP applications.In SAP Datasphere space where the anomaly detections results are made, create a technical database user to expose the artefacts within the space for consumption as source for the CAP application. This database user must have appropriate privileges and should be enabled for HDI consumption.Now, SAP HANA cross container access can be used to access data from one schema or HDI container by another. This blog details cross container access scenarios using user-provided service on cloud foundry.The diagram below puts all these steps into perspective.                                                 To summarise, the SAP CAP application deploys an HDI container (denoted as consumer in the above diagram) on the SAP HANA Cloud underneath SAP Datasphere thereby enabling cross container access with the provider schema or HDI container (denoted as provider in the above diagram) of the corresponding SAP Datasphere space by defining synonyms and a user provided service. Once this is established, the data is brought into SAP CAP by defining @cds.persistence.exists for the necessary entities in the model.The primary strategy in this option involves utilizing SAP HANA Cloud within SAP Datasphere for deploying the HDI container in the SAP CAP application. Should this option be unsuitable due to the SAP HANA Cloud being managed by SAP, an alternative approach detailed below may be considered.SAP HANA Virtual Tables and Cross Container Access:SAP HANA smart data access let’s you access remote data as if the data were stored in local tables in SAP HANA, without copying the data into SAP HANA. In SAP HANA, linked databases or create virtual tables are used, which point to remote tables in different remote databases, and then SQL queries can be used in SAP HANA to access these virtual tables. In this option, you will need an SAP HANA Cloud instance for the CAP application. The virtual tables using smart data access will be created here, thereby linking the remote data source/tables to the space from SAP HANA Cloud underneath the SAP Datasphere.Like option 1, In SAP Datasphere space where the anomaly detection’s results are made, create a technical database user to expose the artefacts within the space for consumption as source for the CAP application. This database user must have appropriate privileges and should be enabled for HDI consumption.Once the access is enabled, create necessary virtual tables as described in the initial part of the blog Step by Step guide to creating remote data in HDI. The diagram below puts all these steps into perspective.Now, similar to option 1, the virtual tables can be used in the HDI container managed by the SAP CAP application using user-provided service on cloud foundry. The second part of the blog referred to in the previous para also details this.Summary:To summarise, SAP Datasphere is utilised to load business data, such as financial accounting information from S/4 HANA, and perform anomaly detections before writing back the data. Subsequently, a second application is developed using SAP CAP on SAP BTP to access these results and provide an interface that interacts with natural language using SAP AI Core. In effect, this approach bridges the gap between leveraging SAP Datasphere for centralised data processing and anomaly detection using Python and integrating the results seamlessly into a user-facing application built with SAP CAP on BTP.   Read More Technology Blog Posts by SAP articles 

#SAP

#SAPTechnologyblog

You May Also Like

More From Author