Let me tell you 4 truths from the multiverse:
Data Fabric / Data Lakes killed Data Warehouse conceptData Mesh comes as an evolution of Data FabricDashboard generation is on demand since GPT, rigid no moreSAP gets the point and releases BDC with Fabric, Mesh, Dashboards and Gen AI
Traditionally, building analytics in SAP primarily built around back end and presentation layers, Business Warehouse (BW) and Business Objects (BO) gave enterprise reporting and data analysis for many organizations. SAP BW provided the data warehousing platform, centralizing data from mostly SAP sources into a structured environment optimized for reporting. On the presentation layer, Business Objects offered a suite of tools for creating reports, dashboards, and ad-hoc queries, it has given business users with self-service analytics.
These technologies brought significant benefits, however, that analytical layer was long gone and not considered the technology of the future. Its on-premise nature, complexity, and rigidity struggle to keep pace with the demands of modern, agile businesses.
The rise of cloud computing, big data, and real-time analytics brought different architectures more flexible, scalable. The focus shifted towards cloud-based platforms like SAP Analytics Cloud and embedded analytics within S/4HANA, the raise of cloud hyperscalers and modern analytics platforms, first Snowflake and then Databricks, got the idea perfectly and leveraged a nice combination of technological improvements and platform modernity. The medallion architecture was born.
In this blog, I will go through the new offering from SAP, Business Data Cloud (BDC from now on), the vision to unify applications, data, and AI, with a core focus on Data Products.
The Evolution of Data Architectures
The Medallion Architecture
“Medallion Architecture, a multi-tiered approach in data lakehouses, providing a foundation for understanding data organization”
In the world of data management, the Medallion architecture, also known as multi-hop architecture, is an approach to data model design that encourages the logical organisation of data within a data lakehouse.
The Medallion architecture structures data in a multi-tier approach —bronze, silver and gold tier— taking into account and encouraging data quality as it moves through the transformation process (from raw data to valuable business insights). This architecture ensures data integrity by passing through several stages of validations and transformations that ensure data atomicity, consistency and durability. Once the data has passed through these validations and transformations, it is stored in an optimal layout for effective analysis, ready to be used for strategic decision making.
By Author: Medallion architecture diagram showing Bronze, Silver, Gold tiers with data sources
Initially, the Medallion Architecture emerged as a response to the growing complexity of data management, offering a seemingly manageable way to organize expanding business data by breaking down the problem into smaller, quality-focused stages, but Medallion falls short for some.
A primary criticism is that it operates as a “pull mechanism,” which inadvertently shifts the burden of complex data transformations onto the data consumers. These consumers, often business analysts or downstream applications are forced to handle intricate data manipulations and wait for data to be fully curated in the Gold layer before they can derive meaningful insights. This creates inefficiencies and delays, hindering agile decision-making.
Furthermore, the Medallion Architecture has been criticized for data quality issues. Becoming so rigid, errors and inconsistencies introduced in the initial Bronze layer can propagate through subsequent layers, becoming increasingly difficult to rectify downstream. This layered approach can lead to a fragile data foundation, where each layer’s integrity is heavily dependent on the preceding one.
In Medallion, Data is repeatedly moved and transformed across layers, adding to computational costs and processing backlogs without necessarily adding commensurate business value in the earlier stages.
In the last layers, Medallion Architecture’s lack of business context in its upstream tiers. This linear, assembly-line style data transformation architecture treats data as a technical artifact rather than a business product. Bronze and Silver layers that often lack the necessary business context to be readily usable for decision-making, relegating true business value realization to the final Gold layer and causing delays in accessing actionable insights.
This also limits data consumption options and creates bottlenecks, as downstream consumers must wait for the data to reach the Gold layer, high-waiting queues and restricted access to data in its more raw or intermediate forms. It is not agile, its still rigid and the risk of errors is high.
The Evolution to Data Products and Data Mesh
The Data Product Architecture. This architecture adopts a “push mechanism,” where data is proactively shaped and refined based on clearly defined analytical and operational use cases. The Data Product Architecture prioritizes pushing business context to the forefront, right from the initial stages of data processing. This ensures that data is treated as a software, as a product from the outset, designed and engineered to meet specific business needs and deliver value at every stage of its lifecycle.
This concept, called Data Mesh is perfect for LLM consumption, and came before LLMs. Introduced here in 2020 describes that data monolith (being a data lake or warehouse), is many times a bottleneck when organizations grow and data complexity increases. Data Lakes, coming from the old days of Big Data, lack agility and scaling, because data needs to move over and offer to the next phase of “curation”.
By Author: Data mesh conceptual diagram showing decentralized data ownership
The Data Mesh is founded on four key principles:
Firstly, Domain-oriented decentralized data ownership shifts responsibility to domain teams, who are closest to the data and its context.Secondly, Data as a product emphasizes treating data as a valuable product, making it discoverable, understandable, trustworthy, and natively accessible. This requires domain teams to not just own the data but also serve it to consumers effectively.Thirdly, a Self-serve data platform is essential, providing domain teams with the necessary infrastructure and tools to build, deploy, and operate their data products independently, without relying on a central data team as a bottleneck.And Fourth, Federated computational governance addresses the need for standardization and interoperability across decentralized domains. This involves establishing global standards and policies, while allowing domains autonomy within those boundaries.
Moving to a Data Mesh is presented as an evolutionary journey for Fabric, not a revolutionary overhaul. Organizations should start with their existing data monolith and incrementally transition towards a mesh architecture. Key steps include identifying business domains and their corresponding data products, building self-serve data platform capabilities, and implementing federated governance.
What is a Data Product?
In Data Mesh speak, a Data Product is an architectural quantum, which is the “smallest unit of architecture that can be independently deployed with high functional cohesion and includes all the structural elements required for its function.”
More practically, a Data Product is a self-contained package encompassing not just data, but also metadata, code for transformation, and potentially infrastructure. It embodies the principle of “data as a product,” where data owners treat their data with a product mindset, focusing on consumer needs and usability.
By Author: Concentric circles diagram showing Data Product components – Data, Metadata, Infrastructure, Code
However, a more nuanced view defines a Data Product as an autonomous logical entity describing data meant for consumption, with relationships to the underlying technology. This logical entity includes a dataset name, description, owner, and references to physical data assets, making it technology-agnostic. Regardless of the definition, the goal is to create discoverable, addressable, understandable, trustworthy, and natively accessible data. Data Products are the fundamental building blocks of a Data Mesh, designed to serve analytical data and facilitate data-driven decision-making within specific business domains. They should be well-defined, simple, cohesive, focusing on a single, well-defined function to maximize reusability.
Data becomes a product when it is effectively packaged, delivered, and consumed in a way that solves a specific user problem or fulfills a business need.
Data becomes a DaaS product at the point of consumption, not merely at the point of collection or storage. It’s like preparing a meal, you collect ingredients which are not consumed until you have a final dish, and you collect the ingredients you want, not the other way around, being given ingredients “what can I do with all this”.
Several key factors are identified as crucial in this data-to-product transformation. Firstly, understanding the user and their problem, just the same way we do if we want to build DaaS. A data product must be designed with a specific user in mind and must address a tangible problem they face. Without this user-centric approach, data risks remaining an abstract asset with limited practical value. This necessitates close collaboration with potential users to deeply understand their workflows, pain points, and information requirements.
Secondly, packaging and presentation play a vital role. Transforming data into easily digestible formats, such as reports, dashboards, or APIs, is essential for making it accessible and actionable. This involves not only technical processing but also thoughtful design of interfaces and visualizations that facilitate intuitive interaction and interpretation. The form in which data is delivered is as important as the data itself in determining its product value.
Also, the lifecycle of the data is critical in its design. Adopting a product-centric mindset within data teams. Data teams should possess a strong understanding of business needs and user workflows. This requires a shift in perspective, viewing data not just as a technical asset but as a product that needs to be carefully crafted, marketed, and supported to ensure user adoption and satisfaction.
SAP’s Journey in Data Management
SAP has long been player in Data Management since the introduction of BW in 1998. This was the dawn of an era where organizations sought to consolidate and analyze their business data. SAP BW evolved through capabilities in ETL, data modeling, and integration, ultimately the in-memory HANA around 2015.
The introduction of SAP Data Warehouse Cloud, around 2019, which was later SAP Datasphere in 2023, and built on the BTP, has been the first business data fabric architecture, because it aimed the consolidation of critical data from various sources, on HANA Cloud. This architecture, with its focus on integrating diverse data landscapes, aligns with the decentralized and interconnected nature of a Data Mesh. SAP Datasphere provides a centralized catalog for data discovery and governance across these interconnected sources. The Catalog is important, and I will discuss later on.
This 2025, the introduction of the SAP Business Data Cloud (BDC) is now the future of SAP’s data and analytics strategy. It integrates the strengths of SAP BW, SAP Datasphere, and SAP Analytics Cloud (SAC) on a single platform.
SAP BDC ecosystem diagram showing integration of components
What is really new is the concept of data products, central to SAP BDC, with SAP aiming to deliver out-of-the-box data products following a harmonized data model. This strong emphasis on data products as fundamental building blocks clearly echoes the core tenets of a Data Mesh.
LLMs and the Necessity of Data Products
Let’s first understand how LLMs function and why these issues occur so frequently.
LLMs are trained on vast amounts of unstructured data; they learn about the data and store this information as part of weights and biases in their neural network; this information includes the language understanding and the general knowledge provided in the training data.
To date, off-the-shelf LLMs are not prepared with structured and relevant data for enterprise use cases. The most popular use case for enterprises is to be able to query their extensive set of tables and data lakes using LLMs. Here are two broad ways in which LLMs are being used in enterprises today:
Scenario 1: Unorganized Data Pools
A common misconception is that LLMs can seamlessly process unorganized data to deliver accurate responses, leading organizations to provide them with such sources. However, this approach is flawed. LLMs struggle to create accurate and optimized queries without a structured data framework, resulting in inefficient SQL, subpar performance, and elevated computational costs. Supplying just the database schema isn’t enough; LLMs necessitate detailed contextual information about metrics, measures, dimensions, entities, and their relationships to generate effective SQL queries.
By Author; LLM with unorganized data diagram showing consumption layer, LLM, query engine, and data lakes
Scenario 2: Organized Data Catalogues
Organizations may choose to organize their data with defined schemas and entities in catalogs before using LLMs, which helps the LLMs understand the data and improves accuracy and efficiency. However, this method requires ongoing updates, involves data movement, and has high upfront costs for organizing and cataloging large datasets. Additionally, even with this structure, LLMs may still not fully comprehend the data’s context and semantics, which results in inaccuracies.
By Author; LLM with data catalogs diagram showing the addition of data catalog between query engine and data lakes
The Solution: Building LLMs Powered by Data Products
Enter the data product era! A data product is a comprehensive solution that integrates Data, Metadata (including semantics), Code (transformations, workflows, policies, and SLAs), and Infrastructure (storage and compute). It is specifically designed to address various data and analytics (D&A) and AI scenarios, such as data sharing, LLM training, data monetization, analytics, and application integration. Across various industries, organizations are increasingly turning to sophisticated data products to transform raw data into actionable insights, enhancing decision-making and operational efficiency. Data products are not merely a solution; they are transformative tools that prepare and present data effectively for AI consumption—trusted by its users, up-to-date, and governed appropriately.
By Author; LLM with data product layer diagram showing how data products sit between LLM and data lakes
Integrating a Data Product Layer with your existing data infrastructure represents a significant advancement in leveraging Large Language Models (LLMs) for enterprise data management. This powerful combination enhances LLMs’ contextual understanding and query precision and ensures robust data governance, scalability, and operational efficiency. In navigating the complexities of big data and AI, organizations benefit from solutions like DataOS, which help in building and managing the data products in a decentralized way.
SAP BDC: Data Products in Action
At the heart of SAP BDC lies the concept of Data Products. SAP recognizes that data is only valuable if it’s accessible, understandable, and trustworthy. Data Products in BDC are not just raw data; they are curated, enriched, and contextualized data assets designed for specific business purposes. They are the core components of the Business Data Cloud.
Within SAP, a Data Product is a data set made available for use outside its original application through APIs. It comes with detailed, high-quality descriptions accessible via a Data Product Catalog. It’s important to note that “Data Product” doesn’t mean something you purchase; it simply refers to data that is “packaged” for straightforward use.
Cash Flow Data Product example showing interface elements and properties
There are 2 variants of Data products available for flexible access:
SAP Data Products, based on a canonical / standard SAP One Domain Model DefinitionCustomer Data Products, based on customer individual configurations like S/4HANA Z-CDS-Views, BW DSO Objects, IBP Dataset, HANA Cloud, etc.
Features of Data Products:
Business Data Sets: consisting of one or more business objects entities, related objects, analytical data (measures, dimensions), documents, graph data, spatial data, …Consumable: via APIs or via Events. Supported API types are SQL (incl. SQL interface views), Delta Sharing, Event, REST, (oData)Described: with metadata that is of high quality and provided via Open Resource Discovery (ORD), following ORD schema for Data Product* ORD will be explained in a minuteDiscoverable: via the Data Product Directory that is a service of UCL that aggregates metadata of all Data Products to make them discoverable in a landscape.
Data catalog interface showing filtering capabilities and available data products
The data catalog plays a pivotal role in modern data management by serving as a centralized and organized inventory of metadata. It is crucial for data discovery, allowing users to easily find the data they need and understand its purpose and contents. A data catalog typically stores metadata such as business terms, owners, origins, lineage, labels, and classifications. This enables data analysts and other users to evaluate the fitness of data for intended use cases. Furthermore, the catalog is fundamental for data governance by providing a system to manage and oversee data assets, track ownership, understand data flows (lineage), and enforce policies.
Requirements for Utilizing Data Products
What is needed to use Data Products effectively:
Discoverability: Users need a way to find and understand the available data products, often through a Data Mesh Marketplace or a data product catalog. This requires metadata and documentation to be readily accessible.Accessibility: Once a data product is found, users need to be able to access it through defined interfaces. This might involve APIs, SQL interfaces, file-based endpoints, or other methods depending on the data product’s design and intended use cases.Understanding Data Contracts and Policies: Users need to be aware of and adhere to the data contracts that define the structure, quality, service levels, security, and privacy policies associated with the data product. They need to understand the access rights and licenses governing the use of the artifacts within the data product.Authentication and Authorization: Secure access requires authentication to verify the user’s identity and authorization to ensure they have the necessary permissions to consume the data product and its artifacts.Appropriate Tools and Skills: Depending on the access method, users might need specific tools (e.g., SQL clients, API clients, data science workbenches) and the skills to interact with the data in the provided format. Self-service capabilities and user-friendly interfaces aim to minimize the need for complex IT skills.Network Connectivity and Endpoints: Users need to be able to connect to the addressable endpoints (URLs, URIs) of the data products. This technical foundation ensures that data consumers can reliably access the data products regardless of their location.Awareness of Service Level Objectives (SLOs): Consumers should be aware of the SLOs to understand the expected reliability and performance of the data product. This includes metrics like uptime, response time, and data freshness guarantees.Data Preparation (Potentially): While data products aim to provide processed and user-friendly data, consumers might still need to perform lightweight integration or transformation tasks based on mappings between similar data elements. Self-service data preparation tools can assist with this.
SAP BDC Architecture: Enabling Data Products
SAP BDC architecture diagram showing layers and components
The architecture of SAP Business Data Cloud is designed to support the creation, management, and consumption of Data Products through a unified and coherent framework. This architecture forms the foundation that enables data to be transformed from raw information into valuable, business-ready data products.
The core components of SAP BDC architecture work harmoniously to support the Data Product lifecycle:
SAP Datasphere
SAP Datasphere serves as the central hub for integrating data from various sources, both SAP and non-SAP. It plays a crucial role in harmonizing data across different formats and structures, ensuring consistency and compatibility. Through its data modeling and transformation capabilities, Datasphere creates the foundation for high-quality Data Products by providing a unified semantic layer that bridges technical data storage and business meaning.
SAP Analytics Cloud
SAP Analytics Cloud functions as the primary consumption layer for Data Products. It leverages the well-structured Data Products to deliver analytics, reports, and dashboards that enable business users to gain insights without complex data manipulation. This component translates the technical excellence of Data Products into business value through visualization and analysis.
SAP Databricks
SAP Databricks provides advanced data processing, machine learning, and AI capabilities within the BDC ecosystem. It enables the creation of sophisticated Data Products that incorporate predictive analytics, complex transformations, and AI-driven insights. Through its serverless computing and unified analytics platform, Databricks helps extend the value of Data Products beyond traditional reporting.
Foundation Services
The underlying Foundation Services provide essential capabilities for data acquisition, transformation, and storage. These services ensure that Data Products have a reliable infrastructure foundation, addressing needs for performance, scalability, and security.
BW integration diagram showing how BW connects with BDC components
Data Product Creation and Consumption in SAP BDC
The journey of creating and consuming Data Products in SAP BDC follows a structured process that ensures quality, governance, and accessibility. This process encompasses multiple steps, from initial data package activation to the creation of business value through Data Product consumption.
SAP applications produce data products that can be consumed in SAP Datasphere via SAP data packages. Data products provide the data for consumption, while the Business Analyst reviews the information about the installed Data Products.
By Author; Data activation workflow diagram showing the 5-step process for data products
Integration with SAP Business Warehouse
SAP BDC offers seamless integration with SAP Business Warehouse, allowing organizations to leverage their existing BW investments while moving toward a modern data product approach. With the introduction of SAP BDC, customer-managed BW data products can be transitioned to SAP-managed data products.
Customers will first work with BW data products based on their BW data models from SAP BW, starting use cases in SAP Databricks and SAP Datasphere, with exposure via Delta Share. As they begin exploring what’s possible with SAP-managed data products and insight apps, customers can gradually replace BW Data Products with SAP-managed Data Products as they move into a clean data approach.
BW can publish “BW data products” into the SAP BDC object store. We use SAP Databricks delta share capabilities to expose those data products to 3rd party data lakes. We can also access 3rd party data lakes from BDC via delta share, and for example, join data from BW with data from Azure Data Lake in a Datasphere data layer.
The Role of Open Resource Discovery (ORD)
Open Resource Discovery (ORD) is a protocol that allows applications and services to self-describe their exposed resources and capabilities. It’s critical for enabling consistent technical documentation and facilitating the discovery of Data Products.
ORD provides several benefits in the context of Data Products:
It enables automated discovery and aggregation of metadataIt ensures a high degree of automation and helps keep systems in sync with realityIt provides a bigger context with shared, high-level information, taxonomy, and relations between described resourcesSAP Business Data Cloud uses ORD to provide high-quality metadata for its Data Products via the Data Product Directory
Business Use Case: Data Products Driving Value
To illustrate the real-world impact of Data Products within SAP BDC, let’s consider a finance department struggling with cash flow visibility and forecasting accuracy.
Before implementing a Data Product approach, the finance team dealt with fragmented data sources: bank transaction data in SAP S/4HANA, accounts receivable in an Ariba system, and forecasting data in various Excel spreadsheets. Analysts spent days each month reconciling and consolidating this data, often discovering discrepancies too late to inform decision-making.
Old relationship between stakeholders
With SAP BDC’s Data Product approach, the organization implemented a “Cash Flow” Data Product that provides:
A unified view of actual cash positions from banking systemsConfirmed transactions from accounts receivableForecasted cash flows from planning systems
Finance analysts now access this Data Product through SAP Analytics Cloud, where they can immediately visualize current cash positions, analyze trends, and generate accurate forecasts. The Data Product ensures that all data is current, consistent, and properly contextualized with business meaning.
By Author
This case demonstrates how Data Products streamline data access, improve data quality, and accelerate insights—transforming raw data into actionable business value.
Benefits of a Data Product-Centric Approach in SAP BDC
The Data Product approach within SAP Business Data Cloud delivers several key advantages that collectively transform how organizations leverage their data assets:
Improved data accessibility and discoverability: Well-defined and cataloged Data Products make it significantly easier for business users to find relevant data. With rich metadata, clear ownership, and intuitive cataloging, users spend less time searching and more time analyzing.Enhanced data trustworthiness and reliability: The focus on quality and governance within Data Products builds confidence in the data. Users can trust that the data they access is accurate, current, and compliant with relevant policies and standards.Increased business agility: By empowering users to access and utilize Data Products without complex data wrangling, organizations can respond more quickly to changing business conditions. Business teams can independently explore data and generate insights without heavy IT involvement.Accelerated innovation: A solid foundation of Data Products provides the reliable data needed for advanced analytics, machine learning, and AI initiatives. This accelerates the path from data to innovation, allowing organizations to develop new capabilities and offerings faster.Reduced duplication and increased efficiency: By creating reusable Data Products instead of one-off data extracts or reports, organizations reduce redundant work and establish a shared understanding of key business data.Improved data literacy across the organization: The business context embedded in Data Products helps users understand the meaning and proper use of data, elevating overall data literacy.
TL:DR
In the age of AI, the ability to link business understanding with strategic choices is vital. As every company becomes an AI data creator, strong business context becomes necessary to get AI ready for real-world use. Building a solid base for Business AI means bringing together data systems and business operations.
The data product approach fundamentally shifts how we think about data, moving from a technical resource to be managed to a strategic asset to be leveraged. It brings business context to the forefront, ensuring that data is not just stored but is truly useful and actionable.
In SAP, the new Business Data Cloud, with Data Products, moves into how organizations manage and leverage their data in this new paradigm by treating data as a product—discoverable, accessible, trustworthy, and natively usable.
By adopting a data product mindset, organizations can bridge the gap between data and business value and at the same time, establish the foundation of AI accessibility of Data.
Let me tell you 4 truths from the multiverse:Data Fabric / Data Lakes killed Data Warehouse conceptData Mesh comes as an evolution of Data FabricDashboard generation is on demand since GPT, rigid no moreSAP gets the point and releases BDC with Fabric, Mesh, Dashboards and Gen AITraditionally, building analytics in SAP primarily built around back end and presentation layers, Business Warehouse (BW) and Business Objects (BO) gave enterprise reporting and data analysis for many organizations. SAP BW provided the data warehousing platform, centralizing data from mostly SAP sources into a structured environment optimized for reporting. On the presentation layer, Business Objects offered a suite of tools for creating reports, dashboards, and ad-hoc queries, it has given business users with self-service analytics.These technologies brought significant benefits, however, that analytical layer was long gone and not considered the technology of the future. Its on-premise nature, complexity, and rigidity struggle to keep pace with the demands of modern, agile businesses.The rise of cloud computing, big data, and real-time analytics brought different architectures more flexible, scalable. The focus shifted towards cloud-based platforms like SAP Analytics Cloud and embedded analytics within S/4HANA, the raise of cloud hyperscalers and modern analytics platforms, first Snowflake and then Databricks, got the idea perfectly and leveraged a nice combination of technological improvements and platform modernity. The medallion architecture was born.In this blog, I will go through the new offering from SAP, Business Data Cloud (BDC from now on), the vision to unify applications, data, and AI, with a core focus on Data Products.The Evolution of Data ArchitecturesThe Medallion Architecture”Medallion Architecture, a multi-tiered approach in data lakehouses, providing a foundation for understanding data organization”In the world of data management, the Medallion architecture, also known as multi-hop architecture, is an approach to data model design that encourages the logical organisation of data within a data lakehouse.The Medallion architecture structures data in a multi-tier approach —bronze, silver and gold tier— taking into account and encouraging data quality as it moves through the transformation process (from raw data to valuable business insights). This architecture ensures data integrity by passing through several stages of validations and transformations that ensure data atomicity, consistency and durability. Once the data has passed through these validations and transformations, it is stored in an optimal layout for effective analysis, ready to be used for strategic decision making.By Author: Medallion architecture diagram showing Bronze, Silver, Gold tiers with data sources Initially, the Medallion Architecture emerged as a response to the growing complexity of data management, offering a seemingly manageable way to organize expanding business data by breaking down the problem into smaller, quality-focused stages, but Medallion falls short for some.A primary criticism is that it operates as a “pull mechanism,” which inadvertently shifts the burden of complex data transformations onto the data consumers. These consumers, often business analysts or downstream applications are forced to handle intricate data manipulations and wait for data to be fully curated in the Gold layer before they can derive meaningful insights. This creates inefficiencies and delays, hindering agile decision-making.Furthermore, the Medallion Architecture has been criticized for data quality issues. Becoming so rigid, errors and inconsistencies introduced in the initial Bronze layer can propagate through subsequent layers, becoming increasingly difficult to rectify downstream. This layered approach can lead to a fragile data foundation, where each layer’s integrity is heavily dependent on the preceding one.In Medallion, Data is repeatedly moved and transformed across layers, adding to computational costs and processing backlogs without necessarily adding commensurate business value in the earlier stages.In the last layers, Medallion Architecture’s lack of business context in its upstream tiers. This linear, assembly-line style data transformation architecture treats data as a technical artifact rather than a business product. Bronze and Silver layers that often lack the necessary business context to be readily usable for decision-making, relegating true business value realization to the final Gold layer and causing delays in accessing actionable insights.This also limits data consumption options and creates bottlenecks, as downstream consumers must wait for the data to reach the Gold layer, high-waiting queues and restricted access to data in its more raw or intermediate forms. It is not agile, its still rigid and the risk of errors is high.The Evolution to Data Products and Data MeshThe Data Product Architecture. This architecture adopts a “push mechanism,” where data is proactively shaped and refined based on clearly defined analytical and operational use cases. The Data Product Architecture prioritizes pushing business context to the forefront, right from the initial stages of data processing. This ensures that data is treated as a software, as a product from the outset, designed and engineered to meet specific business needs and deliver value at every stage of its lifecycle.This concept, called Data Mesh is perfect for LLM consumption, and came before LLMs. Introduced here in 2020 describes that data monolith (being a data lake or warehouse), is many times a bottleneck when organizations grow and data complexity increases. Data Lakes, coming from the old days of Big Data, lack agility and scaling, because data needs to move over and offer to the next phase of “curation”.By Author: Data mesh conceptual diagram showing decentralized data ownership The Data Mesh is founded on four key principles:Firstly, Domain-oriented decentralized data ownership shifts responsibility to domain teams, who are closest to the data and its context.Secondly, Data as a product emphasizes treating data as a valuable product, making it discoverable, understandable, trustworthy, and natively accessible. This requires domain teams to not just own the data but also serve it to consumers effectively.Thirdly, a Self-serve data platform is essential, providing domain teams with the necessary infrastructure and tools to build, deploy, and operate their data products independently, without relying on a central data team as a bottleneck.And Fourth, Federated computational governance addresses the need for standardization and interoperability across decentralized domains. This involves establishing global standards and policies, while allowing domains autonomy within those boundaries.Moving to a Data Mesh is presented as an evolutionary journey for Fabric, not a revolutionary overhaul. Organizations should start with their existing data monolith and incrementally transition towards a mesh architecture. Key steps include identifying business domains and their corresponding data products, building self-serve data platform capabilities, and implementing federated governance.What is a Data Product?In Data Mesh speak, a Data Product is an architectural quantum, which is the “smallest unit of architecture that can be independently deployed with high functional cohesion and includes all the structural elements required for its function.”More practically, a Data Product is a self-contained package encompassing not just data, but also metadata, code for transformation, and potentially infrastructure. It embodies the principle of “data as a product,” where data owners treat their data with a product mindset, focusing on consumer needs and usability.By Author: Concentric circles diagram showing Data Product components – Data, Metadata, Infrastructure, Code However, a more nuanced view defines a Data Product as an autonomous logical entity describing data meant for consumption, with relationships to the underlying technology. This logical entity includes a dataset name, description, owner, and references to physical data assets, making it technology-agnostic. Regardless of the definition, the goal is to create discoverable, addressable, understandable, trustworthy, and natively accessible data. Data Products are the fundamental building blocks of a Data Mesh, designed to serve analytical data and facilitate data-driven decision-making within specific business domains. They should be well-defined, simple, cohesive, focusing on a single, well-defined function to maximize reusability.Data becomes a product when it is effectively packaged, delivered, and consumed in a way that solves a specific user problem or fulfills a business need.Data becomes a DaaS product at the point of consumption, not merely at the point of collection or storage. It’s like preparing a meal, you collect ingredients which are not consumed until you have a final dish, and you collect the ingredients you want, not the other way around, being given ingredients “what can I do with all this”.Several key factors are identified as crucial in this data-to-product transformation. Firstly, understanding the user and their problem, just the same way we do if we want to build DaaS. A data product must be designed with a specific user in mind and must address a tangible problem they face. Without this user-centric approach, data risks remaining an abstract asset with limited practical value. This necessitates close collaboration with potential users to deeply understand their workflows, pain points, and information requirements.Secondly, packaging and presentation play a vital role. Transforming data into easily digestible formats, such as reports, dashboards, or APIs, is essential for making it accessible and actionable. This involves not only technical processing but also thoughtful design of interfaces and visualizations that facilitate intuitive interaction and interpretation. The form in which data is delivered is as important as the data itself in determining its product value.Also, the lifecycle of the data is critical in its design. Adopting a product-centric mindset within data teams. Data teams should possess a strong understanding of business needs and user workflows. This requires a shift in perspective, viewing data not just as a technical asset but as a product that needs to be carefully crafted, marketed, and supported to ensure user adoption and satisfaction.SAP’s Journey in Data ManagementSAP has long been player in Data Management since the introduction of BW in 1998. This was the dawn of an era where organizations sought to consolidate and analyze their business data. SAP BW evolved through capabilities in ETL, data modeling, and integration, ultimately the in-memory HANA around 2015.The introduction of SAP Data Warehouse Cloud, around 2019, which was later SAP Datasphere in 2023, and built on the BTP, has been the first business data fabric architecture, because it aimed the consolidation of critical data from various sources, on HANA Cloud. This architecture, with its focus on integrating diverse data landscapes, aligns with the decentralized and interconnected nature of a Data Mesh. SAP Datasphere provides a centralized catalog for data discovery and governance across these interconnected sources. The Catalog is important, and I will discuss later on.This 2025, the introduction of the SAP Business Data Cloud (BDC) is now the future of SAP’s data and analytics strategy. It integrates the strengths of SAP BW, SAP Datasphere, and SAP Analytics Cloud (SAC) on a single platform.SAP BDC ecosystem diagram showing integration of components What is really new is the concept of data products, central to SAP BDC, with SAP aiming to deliver out-of-the-box data products following a harmonized data model. This strong emphasis on data products as fundamental building blocks clearly echoes the core tenets of a Data Mesh.LLMs and the Necessity of Data ProductsLet’s first understand how LLMs function and why these issues occur so frequently.LLMs are trained on vast amounts of unstructured data; they learn about the data and store this information as part of weights and biases in their neural network; this information includes the language understanding and the general knowledge provided in the training data.To date, off-the-shelf LLMs are not prepared with structured and relevant data for enterprise use cases. The most popular use case for enterprises is to be able to query their extensive set of tables and data lakes using LLMs. Here are two broad ways in which LLMs are being used in enterprises today:Scenario 1: Unorganized Data PoolsA common misconception is that LLMs can seamlessly process unorganized data to deliver accurate responses, leading organizations to provide them with such sources. However, this approach is flawed. LLMs struggle to create accurate and optimized queries without a structured data framework, resulting in inefficient SQL, subpar performance, and elevated computational costs. Supplying just the database schema isn’t enough; LLMs necessitate detailed contextual information about metrics, measures, dimensions, entities, and their relationships to generate effective SQL queries.By Author; LLM with unorganized data diagram showing consumption layer, LLM, query engine, and data lakes Scenario 2: Organized Data CataloguesOrganizations may choose to organize their data with defined schemas and entities in catalogs before using LLMs, which helps the LLMs understand the data and improves accuracy and efficiency. However, this method requires ongoing updates, involves data movement, and has high upfront costs for organizing and cataloging large datasets. Additionally, even with this structure, LLMs may still not fully comprehend the data’s context and semantics, which results in inaccuracies.By Author; LLM with data catalogs diagram showing the addition of data catalog between query engine and data lakes The Solution: Building LLMs Powered by Data ProductsEnter the data product era! A data product is a comprehensive solution that integrates Data, Metadata (including semantics), Code (transformations, workflows, policies, and SLAs), and Infrastructure (storage and compute). It is specifically designed to address various data and analytics (D&A) and AI scenarios, such as data sharing, LLM training, data monetization, analytics, and application integration. Across various industries, organizations are increasingly turning to sophisticated data products to transform raw data into actionable insights, enhancing decision-making and operational efficiency. Data products are not merely a solution; they are transformative tools that prepare and present data effectively for AI consumption—trusted by its users, up-to-date, and governed appropriately.By Author; LLM with data product layer diagram showing how data products sit between LLM and data lakes Integrating a Data Product Layer with your existing data infrastructure represents a significant advancement in leveraging Large Language Models (LLMs) for enterprise data management. This powerful combination enhances LLMs’ contextual understanding and query precision and ensures robust data governance, scalability, and operational efficiency. In navigating the complexities of big data and AI, organizations benefit from solutions like DataOS, which help in building and managing the data products in a decentralized way.SAP BDC: Data Products in ActionAt the heart of SAP BDC lies the concept of Data Products. SAP recognizes that data is only valuable if it’s accessible, understandable, and trustworthy. Data Products in BDC are not just raw data; they are curated, enriched, and contextualized data assets designed for specific business purposes. They are the core components of the Business Data Cloud.Within SAP, a Data Product is a data set made available for use outside its original application through APIs. It comes with detailed, high-quality descriptions accessible via a Data Product Catalog. It’s important to note that “Data Product” doesn’t mean something you purchase; it simply refers to data that is “packaged” for straightforward use.Cash Flow Data Product example showing interface elements and properties There are 2 variants of Data products available for flexible access:SAP Data Products, based on a canonical / standard SAP One Domain Model DefinitionCustomer Data Products, based on customer individual configurations like S/4HANA Z-CDS-Views, BW DSO Objects, IBP Dataset, HANA Cloud, etc.Features of Data Products:Business Data Sets: consisting of one or more business objects entities, related objects, analytical data (measures, dimensions), documents, graph data, spatial data, …Consumable: via APIs or via Events. Supported API types are SQL (incl. SQL interface views), Delta Sharing, Event, REST, (oData)Described: with metadata that is of high quality and provided via Open Resource Discovery (ORD), following ORD schema for Data Product* ORD will be explained in a minuteDiscoverable: via the Data Product Directory that is a service of UCL that aggregates metadata of all Data Products to make them discoverable in a landscape.Data catalog interface showing filtering capabilities and available data products The data catalog plays a pivotal role in modern data management by serving as a centralized and organized inventory of metadata. It is crucial for data discovery, allowing users to easily find the data they need and understand its purpose and contents. A data catalog typically stores metadata such as business terms, owners, origins, lineage, labels, and classifications. This enables data analysts and other users to evaluate the fitness of data for intended use cases. Furthermore, the catalog is fundamental for data governance by providing a system to manage and oversee data assets, track ownership, understand data flows (lineage), and enforce policies.Requirements for Utilizing Data ProductsWhat is needed to use Data Products effectively:Discoverability: Users need a way to find and understand the available data products, often through a Data Mesh Marketplace or a data product catalog. This requires metadata and documentation to be readily accessible.Accessibility: Once a data product is found, users need to be able to access it through defined interfaces. This might involve APIs, SQL interfaces, file-based endpoints, or other methods depending on the data product’s design and intended use cases.Understanding Data Contracts and Policies: Users need to be aware of and adhere to the data contracts that define the structure, quality, service levels, security, and privacy policies associated with the data product. They need to understand the access rights and licenses governing the use of the artifacts within the data product.Authentication and Authorization: Secure access requires authentication to verify the user’s identity and authorization to ensure they have the necessary permissions to consume the data product and its artifacts.Appropriate Tools and Skills: Depending on the access method, users might need specific tools (e.g., SQL clients, API clients, data science workbenches) and the skills to interact with the data in the provided format. Self-service capabilities and user-friendly interfaces aim to minimize the need for complex IT skills.Network Connectivity and Endpoints: Users need to be able to connect to the addressable endpoints (URLs, URIs) of the data products. This technical foundation ensures that data consumers can reliably access the data products regardless of their location.Awareness of Service Level Objectives (SLOs): Consumers should be aware of the SLOs to understand the expected reliability and performance of the data product. This includes metrics like uptime, response time, and data freshness guarantees.Data Preparation (Potentially): While data products aim to provide processed and user-friendly data, consumers might still need to perform lightweight integration or transformation tasks based on mappings between similar data elements. Self-service data preparation tools can assist with this.SAP BDC Architecture: Enabling Data ProductsSAP BDC architecture diagram showing layers and components The architecture of SAP Business Data Cloud is designed to support the creation, management, and consumption of Data Products through a unified and coherent framework. This architecture forms the foundation that enables data to be transformed from raw information into valuable, business-ready data products.The core components of SAP BDC architecture work harmoniously to support the Data Product lifecycle:SAP DatasphereSAP Datasphere serves as the central hub for integrating data from various sources, both SAP and non-SAP. It plays a crucial role in harmonizing data across different formats and structures, ensuring consistency and compatibility. Through its data modeling and transformation capabilities, Datasphere creates the foundation for high-quality Data Products by providing a unified semantic layer that bridges technical data storage and business meaning.SAP Analytics CloudSAP Analytics Cloud functions as the primary consumption layer for Data Products. It leverages the well-structured Data Products to deliver analytics, reports, and dashboards that enable business users to gain insights without complex data manipulation. This component translates the technical excellence of Data Products into business value through visualization and analysis.SAP DatabricksSAP Databricks provides advanced data processing, machine learning, and AI capabilities within the BDC ecosystem. It enables the creation of sophisticated Data Products that incorporate predictive analytics, complex transformations, and AI-driven insights. Through its serverless computing and unified analytics platform, Databricks helps extend the value of Data Products beyond traditional reporting. Foundation ServicesThe underlying Foundation Services provide essential capabilities for data acquisition, transformation, and storage. These services ensure that Data Products have a reliable infrastructure foundation, addressing needs for performance, scalability, and security.BW integration diagram showing how BW connects with BDC components Data Product Creation and Consumption in SAP BDCThe journey of creating and consuming Data Products in SAP BDC follows a structured process that ensures quality, governance, and accessibility. This process encompasses multiple steps, from initial data package activation to the creation of business value through Data Product consumption.SAP applications produce data products that can be consumed in SAP Datasphere via SAP data packages. Data products provide the data for consumption, while the Business Analyst reviews the information about the installed Data Products.By Author; Data activation workflow diagram showing the 5-step process for data products Integration with SAP Business WarehouseSAP BDC offers seamless integration with SAP Business Warehouse, allowing organizations to leverage their existing BW investments while moving toward a modern data product approach. With the introduction of SAP BDC, customer-managed BW data products can be transitioned to SAP-managed data products.Customers will first work with BW data products based on their BW data models from SAP BW, starting use cases in SAP Databricks and SAP Datasphere, with exposure via Delta Share. As they begin exploring what’s possible with SAP-managed data products and insight apps, customers can gradually replace BW Data Products with SAP-managed Data Products as they move into a clean data approach.BW can publish “BW data products” into the SAP BDC object store. We use SAP Databricks delta share capabilities to expose those data products to 3rd party data lakes. We can also access 3rd party data lakes from BDC via delta share, and for example, join data from BW with data from Azure Data Lake in a Datasphere data layer.The Role of Open Resource Discovery (ORD)Open Resource Discovery (ORD) is a protocol that allows applications and services to self-describe their exposed resources and capabilities. It’s critical for enabling consistent technical documentation and facilitating the discovery of Data Products. ORD provides several benefits in the context of Data Products:It enables automated discovery and aggregation of metadataIt ensures a high degree of automation and helps keep systems in sync with realityIt provides a bigger context with shared, high-level information, taxonomy, and relations between described resourcesSAP Business Data Cloud uses ORD to provide high-quality metadata for its Data Products via the Data Product DirectoryBusiness Use Case: Data Products Driving ValueTo illustrate the real-world impact of Data Products within SAP BDC, let’s consider a finance department struggling with cash flow visibility and forecasting accuracy.Before implementing a Data Product approach, the finance team dealt with fragmented data sources: bank transaction data in SAP S/4HANA, accounts receivable in an Ariba system, and forecasting data in various Excel spreadsheets. Analysts spent days each month reconciling and consolidating this data, often discovering discrepancies too late to inform decision-making.Old relationship between stakeholders With SAP BDC’s Data Product approach, the organization implemented a “Cash Flow” Data Product that provides:A unified view of actual cash positions from banking systemsConfirmed transactions from accounts receivableForecasted cash flows from planning systemsFinance analysts now access this Data Product through SAP Analytics Cloud, where they can immediately visualize current cash positions, analyze trends, and generate accurate forecasts. The Data Product ensures that all data is current, consistent, and properly contextualized with business meaning.By Author This case demonstrates how Data Products streamline data access, improve data quality, and accelerate insights—transforming raw data into actionable business value.Benefits of a Data Product-Centric Approach in SAP BDCThe Data Product approach within SAP Business Data Cloud delivers several key advantages that collectively transform how organizations leverage their data assets:Improved data accessibility and discoverability: Well-defined and cataloged Data Products make it significantly easier for business users to find relevant data. With rich metadata, clear ownership, and intuitive cataloging, users spend less time searching and more time analyzing.Enhanced data trustworthiness and reliability: The focus on quality and governance within Data Products builds confidence in the data. Users can trust that the data they access is accurate, current, and compliant with relevant policies and standards.Increased business agility: By empowering users to access and utilize Data Products without complex data wrangling, organizations can respond more quickly to changing business conditions. Business teams can independently explore data and generate insights without heavy IT involvement.Accelerated innovation: A solid foundation of Data Products provides the reliable data needed for advanced analytics, machine learning, and AI initiatives. This accelerates the path from data to innovation, allowing organizations to develop new capabilities and offerings faster.Reduced duplication and increased efficiency: By creating reusable Data Products instead of one-off data extracts or reports, organizations reduce redundant work and establish a shared understanding of key business data.Improved data literacy across the organization: The business context embedded in Data Products helps users understand the meaning and proper use of data, elevating overall data literacy.TL:DRIn the age of AI, the ability to link business understanding with strategic choices is vital. As every company becomes an AI data creator, strong business context becomes necessary to get AI ready for real-world use. Building a solid base for Business AI means bringing together data systems and business operations.The data product approach fundamentally shifts how we think about data, moving from a technical resource to be managed to a strategic asset to be leveraged. It brings business context to the forefront, ensuring that data is not just stored but is truly useful and actionable.In SAP, the new Business Data Cloud, with Data Products, moves into how organizations manage and leverage their data in this new paradigm by treating data as a product—discoverable, accessible, trustworthy, and natively usable.By adopting a data product mindset, organizations can bridge the gap between data and business value and at the same time, establish the foundation of AI accessibility of Data. Read More Technology Blogs by Members articles
#SAP
#SAPTechnologyblog