SAP Document AI accelerates document-centric business processes with Generative AI

Estimated read time 17 min read

Authors: @amagnani37 & @merza 

Thank you to everyone who joined our recent partner webinar, “Streamline document processing with SAP Document AI”. This session was part of our ongoing partner webinar series, “Talk to your business with Generative AI“, and the turnout and engagement were fantastic.

For those who couldn’t make it, or for anyone who wants a recap, we—Merza Klaghstan, Alice Magnani, and Jacob Tan from the BTP & AI Solution Architects team —wanted to share the key takeaways, demos, and technical insights from the session.

Our goal was to go beyond a simple feature list and show you how SAP Document AI can be the centerpiece of a complete, end-to-end business solution.

Here’s what we covered.

Part 1: What is SAP Document AI and Why Does it Matter?

We kicked off by tackling the core business problem: a staggering 70-80% of all enterprise information is unstructured. Manually processing this data from documents, emails, and scans is not only slow but also expensive and prone to errors.

SAP Document AI, formerly known as Document Information Extraction (DOX), is SAP’s AI-powered solution on the Business Technology Platform (BTP) designed to solve this. It goes far beyond simple Optical Character Recognition (OCR) to extract, classify, and validate data with high precision.

Key Capabilities

We highlighted several key features:

Multiple Editions: The service is available in Base AI, Premium AI (which includes Generative AI capabilities), and Embedded editions to fit different use cases. We also want to highlight for our SAP Partners that the Premium AI edition is part of the Test, Demo, and Development (TDD) license, which provides a much more convenient model for your non-productive and internal development purposes.

 

The New “Workspace” UI: We compared the classic “Basic” UI with the new, full-featured “Workspace,” which is the future of the service and includes capabilities like email channel configuration and workflow management.

It’s the new, full-featured environment for all document processing tasks . We want to emphasize that from now on, all the latest SAP Document AI features will be available in the Workspace UI only, so we strongly advise using it for all new projects.

 

Pre-Configured & Custom Content (Schemas): The service provides many pre-trained models (called “Schemas”) for standard documents like invoices and purchase orders. The real power, however, comes from creating custom schemas for any document type your business needs—from specialized licenses to internal forms.

 

Generative AI & Instant Learning: For custom schemas, SAP Document AI leverages Generative AI and Large Language Models (LLMs) to understand and extract information. We also demonstrated “Instant Learning,” where the model learns from your manual corrections immediately. After one correction, it knows how to handle that document layout correctly the next time.

In our demo, we showed how the model initially struggled with a name containing a middle name and extracted the wrong federation name (full name vs. acronym) . After one manual correction and confirmation, it learned immediately. Technically, this feature uses few-shots prompting to make the underlying LLM aware of your previously reviewed documents to improve accuracy on the next run.

Part 2: A Real-World Use Case: “Vertigo Travels”

To make this real, we introduced “Vertigo Travels,” a fictional travel agency specializing in sports and adventure holidays. Their key challenge is managing the various kind of documents required for subscriptions, such as ID cards, medical certificates, and diving licenses.

The solution architecture includes:

The Vertigo Travels website, a frontend application,  deployed in Cloud Foundry, serving as entry point for Travelers to select travel packages and upload the necessary documents, such as ID cards or medical certificates. The backend logic is a CAP (Node.js) application also deployed in a cloud Foundry.

SAP Document AI, for all document intelligence operations, communicating via API with our backend application.

SAP HANA Cloud, as a persistence layer for our application, to store all the entities involved in the subscription process.

SAP Build Work Zone standard edition, as the entry point for Vertigo Travels employees to supervise the process. From Work Zone, business users can access the Vertigo Travel website for admin operations, or the SAP Document AI workspace, for instance to review documents. SAP Build Work Zone also allows business users to use SAP Document AI via mobile.

SAP Build Process Automation, to define business rules to automate some document validation checks. Although this could also be implemented as part of the  Vertigo Travels backend application, SAP Build Process Automation enables citizen developers to modify easily the business logics when required.

SAP S/4HANA Cloud, as the core SAP Cloud solution. The Vertigo Travels backend is integrated with modules such as Business Partners, Products, Sales Orders/Invoice, Customer Return for refund.

For the source code, you may refer here for reference.

P.S. if you’re planning to work on this prototype, please give us a shout out~! So at least we know somebody is trying our repository out!

We demonstrated three end-to-end scenarios:

The E2E Flow: A traveler, Mary, subscribes on the website and uploads her documents. A back-office employee, Barry, reviews and corrects one document in the Document AI Workspace. Once confirmed, a Sales Order is automatically created in S/4HANA Cloud.

Multi-Channel Ingestion: A traveler, Luca, visits an office. The employee uses SAP Mobile Start to scan his ID and medical certificate. The system then automatically emails Luca about a missing document. He simply replies to the email with his license attached, which is automatically ingested via the Inbound Channel feature.

Automation: A traveler, Claire, uploads an income statement to apply for a discount. SAP Build Process Automation runs a rule to check her eligibility. When she submits her final documents, they have a high confidence score and are auto-confirmed without any manual review.

Part 3: How We Built It (The Technical Deep Dive)

For our developer community, we walked through the “how-to” for the key components.

For the source code, you may refer here for reference.

P.S. if you’re planning to work on this prototype, please give us a shout out~! So at least we know somebody is trying our repository out!

1. Setup and Schema Management: We started with the BTP setup, which involves managing entitlements, creating the service instance, and, critically, establishing trust with SAP Cloud Identity Service (IAS) for user authorization. We then showed how to create the custom schemas that power the application.

When you set up your entitlements, make sure to use the application subscription ending in -ias (Identity Authentication Service), as this is the one that provisions the new Workspace interface.

2. API-Driven Development: Our CAP application’s backend communicates directly with Document AI’s APIs. We showed code snippets for uploading a file (using the FileService) and processing it against a schema (using the SchemaService).

How to retrieve specific extracted values:
While our demo UI showed a simple text extraction, your application will likely need specific, structured data (like “gross_family_income” or “number_of_kids”). The API makes this straightforward:

After processing the document, you get a Document ID.You can use this Document ID to retrieve the Document Version ID.Using the Version ID, you can then query the Entities for that document. 4. Finally, you can retrieve the specific value by using the Entity ID (e.g., the ID for your “number_of_kids” field) .

This is how our automation process, for example, was able to pull only the family income and number of kids to pass to the rules engine.

3. Configuring Email Ingestion: We configured the “Channels” feature to monitor an email inbox. This required registering an app in Microsoft Azure, granting Mail.ReadWrite permissions , and then using the Document AI Workspace to map a specific email folder to our target schema.

4. Enabling SAP Mobile Start: This was a key integration for the in-office scenario. It involved three steps:

Adding the Document AI dependency to SAP Build Work Zone within the IAS console.Creating a specific BTP destination (sapdocumentaimobilestartintegration).Configuring the app tile in the Work Zone Site Manager with the correct navigation and visualization parameters (e.g., mobilestart.type = documentAi).

4. Automating Logic with SAP Build Process Automation: To keep our application logic flexible, we externalized our validation rules. We created a process in SAP Build Process Automation with a Decision Table (Rule) to calculate the family discount based on income and number of kids. This process is exposed as an API and triggered by our CAP application.

For the source code, you may refer here for reference.

P.S. if you’re planning to work on this prototype, please give us a shout out~! So at least we know somebody is trying our repository out!

Part 4: Conclusion, Best Practices, and What’s Next

We concluded with some key advice and a look ahead.

Best Practices & Responsible AI

Quality In, Quality Out: The accuracy of extraction heavily depends on the quality of your scans.Human in the Loop: While automation is powerful, always design your processes with a “human in the loop” for validation, especially for critical data.Responsible AI: We reiterated SAP’s commitment to AI ethics. External LLM providers are carefully selected and are not permitted to store or retrain their models on customer data. Features like continuous data sharing for model improvement are strictly opt-in and should not be used for sensitive personal data.

Get Started and Look Ahead

We are committed to helping our partners build on this platform. Here are the resources to get you started:

SAP Document AI Official Help Page: https://url.sap/imohkvVertigo Travels GitHub Repository: https://url.sap/pi5vwn (The code for our demo app will be available here!)SAP Community for AI/ML: https://url.sap/ai-ml-grp

The roadmap for SAP Document AI is packed with features, including more pre-built schemas, document workflows, custom LLM prompting, and deeper integration with Joule.

Finally, don’t miss SAP TechEd Virtual! Be sure to check out sessions AI821v and AI102v for the latest roadmap and deep-dive innovations from the product team.

Please feel free to leave any questions or feedback in the comments below!

 

​ Authors: @amagnani37 & @merza Thank you to everyone who joined our recent partner webinar, “Streamline document processing with SAP Document AI”. This session was part of our ongoing partner webinar series, “Talk to your business with Generative AI”, and the turnout and engagement were fantastic.For those who couldn’t make it, or for anyone who wants a recap, we—Merza Klaghstan, Alice Magnani, and Jacob Tan from the BTP & AI Solution Architects team —wanted to share the key takeaways, demos, and technical insights from the session.Our goal was to go beyond a simple feature list and show you how SAP Document AI can be the centerpiece of a complete, end-to-end business solution.Here’s what we covered.Part 1: What is SAP Document AI and Why Does it Matter?We kicked off by tackling the core business problem: a staggering 70-80% of all enterprise information is unstructured. Manually processing this data from documents, emails, and scans is not only slow but also expensive and prone to errors.SAP Document AI, formerly known as Document Information Extraction (DOX), is SAP’s AI-powered solution on the Business Technology Platform (BTP) designed to solve this. It goes far beyond simple Optical Character Recognition (OCR) to extract, classify, and validate data with high precision.Key CapabilitiesWe highlighted several key features:Multiple Editions: The service is available in Base AI, Premium AI (which includes Generative AI capabilities), and Embedded editions to fit different use cases. We also want to highlight for our SAP Partners that the Premium AI edition is part of the Test, Demo, and Development (TDD) license, which provides a much more convenient model for your non-productive and internal development purposes. The New “Workspace” UI: We compared the classic “Basic” UI with the new, full-featured “Workspace,” which is the future of the service and includes capabilities like email channel configuration and workflow management.It’s the new, full-featured environment for all document processing tasks . We want to emphasize that from now on, all the latest SAP Document AI features will be available in the Workspace UI only, so we strongly advise using it for all new projects. Pre-Configured & Custom Content (Schemas): The service provides many pre-trained models (called “Schemas”) for standard documents like invoices and purchase orders. The real power, however, comes from creating custom schemas for any document type your business needs—from specialized licenses to internal forms. Generative AI & Instant Learning: For custom schemas, SAP Document AI leverages Generative AI and Large Language Models (LLMs) to understand and extract information. We also demonstrated “Instant Learning,” where the model learns from your manual corrections immediately. After one correction, it knows how to handle that document layout correctly the next time.In our demo, we showed how the model initially struggled with a name containing a middle name and extracted the wrong federation name (full name vs. acronym) . After one manual correction and confirmation, it learned immediately. Technically, this feature uses few-shots prompting to make the underlying LLM aware of your previously reviewed documents to improve accuracy on the next run.Part 2: A Real-World Use Case: “Vertigo Travels”To make this real, we introduced “Vertigo Travels,” a fictional travel agency specializing in sports and adventure holidays. Their key challenge is managing the various kind of documents required for subscriptions, such as ID cards, medical certificates, and diving licenses.The solution architecture includes:The Vertigo Travels website, a frontend application,  deployed in Cloud Foundry, serving as entry point for Travelers to select travel packages and upload the necessary documents, such as ID cards or medical certificates. The backend logic is a CAP (Node.js) application also deployed in a cloud Foundry.SAP Document AI, for all document intelligence operations, communicating via API with our backend application.SAP HANA Cloud, as a persistence layer for our application, to store all the entities involved in the subscription process.SAP Build Work Zone standard edition, as the entry point for Vertigo Travels employees to supervise the process. From Work Zone, business users can access the Vertigo Travel website for admin operations, or the SAP Document AI workspace, for instance to review documents. SAP Build Work Zone also allows business users to use SAP Document AI via mobile.SAP Build Process Automation, to define business rules to automate some document validation checks. Although this could also be implemented as part of the  Vertigo Travels backend application, SAP Build Process Automation enables citizen developers to modify easily the business logics when required.SAP S/4HANA Cloud, as the core SAP Cloud solution. The Vertigo Travels backend is integrated with modules such as Business Partners, Products, Sales Orders/Invoice, Customer Return for refund.For the source code, you may refer here for reference.P.S. if you’re planning to work on this prototype, please give us a shout out~! So at least we know somebody is trying our repository out!We demonstrated three end-to-end scenarios:The E2E Flow: A traveler, Mary, subscribes on the website and uploads her documents. A back-office employee, Barry, reviews and corrects one document in the Document AI Workspace. Once confirmed, a Sales Order is automatically created in S/4HANA Cloud.Multi-Channel Ingestion: A traveler, Luca, visits an office. The employee uses SAP Mobile Start to scan his ID and medical certificate. The system then automatically emails Luca about a missing document. He simply replies to the email with his license attached, which is automatically ingested via the Inbound Channel feature.Automation: A traveler, Claire, uploads an income statement to apply for a discount. SAP Build Process Automation runs a rule to check her eligibility. When she submits her final documents, they have a high confidence score and are auto-confirmed without any manual review.Part 3: How We Built It (The Technical Deep Dive)For our developer community, we walked through the “how-to” for the key components.For the source code, you may refer here for reference.P.S. if you’re planning to work on this prototype, please give us a shout out~! So at least we know somebody is trying our repository out!1. Setup and Schema Management: We started with the BTP setup, which involves managing entitlements, creating the service instance, and, critically, establishing trust with SAP Cloud Identity Service (IAS) for user authorization. We then showed how to create the custom schemas that power the application.When you set up your entitlements, make sure to use the application subscription ending in -ias (Identity Authentication Service), as this is the one that provisions the new Workspace interface.2. API-Driven Development: Our CAP application’s backend communicates directly with Document AI’s APIs. We showed code snippets for uploading a file (using the FileService) and processing it against a schema (using the SchemaService).How to retrieve specific extracted values:While our demo UI showed a simple text extraction, your application will likely need specific, structured data (like “gross_family_income” or “number_of_kids”). The API makes this straightforward:After processing the document, you get a Document ID.You can use this Document ID to retrieve the Document Version ID.Using the Version ID, you can then query the Entities for that document. 4. Finally, you can retrieve the specific value by using the Entity ID (e.g., the ID for your “number_of_kids” field) .This is how our automation process, for example, was able to pull only the family income and number of kids to pass to the rules engine.3. Configuring Email Ingestion: We configured the “Channels” feature to monitor an email inbox. This required registering an app in Microsoft Azure, granting Mail.ReadWrite permissions , and then using the Document AI Workspace to map a specific email folder to our target schema.4. Enabling SAP Mobile Start: This was a key integration for the in-office scenario. It involved three steps:Adding the Document AI dependency to SAP Build Work Zone within the IAS console.Creating a specific BTP destination (sapdocumentaimobilestartintegration).Configuring the app tile in the Work Zone Site Manager with the correct navigation and visualization parameters (e.g., mobilestart.type = documentAi).4. Automating Logic with SAP Build Process Automation: To keep our application logic flexible, we externalized our validation rules. We created a process in SAP Build Process Automation with a Decision Table (Rule) to calculate the family discount based on income and number of kids. This process is exposed as an API and triggered by our CAP application.For the source code, you may refer here for reference.P.S. if you’re planning to work on this prototype, please give us a shout out~! So at least we know somebody is trying our repository out!Part 4: Conclusion, Best Practices, and What’s NextWe concluded with some key advice and a look ahead.Best Practices & Responsible AIQuality In, Quality Out: The accuracy of extraction heavily depends on the quality of your scans.Human in the Loop: While automation is powerful, always design your processes with a “human in the loop” for validation, especially for critical data.Responsible AI: We reiterated SAP’s commitment to AI ethics. External LLM providers are carefully selected and are not permitted to store or retrain their models on customer data. Features like continuous data sharing for model improvement are strictly opt-in and should not be used for sensitive personal data.Get Started and Look AheadWe are committed to helping our partners build on this platform. Here are the resources to get you started:SAP Document AI Official Help Page: https://url.sap/imohkvVertigo Travels GitHub Repository: https://url.sap/pi5vwn (The code for our demo app will be available here!)SAP Community for AI/ML: https://url.sap/ai-ml-grpThe roadmap for SAP Document AI is packed with features, including more pre-built schemas, document workflows, custom LLM prompting, and deeper integration with Joule.Finally, don’t miss SAP TechEd Virtual! Be sure to check out sessions AI821v and AI102v for the latest roadmap and deep-dive innovations from the product team.Please feel free to leave any questions or feedback in the comments below!   Read More Technology Blog Posts by SAP articles 

#SAP

#SAPTechnologyblog

You May Also Like

More From Author