Consume SAP BTP Document Information Extraction service for custom documents in ABAP

Estimated read time 17 min read

Welcome folks!

This is the third and final blog in the series End-To-End : Consume SAP BTP AI Service (Document Information Extraction) from ABAP

As the first step, you must have completed Part 1 : Setup BTP Trial Account and subscribe to Document Information Extraction service

As the second step, you must have completed Part 2 : Setup and configure custom documents on Document Information Extraction service on SAP BTP

Now that we have everything set and ready on the BTP service layer, we will proceed to consuming the service (Document Information Extraction) from ABAP layer.

Let us first outline the actions that are required to achieve this.

Note critical information from BTP Service layer (Service Key)Set up connectivity between ABAP and BTP service layers through RFC (SM59)Upload file on ABAP layer for ExtractionAuthenticate from ABAP to BTP service layer Send File from ABAP to BTP service layerReceive Extraction Results from BTP service to ABAP layer

 

Note critical information from BTP Service layer (Service Key)

Login to SAP BTP Cockpit and navigate to Your Trial Account

 

 

 

 

 

Go to Trial Home.Navigate to the Subaccount.Click Instances and Subscriptions

 

 

Go to Application Document Information Extraction from the Instances tab.Click Credentials Key 

 

Credentials popup will appear. Click Form tab.

 

 

Important: Note down following information from this screen.
We will need it later when we call the service from ABAP layer.Service url swagger endpoint urluaa : clientiduaa : clientsecretuaa : authentication url (note without https://)

 

Set up connectivity between ABAP and BTP service layers through RFC (SM59)

 

Call TCODE SM59 to setup an RFC destination with the BTP ServiceCreate a new connection with type G (HTTP connection to external server)Name : BTP_DIE_CONNECT (you may give any name, we will need this later)

 

Enter DescriptionIn the Technical Settings Tab, enter values from the previously noted informationEnter Host = uaa : authentication urlPort 443In the Logon & Security Tab, enter values from the previously noted informationRadiobutton : Basic AuthenticationUser = uaa : clientidPassworduaa : clientsecretSSL > Set to ActiveSave the connectionClick Connection Test

 

 

Successful Connection Test screen will appear.We have successfully established connection between ABAP and BTP service layer using our unique service key.

 

 

Now, as the connectivity is set up, we will begin the real fun – coding our way into consuming the service from custom ABAP program.

We will create a custom program through SE38We will use local class and methods in the programWe will not go into the basics of ABAP programming, classes, methods etc. as part of this blog.We will look into the program sections in logical partitions and the flow of execution to understand what each part of the program does and why.

Just as a reminder of what we are trying to achieve, we are going to upload below file using a custom program and extract the information of the mapped fields using the Document Information Extraction (BTP AI Service) on Trial account. 

In case you have directly stumbled upon this blog and have not got the background of our end to end use case, I suggest you to refer the series End-To-End : Consume SAP BTP AI Service (Document Information Extraction) from ABAP of which, this is the third and final blog.

Upload File:

 

Program Result:

 

Upload file on ABAP layer for Extraction

Set a selection screen parameter and use method cl_gui_frontend_services=>file_open_dialog to browser the pdf file (marksheet) from the presentation layer (desktop).Use method cl_gui_frontend_services=>gui_upload to get the BIN (binary) file type content uploaded in to the internal table.Use Function Module SCMS_BINARY_TO_XSTRING to convert uploaded binary content into XSTRING variable, say l_file_content.

Create a local class, say lcl_doc_extract and define methods to:

AuthenticateSend file to BTPGet Template DetailsPost DocumentCheck Job Status

We will implement these methods subsequently.

Authenticate from ABAP to BTP service layer

Call authenticate method of the local class.Call method cl_http_client=>create_by_destination to set connection with RFC destination created previously, i.e. BTP_DIE_CONNECT and receive HTTP client in a local variable of type reference to if_http_client, say lo_client.Set header fields for ‘POST’ call, grant_type as ‘client_credentials’ and request_uri as ‘/oauth/token?grant_type=client_credentials’.Call methods send( ) and receive( ).Check response status with lo_client->response->get_status( ).In case of successful response, status 200, get the value of “access_token” from the response into a local variable, say lv_oauth.If authentication is successful, proceed to next steps, else give appropriate error message based on the http response.

Send File from ABAP to BTP service layer

Call Send file to BTP method of the local class.We have created the Template and Schema in the previous steps of our blog series. Programatically, first we need to get the Template ID and Schema ID for our use case which will be used to call the API to get extraction details later.Call Get Template Details method of the local class.Use cl_http_client=>create_by_url method to create an HTTP client to call the API endpoint – Get Template (Refer API Documentation)Set header fields for ‘GET‘ call, request_uri as concatenation of swagger url from BTP API Service Key i.e. ‘/document-information-extraction/v1’ and URL endpoint path ‘/templates’Get a record from Response JSON with name (Template Name) as ‘Template_Marksheet’ (previously created in our use case in this blog series). Save corresponding id as template_id and schemaId as schema_id in local variables. We will need these in next steps.Now that we have required uploaded PDF file content (l_file_content), Authentication token (lv_oaut), Template ID (template_id) and Schema ID (schema_id), we are good to proceed to send document to BTP for OCR extraction.

Use cl_http_client=>create_by_url method to create an HTTP client to call the API endpoint – Upload Document (Refer API Documentation)This API expects multipart payload as you may see in the documentation. So, we will send information as multipart HTTP Request.Set header fields for ‘POST‘ call, request_uri as concatenation of swagger url from BTP API Service Key i.e. ‘/document-information-extraction/v1’ and URL endpoint path /document/jobs’Authorization as lv_oaut.Set request content type as if_rest_media_type=>gc_multipart_form_data and formfield encoding as cl_http_request=>if_http_entity~co_encoding_raw First, we will pass options content to the request using method lo_client->request->add_multipart) Fill options section of the payload with clientId as default, documentType as custom, templateId as template_idschemaId as schema_id. Note, as we are going to extract the header and item fields from the document based on specific template and schema, we don’t need to pass headerFields and lineItemFields parameters, else it is mandatory.Set header field ContentDisposition with value = |form-data; name=”options”; type=application/json|Use method set_cdata( ) to set options to the request object.Now, we will pass file content to the request using method lo_client->request->add_multipart)Fill file content l_file_content with its length(xstrlen) to using method set_data( ) of the request object.Set header field ContentDisposition with value |form-data; name=”file”; filename=yourfilename.pdf; type=’application/pdf’|
Call methods send( ) and receive( ).Check response status with lo_client->response->get_status( ).In case of successful response, status 201, get the value of id (Job ID)from the response into a local variable, say lv_job.

 

Call Check Job Status method of the local classUse lv_job (Job ID) received in previous step to fetch job statusUse cl_http_client=>create_by_url method to create an HTTP client to call the API endpoint – Get Template (Refer API Documentation)Set header fields for ‘GET‘ call, Authorization as lv_oaut and request_uri as concatenation of swagger url from BTP API Service Key i.e. ‘/document-information-extraction/v1’,URL endpoint path ‘/document/jobs/’ and Job ID lv_job.Call methods send( ) and receive( ).Receive the JSON object as, say lo_json_response.In case of successful response, status 200, get the value of status as job status from the response into a local variable, say lv_job_status.If lv_job_status = ‘PENDING‘, it means the document information extraction for the document that we had sent is still not completed. In this case, wait for 3 seconds and reattempt to get the job status.If lv_job_status = ‘FAILED‘, it means the document information extraction for the document that we had sent has failed. Read the error message from the response and display to the userIf lv_job_status = ‘DONE‘, it means the document information extraction for the document that we had sent has completed.

Receive Extraction Results from BTP service to ABAP layer

Once the job status is found as DONE in the previous step, the details of the extracted fields will be found in the response object lo_json_response.Now, as the extracted data is received in local object in JSON format, it is just like any other JSON to ABAP structure/variable extraction that you may do through deserialization and parsing. This data information can now be used for business purpose as required.We will not go into its details as part of this blog as there is already ample documentation and guidance available on it in our community.

Congratulations!!!

With this, we have completed the blog series End-To-End : Consume SAP BTP AI Service (Document Information Extraction) from ABAP

I hope you have enjoyed this series. Feel free to share your feedback/suggestions in the comments to make this a better place for learning and growth.

Happy coding!

Tejas.

 

 

​ Welcome folks!This is the third and final blog in the series End-To-End : Consume SAP BTP AI Service (Document Information Extraction) from ABAPAs the first step, you must have completed Part 1 : Setup BTP Trial Account and subscribe to Document Information Extraction serviceAs the second step, you must have completed Part 2 : Setup and configure custom documents on Document Information Extraction service on SAP BTPNow that we have everything set and ready on the BTP service layer, we will proceed to consuming the service (Document Information Extraction) from ABAP layer.Let us first outline the actions that are required to achieve this.Note critical information from BTP Service layer (Service Key)Set up connectivity between ABAP and BTP service layers through RFC (SM59)Upload file on ABAP layer for ExtractionAuthenticate from ABAP to BTP service layer Send File from ABAP to BTP service layerReceive Extraction Results from BTP service to ABAP layer Note critical information from BTP Service layer (Service Key)Login to SAP BTP Cockpit and navigate to Your Trial Account     Go to Trial Home.Navigate to the Subaccount.Click Instances and Subscriptions  Go to Application Document Information Extraction from the Instances tab.Click Credentials Key  Credentials popup will appear. Click Form tab.  Important: Note down following information from this screen.We will need it later when we call the service from ABAP layer.Service url swagger endpoint urluaa : clientiduaa : clientsecretuaa : authentication url (note without https://) Set up connectivity between ABAP and BTP service layers through RFC (SM59) Call TCODE SM59 to setup an RFC destination with the BTP ServiceCreate a new connection with type G (HTTP connection to external server)Name : BTP_DIE_CONNECT (you may give any name, we will need this later) Enter DescriptionIn the Technical Settings Tab, enter values from the previously noted informationEnter Host = uaa : authentication urlPort 443In the Logon & Security Tab, enter values from the previously noted informationRadiobutton : Basic AuthenticationUser = uaa : clientidPassword = uaa : clientsecretSSL > Set to ActiveSave the connectionClick Connection Test  Successful Connection Test screen will appear.We have successfully established connection between ABAP and BTP service layer using our unique service key.  Now, as the connectivity is set up, we will begin the real fun – coding our way into consuming the service from custom ABAP program.We will create a custom program through SE38We will use local class and methods in the programWe will not go into the basics of ABAP programming, classes, methods etc. as part of this blog.We will look into the program sections in logical partitions and the flow of execution to understand what each part of the program does and why.Just as a reminder of what we are trying to achieve, we are going to upload below file using a custom program and extract the information of the mapped fields using the Document Information Extraction (BTP AI Service) on Trial account. In case you have directly stumbled upon this blog and have not got the background of our end to end use case, I suggest you to refer the series End-To-End : Consume SAP BTP AI Service (Document Information Extraction) from ABAP of which, this is the third and final blog.Upload File: Program Result: Upload file on ABAP layer for ExtractionSet a selection screen parameter and use method cl_gui_frontend_services=>file_open_dialog to browser the pdf file (marksheet) from the presentation layer (desktop).Use method cl_gui_frontend_services=>gui_upload to get the BIN (binary) file type content uploaded in to the internal table.Use Function Module SCMS_BINARY_TO_XSTRING to convert uploaded binary content into XSTRING variable, say l_file_content.Create a local class, say lcl_doc_extract and define methods to:AuthenticateSend file to BTPGet Template DetailsPost DocumentCheck Job StatusWe will implement these methods subsequently.Authenticate from ABAP to BTP service layerCall authenticate method of the local class.Call method cl_http_client=>create_by_destination to set connection with RFC destination created previously, i.e. BTP_DIE_CONNECT and receive HTTP client in a local variable of type reference to if_http_client, say lo_client.Set header fields for ‘POST’ call, grant_type as ‘client_credentials’ and request_uri as ‘/oauth/token?grant_type=client_credentials’.Call methods send( ) and receive( ).Check response status with lo_client->response->get_status( ).In case of successful response, status 200, get the value of “access_token” from the response into a local variable, say lv_oauth.If authentication is successful, proceed to next steps, else give appropriate error message based on the http response.Send File from ABAP to BTP service layerCall Send file to BTP method of the local class.We have created the Template and Schema in the previous steps of our blog series. Programatically, first we need to get the Template ID and Schema ID for our use case which will be used to call the API to get extraction details later.Call Get Template Details method of the local class.Use cl_http_client=>create_by_url method to create an HTTP client to call the API endpoint – Get Template (Refer API Documentation)Set header fields for ‘GET’ call, request_uri as concatenation of swagger url from BTP API Service Key i.e. ‘/document-information-extraction/v1’ and URL endpoint path ‘/templates’. Get a record from Response JSON with name (Template Name) as ‘Template_Marksheet’ (previously created in our use case in this blog series). Save corresponding id as template_id and schemaId as schema_id in local variables. We will need these in next steps.Now that we have required uploaded PDF file content (l_file_content), Authentication token (lv_oaut), Template ID (template_id) and Schema ID (schema_id), we are good to proceed to send document to BTP for OCR extraction.Use cl_http_client=>create_by_url method to create an HTTP client to call the API endpoint – Upload Document (Refer API Documentation)This API expects multipart payload as you may see in the documentation. So, we will send information as multipart HTTP Request.Set header fields for ‘POST’ call, request_uri as concatenation of swagger url from BTP API Service Key i.e. ‘/document-information-extraction/v1’ and URL endpoint path ‘/document/jobs’, Authorization as lv_oaut.Set request content type as if_rest_media_type=>gc_multipart_form_data and formfield encoding as cl_http_request=>if_http_entity~co_encoding_raw First, we will pass options content to the request using method lo_client->request->add_multipart( ) Fill options section of the payload with clientId as default, documentType as custom, templateId as template_id, schemaId as schema_id. Note, as we are going to extract the header and item fields from the document based on specific template and schema, we don’t need to pass headerFields and lineItemFields parameters, else it is mandatory.Set header field Content-Disposition with value = |form-data; name=”options”; type=application/json|Use method set_cdata( ) to set options to the request object.Now, we will pass file content to the request using method lo_client->request->add_multipart( )Fill file content l_file_content with its length(xstrlen) to using method set_data( ) of the request object.Set header field Content-Disposition with value = |form-data; name=”file”; filename=yourfilename.pdf; type=’application/pdf’|Call methods send( ) and receive( ).Check response status with lo_client->response->get_status( ).In case of successful response, status 201, get the value of “id” (Job ID)from the response into a local variable, say lv_job. Call Check Job Status method of the local classUse lv_job (Job ID) received in previous step to fetch job statusUse cl_http_client=>create_by_url method to create an HTTP client to call the API endpoint – Get Template (Refer API Documentation)Set header fields for ‘GET’ call, Authorization as lv_oaut and request_uri as concatenation of swagger url from BTP API Service Key i.e. ‘/document-information-extraction/v1’,URL endpoint path ‘/document/jobs/’ and Job ID lv_job.Call methods send( ) and receive( ).Receive the JSON object as, say lo_json_response.In case of successful response, status 200, get the value of “status” as job status from the response into a local variable, say lv_job_status.If lv_job_status = ‘PENDING’, it means the document information extraction for the document that we had sent is still not completed. In this case, wait for 3 seconds and reattempt to get the job status.If lv_job_status = ‘FAILED’, it means the document information extraction for the document that we had sent has failed. Read the error message from the response and display to the userIf lv_job_status = ‘DONE’, it means the document information extraction for the document that we had sent has completed.Receive Extraction Results from BTP service to ABAP layerOnce the job status is found as DONE in the previous step, the details of the extracted fields will be found in the response object lo_json_response.Now, as the extracted data is received in local object in JSON format, it is just like any other JSON to ABAP structure/variable extraction that you may do through deserialization and parsing. This data information can now be used for business purpose as required.We will not go into its details as part of this blog as there is already ample documentation and guidance available on it in our community.Congratulations!!!With this, we have completed the blog series End-To-End : Consume SAP BTP AI Service (Document Information Extraction) from ABAPI hope you have enjoyed this series. Feel free to share your feedback/suggestions in the comments to make this a better place for learning and growth.Happy coding!Tejas.    Read More Technology Blogs by SAP articles 

#SAP

#SAPTechnologyblog

You May Also Like

More From Author

+ There are no comments

Add yours