How to handle UUID in APL

Estimated read time 4 min read

As the adoption of UUIDs (Universally Unique Identifier) and GUIDs (Globally Unique Identifier) becomes prevalent, APL users may face the case where their training dataset contains such IDs. These unique identifiers act as a primary key to differentiate each row in your invoices table for example, or your customers table. This blog shows how APL can work with input data containing a UUID. We will cover the two phases of machine learning: Train and Predict.

The UUID in SAP HANA

If you run the following SQL in SAP HANA:

 

select SYSUUID as “MY_UUID” from dummy

 

you obtain a big alpha-numeric value:

 

 

Now, create a view from it just to check the datatype:

 

create view MY_APL_VIEW as
select SYSUUID as “MY_UUID” from dummy;

select view_name, column_name, data_type_name
from view_columns where view_name = ‘MY_APL_VIEW’;

 

 

 

The UUID is stored as a VARBINARY, but this datatype is not supported in APL, therefore we will convert the UUID into a hexadecimal string. This conversion can be done with the function: bintohex().

 

Training dataset with a UUID

We need to train a classification model against this dataset:

 

 

 

 

 

Because the table contains a UUID, we must convert that column using the script below:

 

select
BINTOHEX(“UUID”) as “Id”,
“Days_to_Report”,
“Bodily_Injury”,
“Property_Damage”,
“Age”,
“Gender”,
“Income”,
“Fraud”
from
“APL_SAMPLES”.”ACCIDENT_CLAIMS”

 

Make the conversion either via a SQL view if you are working with APL SQL, or via a dataframe if you are working with APL Python. An example of dataframe based on a select statement is shown here:

https://community.sap.com/t5/technology-blogs-by-sap/multiclass-classification-with-apl-automated-predictive-library/ba-p/13545072

The SQL view for training the APL model looks like this:

 

 

 

 

 

Inference dataset with a UUID

The inference dataset requires the same conversion before you can run the APL predictions:

 

select
BINTOHEX(“UUID”) as “Id”,
“Days_to_Report”,
“Bodily_Injury”,
“Property_Damage”,
“Age”,
“Gender”,
“Income”
from
“APL_SAMPLES”.”ACCIDENT_NEW_CLAIMS”

 

The APL output from predicting claims fraud contains Id values that are in hexadecimal:

 

 

 

We must convert back the Id so that the APL output table can be joined to the input table:

 

select N.*, “gb_decision_Fraud” as “Prediction”
from APPLY_OUT O, APL_SAMPLES.ACCIDENT_NEW_CLAIMS N
where HEXTOBIN(O.”Id”) = N.”UUID”

 

 

To know more about APL

 

​ As the adoption of UUIDs (Universally Unique Identifier) and GUIDs (Globally Unique Identifier) becomes prevalent, APL users may face the case where their training dataset contains such IDs. These unique identifiers act as a primary key to differentiate each row in your invoices table for example, or your customers table. This blog shows how APL can work with input data containing a UUID. We will cover the two phases of machine learning: Train and Predict.The UUID in SAP HANAIf you run the following SQL in SAP HANA: select SYSUUID as “MY_UUID” from dummy you obtain a big alpha-numeric value:  Now, create a view from it just to check the datatype: create view MY_APL_VIEW as
select SYSUUID as “MY_UUID” from dummy;

select view_name, column_name, data_type_name
from view_columns where view_name = ‘MY_APL_VIEW’;   The UUID is stored as a VARBINARY, but this datatype is not supported in APL, therefore we will convert the UUID into a hexadecimal string. This conversion can be done with the function: bintohex(). Training dataset with a UUIDWe need to train a classification model against this dataset:     Because the table contains a UUID, we must convert that column using the script below: select
BINTOHEX(“UUID”) as “Id”,
“Days_to_Report”,
“Bodily_Injury”,
“Property_Damage”,
“Age”,
“Gender”,
“Income”,
“Fraud”
from
“APL_SAMPLES”.”ACCIDENT_CLAIMS” Make the conversion either via a SQL view if you are working with APL SQL, or via a dataframe if you are working with APL Python. An example of dataframe based on a select statement is shown here:https://community.sap.com/t5/technology-blogs-by-sap/multiclass-classification-with-apl-automated-predictive-library/ba-p/13545072The SQL view for training the APL model looks like this:     Inference dataset with a UUIDThe inference dataset requires the same conversion before you can run the APL predictions: select
BINTOHEX(“UUID”) as “Id”,
“Days_to_Report”,
“Bodily_Injury”,
“Property_Damage”,
“Age”,
“Gender”,
“Income”
from
“APL_SAMPLES”.”ACCIDENT_NEW_CLAIMS” The APL output from predicting claims fraud contains Id values that are in hexadecimal:   We must convert back the Id so that the APL output table can be joined to the input table: select N.*, “gb_decision_Fraud” as “Prediction”
from APPLY_OUT O, APL_SAMPLES.ACCIDENT_NEW_CLAIMS N
where HEXTOBIN(O.”Id”) = N.”UUID”   To know more about APL   Read More Technology Blogs by SAP articles 

#SAP

#SAPTechnologyblog

You May Also Like

More From Author