As the adoption of UUIDs (Universally Unique Identifier) and GUIDs (Globally Unique Identifier) becomes prevalent, APL users may face the case where their training dataset contains such IDs. These unique identifiers act as a primary key to differentiate each row in your invoices table for example, or your customers table. This blog shows how APL can work with input data containing a UUID. We will cover the two phases of machine learning: Train and Predict.
The UUID in SAP HANA
If you run the following SQL in SAP HANA:
select SYSUUID as “MY_UUID” from dummy
you obtain a big alpha-numeric value:
Now, create a view from it just to check the datatype:
create view MY_APL_VIEW as
select SYSUUID as “MY_UUID” from dummy;
select view_name, column_name, data_type_name
from view_columns where view_name = ‘MY_APL_VIEW’;
The UUID is stored as a VARBINARY, but this datatype is not supported in APL, therefore we will convert the UUID into a hexadecimal string. This conversion can be done with the function: bintohex().
Training dataset with a UUID
We need to train a classification model against this dataset:
Because the table contains a UUID, we must convert that column using the script below:
select
BINTOHEX(“UUID”) as “Id”,
“Days_to_Report”,
“Bodily_Injury”,
“Property_Damage”,
“Age”,
“Gender”,
“Income”,
“Fraud”
from
“APL_SAMPLES”.”ACCIDENT_CLAIMS”
Make the conversion either via a SQL view if you are working with APL SQL, or via a dataframe if you are working with APL Python. An example of dataframe based on a select statement is shown here:
The SQL view for training the APL model looks like this:
Inference dataset with a UUID
The inference dataset requires the same conversion before you can run the APL predictions:
select
BINTOHEX(“UUID”) as “Id”,
“Days_to_Report”,
“Bodily_Injury”,
“Property_Damage”,
“Age”,
“Gender”,
“Income”
from
“APL_SAMPLES”.”ACCIDENT_NEW_CLAIMS”
The APL output from predicting claims fraud contains Id values that are in hexadecimal:
We must convert back the Id so that the APL output table can be joined to the input table:
select N.*, “gb_decision_Fraud” as “Prediction”
from APPLY_OUT O, APL_SAMPLES.ACCIDENT_NEW_CLAIMS N
where HEXTOBIN(O.”Id”) = N.”UUID”
As the adoption of UUIDs (Universally Unique Identifier) and GUIDs (Globally Unique Identifier) becomes prevalent, APL users may face the case where their training dataset contains such IDs. These unique identifiers act as a primary key to differentiate each row in your invoices table for example, or your customers table. This blog shows how APL can work with input data containing a UUID. We will cover the two phases of machine learning: Train and Predict.The UUID in SAP HANAIf you run the following SQL in SAP HANA: select SYSUUID as “MY_UUID” from dummy you obtain a big alpha-numeric value: Now, create a view from it just to check the datatype: create view MY_APL_VIEW as
select SYSUUID as “MY_UUID” from dummy;
select view_name, column_name, data_type_name
from view_columns where view_name = ‘MY_APL_VIEW’; The UUID is stored as a VARBINARY, but this datatype is not supported in APL, therefore we will convert the UUID into a hexadecimal string. This conversion can be done with the function: bintohex(). Training dataset with a UUIDWe need to train a classification model against this dataset: Because the table contains a UUID, we must convert that column using the script below: select
BINTOHEX(“UUID”) as “Id”,
“Days_to_Report”,
“Bodily_Injury”,
“Property_Damage”,
“Age”,
“Gender”,
“Income”,
“Fraud”
from
“APL_SAMPLES”.”ACCIDENT_CLAIMS” Make the conversion either via a SQL view if you are working with APL SQL, or via a dataframe if you are working with APL Python. An example of dataframe based on a select statement is shown here:https://community.sap.com/t5/technology-blogs-by-sap/multiclass-classification-with-apl-automated-predictive-library/ba-p/13545072The SQL view for training the APL model looks like this: Inference dataset with a UUIDThe inference dataset requires the same conversion before you can run the APL predictions: select
BINTOHEX(“UUID”) as “Id”,
“Days_to_Report”,
“Bodily_Injury”,
“Property_Damage”,
“Age”,
“Gender”,
“Income”
from
“APL_SAMPLES”.”ACCIDENT_NEW_CLAIMS” The APL output from predicting claims fraud contains Id values that are in hexadecimal: We must convert back the Id so that the APL output table can be joined to the input table: select N.*, “gb_decision_Fraud” as “Prediction”
from APPLY_OUT O, APL_SAMPLES.ACCIDENT_NEW_CLAIMS N
where HEXTOBIN(O.”Id”) = N.”UUID” To know more about APL Read More Technology Blogs by SAP articles
#SAP
#SAPTechnologyblog