HANA Cloud Data Lake – SQL on Files (CSV, PARQUET) changes in QRC 4/2024

As per SAP Note – 3508244 in QRC 4/2024, a change is introduced in SQL for Files. The Data Lake Relational Engine native SQL on Files has been deprecated from the beginning of QRC 4/2024. Post this update, database users must access the Data Lake Files container using SAP HANA Cloud, SAP HANA Database.

This below provides details on the change and guides you on the process of using the SAP HANA Cloud, SAP HANA Database native SQL on Files instead of the Data Lake Relational Engine native SQL on Files. 

The SAP HANA native SQL on Files feature allows SAP HANA Database users to perform query processing on open-format files (CSV, PARQUET) stored in the Data Lake Files. There is no need to import files into the SAP HANA Database, and there is no requirement to configure or use the Data Lake Relational Engine anymore. The files can be retained in existing Data Lake Files container, but starting from QRC 4/2024, queries must be executed from SAP HANA Cloud, SAP HANA Database, rather than from the Data Lake Relational Engine. These queries will use SAP HANA native SQL, replacing the use of Data Lake Relational Engine native SQL. 

Below are the steps that needs to be performed for using SAP HANA native SQL on Files:

Step 1Login in to the Database Explorer of SAP HANA Cloud, SAP HANA Database and open SQL Console. Make sure the data source file is present in the SAP HANA Cloud, Data Lake Filesystem

Step 2: Create PSE. Use the following SQL query to create a PSE named ‘SOF_TEST’.

 

CREATE PSE SOF_TEST;

 

Please note that SOF_TEST Is an example name. Replace it with the required source name.

Step 3: Add the SAP HANA Cloud, Data Lake Filesystem Certificate Chain using the below query.

 

ALTER PSE SOF_TEST SET OWN CERTIFICATE
‘—–BEGIN PRIVATE KEY—–
Clientkey
—–END PRIVATE KEY—–
—–BEGIN CERTIFICATE—–
MainCertificate
—–END CERTIFICATE—–
—–BEGIN CERTIFICATE—–
Intermediatecertificate
—–END CERTIFICATE—–
—–BEGIN CERTIFICATE—–
RootCertificate
—–END CERTIFICATE—–‘;

 

Please provide the entire content of the certificate and key between —–BEGIN CERTIFICATE—– and —–END CERTIFICATE—– & —–BEGIN PRIVATE KEY—– and —–END PRIVATE KEY—–.

Step 4: Create Remote Source. Run the following SQL query to create the remote source in HANA Cloud DB for the HANA Data Lake.

 

CREATE REMOTE SOURCE REMOTE_SOF ADAPTER “file” CONFIGURATION ‘
provider=hdlf;
endpoint=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.files.hdl.prod-eu10.hanacloud.ondemand.com;’
WITH CREDENTIAL TYPE ‘X509’ PSE SOF_TEST;

 

Kindly change the SAP HANA Cloud, Data Lake REST API endpoint URL as per the requirement

Step 5: Create Virtual Table. After the remote source is set up, create a virtual table in SAP HANA Cloud, SAP HANA Database that points to the path and file present in the remote source.

If a CSV file is present in SAP HANA Cloud, Data Lake Filesystem root path (/), run the below query. 

 

CREATE VIRTUAL TABLE VT_EMPLOYEE (EMPLOYEEID INT PRIMARY KEY, FIRSTNAME VARCHAR(50), LASTNAME VARCHAR(50)) AT REMOTE_SOF.”/EMPLOYEE.csv” AS CSV FIELD DELIMITED BY ‘,’;

 

 If a PARQUET file is present in SAP HANA Cloud, Data Lake Filesystem root path (/), run the below query. 

 

CREATE VIRTUAL TABLE VT_EMPLOYEE (EMPLOYEEID INT PRIMARY KEY, FIRSTNAME VARCHAR(50), LASTNAME VARCHAR(50)) AT REMOTE_SOF.”/EMPLOYEE.parquet” AS PARQUET;

 

Kindly note the following:

Table name VT_EMPLOYEE & columns EMPLOYEEID, FIRSTNAME and LASTNAME are examples. Replace it with required table name and columns.Path and Filename /EMPLOYEE.csv and /EMPLOYEE.parquet are examples. Replace it complete SAP HANA Cloud, Data Lake path and valid filename. 

Step 6: Query the virtual table from SAP HANA Cloud, SAP HANA Database SQL Console 

 

SELECT * FROM EMPLOYEE;

 

The data present in the SAP HANA Cloud, Data Lake (CSV and PARQUET) files will be displayed as the content of the virtual table VT_EMPLOYEE.

 

​ As per SAP Note – 3508244 in QRC 4/2024, a change is introduced in SQL for Files. The Data Lake Relational Engine native SQL on Files has been deprecated from the beginning of QRC 4/2024. Post this update, database users must access the Data Lake Files container using SAP HANA Cloud, SAP HANA Database.This below provides details on the change and guides you on the process of using the SAP HANA Cloud, SAP HANA Database native SQL on Files instead of the Data Lake Relational Engine native SQL on Files. The SAP HANA native SQL on Files feature allows SAP HANA Database users to perform query processing on open-format files (CSV, PARQUET) stored in the Data Lake Files. There is no need to import files into the SAP HANA Database, and there is no requirement to configure or use the Data Lake Relational Engine anymore. The files can be retained in existing Data Lake Files container, but starting from QRC 4/2024, queries must be executed from SAP HANA Cloud, SAP HANA Database, rather than from the Data Lake Relational Engine. These queries will use SAP HANA native SQL, replacing the use of Data Lake Relational Engine native SQL. Below are the steps that needs to be performed for using SAP HANA native SQL on Files:Step 1: Login in to the Database Explorer of SAP HANA Cloud, SAP HANA Database and open SQL Console. Make sure the data source file is present in the SAP HANA Cloud, Data Lake FilesystemStep 2: Create PSE. Use the following SQL query to create a PSE named ‘SOF_TEST’. CREATE PSE SOF_TEST; Please note that SOF_TEST Is an example name. Replace it with the required source name.Step 3: Add the SAP HANA Cloud, Data Lake Filesystem Certificate Chain using the below query. ALTER PSE SOF_TEST SET OWN CERTIFICATE
‘—–BEGIN PRIVATE KEY—–
Clientkey
—–END PRIVATE KEY—–
—–BEGIN CERTIFICATE—–
MainCertificate
—–END CERTIFICATE—–
—–BEGIN CERTIFICATE—–
Intermediatecertificate
—–END CERTIFICATE—–
—–BEGIN CERTIFICATE—–
RootCertificate
—–END CERTIFICATE—–‘; Please provide the entire content of the certificate and key between —–BEGIN CERTIFICATE—– and —–END CERTIFICATE—– & —–BEGIN PRIVATE KEY—– and —–END PRIVATE KEY—–.Step 4: Create Remote Source. Run the following SQL query to create the remote source in HANA Cloud DB for the HANA Data Lake. CREATE REMOTE SOURCE REMOTE_SOF ADAPTER “file” CONFIGURATION ‘
provider=hdlf;
endpoint=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.files.hdl.prod-eu10.hanacloud.ondemand.com;’
WITH CREDENTIAL TYPE ‘X509’ PSE SOF_TEST; Kindly change the SAP HANA Cloud, Data Lake REST API endpoint URL as per the requirementStep 5: Create Virtual Table. After the remote source is set up, create a virtual table in SAP HANA Cloud, SAP HANA Database that points to the path and file present in the remote source.If a CSV file is present in SAP HANA Cloud, Data Lake Filesystem root path (/), run the below query.  CREATE VIRTUAL TABLE VT_EMPLOYEE (EMPLOYEEID INT PRIMARY KEY, FIRSTNAME VARCHAR(50), LASTNAME VARCHAR(50)) AT REMOTE_SOF.”/EMPLOYEE.csv” AS CSV FIELD DELIMITED BY ‘,’;   If a PARQUET file is present in SAP HANA Cloud, Data Lake Filesystem root path (/), run the below query.  CREATE VIRTUAL TABLE VT_EMPLOYEE (EMPLOYEEID INT PRIMARY KEY, FIRSTNAME VARCHAR(50), LASTNAME VARCHAR(50)) AT REMOTE_SOF.”/EMPLOYEE.parquet” AS PARQUET;  Kindly note the following:Table name VT_EMPLOYEE & columns EMPLOYEEID, FIRSTNAME and LASTNAME are examples. Replace it with required table name and columns.Path and Filename /EMPLOYEE.csv and /EMPLOYEE.parquet are examples. Replace it complete SAP HANA Cloud, Data Lake path and valid filename. Step 6: Query the virtual table from SAP HANA Cloud, SAP HANA Database SQL Console  SELECT * FROM EMPLOYEE; The data present in the SAP HANA Cloud, Data Lake (CSV and PARQUET) files will be displayed as the content of the virtual table VT_EMPLOYEE.   Read More Technology Blogs by Members articles 

#SAP

#SAPTechnologyblog

You May Also Like

More From Author