Loop control using SQLScript view, Data Flow script operator & Open SQL procedure in SAP Datasphere

Estimated read time 18 min read

Loop control is an effective programming concept that helps in dealing with repetitive execution of a logic over individual segments of an entire or sub dataset. While working with SQL, there are set-based operations and window functions that can help avoiding multiple execution of code blocks in loops. However, there can be use cases where we would end up using loops to achieve the desired outcome in certain deliverables. The objective of this blog post is to explore the possibilities of a less discussed topic so far – using conditional loops in Datasphere (based on a hypothetical use case).
I have taken a scenario where we have 2 measures, namely closing stock & inbound business units per article & store combination over a period of time ie. 16.02.2025 – 26.02.2025. Now I want to derive a calculated measure called interim new stock which should be derived as present day’s inbound quantity added up with previous day’s closing stock quantity for same article-site combination. (For the sake of simplification, I have not taken any outbound movement into consideration :)).
Let’s say I have a table with below inbound & closing stock information in Datasphere within my space in Datasphere.

 

Calendar_DaySiteArticleUnitPurchase_UnitInbound_BUnClosing_Stock2025-02-16S5673475896901EACA17202025-02-16S54610464863EACA13242025-02-16S54692105721EACA21122025-02-17S5673475896901EACA11232025-02-17S54610464863EACA16272025-02-17S54692105721EACA212192025-02-18S5673475896901EACA12192025-02-18S54610464863EACA11232025-02-18S54692105721EACA27212025-02-19S5673475896901EACA13122025-02-19S54610464863EACA11182025-02-19S54692105721EACA224362025-02-20S5673475896901EACA160402025-02-20S54610464863EACA19202025-02-20S54692105721EACA212402025-02-21S5673475896901EACA11342025-02-21S54610464863EACA11172025-02-21S54692105721EACA23412025-02-22S5673475896901EACA16202025-02-22S54610464863EACA16182025-02-22S54692105721EACA210322025-02-23S5673475896901EACA148602025-02-23S54610464863EACA112212025-02-23S54692105721EACA23272025-02-24S5673475896901EACA12482025-02-24S54610464863EACA15202025-02-24S54692105721EACA27202025-02-25S5673475896901EACA110432025-02-25S54610464863EACA19242025-02-25S54692105721EACA23212025-02-26S5673475896901EACA14322025-02-26S54610464863EACA115272025-02-26S54692105721EACA21230

I need a new target table which will have last 4 days’ information from above table & a new calculated field called New_Interim_Stock which needs to be derived based on previous day’s closing stock + present day’s inbound quantity. For example, on 2025-02-26, for article 10464863 in site S546 the new interim stock would be 24+15 = 39. Below, I will try to demonstrate how we can do it applying loops in Datasphere through different approaches.

 

Approach #1 – using SQLScript view

SAP Datasphere supports the SQLScript syntax when the view editor is chosen as SQLScript (Table Function). Below I have attached the view definition script below with a conditional for loop which iterates over last 4 days of the dataset along with the calculated field New_Interim_Stock. I have created two different date variable pointers for handling current day (for inbound quantity) & previous day (for closing stock quantity) & a self join of the same dataset restricted by above date variables individually. The conditional for loop iterates over the dataset backwards starting from present day until it completes the calculation over the last four days.

 

declare control_date date;
declare today date default current_date;
declare date_var_inbound nvarchar(8);
declare date_var_stock nvarchar(8);
declare counter int default 1;
declare result_table table (
“Calendar_Day” date,
“Site” nvarchar(100),
“Article” nvarchar(100),
“Unit” nvarchar(100),
“Purchase_Unit” nvarchar(100),
“Inbound_BUn” int,
“Closing_Stock” int,
“New_Interim_Stock” int
);

for counter in 1..4 do
control_date := add_days(:today, -:counter);
date_var_inbound := to_nvarchar(:control_date, ‘YYYYMMDD’);
date_var_stock := to_nvarchar(add_days(:control_date, -1),’YYYYMMDD’);

insert into :result_table
select
inb.”Calendar_Day”,
inb.”Site”,
inb.”Article”,
inb.”Unit”,
inb.”Purchase_Unit”,
inb.”Inbound_BUn”,
inb.”Closing_Stock”,
sto.”Closing_Stock” + inb.”Inbound_BUn” “New_Interim_Stock”
from “inbound_stock_data” inb
left join “inbound_stock_data” sto
on inb.”Site” = sto.”Site”
and inb.”Article” = sto.”Article”
where inb.”Calendar_Day” = :date_var_inbound
and sto.”Calendar_Day” = :date_var_stock;
end for;

return
select * from :result_table order by “Calendar_Day”;

 

When the view is deployed, we can preview the data & validate output based on the data in source table.
Data preview of SQLScript view

Note : The output from the data previewing option of the view, which executes the conditional loop behind the scenes to get the output everytime we preview data. For larger datasets, an ideal solution would be to create a data flow with this view as a source & a target table with the output stricture in ‘Append’ data write mode. This will ensure that the output data is persisted & querying the same is free from the calculation latency.

 

Approach #2 – using Data Flow Script operator

In this approach, I created a dataflow with the source data table as a source, a target table for persisting the calculated result data & in between I have used the Script operator. The scripting language used here is Python.

I have attached the Python script below. The logic here is that I have sliced the dataset for the last 4 days backwards from maximum date available in the dataset ie. today. To begin with this time, I created with two Pandas dataframes. One of them is df_first, which is used to handle the oldest day of the sliced dataset. Because the sliced dataset has the first Calendar_Day as 23.02.2025, so the closing stocks for 22.02.2025 need to be fetched from the original dataset which is held in the df_first dataframe. On the other hand, df is sliced, grouped based on Article & Site & then fed to grouped_df dataframe. grouped_df dataframe is then consumed in the for loop alongside two date variable to point to the present day’s data row & previous day’s data row. The derivation of New_Interim_Stock is made within the if-else block, where the first day in the data slice is handled using the df_first dataframe based on article-site combination check. Finally the field New_Interim_Stock is updated within the df dataframe & a sorted result dataset is returned based on Calendar_Day.

 

def transform(data):

df_first = pd.DataFrame(data)
df_first[‘Calendar_Day’] = pd.to_datetime(df_first[‘Calendar_Day’], format=’%Y-%m-%d’)
df_first[‘Calendar_Day’] = pd.to_datetime(df_first[‘Calendar_Day’]).dt.date
df = df_first.copy()

today = df[‘Calendar_Day’].max()
relevant_days = [today – pd.Timedelta(days = date_diff) for date_diff in range(0,4)]
df = df[df[‘Calendar_Day’].isin(relevant_days)]

for (article, site), group in df.groupby([‘Article’, ‘Site’]):
grouped_df = group.sort_values(by = ‘Calendar_Day’, ascending = False)

for date_diff in range(0,4):
current_day = today – pd.Timedelta(days = date_diff)
current_day_row = grouped_df[grouped_df[‘Calendar_Day’] == current_day]

prev_day = current_day – pd.Timedelta(days = 1)
prev_day_row = grouped_df[grouped_df[‘Calendar_Day’] == prev_day]

if not prev_day_row.empty:
new_stock = prev_day_row[‘Closing_Stock’].values[0] + current_day_row[‘Inbound_BUn’].values[0]
else:
prev_day_row_from_first = df_first[
(df_first[‘Calendar_Day’] == prev_day) &
(df_first[‘Article’] == article) &
(df_first[‘Site’] == site)
]
new_stock = prev_day_row_from_first[‘Closing_Stock’].values[0] + current_day_row[‘Inbound_BUn’].values[0]

df.loc[(df[‘Article’] == article) & (df[‘Site’] == site) & (df[‘Calendar_Day’] == current_day), ‘New_Interim_Stock’] = new_stock

data = df.sort_values(by = ‘Calendar_Day’)

return data

 

After the Data Flow is saved & deployed, execute the same & preview the persisted output data in the target table.

Output of Script operator

 

Approach #3 – using Open SQL procedures

The last option is basically utilising the conventional studio experience. In Datasphere we can create procedures in your  open SQL schema if you have a DB user for the space in context or a DB usergroup user. Using the DB user credentials we can launch the database explorer or open the integrated HANA DB explorer console in VS Code. After that we can create procedures similar to HANA. The logic here basically follow the same structure as I used in the first approach with the SQLScript view.

Remember that in this case the output table has to be manually created within the same open SQL schema attached to the space where the procedure has write access. The open SQL schemas are access restricted, so for others to view the content of the output table, the table needs to be imported & deployed in the space by dragging into a view editor console for the first time use. Also for the same reason, if you have a local table in a space, the table might not be accessible for reading inside the procedure from your open SQL schema unless explicitly granted the select rights. In this case, the easiest alternative is to create a consumption enabled view on the local table & then it will be accessible by the open SQL schemas attached to the space.

Below is the script for the procedure. After the procedure is created, I simply called it in the console to update the output in the target table. For everyday usage, the procedure can be included in a task chain.

 

create procedure
“PROC_FILL_NEW_INTERIM_STOCK”
language sqlscript as
begin

declare control_date date;
declare today date default current_date;
declare date_var_inbound nvarchar(8);
declare date_var_stock nvarchar(8);
declare counter int default 1;

for counter in 1..4 do
control_date := add_days(:today, -:counter);
date_var_inbound := to_nvarchar(:control_date, ‘YYYYMMDD’);
date_var_stock := to_nvarchar(add_days(:control_date, -1),’YYYYMMDD’);

insert into table_proc_new_interim_stock
select
inb.”Calendar_Day”,
inb.”Site”,
inb.”Article”,
inb.”Unit”,
inb.”Purchase_Unit”,
inb.”Inbound_BUn”,
inb.”Closing_Stock”,
sto.”Closing_Stock” + inb.”Inbound_BUn” “New_Interim_Stock”
from “Inbound_Stock_View” inb
left join “Inbound_Stock_View” sto
on inb.”Site” = sto.”Site”
and inb.”Article” = sto.”Article”
where inb.”Calendar_Day” = :date_var_inbound
and sto.”Calendar_Day” = :date_var_stock;
end for;

end

 

Simply call & execute the procedure to insert the result dataset in target table.

 

call “PROC_FILL_NEW_INTERIM_STOCK”;

 

To consume the table back in Datasphere build application, import & deploy the same & create a view on top of it.

As a blog-post wind up, I can add a final comment based on my experience.

When the source data is not persisted & accessed from remote source, the dynamic filter push down control of Script operator is not as strong as SQLScript view or DB Procedure, where you can control the push down by scripting in specific manner. But this observation can vary based upon the connection being used between Datasphere & the remote source system. 

​ Loop control is an effective programming concept that helps in dealing with repetitive execution of a logic over individual segments of an entire or sub dataset. While working with SQL, there are set-based operations and window functions that can help avoiding multiple execution of code blocks in loops. However, there can be use cases where we would end up using loops to achieve the desired outcome in certain deliverables. The objective of this blog post is to explore the possibilities of a less discussed topic so far – using conditional loops in Datasphere (based on a hypothetical use case).I have taken a scenario where we have 2 measures, namely closing stock & inbound business units per article & store combination over a period of time ie. 16.02.2025 – 26.02.2025. Now I want to derive a calculated measure called interim new stock which should be derived as present day’s inbound quantity added up with previous day’s closing stock quantity for same article-site combination. (For the sake of simplification, I have not taken any outbound movement into consideration :)).Let’s say I have a table with below inbound & closing stock information in Datasphere within my space in Datasphere. Calendar_DaySiteArticleUnitPurchase_UnitInbound_BUnClosing_Stock2025-02-16S5673475896901EACA17202025-02-16S54610464863EACA13242025-02-16S54692105721EACA21122025-02-17S5673475896901EACA11232025-02-17S54610464863EACA16272025-02-17S54692105721EACA212192025-02-18S5673475896901EACA12192025-02-18S54610464863EACA11232025-02-18S54692105721EACA27212025-02-19S5673475896901EACA13122025-02-19S54610464863EACA11182025-02-19S54692105721EACA224362025-02-20S5673475896901EACA160402025-02-20S54610464863EACA19202025-02-20S54692105721EACA212402025-02-21S5673475896901EACA11342025-02-21S54610464863EACA11172025-02-21S54692105721EACA23412025-02-22S5673475896901EACA16202025-02-22S54610464863EACA16182025-02-22S54692105721EACA210322025-02-23S5673475896901EACA148602025-02-23S54610464863EACA112212025-02-23S54692105721EACA23272025-02-24S5673475896901EACA12482025-02-24S54610464863EACA15202025-02-24S54692105721EACA27202025-02-25S5673475896901EACA110432025-02-25S54610464863EACA19242025-02-25S54692105721EACA23212025-02-26S5673475896901EACA14322025-02-26S54610464863EACA115272025-02-26S54692105721EACA21230I need a new target table which will have last 4 days’ information from above table & a new calculated field called New_Interim_Stock which needs to be derived based on previous day’s closing stock + present day’s inbound quantity. For example, on 2025-02-26, for article 10464863 in site S546 the new interim stock would be 24+15 = 39. Below, I will try to demonstrate how we can do it applying loops in Datasphere through different approaches. Approach #1 – using SQLScript viewSAP Datasphere supports the SQLScript syntax when the view editor is chosen as SQLScript (Table Function). Below I have attached the view definition script below with a conditional for loop which iterates over last 4 days of the dataset along with the calculated field New_Interim_Stock. I have created two different date variable pointers for handling current day (for inbound quantity) & previous day (for closing stock quantity) & a self join of the same dataset restricted by above date variables individually. The conditional for loop iterates over the dataset backwards starting from present day until it completes the calculation over the last four days. declare control_date date;
declare today date default current_date;
declare date_var_inbound nvarchar(8);
declare date_var_stock nvarchar(8);
declare counter int default 1;
declare result_table table (
“Calendar_Day” date,
“Site” nvarchar(100),
“Article” nvarchar(100),
“Unit” nvarchar(100),
“Purchase_Unit” nvarchar(100),
“Inbound_BUn” int,
“Closing_Stock” int,
“New_Interim_Stock” int
);

for counter in 1..4 do
control_date := add_days(:today, -:counter);
date_var_inbound := to_nvarchar(:control_date, ‘YYYYMMDD’);
date_var_stock := to_nvarchar(add_days(:control_date, -1),’YYYYMMDD’);

insert into :result_table
select
inb.”Calendar_Day”,
inb.”Site”,
inb.”Article”,
inb.”Unit”,
inb.”Purchase_Unit”,
inb.”Inbound_BUn”,
inb.”Closing_Stock”,
sto.”Closing_Stock” + inb.”Inbound_BUn” “New_Interim_Stock”
from “inbound_stock_data” inb
left join “inbound_stock_data” sto
on inb.”Site” = sto.”Site”
and inb.”Article” = sto.”Article”
where inb.”Calendar_Day” = :date_var_inbound
and sto.”Calendar_Day” = :date_var_stock;
end for;

return
select * from :result_table order by “Calendar_Day”; When the view is deployed, we can preview the data & validate output based on the data in source table.Data preview of SQLScript viewNote : The output from the data previewing option of the view, which executes the conditional loop behind the scenes to get the output everytime we preview data. For larger datasets, an ideal solution would be to create a data flow with this view as a source & a target table with the output stricture in ‘Append’ data write mode. This will ensure that the output data is persisted & querying the same is free from the calculation latency. Approach #2 – using Data Flow Script operatorIn this approach, I created a dataflow with the source data table as a source, a target table for persisting the calculated result data & in between I have used the Script operator. The scripting language used here is Python.I have attached the Python script below. The logic here is that I have sliced the dataset for the last 4 days backwards from maximum date available in the dataset ie. today. To begin with this time, I created with two Pandas dataframes. One of them is df_first, which is used to handle the oldest day of the sliced dataset. Because the sliced dataset has the first Calendar_Day as 23.02.2025, so the closing stocks for 22.02.2025 need to be fetched from the original dataset which is held in the df_first dataframe. On the other hand, df is sliced, grouped based on Article & Site & then fed to grouped_df dataframe. grouped_df dataframe is then consumed in the for loop alongside two date variable to point to the present day’s data row & previous day’s data row. The derivation of New_Interim_Stock is made within the if-else block, where the first day in the data slice is handled using the df_first dataframe based on article-site combination check. Finally the field New_Interim_Stock is updated within the df dataframe & a sorted result dataset is returned based on Calendar_Day. def transform(data):

df_first = pd.DataFrame(data)
df_first[‘Calendar_Day’] = pd.to_datetime(df_first[‘Calendar_Day’], format=’%Y-%m-%d’)
df_first[‘Calendar_Day’] = pd.to_datetime(df_first[‘Calendar_Day’]).dt.date
df = df_first.copy()

today = df[‘Calendar_Day’].max()
relevant_days = [today – pd.Timedelta(days = date_diff) for date_diff in range(0,4)]
df = df[df[‘Calendar_Day’].isin(relevant_days)]

for (article, site), group in df.groupby([‘Article’, ‘Site’]):
grouped_df = group.sort_values(by = ‘Calendar_Day’, ascending = False)

for date_diff in range(0,4):
current_day = today – pd.Timedelta(days = date_diff)
current_day_row = grouped_df[grouped_df[‘Calendar_Day’] == current_day]

prev_day = current_day – pd.Timedelta(days = 1)
prev_day_row = grouped_df[grouped_df[‘Calendar_Day’] == prev_day]

if not prev_day_row.empty:
new_stock = prev_day_row[‘Closing_Stock’].values[0] + current_day_row[‘Inbound_BUn’].values[0]
else:
prev_day_row_from_first = df_first[
(df_first[‘Calendar_Day’] == prev_day) &
(df_first[‘Article’] == article) &
(df_first[‘Site’] == site)
]
new_stock = prev_day_row_from_first[‘Closing_Stock’].values[0] + current_day_row[‘Inbound_BUn’].values[0]

df.loc[(df[‘Article’] == article) & (df[‘Site’] == site) & (df[‘Calendar_Day’] == current_day), ‘New_Interim_Stock’] = new_stock

data = df.sort_values(by = ‘Calendar_Day’)

return data After the Data Flow is saved & deployed, execute the same & preview the persisted output data in the target table.Output of Script operator Approach #3 – using Open SQL proceduresThe last option is basically utilising the conventional studio experience. In Datasphere we can create procedures in your  open SQL schema if you have a DB user for the space in context or a DB usergroup user. Using the DB user credentials we can launch the database explorer or open the integrated HANA DB explorer console in VS Code. After that we can create procedures similar to HANA. The logic here basically follow the same structure as I used in the first approach with the SQLScript view. Remember that in this case the output table has to be manually created within the same open SQL schema attached to the space where the procedure has write access. The open SQL schemas are access restricted, so for others to view the content of the output table, the table needs to be imported & deployed in the space by dragging into a view editor console for the first time use. Also for the same reason, if you have a local table in a space, the table might not be accessible for reading inside the procedure from your open SQL schema unless explicitly granted the select rights. In this case, the easiest alternative is to create a consumption enabled view on the local table & then it will be accessible by the open SQL schemas attached to the space.Below is the script for the procedure. After the procedure is created, I simply called it in the console to update the output in the target table. For everyday usage, the procedure can be included in a task chain. create procedure
“PROC_FILL_NEW_INTERIM_STOCK”
language sqlscript as
begin

declare control_date date;
declare today date default current_date;
declare date_var_inbound nvarchar(8);
declare date_var_stock nvarchar(8);
declare counter int default 1;

for counter in 1..4 do
control_date := add_days(:today, -:counter);
date_var_inbound := to_nvarchar(:control_date, ‘YYYYMMDD’);
date_var_stock := to_nvarchar(add_days(:control_date, -1),’YYYYMMDD’);

insert into table_proc_new_interim_stock
select
inb.”Calendar_Day”,
inb.”Site”,
inb.”Article”,
inb.”Unit”,
inb.”Purchase_Unit”,
inb.”Inbound_BUn”,
inb.”Closing_Stock”,
sto.”Closing_Stock” + inb.”Inbound_BUn” “New_Interim_Stock”
from “Inbound_Stock_View” inb
left join “Inbound_Stock_View” sto
on inb.”Site” = sto.”Site”
and inb.”Article” = sto.”Article”
where inb.”Calendar_Day” = :date_var_inbound
and sto.”Calendar_Day” = :date_var_stock;
end for;

end Simply call & execute the procedure to insert the result dataset in target table. call “PROC_FILL_NEW_INTERIM_STOCK”; To consume the table back in Datasphere build application, import & deploy the same & create a view on top of it.As a blog-post wind up, I can add a final comment based on my experience.When the source data is not persisted & accessed from remote source, the dynamic filter push down control of Script operator is not as strong as SQLScript view or DB Procedure, where you can control the push down by scripting in specific manner. But this observation can vary based upon the connection being used between Datasphere & the remote source system.   Read More Technology Blogs by Members articles 

#SAP

#SAPTechnologyblog

You May Also Like

More From Author