Agent Metering using GEN AI Hub

Introduction

In this blog, We will cover how to report raw metrics to GEN AI Hub for Agentic AI solutions in order to calculate AI Units. Further generative AI Hub reports this data to Unified Metering for further processing of AI Units calcutions.
Your application includes AI functionality that is commercialized using AI units. Therefore, it must report a business metric that reflects the business value of this AI functionality and accounts for the costs incurred by GenAI/LLM services (e.g., processed documents). The AI-unit consumption per customer tenant, along with the corresponding business metric, must be represented in billing and displayed for customer transparency in the SAP for Me dashboard.
You therefore need to implement a process that transforms raw metrics into business metrics, and subsequently into AI units.

The conversion from business metrics to AI units is done by the billing engine, and we don’t cover it here.

Metering for Generative AI
Use of large language models (LLMs) is metered in tokens. Tokens are the fundamental building blocks of text that a language model processes. They represent single characters, parts of words, entire words, or even punctuation marks. They’re the unit into which the LLM breaks down text input and output. It’s important to distinguish between input and output tokens. Input tokens refer to the text provided in the prompt, while output tokens represent the response generated by the LLM. The ratio between input and output tokens varies depending on the prompt and the use case.

AI Units

SAP’s portfolio of AI offerings and products can meter usage in AI units using the Unified Metering service. An AI unit is a commercial construct that covers the costs for all AI-enabled features and offerings and that helps to provide a unified customer experience.

Business Metric Calculation

For Billable Metering: all the below headers are mandatory. Requests with missing headers are not sent to unified metering.

FieldDescriptionAllowed Values ExampleAI-Resource-GroupResource group identifier for the AI metering calldefaultX-USECASE-IDUse case / feature identifier AI-LPRXXX-REQUEST-XXXX-25Q4 X-LOCALTENANT-IDUser (local) tenant ID; empty string if not available can be fetched from

cds.context?.tenant

X-PRODUCT-TYPE

Product identifier for metering contextProduct NameX-BUSINESS-METRIC-PATTERNMetric pattern: COUNT counts business objects; PAGES would count token blocksCOUNTX-BUSINESS-METRIC-PARAMFor COUNT: This is time in minutes. If it is set to 3 which means all the call within this 3 minutes window will be counted as a single call.
For PAGES: It indicates the length of a block of tokens in range 1-100,0001X-BUSINESS-CONTEXTUnique identifier of the counted business object per call (generated UUID, e.g. 550e8400-e29b-41d4-a716-446655440000)

Code Snippets

When calling chatCompletion method we need to send all these header values to Gen AI Hub.

This is the example code how these headers can be filled.

private _getHeaders() {
const headers: Record<string, string> = {}
headers[‘AI-Resource-Group’] = ‘default’

headers[‘X-USECASE-ID’] = process.env.AI_CORE_FEATURE_ID ?? ‘AI-XXXXXX-REQUEST-XXXX-25Q4’

// cds.context?.tenant contains the user tenant; in case of hybrid testing, the provider tenant
headers[‘X-LOCALTENANT-ID’] = cds.context?.tenant ?? ”
headers[‘X-PRODUCT-TYPE’] = ‘UXXX’

// COUNT: for counting business objects in the context of the LLM call
// PAGES: for counting the blocks or partial blocks of input and output tokens
headers[‘X-BUSINESS-METRIC-PATTERN’] = ‘COUNT’

headers[‘X-BUSINESS-METRIC-PARAM’] = ‘1’

headers[‘X-BUSINESS-CONTEXT’] = cds.context?.id?.toString() ?? ”

if (process.env.LOG_LLM_CALL_HEADERS === ‘true’) {
this._logger.debug(`LLM call headers: ${JSON.stringify(headers)}`)
}

return headers
}

Conclusion
For Generative AI based metering, We simply have to fill all the parameters when making the LLM calls and then this data is sent to Unified metering on behalf of the application for further processing and calculations of AI Units. If you do not want to use generative AI based metering then the other way is to directly integrate with Unified metering and report the data to Unified metering.

IntroductionIn this blog, We will cover how to report raw metrics to GEN AI Hub for Agentic AI solutions in order to calculate AI Units. Further generative AI Hub reports this data to Unified Metering for further processing of AI Units calcutions.Your application includes AI functionality that is commercialized using AI units. Therefore, it must report a business metric that reflects the business value of this AI functionality and accounts for the costs incurred by GenAI/LLM services (e.g., processed documents). The AI-unit consumption per customer tenant, along with the corresponding business metric, must be represented in billing and displayed for customer transparency in the SAP for Me dashboard.You therefore need to implement a process that transforms raw metrics into business metrics, and subsequently into AI units.The conversion from business metrics to AI units is done by the billing engine, and we don’t cover it here.Metering for Generative AI Use of large language models (LLMs) is metered in tokens. Tokens are the fundamental building blocks of text that a language model processes. They represent single characters, parts of words, entire words, or even punctuation marks. They’re the unit into which the LLM breaks down text input and output. It’s important to distinguish between input and output tokens. Input tokens refer to the text provided in the prompt, while output tokens represent the response generated by the LLM. The ratio between input and output tokens varies depending on the prompt and the use case.AI Units SAP’s portfolio of AI offerings and products can meter usage in AI units using the Unified Metering service. An AI unit is a commercial construct that covers the costs for all AI-enabled features and offerings and that helps to provide a unified customer experience.Business Metric CalculationFor Billable Metering: all the below headers are mandatory. Requests with missing headers are not sent to unified metering.FieldDescriptionAllowed Values ExampleAI-Resource-GroupResource group identifier for the AI metering calldefaultX-USECASE-IDUse case / feature identifier AI-LPRXXX-REQUEST-XXXX-25Q4 X-LOCALTENANT-IDUser (local) tenant ID; empty string if not available can be fetched from cds.context?.tenantX-PRODUCT-TYPEProduct identifier for metering contextProduct NameX-BUSINESS-METRIC-PATTERNMetric pattern: COUNT counts business objects; PAGES would count token blocksCOUNTX-BUSINESS-METRIC-PARAMFor COUNT: This is time in minutes. If it is set to 3 which means all the call within this 3 minutes window will be counted as a single call.For PAGES: It indicates the length of a block of tokens in range 1-100,0001X-BUSINESS-CONTEXTUnique identifier of the counted business object per call (generated UUID, e.g. 550e8400-e29b-41d4-a716-446655440000)Code SnippetsWhen calling chatCompletion method we need to send all these header values to Gen AI Hub.This is the example code how these headers can be filled. private _getHeaders() {
const headers: Record<string, string> = {}
headers[‘AI-Resource-Group’] = ‘default’

headers[‘X-USECASE-ID’] = process.env.AI_CORE_FEATURE_ID ?? ‘AI-XXXXXX-REQUEST-XXXX-25Q4’

// cds.context?.tenant contains the user tenant; in case of hybrid testing, the provider tenant
headers[‘X-LOCALTENANT-ID’] = cds.context?.tenant ?? ”
headers[‘X-PRODUCT-TYPE’] = ‘UXXX’

headers[‘X-BUSINESS-METRIC-PARAM’] = ‘1’

headers[‘X-BUSINESS-CONTEXT’] = cds.context?.id?.toString() ?? ”

if (process.env.LOG_LLM_CALL_HEADERS === ‘true’) {
this._logger.debug(`LLM call headers: ${JSON.stringify(headers)}`)
}

return headers
}ConclusionFor Generative AI based metering, We simply have to fill all the parameters when making the LLM calls and then this data is sent to Unified metering on behalf of the application for further processing and calculations of AI Units. If you do not want to use generative AI based metering then the other way is to directly integrate with Unified metering and report the data to Unified metering. Read More Technology Blog Posts by SAP articles

#SAP

#SAPTechnologyblog