Your AI Models Just Got a Hardware Upgrade

TL;DR

Start Here: SAP Note 3660109

Why This Matters

If you’re running your custom AI models on SAP’s Generative AI Hub (technical service SAP AI Core), this one’s for you.

We’ve heard the feedback: newer GPUs are available from hyperscalers, but you can’t use them yet because SAP hasn’t released a matching resource plan. That lag is frustrating — especially when the difference between a V100 and an H100 isn’t incremental. It’s generational. We’re talking 4.5–5x the compute performance for the right workloads. An inference call that takes 8 seconds on older hardware might finish in under 1 second on newer chips. That’s the gap between “possible in theory” and “shipping in production.”

Starting now, that gap is closing.

What Changed: Resource Plans → Instance Types

Think of the old model like a set menu at a restaurant. Curated, predictable, but limited to what the chef decided to offer.

Instance types are the shift to à la carte. You see the full range of CPUs and GPUs available across cloud providers, and you pick exactly what your workload needs — from a lightweight CPU for preprocessing to an H100 for training large models.

Resource plans aren’t going away. They still work. But instance types are now the primary path for getting the newest, most capable hardware into your deployments — faster, and with finer control than the old model allowed. We highly recommend to leverage instance types going forward (only available in the “extended” service plan, now be the right time to migrate from standard to extended)

Here’s the point: you tell us what your model needs; we make sure the right compute is there. Bon appétit.

What You Actually Get

A quick clarification: If you’re using pre-integrated foundation models through SAP’s Generative AI Hub — GPT-5.2, Claude 4.6 Opus or Gemini — this update doesn’t directly affect you. Those models are already managed and optimized; you just call the API.

Instance types matter when you’re running your own models: custom fine-tuned models, your proprietary models (like team-built solutions), embedding models for RAG pipelines, or specialized computer vision and industry models. That’s when hardware choice becomes your concern — and when better hardware translates directly into better outcomes.

Faster inference, faster training

This isn’t about spec sheets. It’s about what becomes possible with the models you’ve built.

Custom models that were borderline viable for real-time use — domain-specific classifiers, fine-tuned embedding models, proprietary prediction models — become clearly viable when you can run them on L40S instead of T4. Training jobs that took days can finish in hours on H100. Features your team considered out of reach might now fit inside your existing budget.

L4 and L40S chips deliver substantially higher throughput than T4, at lower cost per inference call. H100 provides order-of-magnitude gains over V100 for training workloads. If you’re pushing the limits of what’s feasible with custom models, this matters.

Better cost efficiency — without re-architecting

Faster hardware means less compute time per job, which means lower cost. Simple as that.

Instance types also let you right-size. You stop paying for a resource plan heavier than you need. The result: better cost-per-output without touching your model or application logic.

New hardware when it ships — not months later

When hyperscalers release the next GPU generation, you won’t be waiting for SAP to define a new resource plan. Instance types let us surface new hardware significantly faster. Your infrastructure stays current.

Multi-cloud, multi-region resilience

Instance types are available across cloud providers and regions. Deploy closer to your users or data. Meet compliance requirements. Reduce single-vendor dependency. Keep two or three validated alternatives per region and you’re protected against capacity constraints.

Which Instance Should You Use?

WorkloadRecommended InstanceExamplesOrchestration, preprocessing, CPU-based MLCPU (AWS m7i, Azure Dv6, GCP n4)Data pipelines, API routing, lightweight model servingCustom model inference (high throughput)L4 / L40S GPU (AWS g6, g6e)Fine-tuned classifiers, embedding models, custom vision modelsFine-tuning and training medium-to-large modelsA100 (GCP A2 Ultra)Domain-specific model fine-tuning, complex custom inferenceTraining large language modelsH100 (Azure NC H100 v5)Custom LLM training, proprietary foundation model development

Match the instance to your workload. Account for regional availability. Keep at least two alternatives validated per region.

Use the compute factors in SAP Note 3660109 or the SAP Cost Calculator to compare options on a consistent basis — each instance type maps to a compute factor in node-hours, which makes cost planning straightforward.

Migration: No Big Bang Required

Both options work today. You don’t have to migrate everything at once. More details:

# This still works
ai.sap.com/resourcePlan: “infer.s”

# This is the new option — use where it helps
ai.sap.com/instanceType: “g6.4xlarge“

Start with the workloads where newer hardware will have the clearest impact — typically your highest-volume inference jobs or your most compute-heavy training runs. Everything else can stay on resource plans, but newer hardware might be more efficient.

Full specs, regional availability, and compute factors: SAP Note 3660109.

Staying Current

Hardware in AI is moving fast — faster than enterprise software typically keeps up. Instance types are how we keep you on the right side of that curve, without adding operational overhead to your team.

To stay informed:

Subscribe to SAP Note 3660109 — it’s the live catalogue; new instance types show up here as they become availableWatch the What’s New page for implementation guidance and new instance families

The Bottom Line

This isn’t a small tweak. It’s a deliberate shift to put more control in your hands — and to get it there faster.

We handle the complexity: evaluating new hardware, qualifying it, making it available at scale. You focus on what you’re actually here to do: build AI solutions that matter.

The menu just expanded. Pick your instance — and dig in.

Co-Author: Lukas PätzTL;DR

SAP AI Core (the engine) now supports flexible instance types — pick the exact CPU or GPU your workload needs
New instance types deliver up to 4.5–5x the compute performance of old resource plans
Access the latest GPUs (L4, L40S, H100) without waiting for SAP to package them into plans
Your existing resource plans still work — instance types are opt-in where they helpStart Here: SAP Note 3660109Why This MattersIf you’re running your custom AI models on SAP’s Generative AI Hub (technical service SAP AI Core), this one’s for you.We’ve heard the feedback: newer GPUs are available from hyperscalers, but you can’t use them yet because SAP hasn’t released a matching resource plan. That lag is frustrating — especially when the difference between a V100 and an H100 isn’t incremental. It’s generational. We’re talking 4.5–5x the compute performance for the right workloads. An inference call that takes 8 seconds on older hardware might finish in under 1 second on newer chips. That’s the gap between “possible in theory” and “shipping in production.”Starting now, that gap is closing. What Changed: Resource Plans → Instance TypesThink of the old model like a set menu at a restaurant. Curated, predictable, but limited to what the chef decided to offer.Instance types are the shift to à la carte. You see the full range of CPUs and GPUs available across cloud providers, and you pick exactly what your workload needs — from a lightweight CPU for preprocessing to an H100 for training large models.Resource plans aren’t going away. They still work. But instance types are now the primary path for getting the newest, most capable hardware into your deployments — faster, and with finer control than the old model allowed. We highly recommend to leverage instance types going forward (only available in the “extended” service plan, now be the right time to migrate from standard to extended)Here’s the point: you tell us what your model needs; we make sure the right compute is there. Bon appétit.What You Actually GetA quick clarification: If you’re using pre-integrated foundation models through SAP’s Generative AI Hub — GPT-5.2, Claude 4.6 Opus or Gemini — this update doesn’t directly affect you. Those models are already managed and optimized; you just call the API.Instance types matter when you’re running your own models: custom fine-tuned models, your proprietary models (like team-built solutions), embedding models for RAG pipelines, or specialized computer vision and industry models. That’s when hardware choice becomes your concern — and when better hardware translates directly into better outcomes.Faster inference, faster trainingThis isn’t about spec sheets. It’s about what becomes possible with the models you’ve built.Custom models that were borderline viable for real-time use — domain-specific classifiers, fine-tuned embedding models, proprietary prediction models — become clearly viable when you can run them on L40S instead of T4. Training jobs that took days can finish in hours on H100. Features your team considered out of reach might now fit inside your existing budget.L4 and L40S chips deliver substantially higher throughput than T4, at lower cost per inference call. H100 provides order-of-magnitude gains over V100 for training workloads. If you’re pushing the limits of what’s feasible with custom models, this matters.Better cost efficiency — without re-architectingFaster hardware means less compute time per job, which means lower cost. Simple as that.Instance types also let you right-size. You stop paying for a resource plan heavier than you need. The result: better cost-per-output without touching your model or application logic. New hardware when it ships — not months laterWhen hyperscalers release the next GPU generation, you won’t be waiting for SAP to define a new resource plan. Instance types let us surface new hardware significantly faster. Your infrastructure stays current.Multi-cloud, multi-region resilienceInstance types are available across cloud providers and regions. Deploy closer to your users or data. Meet compliance requirements. Reduce single-vendor dependency. Keep two or three validated alternatives per region and you’re protected against capacity constraints.Which Instance Should You Use? WorkloadRecommended InstanceExamplesOrchestration, preprocessing, CPU-based MLCPU (AWS m7i, Azure Dv6, GCP n4)Data pipelines, API routing, lightweight model servingCustom model inference (high throughput)L4 / L40S GPU (AWS g6, g6e)Fine-tuned classifiers, embedding models, custom vision modelsFine-tuning and training medium-to-large modelsA100 (GCP A2 Ultra)Domain-specific model fine-tuning, complex custom inferenceTraining large language modelsH100 (Azure NC H100 v5)Custom LLM training, proprietary foundation model developmentMatch the instance to your workload. Account for regional availability. Keep at least two alternatives validated per region.Use the compute factors in SAP Note 3660109 or the SAP Cost Calculator to compare options on a consistent basis — each instance type maps to a compute factor in node-hours, which makes cost planning straightforward.Migration: No Big Bang RequiredBoth options work today. You don’t have to migrate everything at once. More details:# This still works
ai.sap.com/resourcePlan: “infer.s”

# This is the new option — use where it helps
ai.sap.com/instanceType: “g6.4xlarge”
Start with the workloads where newer hardware will have the clearest impact — typically your highest-volume inference jobs or your most compute-heavy training runs. Everything else can stay on resource plans, but newer hardware might be more efficient.Full specs, regional availability, and compute factors: SAP Note 3660109.Staying CurrentHardware in AI is moving fast — faster than enterprise software typically keeps up. Instance types are how we keep you on the right side of that curve, without adding operational overhead to your team.To stay informed:Subscribe to SAP Note 3660109 — it’s the live catalogue; new instance types show up here as they become availableWatch the What’s New page for implementation guidance and new instance familiesThe Bottom LineThis isn’t a small tweak. It’s a deliberate shift to put more control in your hands — and to get it there faster.We handle the complexity: evaluating new hardware, qualifying it, making it available at scale. You focus on what you’re actually here to do: build AI solutions that matter.The menu just expanded. Pick your instance — and dig in. Read More Technology Blog Posts by SAP articles

#SAP

#SAPTechnologyblog