Skip to content

feat: add multi-layer custom_pricing for Anthropic models#720

Draft
bhishmendramahala-crypto wants to merge 1 commit intomainfrom
pricing-update/anthropic-multi-layer
Draft

feat: add multi-layer custom_pricing for Anthropic models#720
bhishmendramahala-crypto wants to merge 1 commit intomainfrom
pricing-update/anthropic-multi-layer

Conversation

@bhishmendramahala-crypto
Copy link
Copy Markdown
Contributor

@bhishmendramahala-crypto bhishmendramahala-crypto commented Apr 16, 2026

Summary

  • Add custom_pricing with nested regions and execution_modes for claude-opus-4-6 and claude-sonnet-4-6
  • claude-opus-4-6: standard / batch / fast modes, global + us regions
    • standard: pay_as_you_go with full token + cache pricing
    • batch: pay_as_you_go + batch_config (50% of standard rates)
    • fast: pay_as_you_go (6x standard — Anthropic fast mode, Opus only)
    • us region: 1.1x uplift on all prices
  • claude-sonnet-4-6: standard / batch modes, global + us regions
    • standard: pay_as_you_go with full token + cache pricing
    • batch: pay_as_you_go + batch_config (50% of standard rates)
    • us region: 1.1x uplift on all prices
  • Existing top-level pricing_config (with pay_as_you_go + batch_config) preserved as fallback for older gateway versions

Source Verification

Source Links:

Checklist

  • I have validated the JSON using jq or an online validator
  • I have verified that prices are in cents per token (not dollars)
  • I have included the source link above

Related

  • Gateway PR: Portkey-AI/gateway-enterprise-node#1387 (Anthropic pricing resolver)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new custom_pricing structure to Anthropic model pricing to support multi-layer pricing by region and execution mode (while keeping pricing_config as a fallback for older gateway versions).

Changes:

  • Introduces custom_pricing.regions.{global,us}.execution_modes for claude-opus-4-6 (standard/batch/fast) and claude-sonnet-4-6 (standard/batch).
  • Adds cache token pricing entries (cache_write_input_token, cache_read_input_token) to the top-level batch_config for both models.
  • Adds US-region uplifted pricing (10%) and fast-mode pricing for Opus.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pricing/anthropic.json Outdated
"price": 0.0008250000000000001
},
"cache_write_input_token": {
"price": 0.00020625000000000003
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

US-region Sonnet batch cache_write_input_token.price is 0.00020625000000000003 (float artifact). Please simplify to 0.00020625 for consistency and readability.

Suggested change
"price": 0.00020625000000000003
"price": 0.00020625

Copilot uses AI. Check for mistakes.
Comment thread pricing/anthropic.json
Comment thread pricing/anthropic.json
Comment on lines +752 to +754
"request_token": {
"price": 0.0033000000000000004
},
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The US fast-mode request_token price is written as 0.0033000000000000004 (float precision artifact). Please normalize this to 0.0033 (and similarly for any other uplift-derived values) for consistency and to avoid unnecessary diffs.

Copilot uses AI. Check for mistakes.
Comment thread pricing/anthropic.json Outdated
Comment on lines +738 to +742
"price": 0.0013750000000000001
},
"cache_write_input_token": {
"price": 0.00034375000000000003
},
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More US-region batch prices show float precision artifacts (e.g., response_token.price is 0.0013750000000000001). Please normalize these values (e.g., 0.001375) to avoid noisy diffs and improve readability.

Copilot uses AI. Check for mistakes.
Comment thread pricing/anthropic.json Outdated
"price": 0.00034375000000000003
},
"cache_read_input_token": {
"price": 0.000027500000000000004
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache_read_input_token.price under US batch pricing is 0.000027500000000000004 (float artifact). Please simplify to 0.0000275 (or the intended exact decimal) for consistency.

Suggested change
"price": 0.000027500000000000004
"price": 0.0000275

Copilot uses AI. Check for mistakes.
Comment thread pricing/anthropic.json
Comment thread pricing/anthropic.json Outdated
"price": 0.0016500000000000002
},
"cache_write_input_token": {
"price": 0.00041250000000000005
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

US-region Sonnet cache_write_input_token.price is 0.00041250000000000005 (float artifact). Please simplify to 0.0004125 (or the intended exact decimal) for consistency.

Suggested change
"price": 0.00041250000000000005
"price": 0.0004125

Copilot uses AI. Check for mistakes.
Comment thread pricing/anthropic.json Outdated
Comment on lines +720 to +724
"price": 0.0027500000000000003
},
"cache_write_input_token": {
"price": 0.0006875000000000001
},
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

US-region prices contain floating-point precision artifacts (e.g., response_token.price is 0.0027500000000000003). Please normalize these numeric literals to their intended decimal forms (e.g., 0.00275) to keep the JSON readable and reduce downstream stringify/diff churn.

Copilot uses AI. Check for mistakes.
Comment thread pricing/anthropic.json Outdated
"price": 0.0006875000000000001
},
"cache_read_input_token": {
"price": 0.00005500000000000001
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache_read_input_token.price is written as 0.00005500000000000001 (float artifact). Please simplify this to the exact decimal value (e.g., 0.000055) for consistency with the rest of the file.

Suggested change
"price": 0.00005500000000000001
"price": 0.000055

Copilot uses AI. Check for mistakes.
Comment thread pricing/anthropic.json Outdated
"price": 0.000165
},
"response_token": {
"price": 0.0008250000000000001
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

US-region Sonnet batch response_token.price is 0.0008250000000000001 (float artifact). Please normalize this to 0.000825 to keep numeric formatting consistent.

Suggested change
"price": 0.0008250000000000001
"price": 0.000825

Copilot uses AI. Check for mistakes.
…d claude-sonnet-4-6

- claude-opus-4-6: standard + fast execution modes, global + us regions
  - standard: pay_as_you_go + batch_config (50% rate) with cache entries
  - fast: pay_as_you_go only (6x standard rates, no batch)
  - us region: 10% uplift on all standard and fast prices
- claude-sonnet-4-6: standard execution mode, global + us regions
  - standard: pay_as_you_go + batch_config (50% rate) with cache entries
  - us region: 10% uplift on standard prices
- Existing pricing_config preserved as fallback for older gateway versions
  (batch_config removed from pricing_config — now lives in custom_pricing)
@bhishmendramahala-crypto bhishmendramahala-crypto force-pushed the pricing-update/anthropic-multi-layer branch from 8bce4f4 to 2f9ccfc Compare April 16, 2026 12:29
@narengogi narengogi marked this pull request as draft April 16, 2026 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants