Skip to content

chore(cognitive): update AI model catalog#6

Open
github-actions[bot] wants to merge 1 commit intomasterfrom
chore/update-models-7
Open

chore(cognitive): update AI model catalog#6
github-actions[bot] wants to merge 1 commit intomasterfrom
chore/update-models-7

Conversation

@github-actions
Copy link
Contributor

Model Update Summary

Updated 38 models across 7 providers (openai: no changes, anthropic, google-ai, groq, cerebras, xai, openrouter, fireworks-ai)


Anthropic

defaultModel: claude-sonnet-4-5-20250929claude-sonnet-4-6

New Models Added

Model ID Change Notes
claude-opus-4-6 NEW Production; $5/$25/1M tokens; 1M context / 128k output
claude-sonnet-4-6 NEW Production; $3/$15/1M tokens; 1M context / 64k output
claude-opus-4-5-20251101 NEW Production; $5/$25/1M tokens; 200k context / 64k output
claude-opus-4-1-20250805 NEW Production; $15/$75/1M tokens; 200k context / 32k output
claude-opus-4-20250514 NEW Production; $15/$75/1M tokens; 200k context / 32k output

Updated Models

Model ID Field Old → New
claude-3-haiku-20240307 lifecycle productiondeprecated
claude-3-haiku-20240307 deprecationDate (none)2026-04-19
claude-3-haiku-20240307 replacementModels (none)['claude-haiku-4-5-20251001']

Google AI

New Models Added

Model ID Change Notes
gemini-3.1-pro NEW Preview; internalModelId gemini-3.1-pro-preview; $2/$12/1M tokens; 1M context / 65k output
gemini-3.1-flash-lite NEW Preview; internalModelId gemini-3.1-flash-lite-preview; $0.25/$1.50/1M tokens; 1M context / 65k output
gemini-2.5-flash-lite NEW Production (stable); $0.10/$0.40/1M tokens; 1M context / 65k output

Updated Models

Model ID Field Old → New
gemini-3-pro lifecycle previewdiscontinued
gemini-3-pro discontinuedDate (none)2026-03-09 (shut down by Google)
gemini-3-pro replacementModels (none)['gemini-3.1-pro']
gemini-3-pro tags ['reasoning', ...]['deprecated', 'reasoning', ...]
gemini-2.0-flash lifecycle productiondeprecated
gemini-2.0-flash deprecationDate (none)2026-01-01
gemini-2.0-flash replacementModels (none)['gemini-2.5-flash']
gemini-2.0-flash tags ['low-cost', ...]['deprecated', 'low-cost', ...]

Groq

New Models Added

Model ID Change Notes
llama-4-scout-17b-16e-instruct NEW Production/Preview; internalModelId meta-llama/llama-4-scout-17b-16e-instruct; $0.11/$0.34/1M tokens; vision support
kimi-k2-instruct-0905 NEW Production/Preview; internalModelId moonshotai/kimi-k2-instruct-0905; $1.00/$3.00/1M tokens
qwen3-32b NEW Production/Preview; internalModelId qwen/qwen3-32b; $0.29/$0.59/1M tokens

Updated Models

Model ID Field Old → New
gpt-oss-20b inputCostPer1mTokens 0.10.075
gpt-oss-20b outputCostPer1mTokens 0.50.3
gpt-oss-20b maxOutputTokens 32_00065_536
gpt-oss-120b outputCostPer1mTokens 0.750.6
gpt-oss-120b maxOutputTokens 32_00065_536
llama-3.3-70b-versatile maxInputTokens 128_000131_000
llama-3.1-8b-instant maxInputTokens 128_000131_000
llama-3.1-8b-instant maxOutputTokens 8192131_072

Cerebras

New Models Added

Model ID Change Notes
qwen-3-235b-a22b-instruct-2507 NEW Preview; Qwen3 235B MoE; $0.80/$1.60/1M tokens (est.); 131k context
zai-glm-4.7 NEW Preview; Z.ai GLM 4.7 355B; $0.39/$1.75/1M tokens; 131k context

No other changes


xAI

defaultModel: grok-4-fast-non-reasoninggrok-4.20-0309-non-reasoning

New Models Added

Model ID Change Notes
grok-4.20-0309-reasoning NEW Production; $2/$6/1M tokens; 2M context / 128k output
grok-4.20-0309-non-reasoning NEW Production; $2/$6/1M tokens; 2M context / 128k output
grok-4.20-multi-agent-0309 NEW Production; $2/$6/1M tokens; 2M context / 128k output
grok-4-1-fast-reasoning NEW Production; $0.20/$0.50/1M tokens; 2M context / 128k output
grok-4-1-fast-non-reasoning NEW Production; $0.20/$0.50/1M tokens; 2M context / 128k output

No existing model changes


OpenRouter

New Models Added

Model ID Change Notes
gpt-oss-20b NEW Production; $0.075/$0.30/1M tokens; 131k context / 65k output

Updated Models

Model ID Field Old → New
gpt-oss-120b outputCostPer1mTokens 0.750.6
gpt-oss-120b maxOutputTokens 32_00065_536

Fireworks AI

New Models Added

Model ID Change Notes
deepseek-v3p2 NEW Production; DeepSeek V3.2; $0.56/$1.68/1M tokens; 163k context
deepseek-v3p1 NEW Production; DeepSeek V3.1; $0.56/$1.68/1M tokens; 163k context
kimi-k2-instruct-0905 NEW Production; Kimi K2 0905; $0.60/$2.50/1M tokens; 262k context

Updated Models

Model ID Field Old → New
deepseek-v3-0324 inputCostPer1mTokens 0.90.56
deepseek-v3-0324 outputCostPer1mTokens 0.91.68
deepseek-v3-0324 maxInputTokens 160_000163_840

OpenAI

OpenAI's documentation page (platform.openai.com/docs/models) returned 403 Forbidden and could not be fetched programmatically. No changes were made to the OpenAI config. Manual verification recommended.


Notes

  • Cerebras pricing for new preview models (qwen-3-235b-a22b-instruct-2507, zai-glm-4.7) is estimated based on comparable model sizes from other providers, as Cerebras does not publish pricing publicly (requires dedicated endpoint agreement).
  • gemini-3-pro was already shut down by Google as of March 9, 2026; updated to discontinued.
  • gemini-2.0-flash is deprecated upstream by Google; updated lifecycle accordingly.
  • claude-3-haiku-20240307 is deprecated by Anthropic with retirement date April 19, 2026.
  • xAI introduced a new model naming scheme (grok-4.20-*) alongside the older grok-4-* naming; both coexist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant