Reference

Compatible API Providers

Popular OpenAI-compatible API providers you can use with Claude Code CLI

OpenAIPopular

https://api.openai.com/v1

Official OpenAI API with GPT-4, GPT-4 Turbo, and GPT-4o models

Models

gpt-4ogpt-4o-mini+2

Features

Tool UseVisionJSON Mode

OpenRouterPopular

https://openrouter.ai/api/v1

Access multiple AI providers through a single API, including Claude, GPT-4, Llama, and more

Models

anthropic/claude-3.5...openai/gpt-4o+1

Features

Multi-ProviderFallbacksCost Optimization

Together AI

https://api.together.xyz/v1

Fast inference for open-source models with competitive pricing

Models

meta-llama/Llama-3.1...mistralai/Mixtral-8x...

Features

Open SourceFast InferenceFine-tuning

GroqPopular

https://api.groq.com/openai/v1

Ultra-fast inference with LPU technology for supported models

Models

llama-3.1-70b-versat...llama-3.1-8b-instant+1

Features

Ultra FastFree TierLow Latency

Fireworks AI

https://api.fireworks.ai/inference/v1

Optimized inference for open-source and custom models

Models

accounts/fireworks/m...accounts/fireworks/m...

Features

FastCustom ModelsFunction Calling

Ollama

http://localhost:11434/v1

Run models locally on your machine with complete privacy

Models

llama3.1codellama+2

Features

LocalPrivateNo API Key

LM Studio

http://localhost:1234/v1

Local inference with a user-friendly GUI for model management

Models

Various GGUF models

Features

LocalGUINo API Key

vLLM

http://localhost:8000/v1

High-throughput inference server for self-hosted deployments

Models

Self-hosted models

Features

Self-HostedHigh ThroughputPaged Attention

Privacy Note

When using local providers like Ollama or LM Studio, all inference happens on your machine. Your code and conversations never leave your device. For cloud providers, review their privacy policies and data retention practices.

OpenAI-Compatible API Providers

Many AI providers offer OpenAI-compatible APIs, allowing you to use Claude Code CLI with various models and services. Here's an overview of popular options:

Cloud Providers

OpenRouter: Unified API for 100+ models including Claude, GPT-4, and open-source models
Groq: Ultra-fast inference with Llama and Mixtral models on custom LPU hardware
Together AI: Wide selection of open-source models with competitive pricing
Fireworks AI: Fast inference optimized for production workloads
Deep Infra: Serverless inference for popular open-source models

Local Options

Ollama: Run LLMs locally on your machine with easy model management
LM Studio: Desktop app for running local LLMs with a GUI interface

Choosing a Provider

Consider factors like model availability, pricing, speed, and whether you need local or cloud-based inference. OpenRouter is often recommended for its wide model selection and unified billing.