Embeddings

Generate vector embeddings for search, clustering, and semantic retrieval.

POST/v1/embeddings

Name	Type	Required	Description
model	string	Required	Embedding model ID
input	string \| string[]	Required	Text or list of text inputs
encoding_format	string		float (default) or base64
dimensions	integer		Optional output dimension reduction when model supports it

Examples

cURL

curl https://api.therouter.ai/v1/embeddings \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/text-embedding-3-large",
    "input": ["therouter docs", "routing and fallbacks"]
  }'

embeddings-response.json

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0123, -0.0488, 0.1932, -0.0157]
    },
    {
      "object": "embedding",
      "index": 1,
      "embedding": [0.0099, -0.0311, 0.1641, -0.0214]
    }
  ],
  "model": "openai/text-embedding-3-large",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

Batching

Use array input to batch related records in one request. The output order matches input order by index.

Provider-specific limitations

Cohere input_type

Cohere Embed models (cohere/embed-english-v3, cohere/embed-multilingual-v3) require an input_type parameter that has no equivalent in the OpenAI embeddings format. TheRouter hardcodes this to search_document, which is optimal for indexing documents. If you are embedding search queries, the results may be suboptimal compared to using Cohere directly with input_type: "search_query".

Cohere token estimation

Cohere Embed does not return token counts in its response. TheRouter estimates prompt tokens using a character-based heuristic (characters / 4), which can deviate from actual token counts by up to ~30% depending on language and content. This affects billing accuracy for Cohere embedding requests. For precise billing, consider using an OpenAI embedding model instead.