Image Inputs
Analyze images with TheRouter.ai vision models
Send images through /chat/completions using image_url content parts. TheRouter.ai supports both URL-based images and base64 data URLs.
Image URL input
Use public URLs when possible for faster requests and smaller payloads.
TypeScript
const response = await fetch("https://api.therouter.ai/v1/chat/completions", {
method: "POST",
headers: {
Authorization: "Bearer <THEROUTER_API_KEY>",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "google/gemini-2.5-flash",
messages: [
{
role: "user",
content: [
{ type: "text", text: "What is shown in this image?" },
{
type: "image_url",
image_url: {
url: "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/1280px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
},
},
],
},
],
}),
});
const result = await response.json();
console.log(result.choices[0].message.content);Base64 image input
Use base64 data URLs for local files and private assets that are not publicly reachable.
Node.js
import fs from "node:fs/promises";
const image = await fs.readFile("./receipt.jpg");
const dataUrl = "data:image/jpeg;base64," + image.toString("base64");
const response = await fetch("https://api.therouter.ai/v1/chat/completions", {
method: "POST",
headers: {
Authorization: "Bearer <THEROUTER_API_KEY>",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "google/gemini-2.5-flash",
messages: [
{
role: "user",
content: [
{ type: "text", text: "Extract all totals from this receipt" },
{ type: "image_url", image_url: { url: dataUrl } },
],
},
],
}),
});supported-content-types.txt
image/png
image/jpeg
image/webp
image/gifBest practices
- Put the text instruction first, then attach one or more images.
- Use separate content entries for multiple images in the same message.
- Check model capability metadata before sending image payloads.
Prompt ordering
If images must appear first, place your task instruction in a system message so the model gets clear guidance before processing the visuals.