Skip to main content

Chat API

Base URL: https://colibri-api-643619291153.me-west1.run.app

All chat endpoints require a valid Firebase ID token:

Authorization: Bearer <Firebase ID Token>

POST /chat

Sends the full message history and returns the AI response as a single JSON object.

Request

{
"messages": [
{ "role": "user", "content": "Hello!", "author": "Alice" },
{ "role": "assistant", "content": "Hi! How can I help?" },
{ "role": "user", "content": "What should we eat in Paris?", "author": "Bob" }
]
}
FieldTypeRequiredDescription
rolestringuser, assistant, or system
contentstringMessage text. Max 10 000 characters
authorstringDisplay name of the sender. When multiple distinct authors are present the AI receives a group-chat context and addresses users by name

Limits: max 50 messages per request.

Response 200

{
"reply": "Alice and Bob, here are the top dishes to try in Paris...",
"message": {
"id": "1740000000000",
"role": "assistant",
"content": "Alice and Bob, here are the top dishes to try in Paris...",
"timestamp": "2026-03-03T12:00:00.000Z"
}
}

curl example

curl -X POST https://colibri-api-643619291153.me-west1.run.app/chat \
-H "Authorization: Bearer <ID_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "user", "content": "Tell me a fun fact about Paris.", "author": "Alice" }
]
}'

POST /chat/stream

Sends the full message history and streams the AI response chunk-by-chunk via Server-Sent Events (SSE).

Use this endpoint to show the "typing" effect in the UI — text appears word by word as Gemini generates it, instead of waiting for the full response.

Request

Same JSON body as POST /chat, including optional author field.

Response headers

Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

Response body — SSE event stream

data: Alice and Bob,

data: here are the top dishes

data: to try in Paris...

data: [DONE]
  • Each data: <chunk> line is a text fragment delivered as soon as Gemini produces it.
  • data: [DONE] signals the end of the stream — the connection closes after this.

curl example

# -N disables output buffering so chunks print as they arrive
curl -N -X POST https://colibri-api-643619291153.me-west1.run.app/chat/stream \
-H "Authorization: Bearer <ID_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "user", "content": "Tell me a fun fact about Paris.", "author": "Alice" }
]
}'

Multi-user chat

When messages from multiple authors are included in a single request, the server automatically:

  1. Tells the AI it is in a group chat with those users (systemInstruction).
  2. Prefixes each message with the sender's name so the AI can track who said what.
{
"messages": [
{ "role": "user", "content": "Hey Alice, have you been to France?", "author": "Bob" },
{ "role": "user", "content": "Not yet! AI, what should we try there?", "author": "Alice" }
]
}

The AI addresses both users by name in its response.


Limits & timeouts

ParameterValue
Max messages per request50
Max message length10 000 characters
Rate limit10 requests / 60 seconds per user
Connect timeout30 s
Response timeout30 s

Error codes

CodeMeaning
400Invalid JSON body
401Missing or invalid Firebase token
405Wrong HTTP method (only POST allowed)
422Validation error — check message field in response body
429Rate limit exceeded — see Retry-After header
500Unexpected server error
502Cloud Run / upstream error (retry)
504Gemini API timeout

Error response body:

{
"error": "Too Many Requests",
"message": "Rate limit exceeded. Maximum 10 requests per 60 seconds."
}

Rate-limit response headers:

Retry-After: 60
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 0
X-RateLimit-Reset: <unix ms>