Chat API
Base URL: https://colibri-api-643619291153.me-west1.run.app
All chat endpoints require a valid Firebase ID token:
Authorization: Bearer <Firebase ID Token>
POST /chat
Sends the full message history and returns the AI response as a single JSON object.
Request
{
"messages": [
{ "role": "user", "content": "Hello!", "author": "Alice" },
{ "role": "assistant", "content": "Hi! How can I help?" },
{ "role": "user", "content": "What should we eat in Paris?", "author": "Bob" }
]
}
| Field | Type | Required | Description |
|---|---|---|---|
role | string | ✅ | user, assistant, or system |
content | string | ✅ | Message text. Max 10 000 characters |
author | string | ❌ | Display name of the sender. When multiple distinct authors are present the AI receives a group-chat context and addresses users by name |
Limits: max 50 messages per request.
Response 200
{
"reply": "Alice and Bob, here are the top dishes to try in Paris...",
"message": {
"id": "1740000000000",
"role": "assistant",
"content": "Alice and Bob, here are the top dishes to try in Paris...",
"timestamp": "2026-03-03T12:00:00.000Z"
}
}
curl example
curl -X POST https://colibri-api-643619291153.me-west1.run.app/chat \
-H "Authorization: Bearer <ID_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "user", "content": "Tell me a fun fact about Paris.", "author": "Alice" }
]
}'
POST /chat/stream
Sends the full message history and streams the AI response chunk-by-chunk via Server-Sent Events (SSE).
Use this endpoint to show the "typing" effect in the UI — text appears word by word as Gemini generates it, instead of waiting for the full response.
Request
Same JSON body as POST /chat, including optional author field.
Response headers
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
Response body — SSE event stream
data: Alice and Bob,
data: here are the top dishes
data: to try in Paris...
data: [DONE]
- Each
data: <chunk>line is a text fragment delivered as soon as Gemini produces it. data: [DONE]signals the end of the stream — the connection closes after this.
curl example
# -N disables output buffering so chunks print as they arrive
curl -N -X POST https://colibri-api-643619291153.me-west1.run.app/chat/stream \
-H "Authorization: Bearer <ID_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "user", "content": "Tell me a fun fact about Paris.", "author": "Alice" }
]
}'
Multi-user chat
When messages from multiple authors are included in a single request, the server automatically:
- Tells the AI it is in a group chat with those users (
systemInstruction). - Prefixes each message with the sender's name so the AI can track who said what.
{
"messages": [
{ "role": "user", "content": "Hey Alice, have you been to France?", "author": "Bob" },
{ "role": "user", "content": "Not yet! AI, what should we try there?", "author": "Alice" }
]
}
The AI addresses both users by name in its response.
Limits & timeouts
| Parameter | Value |
|---|---|
| Max messages per request | 50 |
| Max message length | 10 000 characters |
| Rate limit | 10 requests / 60 seconds per user |
| Connect timeout | 30 s |
| Response timeout | 30 s |
Error codes
| Code | Meaning |
|---|---|
400 | Invalid JSON body |
401 | Missing or invalid Firebase token |
405 | Wrong HTTP method (only POST allowed) |
422 | Validation error — check message field in response body |
429 | Rate limit exceeded — see Retry-After header |
500 | Unexpected server error |
502 | Cloud Run / upstream error (retry) |
504 | Gemini API timeout |
Error response body:
{
"error": "Too Many Requests",
"message": "Rate limit exceeded. Maximum 10 requests per 60 seconds."
}
Rate-limit response headers:
Retry-After: 60
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 0
X-RateLimit-Reset: <unix ms>