Streaming Responses

Stream responses in real time to improve user experience and reduce perceived latency.

With streaming enabled, tokens are delivered as they are generated, giving users immediate feedback and significantly improving perceived performance.

Overview

Requesty supports Server-Sent Events (SSE) streaming across all major providers (OpenAI, Anthropic, Google, Mistral). Your application can render content progressively while it is being generated without waiting for the full response.

Why Stream?

Better UX: Users see output immediately; perceived wait time can drop by up to 80%.
Higher engagement: Real-time delivery keeps users engaged during long responses.
Fewer timeouts: Avoid timeouts on slow or complex requests.
Progressive display: Enable incremental UI updates as chunks arrive.

Implementation

Basic Streaming Setup

Enable streaming by setting the stream parameter to true in your request:

Python
JavaScript
Bash

import openai

client = openai.OpenAI(
    api_key="your_requesty_api_key",
    base_url="https://gw.1route.ai/v1",
)

response = client.chat.completions.create(
    model="openai/gpt-4",
    messages=[{"role": "user", "content": "Write a poem about the stars."}],
    stream=True
)

# Handle streamed chunks
for chunk in response:
    if chunk.choices[0].delta.content is not None:
        content = chunk.choices[0].delta.content
        print(content, end="", flush=True)

const { OpenAI } = require('openai');

const client = new OpenAI({
    apiKey: "your_requesty_api_key",
    baseURL: "https://gw.1route.ai/v1",
});

async function streamResponse() {
    const stream = await client.chat.completions.create({
        model: "openai/gpt-4",
        messages: [{ "role": "user", "content": "Write a poem about the stars." }],
        stream: true
    });

    for await (const chunk of stream) {
        if (chunk.choices[0].delta.content) {
            process.stdout.write(chunk.choices[0].delta.content);
        }
    }
}

streamResponse();

# Invoke a model with streaming enabled
API_KEY="your_requesty_api_key"

curl -N -X POST "https://gw.1route.ai/v1/chat/completions" \
 -H "Authorization: Bearer $API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"model": "openai/gpt-4", "messages": [{"role": "user", "content": "Write a poem about the stars."}], "stream": true}'

Overview​

Why Stream?​

Implementation​

Basic Streaming Setup​

Overview

Why Stream?

Implementation

Basic Streaming Setup