Deploy an AI Chatbot with Streaming Responses
This guide covers deploying a chatbot on Railway that streams responses from an LLM API (OpenAI or Anthropic) to the browser using Server-Sent Events. The chatbot uses Next.js with the Vercel AI SDK, which handles the streaming protocol on both the server and client.
Railway is a CPU-based platform. The chatbot calls external LLM APIs over HTTP, it does not run models locally.
What you will set up
- A Next.js app with a streaming chat endpoint using the AI SDK
- An SSE connection that delivers tokens to the browser as they are generated
- Environment variables for your LLM API key
- A public domain for accessing the chatbot
- Optional: Postgres for persisting conversation history
Prerequisites
- A Railway account
- An API key from OpenAI or Anthropic
- A Next.js app with the AI SDK installed, or a willingness to start from the AI SDK's chat template
1. Create the project
If you do not already have a Next.js app, create one and install the AI SDK:
To use Anthropic instead of OpenAI, install @ai-sdk/anthropic in place of @ai-sdk/openai.
2. Set up the chat API route
The AI SDK provides a streamText function that calls the LLM and returns a streaming response. Create an API route that uses it:
To use Anthropic instead, swap the provider:
3. Set up the chat UI
The AI SDK's useChat hook manages the message list, input state, and SSE connection:
useChat sends a POST request to /api/chat and reads the streamed response. Tokens appear in the UI as they arrive.
4. Deploy to Railway
- Push your code to a GitHub repository.
- Create a new project on Railway.
- Click + New and select GitHub Repo, then choose your repository.
- Add your LLM API key as an environment variable:
- For OpenAI: set
OPENAI_API_KEY - For Anthropic: set
ANTHROPIC_API_KEY
- For OpenAI: set
- Generate a public domain under Settings > Networking > Public Networking.
Railway auto-detects Next.js via Railpack and configures the build. The service will be live at your generated domain once the first deploy completes.
Streaming and Railway's request duration limit
Railway has a maximum HTTP request duration of 15 minutes. SSE connections that stay open longer than 15 minutes are terminated.
For a chatbot, this is rarely a problem: most LLM responses complete in seconds. If your chatbot runs multi-step agent tasks that could take longer, consider the async workers pattern instead of holding an SSE connection open.
The AI SDK's useChat hook handles reconnection automatically. If a connection drops, the client re-sends the message history on the next request, so the user does not lose context.
Optional: persist conversations with Postgres
Without a database, conversation history only exists in the browser's memory. To persist conversations across sessions:
- Add PostgreSQL to your project: click + New > Database > PostgreSQL.
- Reference the
DATABASE_URLvariable in your Next.js service. - Create a table for messages:
- In your API route, load prior messages from Postgres before calling
streamText, and save the assistant's response after the stream completes.
Next steps
- Choose Between SSE and WebSockets: Protocol tradeoffs for real-time applications.
- Deploy an AI Agent with Async Workers: For tasks that take minutes, not seconds.
- Deploy a RAG Pipeline with pgvector: Add knowledge retrieval to your chatbot.
- PostgreSQL on Railway: Connection pooling, backups, and configuration.