|Docs

Deploy a Multi-Agent System on Railway

agentsmulti-agentpythonworkers

A multi-agent system uses multiple specialized AI agents that collaborate on tasks. Each agent has a specific role (researcher, writer, reviewer) and calls an LLM API to reason through its part of the work. Agents communicate by reading and writing shared state in a database and message queue.

This guide covers deploying a multi-agent system on Railway where each agent runs as a separate service. This lets you scale agents independently, use different LLM providers per agent, and isolate failures.

Railway is a CPU-based platform. All agents call external LLM APIs (OpenAI, Anthropic, etc.) over HTTP. No models run locally.

Architecture overview

The system uses four types of components:

  • Orchestrator service receives tasks via HTTP, creates subtasks, and dispatches them to agent-specific Redis queues.
  • Agent services (one per role) pull tasks from their queue, call their assigned LLM, write results to Postgres, and push downstream tasks to the next agent's queue.
  • Postgres stores task state, agent outputs, and final results.
  • Redis serves as the message queue between services.

This extends the single-agent async workers pattern to multiple specialized agents.

Prerequisites

  • A Railway account
  • API keys for one or more LLM providers (OpenAI, Anthropic, etc.)
  • Python 3.11+

Project structure

Create a repository with the following structure:

requirements.txt

shared.py

Shared utilities for database and Redis connections used by all services:

orchestrator.py

The orchestrator receives HTTP requests, creates tasks, and dispatches them to agent queues:

agents/researcher.py

The researcher agent pulls tasks from its queue, calls the LLM to research a topic, then creates a follow-up task for the writer:

agents/writer.py

The writer agent takes research output and produces a finished article:

1. Create the project and shared infrastructure

  1. Create a new project on Railway.
  2. Add PostgreSQL: click + New > Database > PostgreSQL.
  3. Add Redis: click + New > Database > Redis.

Create the tasks table by connecting to Postgres and running:

2. Deploy the orchestrator

  1. Push the code above to a GitHub repository.
  2. In your project, click + New > GitHub Repo and select your repository.
  3. Set the start command to: uvicorn orchestrator:app --host 0.0.0.0 --port $PORT
  4. Set environment variables:
  5. Generate a public domain for receiving task requests.

3. Deploy agent services

Each agent is a separate Railway service pointing at the same repository with a different start command:

  1. Click + New > GitHub Repo and select the same repository again.
  2. Name the service researcher.
  3. Set the start command to: python -m agents.researcher
  4. Set environment variables:
    • Reference DATABASE_URL and REDIS_URL (same as the orchestrator).
    • Set OPENAI_API_KEY to your API key.
    • Set STARTUP_DELAY_SECONDS to 0.
  5. No public domain is needed. The agent communicates via Redis and Postgres over private networking.

Repeat for the writer agent:

  1. Click + New > GitHub Repo and select the same repository.
  2. Name the service writer.
  3. Set the start command to: python -m agents.writer
  4. Set the same environment variables, but set STARTUP_DELAY_SECONDS to 5 to stagger startup and avoid hitting LLM API rate limits.

4. Test the system

Send a task to the orchestrator:

This returns a task ID. The researcher picks it up, calls the LLM, writes the research to Postgres, and dispatches a writing task. The writer picks that up and produces an article.

Check the result:

5. Scale agents independently

Each agent is a separate Railway service with its own scaling configuration:

  • Add horizontal replicas to high-throughput agents (e.g., 3 researcher replicas, 1 writer replica). All replicas pull from the same Redis queue.
  • Adjust CPU and memory per agent under Settings > Resources.
  • Monitor queue depth in Redis to identify bottlenecks.

Using a framework

If you prefer a framework for agent orchestration, CrewAI and AutoGen handle agent definitions, task routing, and conversation management. Deploy the framework as your orchestrator service and let it manage agent interactions internally, or split agents into separate services for independent scaling.

Next steps