Deploy an AI-Powered Bot for Discord or Telegram
This guide covers deploying a chat bot for Discord or Telegram on Railway that uses an LLM API (OpenAI, Anthropic, etc.) to generate responses. The bot runs as an always-on service, listens for messages, and replies with AI-generated text.
Railway is a CPU-based platform. The bot calls external LLM APIs over HTTP. No models run locally.
Architecture
The bot consists of two components:
- Bot service runs continuously, connects to the Discord or Telegram API, listens for messages, and calls the LLM API to generate responses.
- Postgres (optional) stores conversation history per user or channel so the bot can maintain context across messages.
Bot services do not need a public domain. They connect outbound to the messaging platform's API and to the LLM API.
Prerequisites
1. Set up the project
Create a project directory with a requirements.txt:
For a Discord bot:
For a Telegram bot:
Install dependencies locally with pip install -r requirements.txt.
2. Write the bot code
Here is a minimal Discord bot using discord.py and OpenAI. Save this as bot.py:
For Telegram, use python-telegram-bot instead. Save this as bot.py:
3. Deploy to Railway
- Push your bot code to a GitHub repository.
- Create a new project on Railway.
- Click + New > GitHub Repo and select your repository.
- Set the start command to:
python bot.py - Set environment variables under the Variables tab:
DISCORD_TOKENorTELEGRAM_TOKEN: your bot token.OPENAI_API_KEYorANTHROPIC_API_KEY: your LLM API key.
- The bot does not need a public domain. It connects outbound to Discord/Telegram servers.
Deploy the bot as an always-on service, not a cron job. Bots must stay connected to receive messages in real time.
4. Add conversation memory with Postgres
Without a database, the bot treats every message independently. To maintain conversation context:
- Add PostgreSQL to your project.
- Reference
DATABASE_URLin the bot service. - Store recent messages per user or channel:
Before calling the LLM, load the last N messages for the channel and include them in the messages array. This gives the bot conversational context.
5. Handle rate limits
LLM APIs enforce rate limits on requests per minute and tokens per minute. When your bot is active in multiple channels simultaneously, you can hit these limits. Mitigations:
- Use a smaller, faster model (GPT-4o mini, Claude Haiku) for most responses. Reserve larger models for complex queries.
- Add a per-channel cooldown to avoid rapid-fire LLM calls.
- Implement retry logic with exponential backoff for 429 responses.
- Truncate conversation history to limit token usage per request.
Next steps
- Deploy an AI Agent with Async Workers: For bots that need to run longer tasks.
- PostgreSQL on Railway: Connection pooling, backups, and configuration.
- Monitor your app: View logs and metrics for your bot service.