Scaling to handle more traffic

Performance Tuning Overview

RhodeCode 5.x provides multiple scaling dimensions to optimize performance for your workload. The key areas to tune are:

  • Gunicorn workers and threads - Controls concurrent request handling per replica

  • Docker replicas - Horizontal scaling of services

For production deployments, especially after migrating from 4.x, comprehensive tuning of these settings is recommended to achieve optimal performance.

Scaling rhodecode app HTTP Traffic

Adjust the .custom/docker-compose-apps.override.yaml file to scale services horizontally:

services:
  rhodecode:
    deploy:
      replicas: 4

  vcsserver:
    deploy:
      replicas: 4

Scaling SSH Traffic

SSH workers can scale independently from HTTP traffic. For high SSH workloads, increase replicas for the sshd service.

Edit .custom/docker-compose-apps.override.yaml:

services:
  sshd:
    deploy:
      replicas: 4

Note

For production deployments with heavy SSH usage (e.g., many developers using git/hg over SSH), 4-7 replicas are recommended depending on load.

Scaling SVN Traffic

SVN operations can be resource-intensive. For deployments with many SVN repositories, scale the SVN service independently.

Edit .custom/docker-compose-apps.override.yaml:

services:
  svn:
    deploy:
      replicas: 4

Note

SVN operations can be particularly demanding. For large SVN deployments, 4-5 replicas are recommended depending on workload.

Scaling Celery Workers

Celery workers handle background tasks and can scale independently from HTTP traffic.

Edit .custom/docker-compose-apps.override.yaml:

services:
  celery:
    deploy:
      replicas: 4

  celery-beat:
    deploy:
      # Must be 1 to avoid duplicate scheduled tasks
      replicas: 1

Note

  • Celery has an autoscaler defined by --autoscale=20,2

  • This means 2 processes minimum, scaling up to 20 per replica

  • Total capacity = replicas × maximum autoscale number

  • For production: 4 worker replicas provide good balance

  • Important: celery-beat must always be 1 replica to avoid duplicate scheduled tasks

Scale without downtime:

./rcstack stack rhodecode up --detach --no-recreate --scale rhodecode=4 --scale vcsserver=4

Gunicorn Configuration

Edit config/_shared/gunicorn_conf_rc.py for rhodecode config and config/_shared/gunicorn_conf_vcs.py for vcsserver config:

# Enable port reuse for better load balancing
reuse_port = True

# Use threaded worker class
worker_class = "gthread"

# Workers handle concurrent requests
# VCS server typically needs around 50% more workers
workers = 4


# Threads per worker for concurrent request handling
threads = 5

# Maximum concurrent connections per worker
worker_connections = 30

Note

Recommended Configuration:

  • For production: 4 replicas with 4 workers and 5 threads for rhodecode

  • VCS server typically needs more workers (6 workers recommended)

  • Replicas of vcsserver and rhodecode should typically be equal

  • Total capacity = replicas × workers × threads

  • Set worker_class = "gthread" and reuse_port = True for both rhodecode and vcsserver

  • Monitor CPU and memory usage to fine-tune these values

Additional Performance Parameters for Gunicorn

For production deployments, consider tuning these additional gunicorn parameters in both config/_shared/gunicorn_conf_rc.py and config/_shared/gunicorn_conf_vcs.py:

# Worker restart settings (prevents memory leaks)
max_requests = 2000
max_requests_jitter = int(max_requests * 0.2)  # Adds randomness to prevent simultaneous restarts

# Connection backlog (pending connections queue)
backlog = 64

# Worker timeout (default: 30s, RhodeCode default: 6hrs for long operations)
timeout = 21600

# Graceful restart timeout
graceful_timeout = 21600

# Keep-alive for client connections
keepalive = 2

Note

Key Parameters:

  • max_requests - Worker restarts after N requests (0 = disabled, default in gunicorn is 0)

  • max_requests_jitter - Randomizes restart timing using randint(0, jitter_value)

  • backlog - Max pending connections (default: 2048). Generally 64-2048 range

  • timeout - Worker silent timeout in seconds (default: 30s, 0 = disabled)

  • graceful_timeout - Time for workers to finish requests during restart (default: 30s)

  • keepalive - Keep-alive timeout for client connections (default: 2s, range: 1-5s)