Scaling to handle more traffic¶

Performance Tuning Overview¶

RhodeCode 5.x provides multiple scaling dimensions to optimize performance for your workload. The key areas to tune are:

Gunicorn workers and threads - Controls concurrent request handling per replica
Docker replicas - Horizontal scaling of services

For production deployments, especially after migrating from 4.x, comprehensive tuning of these settings is recommended to achieve optimal performance.

Scaling rhodecode app HTTP Traffic¶

Adjust the .custom/docker-compose-apps.override.yaml file to scale services horizontally:

services:
  rhodecode:
    deploy:
      replicas: 4

  vcsserver:
    deploy:
      replicas: 4

Scaling SSH Traffic¶

SSH workers can scale independently from HTTP traffic. For high SSH workloads, increase replicas for the sshd service.

Edit .custom/docker-compose-apps.override.yaml:

services:
  sshd:
    deploy:
      replicas: 4

Note

For production deployments with heavy SSH usage (e.g., many developers using git/hg over SSH), 4-7 replicas are recommended depending on load.

Scaling SVN Traffic¶

SVN operations can be resource-intensive. For deployments with many SVN repositories, scale the SVN service independently.

Edit .custom/docker-compose-apps.override.yaml:

services:
  svn:
    deploy:
      replicas: 4

Note

SVN operations can be particularly demanding. For large SVN deployments, 4-5 replicas are recommended depending on workload.

Scaling Celery Workers¶

Celery workers handle background tasks and can scale independently from HTTP traffic.

Edit .custom/docker-compose-apps.override.yaml:

services:
  celery:
    deploy:
      replicas: 4

  celery-beat:
    deploy:
      # Must be 1 to avoid duplicate scheduled tasks
      replicas: 1

Note

Celery has an autoscaler defined by --autoscale=20,2
This means 2 processes minimum, scaling up to 20 per replica
Total capacity = replicas × maximum autoscale number
For production: 4 worker replicas provide good balance
Important: celery-beat must always be 1 replica to avoid duplicate scheduled tasks

Scale without downtime:

./rcstack stack rhodecode up --detach --no-recreate --scale rhodecode=4 --scale vcsserver=4

Gunicorn Configuration¶

Edit config/_shared/gunicorn_conf_rc.py for rhodecode config and config/_shared/gunicorn_conf_vcs.py for vcsserver config:

# Enable port reuse for better load balancing
reuse_port = True

# Use threaded worker class
worker_class = "gthread"

# Workers handle concurrent requests
# VCS server typically needs around 50% more workers
workers = 4


# Threads per worker for concurrent request handling
threads = 5

# Maximum concurrent connections per worker
worker_connections = 30

Note

Recommended Configuration:

For production: 4 replicas with 4 workers and 5 threads for rhodecode
VCS server typically needs more workers (6 workers recommended)
Replicas of vcsserver and rhodecode should typically be equal
Total capacity = replicas × workers × threads
Set worker_class = "gthread" and reuse_port = True for both rhodecode and vcsserver
Monitor CPU and memory usage to fine-tune these values

Additional Performance Parameters for Gunicorn¶

For production deployments, consider tuning these additional gunicorn parameters in both config/_shared/gunicorn_conf_rc.py and config/_shared/gunicorn_conf_vcs.py:

# Worker restart settings (prevents memory leaks)
max_requests = 2000
max_requests_jitter = int(max_requests * 0.2)  # Adds randomness to prevent simultaneous restarts

# Connection backlog (pending connections queue)
backlog = 64

# Worker timeout (default: 30s, RhodeCode default: 6hrs for long operations)
timeout = 21600

# Graceful restart timeout
graceful_timeout = 21600

# Keep-alive for client connections
keepalive = 2

Note

Key Parameters:

max_requests - Worker restarts after N requests (0 = disabled, default in gunicorn is 0)
max_requests_jitter - Randomizes restart timing using randint(0, jitter_value)
backlog - Max pending connections (default: 2048). Generally 64-2048 range
timeout - Worker silent timeout in seconds (default: 30s, 0 = disabled)
graceful_timeout - Time for workers to finish requests during restart (default: 30s)
keepalive - Keep-alive timeout for client connections (default: 2s, range: 1-5s)