BioCortex Web Scaling And Workers

Source aligned with docs/web_scaling_and_workers.md in the main repo.

What was changed

This update hardens the web/API layer for multi-user deployment on a login node:

biocortex.web.app now exposes an app-factory path.
python biocortex_web_app.py ... --frontend next --workers N can start multiple Uvicorn workers.
Session auth now uses a stable secret shared across workers.
Analysis admission is now gated by a shared SQLite-backed queue, so expensive login-node work does not fan out without control.
Web logs now include a run-scoped identifier around queue admission/release events.

Why this matters

Before this change, the Next.js API server ran as a single in-process app. That meant:

one worker handled all WebSocket and API traffic
session cookies were backed by a random secret generated on each start
no shared concurrency control existed for agent setup/planning on the login node

With many simultaneous users, the login node could become overloaded even if heavy compute was offloaded to Slurm.

New environment variables

Recommended .env values:

BIOCORTEX_WEB_WORKERS=4
BIOCORTEX_WEB_CONCURRENCY_LIMIT=6
BIOCORTEX_WEB_QUEUE_POLL_SECONDS=2

Optional session settings:

# Optional explicit secret
# BIOCORTEX_SESSION_SECRET=replace_with_a_long_random_string

# Optional alternate path for the generated secret file
# BIOCORTEX_SESSION_SECRET_FILE=/shared/path/.biocortex_session_secret

Notes:

BIOCORTEX_WEB_WORKERS controls API worker processes.
BIOCORTEX_WEB_CONCURRENCY_LIMIT controls how many expensive login-node analysis setups can run at once across workers.
BIOCORTEX_WEB_QUEUE_POLL_SECONDS controls how often queued requests re-check for a free slot.
If BIOCORTEX_SESSION_SECRET is not set, BioCortex creates .biocortex_session_secret automatically and reuses it.

Recommended production startup

API / WebSocket layer

python biocortex_web_app.py \
  --auth-users "admin:pass1" \
  --allow-register \
  --admin-user admin \
  --frontend next \
  --port 7860 \
  --workers 4

Frontend layer

Prefer production mode over npm run dev:

cd frontend
npm run build
npm run start -- -p 3001

Then access:

frontend: http://YOUR_IP:3001/chat
API: http://YOUR_IP:7860/docs

Queue behavior

When the login node is busy:

new analysis requests enter a shared queue
the UI receives a queue/status message instead of appearing frozen
users can still abort while waiting
once admitted, the request continues through agent setup, planning, and execution

This queue is shared across Uvicorn workers through a runtime SQLite database under the results runtime directory.

Important architecture note

This is a strong foundation for dozens to low hundreds of active users on a scheduler/login node architecture, but it is not by itself enough for a true “7000 simultaneously active analyses” target. For that later stage, you should additionally plan for:

an external reverse proxy / load balancer
multiple API instances on separate login/service nodes
shared session/state infrastructure where needed
a dedicated job/status store (PostgreSQL or Redis-backed queue)
object storage or shared filesystem for results
observability for queue depth, worker latency, and Slurm submission rate

The code added here is the correct first step because it removes the single-process bottleneck and adds controlled admission on the login node.

Getting Started

Core Framework

Tools & Extensions

Web & Automation

Deployment & auth

Advanced

Reference

Web scaling & workers

BioCortex Web Scaling And Workers

What was changed

Why this matters

New environment variables

Recommended production startup

API / WebSocket layer

Frontend layer

Queue behavior

Important architecture note

Getting Started

Core Framework

Tools & Extensions

Web & Automation

Deployment & auth

Advanced

Reference

​BioCortex Web Scaling And Workers

​What was changed

​Why this matters

​New environment variables

​Recommended production startup

​API / WebSocket layer

​Frontend layer

​Queue behavior

​Important architecture note

BioCortex Web Scaling And Workers

What was changed

Why this matters

New environment variables

Recommended production startup

API / WebSocket layer

Frontend layer

Queue behavior

Important architecture note