BioCortex Web Scaling And Workers

Source aligned with docs/web_scaling_and_workers.md in the main repo.

What was changed

This update hardens the web/API layer for multi-user deployment on a login node:
  1. biocortex.web.app now exposes an app-factory path.
  2. python biocortex_web_app.py ... --frontend next --workers N can start multiple Uvicorn workers.
  3. Session auth now uses a stable secret shared across workers.
  4. Analysis admission is now gated by a shared SQLite-backed queue, so expensive login-node work does not fan out without control.
  5. Web logs now include a run-scoped identifier around queue admission/release events.

Why this matters

Before this change, the Next.js API server ran as a single in-process app. That meant:
  • one worker handled all WebSocket and API traffic
  • session cookies were backed by a random secret generated on each start
  • no shared concurrency control existed for agent setup/planning on the login node
With many simultaneous users, the login node could become overloaded even if heavy compute was offloaded to Slurm.

New environment variables

Recommended .env values:
BIOCORTEX_WEB_WORKERS=4
BIOCORTEX_WEB_CONCURRENCY_LIMIT=6
BIOCORTEX_WEB_QUEUE_POLL_SECONDS=2
Optional session settings:
# Optional explicit secret
# BIOCORTEX_SESSION_SECRET=replace_with_a_long_random_string

# Optional alternate path for the generated secret file
# BIOCORTEX_SESSION_SECRET_FILE=/shared/path/.biocortex_session_secret
Notes:
  • BIOCORTEX_WEB_WORKERS controls API worker processes.
  • BIOCORTEX_WEB_CONCURRENCY_LIMIT controls how many expensive login-node analysis setups can run at once across workers.
  • BIOCORTEX_WEB_QUEUE_POLL_SECONDS controls how often queued requests re-check for a free slot.
  • If BIOCORTEX_SESSION_SECRET is not set, BioCortex creates .biocortex_session_secret automatically and reuses it.

API / WebSocket layer

python biocortex_web_app.py \
  --auth-users "admin:pass1" \
  --allow-register \
  --admin-user admin \
  --frontend next \
  --port 7860 \
  --workers 4

Frontend layer

Prefer production mode over npm run dev:
cd frontend
npm run build
npm run start -- -p 3001
Then access:
  • frontend: http://YOUR_IP:3001/chat
  • API: http://YOUR_IP:7860/docs

Queue behavior

When the login node is busy:
  • new analysis requests enter a shared queue
  • the UI receives a queue/status message instead of appearing frozen
  • users can still abort while waiting
  • once admitted, the request continues through agent setup, planning, and execution
This queue is shared across Uvicorn workers through a runtime SQLite database under the results runtime directory.

Important architecture note

This is a strong foundation for dozens to low hundreds of active users on a scheduler/login node architecture, but it is not by itself enough for a true “7000 simultaneously active analyses” target. For that later stage, you should additionally plan for:
  • an external reverse proxy / load balancer
  • multiple API instances on separate login/service nodes
  • shared session/state infrastructure where needed
  • a dedicated job/status store (PostgreSQL or Redis-backed queue)
  • object storage or shared filesystem for results
  • observability for queue depth, worker latency, and Slurm submission rate
The code added here is the correct first step because it removes the single-process bottleneck and adds controlled admission on the login node.