BioCortex Web Scaling And Workers
Source aligned with docs/web_scaling_and_workers.md in the main repo.
What was changed
This update hardens the web/API layer for multi-user deployment on a login node:biocortex.web.appnow exposes an app-factory path.python biocortex_web_app.py ... --frontend next --workers Ncan start multiple Uvicorn workers.- Session auth now uses a stable secret shared across workers.
- Analysis admission is now gated by a shared SQLite-backed queue, so expensive login-node work does not fan out without control.
- Web logs now include a run-scoped identifier around queue admission/release events.
Why this matters
Before this change, the Next.js API server ran as a single in-process app. That meant:- one worker handled all WebSocket and API traffic
- session cookies were backed by a random secret generated on each start
- no shared concurrency control existed for agent setup/planning on the login node
New environment variables
Recommended.env values:
BIOCORTEX_WEB_WORKERScontrols API worker processes.BIOCORTEX_WEB_CONCURRENCY_LIMITcontrols how many expensive login-node analysis setups can run at once across workers.BIOCORTEX_WEB_QUEUE_POLL_SECONDScontrols how often queued requests re-check for a free slot.- If
BIOCORTEX_SESSION_SECRETis not set, BioCortex creates.biocortex_session_secretautomatically and reuses it.
Recommended production startup
API / WebSocket layer
Frontend layer
Prefer production mode overnpm run dev:
- frontend:
http://YOUR_IP:3001/chat - API:
http://YOUR_IP:7860/docs
Queue behavior
When the login node is busy:- new analysis requests enter a shared queue
- the UI receives a queue/status message instead of appearing frozen
- users can still abort while waiting
- once admitted, the request continues through agent setup, planning, and execution
Important architecture note
This is a strong foundation for dozens to low hundreds of active users on a scheduler/login node architecture, but it is not by itself enough for a true “7000 simultaneously active analyses” target. For that later stage, you should additionally plan for:- an external reverse proxy / load balancer
- multiple API instances on separate login/service nodes
- shared session/state infrastructure where needed
- a dedicated job/status store (PostgreSQL or Redis-backed queue)
- object storage or shared filesystem for results
- observability for queue depth, worker latency, and Slurm submission rate