Configuration
BioCortex uses a hierarchical configuration system: defaults, environment variables, per-user.env files, and optional runtime/API overrides.
Where Config Lives
- Environment variables — e.g.
OPENAI_API_KEY,REASONING_MODEL,BIOCORTEX_UPLOAD_DIR. - .env file — Project root or
~/.biocortex/.envfor per-user overrides. - build_config_for_user(username) — Builds the effective config for the Web UI; can pull from user-specific settings (stored in DB or file) for models and strategy.
Key Options (Overview)
| Category | Examples |
|---|---|
| LLM | REASONING_MODEL, CODER_MODEL, FAST_MODEL, OPENAI_API_KEY, ANTHROPIC_API_KEY, AZURE_OPENAI_*, custom base URL and API key. |
| Context / memory | Model context window table in code (MODEL_CONTEXT_WINDOWS); working_memory_max_tokens (-1 = auto-calibrate from reasoning model). |
| Paths | BIOCORTEX_UPLOAD_DIR, BIOCORTEX_RESULTS_DIR, AI_RESEARCHER_PATH, AutoFigure paths. |
| Strategy | Strategy router confidence threshold; overrides for testing (e.g. force DAG or ReAct). |
| Retrieval | Vector search top-k, KG expansion depth, LLM rerank top-k; disable vector for LLM-only. |
| Memory | Episodic: top_k, min_relevance, token budget, multi-signal weights. |
| Self-correction | Max retries, max deep retries, report quality threshold. |
| Template evolution | BIOCORTEX_TEMPLATE_EVOLUTION, BIOCORTEX_TEMPLATE_THRESHOLD (default 0.9), BIOCORTEX_TEMPLATE_STORAGE_DIR. See Template Evolution. |
| Pipeline | BIOCORTEX_ENABLE_FINAL_VALIDATION — Phase 3 (final validation) is off by default to save time and tokens; set to true to run final validation after execution and allow plan revision. BIOCORTEX_REQUIRE_PLAN_CONFIRMATION — wait for user to confirm plan before execution. BIOCORTEX_CONTINUE_AFTER_STEP_FAILURE — when true (DAG only), do not skip downstream steps when one step fails; run them with partial state (best-effort). |
| Execution | Sandbox mode (subprocess vs Docker), timeouts, resource limits. |
Model Context Window Table
Defined inbiocortex.config: MODEL_CONTEXT_WINDOWS maps model ids to (max_input_tokens, max_output_tokens). Used for:
- get_model_context_window(model_id) — Resolution: exact → prefix → substring; default 128K / 4K.
- Token budgets — BaseAgent and Synthesizer use this for input/output reserves and truncation.
Working Memory Auto-Calibration
- memory.working_memory_max_tokens = -1 (sentinel) → at config build time the framework sets it to 60% of the reasoning model’s max input, clamped between 16K and 600K.
- Set to a positive integer to fix working memory size regardless of model.
Per-User Settings (Web UI)
Stored and retrieved via GET/PUT /api/v1/me/settings. Typically includes:- reasoning_model, coder_model, fast_model
- custom_base_url, custom_api_key (masked in GET)
- strategy (e.g. auto, simple, pipeline, mcts)
- usage/limits (api_calls, tokens, storage)
Optional Integrations
- AutoFigure:
AUTOFIGURE_API_KEYor fallback to BioCortex.env(e.g.BIOCORTEX_CUSTOM_API_KEY). - AI-Researcher:
AI_RESEARCHER_PATH(repo root); the pipeline’s own.envforOPENROUTER_API_KEY,COMPLETION_MODEL,CATEGORY,INSTANCE_ID.
Full Reference
For an exhaustive list of parameters and defaults, see the source:biocortex.config (and related dataclasses). The Nature Methods-style paper in docs/BioCortex_Nature_Methods_Paper.md also documents context-window, memory, and retrieval parameters in the Methods section.