Configuration

BioCortex uses a hierarchical configuration system: defaults, environment variables, per-user .env files, and optional runtime/API overrides.

Where Config Lives

  • Environment variables — e.g. OPENAI_API_KEY, REASONING_MODEL, BIOCORTEX_UPLOAD_DIR.
  • .env file — Project root or ~/.biocortex/.env for per-user overrides.
  • build_config_for_user(username) — Builds the effective config for the Web UI; can pull from user-specific settings (stored in DB or file) for models and strategy.

Key Options (Overview)

CategoryExamples
LLMREASONING_MODEL, CODER_MODEL, FAST_MODEL, OPENAI_API_KEY, ANTHROPIC_API_KEY, AZURE_OPENAI_*, custom base URL and API key.
Context / memoryModel context window table in code (MODEL_CONTEXT_WINDOWS); working_memory_max_tokens (-1 = auto-calibrate from reasoning model).
PathsBIOCORTEX_UPLOAD_DIR, BIOCORTEX_RESULTS_DIR, AI_RESEARCHER_PATH, AutoFigure paths.
StrategyStrategy router confidence threshold; overrides for testing (e.g. force DAG or ReAct).
RetrievalVector search top-k, KG expansion depth, LLM rerank top-k; disable vector for LLM-only.
MemoryEpisodic: top_k, min_relevance, token budget, multi-signal weights.
Self-correctionMax retries, max deep retries, report quality threshold.
Template evolutionBIOCORTEX_TEMPLATE_EVOLUTION, BIOCORTEX_TEMPLATE_THRESHOLD (default 0.9), BIOCORTEX_TEMPLATE_STORAGE_DIR. See Template Evolution.
PipelineBIOCORTEX_ENABLE_FINAL_VALIDATION — Phase 3 (final validation) is off by default to save time and tokens; set to true to run final validation after execution and allow plan revision. BIOCORTEX_REQUIRE_PLAN_CONFIRMATION — wait for user to confirm plan before execution. BIOCORTEX_CONTINUE_AFTER_STEP_FAILURE — when true (DAG only), do not skip downstream steps when one step fails; run them with partial state (best-effort).
ExecutionSandbox mode (subprocess vs Docker), timeouts, resource limits.

Model Context Window Table

Defined in biocortex.config: MODEL_CONTEXT_WINDOWS maps model ids to (max_input_tokens, max_output_tokens). Used for:
  • get_model_context_window(model_id) — Resolution: exact → prefix → substring; default 128K / 4K.
  • Token budgets — BaseAgent and Synthesizer use this for input/output reserves and truncation.
Adding a new model: add an entry to the table (or a prefix) so the correct window is used.

Working Memory Auto-Calibration

  • memory.working_memory_max_tokens = -1 (sentinel) → at config build time the framework sets it to 60% of the reasoning model’s max input, clamped between 16K and 600K.
  • Set to a positive integer to fix working memory size regardless of model.

Per-User Settings (Web UI)

Stored and retrieved via GET/PUT /api/v1/me/settings. Typically includes:
  • reasoning_model, coder_model, fast_model
  • custom_base_url, custom_api_key (masked in GET)
  • strategy (e.g. auto, simple, pipeline, mcts)
  • usage/limits (api_calls, tokens, storage)
These override defaults when building config for that user’s agent.

Optional Integrations

  • AutoFigure: AUTOFIGURE_API_KEY or fallback to BioCortex .env (e.g. BIOCORTEX_CUSTOM_API_KEY).
  • AI-Researcher: AI_RESEARCHER_PATH (repo root); the pipeline’s own .env for OPENROUTER_API_KEY, COMPLETION_MODEL, CATEGORY, INSTANCE_ID.
See AutoFigure & AI-Researcher and Adding Tools and Agents.

Full Reference

For an exhaustive list of parameters and defaults, see the source: biocortex.config (and related dataclasses). The Nature Methods-style paper in docs/BioCortex_Nature_Methods_Paper.md also documents context-window, memory, and retrieval parameters in the Methods section.