Introduction

BioCortex (Biological Cortical Task Executor) is a multi-agent AI framework for automated biological research. It plans, executes, and synthesizes complex analytical workflows across all biological domains — from single-cell transcriptomics and protein structure to ecology, pharmacology, and genome engineering.

Why BioCortex?

Biological research today involves many data types and tools. Existing LLM-based agents often use a single, sequential reasoning loop and scale poorly as tool sets grow. BioCortex is built around three ideas:
  1. Adaptive strategy routing — Not every task needs the same pipeline. Simple lookups use a fast ReAct loop; multi-step workflows use a DAG-parallel Planner–Executor–Critic–Synthesizer pipeline; exploratory research uses Monte Carlo Tree Search (MCTS).
  2. Hybrid retrieval at scale — Tool selection uses a three-stage pipeline: vector semantic search (fast recall), knowledge-graph expansion (dependency discovery), and LLM reranking (precision). This scales to 10,000+ tools and is 3–5× faster than LLM-only selection.
  3. Persistent intelligence — A three-tier memory system (working, episodic, semantic) and a biological knowledge graph let BioCortex learn from past analyses and ground claims for anti-hallucination verification.

Core Capabilities

CapabilityDescription
Strategy RouterClassifies tasks by complexity and uncertainty; selects SimpleReAct, DAG Parallel, or MCTS.
Multi-Agent PipelinePlanner (DAG decomposition), Executor (code + tools), Critic (validation), Synthesizer (report).
Hybrid RetrievalVector search → KG expansion → LLM rerank for 10K+ tools.
Self-CorrectionFour levels: reflection-guided code repair, output refinement, plan revision, report self-improvement.
MemoryWorking memory (session), episodic (past analyses, Engram-style recall), semantic (facts + KG).
Context-Window AwarenessModel-specific token budgets, auto-calibrated working memory, per-call truncation guards.
Resources InterfaceWeb UI: Databases, Tools, Packages; same registry drives planning and execution.
Optional IntegrationsAutoFigure (figures from text/papers), AI-Researcher (autonomous research pipeline).

Recent Optimizations

The framework has been extended with production-oriented and usability features:
  • Context window & token budget — A model context window table (Qwen3-max, Claude, GPT-4, etc.), token estimation (CJK-aware), and auto-calibration of working memory from the active model. Every LLM call is guarded; the Synthesizer uses a dynamic per-step character budget so long pipelines stay within context.
  • Engram-inspired episodic memory — N-gram fingerprint index for O(1) candidate retrieval, multi-signal index (tools, domains, content, bio-entities), and a relevance gate so only high-relevance past analyses are injected into the Planner.
  • Resources page — A dedicated Resources view in the web UI (Databases, Tools, Packages) backed by GET /api/v1/resources. Tools are classified by domain; users can search, browse, and reference them in chat (e.g. with @).
  • Project–Chat linkage — The project panel (Tasks/Files) is shown only on the Chat page; Resources, Automation, Files, and Settings show a streamlined sidebar without project context.
  • AutoFigure & AI-Researcher — Optional tools for methodology/result figures from text or papers, and for running an external autonomous research pipeline (reference-based or idea-driven), registered in the same tool registry and visible on the Resources page.

Recent Updates (Reports, UI & Logic)

The following changes improve report quality, Web UI behavior, and analysis feedback:

Reports and presentations (Nature-style figures)

  • Numbered figures — Every figure in the generated report is labeled Figure 1, Figure 2, … Figure N in order of appearance. The same numbering is used in the downloadable PowerPoint so report and presentation stay aligned.
  • Captions and explanations — Each figure has a short subtitle and a paragraph-style explanation (what the figure shows and how to interpret it). When an LLM is available, captions are generated automatically; otherwise a template caption is used.
  • Report structure — Reports follow an academic structure (Executive Summary, Dataset Overview, Methods, Results with figures, Discussion, Output Files, References). The Results section embeds figures with the format: Figure N. Subtitle → explanation → image.
  • PPT generation — The same ordered figure list and optional LLM captions are used for slide decks. Slide captions show Figure N. Caption for consistency. See Reports and Presentations.

Web UI: usage and logic

  • Upload progress — File uploads show a progress bar with percentage, bytes transferred, and upload speed (e.g. KB/s or MB/s). When the total size is unknown, an indeterminate progress indicator is shown.
  • Immediate feedback — As soon as you send a message, the UI shows a thinking state (e.g. “Preparing your analysis…”). This avoids a blank wait before the first plan or reply.
  • Project and task history — Clicking a project in the sidebar loads that project’s conversation. If no messages were saved, a placeholder message is shown so you can continue in chat. Network or “project not found” errors are shown in-chat with clear guidance instead of a blank page.
  • Files page — Dedicated Upload dropdown (upload files or folder), New folder entry point, and an empty state with direct upload actions. File list supports list/grid view and clearer layout.
  • Sidebar – Files — The Files tab in the project panel shows file count, a View all link to the full File Manager, and a compact list of recent files with icons and sizes.
  • Analysis status — When an analysis fails or completes, all step spinners stop immediately. Failed steps show an error state; remaining steps are marked skipped so the UI reflects the final state without lingering loaders.
  • Error handling — Step errors (e.g. 504 Gateway Timeout from an external API) are sanitized into short, readable messages. WebSocket disconnects are handled so the UI does not get stuck.

Configuration and pipeline

  • Phase 3 (final validation) — Off by default to save time and tokens. Set BIOCORTEX_ENABLE_FINAL_VALIDATION=true in .env to run final validation after execution and optionally revise the plan if issues are found.
  • AnnData / h5ad — The Executor prompt includes guidelines for safe h5ad serialization and for validating QC/annotation columns before use, reducing runtime errors in single-cell workflows.

Usage Flow and Logic (Overview)

Typical usage flow (Web UI): Log in → create or select a project → (optional) upload files → type a task in chat → see immediate “thinking” feedback → backend selects strategy and generates a plan → you confirm the plan (if required) → steps execute and stream progress → report appears with Figure 1..N and captions → download report or generate PPT. Details: Web UI – Usage flow and Getting Started – Typical usage flow. Logic at a glance: The Strategy Router classifies your task (simple vs pipeline vs exploratory) and chooses ReAct, DAG, or MCTS. The Planner decomposes the task into steps; the Executor runs code and tools; the Critic validates; the Synthesizer produces the report. Figure order is fixed by analysis step order + file name; captions are from an LLM call (when available) or a template. Report and PPT share the same figure list and captions so numbering stays consistent. See Web UI – Logic, Reports and Presentations – Report generation flow, and Logic: figure ordering and captions for more.

When to Use BioCortex

  • Structured multi-step analyses — e.g. scRNA-seq: QC → normalization → clustering → annotation → differential expression.
  • Cross-domain workflows — Combine genomics, literature, and pathway tools in one plan.
  • Exploratory research — Use MCTS when the best analytical path is unknown.
  • Reproducibility — Full provenance and Jupyter notebook export.
  • Lab automation — Webhooks, cron jobs, and browser tools for literature and genomics databases.

Next Steps