Introduction

BioCortex (Biological Cortical Task Executor) is a multi-agent AI framework for automated biological research. It plans, executes, and synthesizes complex analytical workflows across all biological domains — from single-cell transcriptomics and protein structure to ecology, pharmacology, and genome engineering.

Why BioCortex?

Biological research today involves many data types and tools. Existing LLM-based agents often use a single, sequential reasoning loop and scale poorly as tool sets grow. BioCortex is built around three ideas:

Adaptive strategy routing — Not every task needs the same pipeline. Simple lookups use a fast ReAct loop; multi-step workflows use a DAG-parallel Planner–Executor–Critic–Synthesizer pipeline; exploratory research uses Monte Carlo Tree Search (MCTS).
Hybrid retrieval at scale — Tool selection uses a three-stage pipeline: vector semantic search (fast recall), knowledge-graph expansion (dependency discovery), and LLM reranking (precision). This scales to 10,000+ tools and is 3–5× faster than LLM-only selection.
Persistent intelligence — A three-tier memory system (working, episodic, semantic) and a biological knowledge graph let BioCortex learn from past analyses and ground claims for anti-hallucination verification.

Core Capabilities

Capability	Description
Strategy Router	Classifies tasks by complexity and uncertainty; selects SimpleReAct, DAG Parallel, or MCTS.
Multi-Agent Pipeline	Planner (DAG decomposition), Executor (code + tools), Critic (validation), Synthesizer (report).
Hybrid Retrieval	Vector search → KG expansion → LLM rerank for 10K+ tools.
Self-Correction	Four levels: reflection-guided code repair, output refinement, plan revision, report self-improvement.
Memory	Working memory (session), episodic (past analyses, Engram-style recall), semantic (facts + KG).
Context-Window Awareness	Model-specific token budgets, auto-calibrated working memory, per-call truncation guards.
Resources Interface	Web UI: Databases, Tools, Packages; same registry drives planning and execution.
Optional Integrations	AutoFigure (figures from text/papers), AI-Researcher (autonomous research pipeline).

Recent Optimizations

The framework has been extended with production-oriented and usability features:

Context window & token budget — A model context window table (Qwen3-max, Claude, GPT-4, etc.), token estimation (CJK-aware), and auto-calibration of working memory from the active model. Every LLM call is guarded; the Synthesizer uses a dynamic per-step character budget so long pipelines stay within context.
Engram-inspired episodic memory — N-gram fingerprint index for O(1) candidate retrieval, multi-signal index (tools, domains, content, bio-entities), and a relevance gate so only high-relevance past analyses are injected into the Planner.
Resources page — A dedicated Resources view in the web UI (Databases, Tools, Packages) backed by GET /api/v1/resources. Tools are classified by domain; users can search, browse, and reference them in chat (e.g. with @).
Project–Chat linkage — The project panel (Tasks/Files) is shown only on the Chat page; Resources, Automation, Files, and Settings show a streamlined sidebar without project context.
AutoFigure & AI-Researcher — Optional tools for methodology/result figures from text or papers, and for running an external autonomous research pipeline (reference-based or idea-driven), registered in the same tool registry and visible on the Resources page.

Recent Updates (Reports, UI & Logic)

The following changes improve report quality, Web UI behavior, and analysis feedback:

Reports and presentations (Nature-style figures)

Numbered figures — Every figure in the generated report is labeled Figure 1, Figure 2, … Figure N in order of appearance. The same numbering is used in the downloadable PowerPoint so report and presentation stay aligned.
Captions and explanations — Each figure has a short subtitle and a paragraph-style explanation (what the figure shows and how to interpret it). When an LLM is available, captions are generated automatically; otherwise a template caption is used.
Report structure — Reports follow an academic structure (Executive Summary, Dataset Overview, Methods, Results with figures, Discussion, Output Files, References). The Results section embeds figures with the format: Figure N. Subtitle → explanation → image.
PPT generation — The same ordered figure list and optional LLM captions are used for slide decks. Slide captions show Figure N. Caption for consistency. See Reports and Presentations.

Web UI: usage and logic

Upload progress — File uploads show a progress bar with percentage, bytes transferred, and upload speed (e.g. KB/s or MB/s). When the total size is unknown, an indeterminate progress indicator is shown.
Immediate feedback — As soon as you send a message, the UI shows a thinking state (e.g. “Preparing your analysis…”). This avoids a blank wait before the first plan or reply.
Project and task history — Clicking a project in the sidebar loads that project’s conversation. If no messages were saved, a placeholder message is shown so you can continue in chat. Network or “project not found” errors are shown in-chat with clear guidance instead of a blank page.
Files page — Dedicated Upload dropdown (upload files or folder), New folder entry point, and an empty state with direct upload actions. File list supports list/grid view and clearer layout.
Sidebar – Files — The Files tab in the project panel shows file count, a View all link to the full File Manager, and a compact list of recent files with icons and sizes.
Analysis status — When an analysis fails or completes, all step spinners stop immediately. Failed steps show an error state; remaining steps are marked skipped so the UI reflects the final state without lingering loaders.
Error handling — Step errors (e.g. 504 Gateway Timeout from an external API) are sanitized into short, readable messages. WebSocket disconnects are handled so the UI does not get stuck.

Configuration and pipeline

Phase 3 (final validation) — Off by default to save time and tokens. Set BIOCORTEX_ENABLE_FINAL_VALIDATION=true in .env to run final validation after execution and optionally revise the plan if issues are found.
AnnData / h5ad — The Executor prompt includes guidelines for safe h5ad serialization and for validating QC/annotation columns before use, reducing runtime errors in single-cell workflows.

Usage Flow and Logic (Overview)

Typical usage flow (Web UI): Log in → create or select a project → (optional) upload files → type a task in chat → see immediate “thinking” feedback → backend selects strategy and generates a plan → you confirm the plan (if required) → steps execute and stream progress → report appears with Figure 1..N and captions → download report or generate PPT. Details: Web UI – Usage flow and Getting Started – Typical usage flow. Logic at a glance: The Strategy Router classifies your task (simple vs pipeline vs exploratory) and chooses ReAct, DAG, or MCTS. The Planner decomposes the task into steps; the Executor runs code and tools; the Critic validates; the Synthesizer produces the report. Figure order is fixed by analysis step order + file name; captions are from an LLM call (when available) or a template. Report and PPT share the same figure list and captions so numbering stays consistent. See Web UI – Logic, Reports and Presentations – Report generation flow, and Logic: figure ordering and captions for more.

When to Use BioCortex

Structured multi-step analyses — e.g. scRNA-seq: QC → normalization → clustering → annotation → differential expression.
Cross-domain workflows — Combine genomics, literature, and pathway tools in one plan.
Exploratory research — Use MCTS when the best analytical path is unknown.
Reproducibility — Full provenance and Jupyter notebook export.
Lab automation — Webhooks, cron jobs, and browser tools for literature and genomics databases.

Next Steps

Getting Started — Install, configure, and run your first analysis.
Architecture — System layers and data flow.
Strategy Routing — How tasks are classified and executed.
Web UI — Chat, uploads, projects, Files, and report viewer.
Reports and Presentations — Figure numbering, captions, and PPT generation.

Getting Started

​Introduction

​Why BioCortex?

​Core Capabilities

​Recent Optimizations

​Recent Updates (Reports, UI & Logic)

​Reports and presentations (Nature-style figures)

​Web UI: usage and logic

​Configuration and pipeline

​Usage Flow and Logic (Overview)

​When to Use BioCortex

​Next Steps