Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Hallucinator is a tool that validates academic citations against multiple databases to detect fabricated references, retracted papers, and author mismatches.

See the project README, MANIFESTO, and Rust README on GitHub for project context.

Use the sidebar to navigate the documentation.

System Architecture Overview

Hallucinator is a multi-crate Rust workspace that validates academic references extracted from PDFs against 10+ academic databases. This document covers the high-level architecture, key design decisions, and how the pieces fit together.

Workspace Structure

The workspace lives in hallucinator-rs/ and contains 10 member crates plus 2 excluded crates:

hallucinator-rs/
├── crates/
│   ├── hallucinator-core       # Validation engine, DB backends, caching, rate limiting
│   ├── hallucinator-parsing     # Reference parsing pipeline (backend-agnostic)
│   ├── hallucinator-pdf-mupdf  # MuPDF backend (AGPL-licensed)
│   ├── hallucinator-bbl        # BibTeX .bbl/.bib file parsing
│   ├── hallucinator-ingest     # Unified file dispatch + archive handling
│   ├── hallucinator-dblp       # DBLP offline database (RDF → SQLite FTS5)
│   ├── hallucinator-acl        # ACL Anthology offline database
│   ├── hallucinator-reporting  # Export formats (JSON, CSV, Markdown, HTML, Text)
│   ├── hallucinator-cli        # CLI binary
│   ├── hallucinator-tui        # TUI binary (Ratatui)
│   ├── hallucinator-python     # PyO3 Python bindings (excluded from workspace)
│   └── hallucinator-web        # Axum web server (excluded from workspace)

Only hallucinator-cli and hallucinator-tui are distributed as release binaries. The Python and web crates are excluded from the workspace to avoid CI complications (pyo3 version conflicts, unnecessary axum compilation during dist builds).

Crate Dependency Graph

                    ┌──────────────┐
                    │  CLI / TUI   │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │    ingest    │──────────────┐
                    └──────┬───────┘              │
                           │                      │
              ┌────────────▼────────────┐   ┌─────▼──────┐
              │          core           │   │  parsing    │
              │  (validation, DB, cache)│   │  (extract)  │
              └────┬───────┬───────┬────┘   └─────┬───────┘
                   │       │       │              │
              ┌────▼──┐ ┌──▼───┐ ┌▼────────┐ ┌───▼────────┐
              │ dblp  │ │ acl  │ │reporting│ │ pdf-mupdf  │
              └───────┘ └──────┘ └─────────┘ │  (AGPL)    │
                                             └────────────┘

AGPL Isolation

The MuPDF library is licensed under AGPL. To keep the rest of the codebase under a permissive license:

  • hallucinator-core defines the PdfBackend trait (permissive license)
  • hallucinator-pdf-mupdf implements PdfBackend using MuPDF (AGPL)
  • Only the final binaries (CLI/TUI) link to the AGPL crate
  • Library consumers (hallucinator-core, hallucinator-python) never depend on hallucinator-pdf-mupdf directly

This means the core validation logic remains AGPL-free. Alternative PDF backends (e.g., pdf-extract, pdfium) can be implemented by providing the PdfBackend trait.

Key Traits

DatabaseBackend

Defined in hallucinator-core/src/db/mod.rs. Every academic database implements this trait:

#![allow(unused)]
fn main() {
pub trait DatabaseBackend: Send + Sync {
    fn name(&self) -> &str;
    fn is_local(&self) -> bool { false }
    fn requires_doi(&self) -> bool { false }
    fn query(&self, title: &str, client: &reqwest::Client,
             timeout: Duration) -> Pin<Box<dyn Future<Output = Result<DbQueryResult, DbQueryError>> + Send>>;
    fn query_doi(&self, doi: &str, title: &str, authors: &[String],
                 client: &reqwest::Client, timeout: Duration) -> DoiQueryResult;
}
}

Local backends (is_local() = true) are queried inline by the coordinator before fanning out to remote drainers. See Concurrency Model for details.

PdfBackend

Defined in hallucinator-core. Abstracts PDF text extraction:

#![allow(unused)]
fn main() {
pub trait PdfBackend {
    fn extract_text(&self, path: &Path) -> Result<String, String>;
}
}

Configuration Layering

Configuration is resolved with the following precedence (highest wins):

  1. CLI flags--num-workers 8, --dblp-offline /path, etc.
  2. Environment variablesOPENALEX_KEY, DB_TIMEOUT, SEARXNG_URL, etc.
  3. CWD config.hallucinator.toml in the current directory
  4. Platform config~/.config/hallucinator/config.toml (Linux/macOS) or %APPDATA%\hallucinator\config.toml (Windows)
  5. Defaults — Hardcoded defaults (4 workers, 10s timeout, etc.)

CWD config overlays platform config field-by-field, so you can keep API keys in the global config and override concurrency settings per-project.

See Configuration for the full reference.

Caching

A two-tier cache prevents redundant API calls:

  • L1 (in-memory): DashMap — lock-free concurrent reads, sub-microsecond lookups
  • L2 (optional SQLite): WAL-mode database for persistence across runs

Cache keys use aggressive title normalization (Unicode NFKD, Greek letter transliteration, math symbol replacement, ASCII-only lowercasing) to maximize hit rates across PDF extraction artifacts.

TTLs: 7 days for positive (found) entries, 24 hours for negative (not-found) entries. Both are configurable.

See Concurrency Model for how the cache interacts with the drainer pool.

Rate Limiting

Each remote database has its own AdaptiveDbLimiter using the governor crate for token-bucket rate limiting:

  • Per-DB drainer task — Each drainer is the sole consumer of its DB’s rate limiter, eliminating governor contention
  • Adaptive backoff — On HTTP 429: doubles the slowdown factor (1x → 2x → 4x → … → 16x max), atomically swaps the governor via ArcSwap
  • Recovery — After 30 seconds without a 429, the original rate is restored
  • Default rates — CrossRef 1/s (3/s with crossref_mailto), arXiv 3/s, DBLP 1/s, Semantic Scholar varies by API key presence

Title Matching

References are matched using fuzzy string comparison with a 95% similarity threshold (via rapidfuzz). Before comparison, titles are normalized:

  1. HTML entity unescaping
  2. Separated diacritic fixing (e.g., B ¨UNZBÜNZ)
  3. Greek letter transliteration (α → alpha, β → beta)
  4. Math symbol replacement (√ → sqrt, ∞ → infinity)
  5. Unicode NFKD decomposition
  6. Strip to [a-z0-9] only

Author Validation

Two modes based on the quality of extracted author names:

  • Full mode — Normalizes each author to FirstInitial Surname, checks set intersection between PDF authors and DB authors
  • Last-name-only mode — Used when >50% of reference authors lack first names/initials; compares surnames only with partial suffix matching for multi-word surnames

Entry Points

All interfaces consume the same hallucinator-core library:

InterfaceCrateDescription
CLIhallucinator-cliSingle-file checking with colored terminal output
TUIhallucinator-tuiBatch processing with Ratatui, result navigation, false-positive overrides
Webhallucinator-webAxum HTTP server with SSE streaming (excluded from workspace)
Pythonhallucinator-pythonPyO3 bindings with pre-compiled wheels (excluded from workspace)
Libraryhallucinator-coreDirect Rust API via check_references()

The core check_references() function signature:

#![allow(unused)]
fn main() {
pub async fn check_references(
    refs: Vec<Reference>,
    config: Config,
    progress: impl Fn(ProgressEvent) + Send + Sync + 'static,
    cancel: CancellationToken,
) -> Vec<ValidationResult>
}

Data Flow: PDF to Results

This document traces a reference’s journey from PDF file to final validation result.

Pipeline Overview

PDF file
  │
  ▼
┌─────────────────┐
│  File Dispatch   │  hallucinator-ingest
│  (PDF/BBL/BIB/  │  Detects file type, extracts from archives
│   archive)       │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Text Extraction  │  hallucinator-parsing + hallucinator-pdf-mupdf
│  (PdfBackend)    │  MuPDF extracts raw text with ligature expansion
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Section Detection│  hallucinator-parsing/src/section.rs
│                  │  Locates "References" / "Bibliography" header
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Segmentation   │  hallucinator-parsing/src/section.rs
│                  │  Splits section into individual references
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Title/Author    │  hallucinator-parsing/src/title.rs, authors.rs
│  Extraction     │  Parses title, authors, DOI, arXiv ID per ref
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Skip Filtering  │  hallucinator-parsing/src/extractor.rs
│                  │  Removes URL-only and short-title refs
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Validation    │  hallucinator-core (pool, orchestrator, db/*)
│   Pool          │  Concurrent DB queries with early exit
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Result Assembly │  hallucinator-core/src/pool.rs
│                  │  Merge local+remote results, retraction check
└────────┬────────┘
         │
         ▼
  Vec<ValidationResult>

Stage 1: File Dispatch

Crate: hallucinator-ingest

The ingest crate handles file type detection and archive extraction:

  • PDF files — Passed to the PDF extraction pipeline
  • BBL/BIB files — Parsed by hallucinator-bbl (LaTeX bibliography entries)
  • Archives (.tar.gz, .zip) — Extracted streaming via ArchiveIterator, each contained PDF processed independently
  • Size limits — Configurable max_archive_size_mb to prevent resource exhaustion

Stage 2: Text Extraction

Crate: hallucinator-parsing + hallucinator-pdf-mupdf

The PdfBackend trait abstracts text extraction. The MuPDF backend:

  1. Opens the PDF and iterates page-by-page
  2. Extracts raw text blocks
  3. Expands ligatures (fi, fl, ff, etc.)
  4. Fixes hyphenation — distinguishes syllable breaks from compound words using a suffix heuristic

Stage 3: Section Detection

File: hallucinator-parsing/src/section.rs

Locates the references section by scanning for header patterns:

  • Primary: References, Bibliography, REFERENCES, BIBLIOGRAPHY
  • End markers: Appendix, Acknowledgments, Supplementary, Author Contributions

If no header is found, falls back to using the last 30% of the document text.

The section text between the header and the first end-marker (or EOF) is extracted.

Stage 4: Reference Segmentation

File: hallucinator-parsing/src/section.rs

Individual references are split using priority-ordered strategies:

PriorityStrategyPatternExample
1IEEE[1], [2], …[1] A. Author, "Title..."
2Numbered1., 2., …1. Author, Title...
3ML author-basedFull names / initialsAuthor, A. B. (2023). Title...
4Springer/NatureUppercase + (YYYY)AUTHOR, A. Title. J. (2023)
5FallbackDouble newlineTwo blank lines between refs

The system tries each strategy and picks the one that produces the most valid segments. For IEEE and numbered styles, a sequential check ensures numbering is contiguous.

Stage 5: Title and Author Extraction

Files: hallucinator-parsing/src/title.rs, authors.rs, identifiers.rs

For each segmented reference:

  1. DOI extraction — Regex: /10\.\d+/[^\s]+/
  2. arXiv ID extraction — Regex for arXiv:YYMM.NNNNN patterns
  3. Title extraction — Two strategies tried in order:
    • Quoted strings (e.g., "Title Here")
    • Capitalized word sequences between author and venue patterns
  4. Author extraction — Format-specific parsing for IEEE, ACM, USENIX, AAAI, NeurIPS styles
  5. Em-dash handling——— means “same authors as previous reference”

Stage 6: Skip Filtering

File: hallucinator-parsing/src/extractor.rs

References are skipped (not validated) if:

  • URL-only — The reference is just a URL to a non-academic site (GitHub, docs, etc.)
  • Short title — Title has fewer than 5 words (prone to false matches), unless a DOI or arXiv ID is present
  • No title — No title could be extracted

Skip statistics are tracked and reported: total_raw, url_only, short_title, no_title.

Stage 7: Validation

Crate: hallucinator-core (see Concurrency Model for the full deep dive)

Each reference goes through:

  1. Coordinator picks up reference from job queue
  2. Local DB query (DBLP offline, ACL offline) — inline, < 1ms
  3. If verified locally → skip all remote DBs, emit result immediately
  4. Cache pre-check — synchronously check cache for all remote DBs
  5. If verified from cache → skip all drainers
  6. Fan out cache-miss DBs to per-DB drainer queues
  7. Drainer queries DB — rate-limited HTTP call
  8. Author validation — compare PDF authors against DB authors
  9. Early exit — if any drainer verifies, others skip remaining work

Database Query Flow (per reference, per DB)

Drainer receives job
  │
  ├─ Already verified? → skip
  ├─ Cancelled? → skip
  ├─ Requires DOI but ref has none? → skip
  │
  ▼
Rate limit acquire (governor token)
  │
  ▼
Cache check
  ├─ Cache hit → return cached result
  │
  ▼
HTTP request (with timeout)
  │
  ├─ Success + title found → author validation
  │     ├─ Authors match → set verified flag
  │     └─ Authors don't match → record mismatch
  ├─ Success + title not found → NoMatch
  ├─ 429 Rate Limited → adaptive backoff + retry
  └─ Error/Timeout → record failure
  │
  ▼
Cache insert (if successful)
  │
  ▼
Decrement remaining counter
  ├─ Not last → done
  └─ Last drainer → finalize result

Stage 8: Result Assembly

File: hallucinator-core/src/pool.rs (finalize_collector)

When the last drainer for a reference completes:

  1. Merge local and remote DbResult lists
  2. Determine status — Verified (any DB matched) > AuthorMismatch (title found, wrong authors) > NotFound
  3. SearxNG fallback — If still NotFound and SearxNG is configured, try web search as last resort
  4. DOI info — Mark DOI as valid/invalid based on DOI backend result
  5. Retraction info — Use inline retraction data extracted from CrossRef response (no extra API call)
  6. Emit eventsProgressEvent::Warning (if DBs timed out) + ProgressEvent::Result
  7. Send result via oneshot channel back to the caller

Output Types

The final Vec<ValidationResult> can be:

  • Displayed in the CLI with colored output
  • Navigated in the TUI with sorting/filtering
  • Streamed via SSE in the web interface
  • Exported to JSON/CSV/Markdown/Text/HTML via hallucinator-reporting
  • Returned as Python objects via hallucinator-python

See Export Formats for output schema details.

Concurrency Model

Hallucinator’s validation engine is designed around a per-DB drainer pool architecture that maximizes throughput while respecting per-database rate limits. This document explains the concurrency primitives, task structure, and how they interact.

Design Goals

  1. Maximize parallelism — Check multiple references simultaneously
  2. Respect rate limits — Each database has its own rate limit; never exceed it
  3. Minimize latency — Return results as soon as a verified match is found
  4. Avoid contention — No shared rate limiter governor across tasks

Architecture Diagram

                         ┌──────────────────┐
                         │   Job Queue      │
                         │ (async_channel)  │
                         └────────┬─────────┘
                                  │
             ┌────────────────────┼────────────────────┐
             │                    │                    │
      ┌──────▼──────┐     ┌──────▼──────┐     ┌──────▼──────┐
      │ Coordinator │     │ Coordinator │     │ Coordinator │
      │   Task 1    │     │   Task 2    │     │   Task N    │
      └──────┬──────┘     └──────┬──────┘     └──────┬──────┘
             │                    │                    │
             │  Local DBs inline  │                    │
             │  Cache pre-check   │                    │
             │                    │                    │
             └──────────┬─────────┘                    │
                        │ Fan out (cache misses only)  │
         ┌──────────────┼──────────────────────────┐   │
         │              │              │           │   │
   ┌─────▼─────┐ ┌──────▼─────┐ ┌─────▼────┐ ┌───▼───▼──┐
   │ CrossRef  │ │   arXiv    │ │   DBLP   │ │   ...    │
   │ Drainer   │ │  Drainer   │ │ Drainer  │ │ Drainers │
   │           │ │            │ │ (online) │ │          │
   │ Rate: 1/s │ │ Rate: 3/s  │ │ Rate:1/s │ │          │
   └───────────┘ └────────────┘ └──────────┘ └──────────┘

Task Types

Coordinator Tasks

  • Count: Configurable via num_workers (default: 4)
  • Role: Pick references from the shared job queue, run local DBs inline, pre-check cache, fan out to drainers
  • Concurrency: Multiple coordinators run in parallel, each pulling from the same async_channel

A coordinator’s lifecycle for each reference:

  1. Receive RefJob from job queue
  2. Emit ProgressEvent::Checking
  3. Query local databases inline (DBLP offline, ACL offline) — sub-millisecond
  4. If verified locally → emit result, skip remote phase
  5. Pre-check cache for all remote DBs (synchronous, prevents race condition)
  6. If verified from cache → emit result, skip drainers
  7. Create RefCollector (shared aggregation hub)
  8. Send DrainerJob to each cache-miss DB’s drainer queue

Drainer Tasks

  • Count: One per enabled remote database
  • Role: Process DB queries sequentially at the database’s natural rate
  • Rate limiting: Each drainer is the sole consumer of its DB’s AdaptiveDbLimiter

A drainer’s lifecycle for each job:

  1. Check early-exit conditions (cancelled, already verified, no DOI for DOI-requiring backend)
  2. Acquire rate limiter token
  3. Check cache (within the rate-limited query path)
  4. Execute HTTP query with timeout
  5. Validate authors if title found
  6. Update RefCollector state
  7. Decrement remaining counter; if last, finalize the result

RefCollector

A per-reference aggregation hub, shared (via Arc) by all drainers working on that reference:

RefCollector
├── remaining: AtomicUsize    # Drainers left to report
├── verified: AtomicBool      # Early-exit flag
├── state: Mutex<AggState>    # Aggregation (held briefly)
│   ├── verified_info
│   ├── first_mismatch
│   ├── failed_dbs
│   ├── db_results
│   └── retraction
└── result_tx: Mutex<Option<oneshot::Sender>>

The last drainer to decrement remaining to zero calls finalize_collector(), which builds the final ValidationResult and sends it on the oneshot channel.

Concurrency Primitives

PrimitivePurpose
async_channel::unboundedJob queue (coordinators) and per-DB drainer queues
AtomicUsize + Ordering::AcqRelremaining counter for lock-free drainer coordination
AtomicBool + Ordering::Release/Acquireverified flag for early exit
Mutex<AggState>Per-reference aggregation state (single mutex, held briefly)
tokio::sync::oneshotReturn channel for each reference’s result
CancellationTokenGraceful shutdown (Ctrl+C handler)
ArcSwapAtomic governor swapping during adaptive rate limit backoff
DashMapLock-free concurrent L1 cache reads

Cache Pre-Check: Preventing Race Conditions

A subtle race condition exists without the cache pre-check:

  1. Reference R is dispatched to CrossRef (drainer A) and arXiv (drainer B)
  2. Drainer A finishes first: CrossRef has a match → sets verified = true
  3. Drainer B sees verified = true → skips arXiv query entirely
  4. arXiv’s result is never cached for reference R

This means future runs will always miss the arXiv cache for this title.

Solution: Before dispatching to any drainer, the coordinator synchronously checks the cache for all remote DBs. Cache hits are recorded in AggState.db_results, and only cache-miss DBs are dispatched to drainers. This ensures every DB’s cached result is always captured regardless of verification order.

Early Exit

When a drainer verifies a reference:

  1. Sets collector.verified to true (atomic store with Release ordering)
  2. Other drainers check this flag before querying (Acquire ordering)
  3. Drainers that see verified = true emit a Skipped status and decrement remaining

This avoids unnecessary API calls once a match is found.

SearxNG Fallback

If a reference is NotFound after all remote DBs have been checked and SearxNG is configured:

  1. finalize_collector() runs a SearxNG web search as a last-resort fallback
  2. If SearxNG finds the title, the status upgrades from NotFound to Verified (source: “Web Search”)
  3. SearxNG results don’t undergo author validation (web search doesn’t return structured author data)

Shutdown Sequence

  1. User presses Ctrl+C → CancellationToken is cancelled
  2. job_tx (the job queue sender) is closed
  3. Coordinators drain remaining jobs, checking cancel.is_cancelled() at each iteration
  4. Drainers skip remaining jobs when cancelled
  5. When all coordinators finish, they drop their Arc<drainer_txs> clones
  6. Drainer channels close → drainers drain and exit
  7. Pool handle completes

Performance Characteristics

  • Local DB queries: < 1ms (SQLite FTS5 lookups)
  • Cache hits: Sub-microsecond (DashMap L1) to ~1ms (SQLite L2)
  • Remote DB queries: 100ms–10s depending on database and network
  • Throughput: Scales linearly with num_workers for CPU-bound coordination; drainer throughput is rate-limit-bound per DB
  • Memory: One RefCollector per in-flight reference (small: a few KB each)

Crate Map

Quick reference for each crate in the workspace: its responsibility, key types, and dependencies.

hallucinator-core

Responsibility: The validation engine — database backends, caching, rate limiting, author matching, title normalization, and the main check_references() entry point.

Key types:

  • Reference — Parsed reference (title, authors, DOI, arXiv ID, raw citation)
  • ExtractionResult — References extracted from a document plus skip statistics
  • ValidationResult — Complete result for one reference (status, source, per-DB results, retraction info)
  • Config — Runtime configuration (API keys, timeouts, offline DBs, disabled DBs, rate limiters, cache)
  • ProgressEvent — Events emitted during validation (Checking, Result, Warning, RetryPass, DatabaseQueryComplete, RateLimitWait)
  • StatusVerified, NotFound, AuthorMismatch
  • DbStatusMatch, NoMatch, AuthorMismatch, Timeout, RateLimited, Error, Skipped
  • CheckStats — Summary counts (total, verified, not_found, author_mismatch, retracted, skipped)
  • PdfBackend trait — Abstraction for PDF text extraction (moved here from hallucinator-parsing)
  • DatabaseBackend trait — Interface for all database backends
  • ValidationPool — Per-DB drainer pool for concurrent validation
  • QueryCache — Two-tier (DashMap + optional SQLite) cache
  • RateLimiters — Per-DB adaptive rate limiters

Key files:

  • src/lib.rs — Public API and type exports
  • src/pool.rs — ValidationPool, coordinators, drainers, RefCollector
  • src/orchestrator.rs — Database query orchestration (local then remote)
  • src/checker.rscheck_references() entry point
  • src/db/mod.rsDatabaseBackend trait and DbQueryResult
  • src/db/*.rs — Individual database implementations
  • src/cache.rs — Two-tier caching system
  • src/rate_limit.rs — Adaptive per-DB rate limiting
  • src/matching.rs — Title normalization and fuzzy matching
  • src/authors.rs — Author name validation
  • src/retraction.rs — Retraction checking
  • src/config_file.rs — TOML configuration file loading and merging

Dependencies: reqwest, tokio, async-channel, governor, dashmap, arc-swap, rapidfuzz, serde, rusqlite


hallucinator-parsing

Responsibility: Reference parsing pipeline — backend-agnostic text extraction, reference section detection, segmentation into individual references, title/author/identifier extraction. (Renamed from hallucinator-pdf to better reflect its scope.)

Key types:

  • ReferenceExtractor — Configurable extraction pipeline (formerly PdfExtractor)
  • ParsingConfig — Custom regex patterns, thresholds, segment strategies (formerly PdfParsingConfig)
  • ParsingError — Error type for parsing failures (formerly PdfError)

Key files:

  • src/lib.rs — Public API and type exports
  • src/extractor.rsReferenceExtractor pipeline orchestration
  • src/section.rsfind_references_section(), segment_references()
  • src/title.rsextract_title_from_reference(), clean_title()
  • src/authors.rsextract_authors_from_reference()
  • src/identifiers.rsextract_doi(), extract_arxiv_id()
  • src/hyphenation.rs — Hyphenation fixing

Dependencies: regex

Note: The PdfBackend trait now lives in hallucinator-core, not here.


hallucinator-pdf-mupdf

Responsibility: MuPDF implementation of the PdfBackend trait (defined in hallucinator-core). AGPL-licensed — isolated to keep other crates permissive.

Key types:

  • MupdfBackend — Implements PdfBackend using the mupdf crate

Dependencies: mupdf, hallucinator-core


hallucinator-bbl

Responsibility: Parse BibTeX .bbl and .bib files into Reference structs.

Key functions:

  • extract_references_from_bbl(path) — Parse .bbl files
  • extract_references_from_bib(path) — Parse .bib files

Dependencies: hallucinator-core (for Reference, ExtractionResult)


hallucinator-ingest

Responsibility: Unified file dispatch — detects file type (PDF, BBL, BIB, archive) and routes to the appropriate extractor. Handles archive streaming with size limits.

Key functions:

  • extract_references(path) — Dispatch to PDF or BBL/BIB extractor
  • is_archive_path(path) — Check if path is a .tar.gz or .zip

Key types:

  • ArchiveItem — Streaming archive extraction results (Pdf, Warning, Done)

Dependencies: hallucinator-parsing, hallucinator-pdf-mupdf, hallucinator-bbl, hallucinator-core, zip, tar, flate2, tempfile


hallucinator-dblp

Responsibility: Build and query an offline DBLP database. Downloads DBLP’s XML dump (~4.6GB compressed), parses it, and creates a SQLite database with FTS5 full-text search.

Key types:

  • DblpDatabase — SQLite database handle with FTS5 search
  • BuildProgress — Progress events during database building (Downloading, Parsing, RebuildingIndex, Compacting, Complete)

Key functions:

  • DblpDatabase::open(path) — Open existing database
  • DblpDatabase::search(title) — FTS5 title search
  • build_database(path, callback) — Download, parse, and build database

Dependencies: rusqlite, reqwest, quick-xml, flate2


hallucinator-acl

Responsibility: Build and query an offline ACL Anthology database. Downloads ACL XML data from GitHub, parses it, and creates a SQLite FTS5 database.

Key types:

  • AclDatabase — SQLite database handle
  • BuildProgress — Progress events during building

Key functions:

  • AclDatabase::open(path) — Open existing database
  • AclDatabase::search(title) — FTS5 title search
  • build_database(path, callback) — Download and build database

Dependencies: rusqlite, reqwest, quick-xml, tar, flate2


hallucinator-reporting

Responsibility: Export validation results to various formats.

Key types:

  • ExportFormatJson, Csv, Markdown, Text, Html
  • ReportPaper — Per-paper metadata for export (filename, stats, results, verdict)
  • ReportRef — Per-reference state for export (index, title, skip info, false-positive reason)
  • FpReason — False-positive override reasons (BrokenParse, ExistsElsewhere, AllTimedOut, KnownGood, NonAcademic)
  • PaperVerdict — Overall paper judgment (Safe, Questionable)

Key functions:

  • export_results(papers, ref_states, format, path) — Write results to file in specified format

Dependencies: hallucinator-core


hallucinator-cli

Responsibility: Command-line binary for single-file reference checking.

Commands:

  • check <file> — Check PDF/BBL/BIB file (or archive)
  • update-dblp <path> — Build/update offline DBLP database
  • update-acl <path> — Build/update offline ACL database

Dependencies: hallucinator-core, hallucinator-ingest, hallucinator-dblp, hallucinator-acl, clap, owo-colors, indicatif, tokio


hallucinator-tui

Responsibility: Terminal UI for batch processing. Built with Ratatui. Supports multiple PDFs, result navigation, sorting/filtering, false-positive overrides, result persistence (JSON), and configurable themes.

Screens: Queue → Paper → Reference Detail → Config

Dependencies: hallucinator-core, hallucinator-ingest, hallucinator-reporting, ratatui, crossterm, tokio

See TUI Design Document for design details.


hallucinator-python (excluded)

Responsibility: PyO3 Python bindings providing PdfExtractor, Validator, ValidatorConfig, and result types. Pre-compiled wheels available for major platforms.

Excluded from workspace to avoid pyo3/Python version conflicts in CI.

See the Python Bindings page for an overview, or the full PYTHON_BINDINGS.md on GitHub.


hallucinator-web (excluded)

Responsibility: Axum web server with HTML UI and SSE streaming.

Endpoints:

  • GET / — HTML interface
  • POST /analyze/stream — SSE-streaming reference validation (multipart PDF upload)
  • POST /retry — Recheck specific references

Excluded from workspace to avoid compiling axum/tower during dist builds (not distributed as a binary).

Getting Started

This guide covers installation and your first reference check across all available interfaces.

Choose Your Interface

InterfaceBest forInstall method
TUIBatch processing, exploring results interactivelyPre-built binary or cargo install
CLISingle-file checks, scripting, CI pipelinesPre-built binary or cargo install
PythonIntegration into existing Python workflowspip install hallucinator
From sourceDevelopment, customizationcargo build --release

Install Pre-built Binaries

Download the latest release for your platform from GitHub Releases. Both hallucinator-cli and hallucinator-tui binaries are included.

macOS / Linux

# Example: download and extract
tar xzf hallucinator-*-x86_64-unknown-linux-gnu.tar.gz
sudo mv hallucinator-cli hallucinator-tui /usr/local/bin/

Build from Source

cd hallucinator-rs
cargo build --release
# Binaries are in target/release/hallucinator-cli and target/release/hallucinator-tui

Install Python Bindings

Pre-compiled wheels are available for major platforms:

pip install hallucinator

Or build from source (requires Rust toolchain):

cd hallucinator-rs/crates/hallucinator-python
pip install maturin
maturin develop --release

See Python Bindings for the full API.

First Run: CLI

Check a single PDF:

hallucinator-cli check paper.pdf

The CLI will extract references, query databases, and print results with colored output. Each reference gets a verdict: Verified, Not Found, or Author Mismatch.

Useful Options

# Dry run — extract references without querying databases
hallucinator-cli check --dry-run paper.pdf

# Use offline DBLP for faster local lookups
hallucinator-cli check --dblp-offline dblp.db paper.pdf

# Save output to a file
hallucinator-cli check -o results.txt paper.pdf

# Check a .bbl or .bib file (LaTeX bibliography)
hallucinator-cli check references.bbl

First Run: TUI

Process multiple PDFs interactively:

hallucinator-tui paper1.pdf paper2.pdf *.pdf

The TUI opens with a queue of papers. Navigate with arrow keys:

  • Enter — Open paper results
  • Tab — Switch between panels
  • q — Quit
  • ? — Show help

See the Rust README for full key bindings.

First Run: Python

from hallucinator import PdfExtractor, Validator, ValidatorConfig

# Extract references from a PDF
extractor = PdfExtractor()
result = extractor.extract("paper.pdf")
print(f"Found {len(result.references)} references")

# Validate references
config = ValidatorConfig()
validator = Validator(config)
results = validator.check(result.references)

for r in results:
    print(f"  [{r.status}] {r.title}")

See PYTHON_BINDINGS.md for the complete API.

Optional: API Keys

Some databases offer higher rate limits or additional features with API keys:

KeyEnvironment VariableEffect
OpenAlexOPENALEX_KEYEnables OpenAlex database (disabled without key)
Semantic ScholarS2_API_KEYHigher rate limit (100/s vs 1/s)
CrossRef mailtoCROSSREF_MAILTOPolite pool: 3/s instead of 1/s

Set them as environment variables or in your config file.

Optional: Offline Databases

For faster local lookups and reduced API dependence, build offline databases:

# DBLP (~4.6GB download, 20–30 minutes)
hallucinator-cli update-dblp dblp.db

# ACL Anthology (smaller, a few minutes)
hallucinator-cli update-acl acl.db

Then use them:

hallucinator-cli check --dblp-offline dblp.db --acl-offline acl.db paper.pdf

Or set paths in your config file for automatic detection. See Offline Databases for details.

Next Steps

Configuration Reference

Hallucinator can be configured via CLI flags, environment variables, and TOML config files. This page documents all options.

Precedence

Configuration is resolved in this order (highest wins):

  1. CLI flags--num-workers 8, --openalex-key KEY
  2. Environment variablesOPENALEX_KEY, DB_TIMEOUT
  3. CWD config.hallucinator.toml in the current working directory
  4. Platform config~/.config/hallucinator/config.toml (Linux/macOS) or %APPDATA%\hallucinator\config.toml (Windows)
  5. Defaults

CWD config overlays platform config field-by-field. This lets you keep API keys in the global config and override settings per-project.

Config File Format

Both config file locations use the same TOML format:

[api_keys]
openalex_key = "your-openalex-key"
s2_api_key = "your-semantic-scholar-key"
crossref_mailto = "you@example.com"

[databases]
dblp_offline_path = "/path/to/dblp.db"
acl_offline_path = "/path/to/acl.db"
cache_path = "/path/to/cache.db"
searxng_url = "http://localhost:8080"
disabled = ["NeurIPS", "SSRN"]

[concurrency]
num_workers = 4
db_timeout_secs = 10
db_timeout_short_secs = 5
max_rate_limit_retries = 3
max_archive_size_mb = 500

[display]
theme = "hacker"
fps = 30

All fields are optional. Omitted fields use defaults.

Full Option Reference

API Keys

OptionCLI FlagEnv VarTOML KeyDescription
OpenAlex key--openalex-key KEYOPENALEX_KEYapi_keys.openalex_keyEnables OpenAlex database queries
Semantic Scholar key--s2-api-key KEYS2_API_KEYapi_keys.s2_api_keyHigher S2 rate limit (100/s vs 1/s)
CrossRef mailtoCROSSREF_MAILTOapi_keys.crossref_mailtoCrossRef polite pool (3/s vs 1/s)

Databases

OptionCLI FlagEnv VarTOML KeyDefault
DBLP offline path--dblp-offline PATHDBLP_OFFLINE_PATHdatabases.dblp_offline_pathNone
ACL offline path--acl-offline PATHACL_OFFLINE_PATHdatabases.acl_offline_pathNone
Cache path--cache-path PATHHALLUCINATOR_CACHE_PATHdatabases.cache_pathNone
SearxNG URL--searxng (flag)SEARXNG_URLdatabases.searxng_urlhttp://localhost:8080
Disabled DBs--disable-dbs A,Bdatabases.disabled[]

Notes:

  • --searxng is a boolean flag on the CLI. The actual URL comes from the env var or config file, defaulting to http://localhost:8080.
  • --disable-dbs accepts a comma-separated list. Database names are case-sensitive: CrossRef, arXiv, DBLP, Semantic Scholar, OpenAlex, Europe PMC, PubMed, ACL Anthology, NeurIPS, DOI, SSRN, Web Search.

Concurrency

OptionCLI FlagEnv VarTOML KeyDefault
Worker count--num-workers Nconcurrency.num_workers4
DB timeoutDB_TIMEOUTconcurrency.db_timeout_secs10
Short timeoutDB_TIMEOUT_SHORTconcurrency.db_timeout_short_secs5
Max 429 retries--max-rate-limit-retries Nconcurrency.max_rate_limit_retries3
Max archive sizeconcurrency.max_archive_size_mb500

Display (TUI only)

OptionTOML KeyDefaultValues
Themedisplay.themehackerhacker, modern, gnr
FPSdisplay.fps301–120

Other CLI Flags

FlagDescription
--no-colorDisable colored output
-o, --output PATHWrite results to file
--dry-runExtract and print references without querying databases
--check-openalex-authorsFlag author mismatches from OpenAlex (skipped by default)
--clear-cacheClear the entire query cache and exit
--clear-not-foundClear only not-found entries from cache and exit
--config PATHPath to config file (overrides auto-detection)
--log PATHWrite tracing/debug logs to file

CLI Commands

hallucinator-cli check <file>         # Check a PDF, BBL, or BIB file
hallucinator-cli update-dblp <path>   # Download and build offline DBLP database
hallucinator-cli update-acl <path>    # Download and build offline ACL database

Cache Configuration

The query cache stores database responses to avoid redundant API calls across runs.

  • Positive TTL (found entries): 7 days
  • Negative TTL (not-found entries): 24 hours
  • Storage: SQLite with WAL mode + in-memory DashMap

To enable caching, set cache_path in your config or use --cache-path:

hallucinator-cli check --cache-path ~/.hallucinator/cache.db paper.pdf

Cache maintenance:

# Clear everything
hallucinator-cli check --cache-path ~/.hallucinator/cache.db --clear-cache

# Clear only not-found entries (useful after DB outages)
hallucinator-cli check --cache-path ~/.hallucinator/cache.db --clear-not-found

Auto-detection

The TUI and CLI auto-detect offline database paths from well-known locations on your system. If you place dblp.db or acl.db in your platform config directory (~/.config/hallucinator/ on Linux/macOS), they may be found automatically. Explicit paths in the config file or CLI flags always take precedence.

Example Configurations

Minimal (API keys only)

[api_keys]
crossref_mailto = "researcher@university.edu"

Full Setup

[api_keys]
openalex_key = "your-key"
s2_api_key = "your-key"
crossref_mailto = "researcher@university.edu"

[databases]
dblp_offline_path = "~/.hallucinator/dblp.db"
acl_offline_path = "~/.hallucinator/acl.db"
cache_path = "~/.hallucinator/cache.db"

[concurrency]
num_workers = 8
db_timeout_secs = 15

[display]
theme = "modern"

CI / Scripting

[databases]
cache_path = "/tmp/hallucinator-cache.db"
disabled = ["OpenAlex", "NeurIPS", "SSRN"]

[concurrency]
num_workers = 2
db_timeout_secs = 5
max_rate_limit_retries = 1

Offline Databases

Hallucinator supports offline copies of DBLP and ACL Anthology for faster lookups, reduced API dependence, and better reliability. Offline databases are queried inline by the coordinator task (< 1ms), before any remote API calls.

Why Use Offline Databases?

  • Speed — SQLite FTS5 lookups complete in under 1ms vs. 100ms–5s for HTTP APIs
  • Reliability — No network dependency, no rate limiting, no timeouts
  • Early exit — If a reference is found locally, all remote DB queries are skipped
  • API savings — Fewer remote calls means you stay within rate limits and API quotas

The tradeoff is disk space and a one-time build step.

DBLP Offline

What It Contains

The DBLP database indexes publications, authors, and URLs from dblp.org. This covers computer science conferences and journals comprehensively — over 7 million publications.

Building

hallucinator-cli update-dblp /path/to/dblp.db

This will:

  1. Download dblp.xml.gz from dblp.org (~4.6GB compressed, ~16GB uncompressed)
  2. Parse the XML to extract publications, authors, and URLs
  3. Build a SQLite database with FTS5 full-text search index
  4. Compact the database with VACUUM

Time: 20–30 minutes on a modern machine (mostly download + parse time)

Disk space: ~2–3GB for the final SQLite database

The build process supports conditional download — if the database already exists and the server reports the file hasn’t changed (304 Not Modified), the download is skipped.

Using

# CLI flag
hallucinator-cli check --dblp-offline /path/to/dblp.db paper.pdf

# Or set in config file
# [databases]
# dblp_offline_path = "/path/to/dblp.db"

# Or environment variable
DBLP_OFFLINE_PATH=/path/to/dblp.db hallucinator-cli check paper.pdf

Staleness Warning

If the database is older than 30 days, a warning is printed. To refresh:

hallucinator-cli update-dblp /path/to/dblp.db

The update is incremental via conditional HTTP (ETag/If-Modified-Since), so if the upstream data hasn’t changed, it completes instantly.

ACL Anthology Offline

What It Contains

The ACL Anthology database indexes papers from computational linguistics and NLP venues (ACL, EMNLP, NAACL, EACL, CoNLL, etc.) — tens of thousands of publications.

Building

hallucinator-cli update-acl /path/to/acl.db

This will:

  1. Download the ACL Anthology XML data from GitHub
  2. Extract and parse XML files
  3. Build a SQLite database with FTS5 full-text search index

Time: A few minutes (much smaller than DBLP)

Disk space: ~50–100MB for the final database

The build process tracks the GitHub commit SHA and skips the download if nothing has changed.

Using

# CLI flag
hallucinator-cli check --acl-offline /path/to/acl.db paper.pdf

# Or set in config file
# [databases]
# acl_offline_path = "/path/to/acl.db"

# Or environment variable
ACL_OFFLINE_PATH=/path/to/acl.db hallucinator-cli check paper.pdf

Store both databases in your platform config directory for automatic detection:

mkdir -p ~/.config/hallucinator

# Build databases
hallucinator-cli update-dblp ~/.config/hallucinator/dblp.db
hallucinator-cli update-acl ~/.config/hallucinator/acl.db

# Configure paths
cat > ~/.config/hallucinator/config.toml << 'EOF'
[databases]
dblp_offline_path = "~/.config/hallucinator/dblp.db"
acl_offline_path = "~/.config/hallucinator/acl.db"
cache_path = "~/.config/hallucinator/cache.db"
EOF

Maintenance Schedule

DatabaseRecommended refreshWhy
DBLPMonthlyNew publications indexed regularly
ACLBefore conference deadlinesNew proceedings added after each conference

Both update commands are safe to run against existing databases — they rebuild in-place.

Combining with Online Databases

Offline and online databases complement each other:

  1. Local databases are queried first (< 1ms)
  2. If verified locally, remote queries are skipped entirely
  3. If not found locally, remote databases are queried in parallel
  4. Having both reduces total validation time and improves coverage

This means you get the speed of local lookups for common CS and NLP papers, with full coverage from 10+ remote databases for everything else.

Understanding Results

This guide explains how to interpret Hallucinator’s output, what each verdict means, and how to handle edge cases.

Verdict Types

Each validated reference receives one of these statuses:

Verified

The reference was found in at least one academic database with matching authors.

  • Source is reported (e.g., “CrossRef”, “DBLP Offline”, “arXiv”)
  • Found authors are listed for comparison
  • Paper URL links to the database entry when available

A verified reference is almost certainly real. The 95% fuzzy title matching threshold accommodates minor PDF extraction artifacts while remaining strict enough to avoid false matches.

Not Found

The reference was not found in any queried database.

This does not necessarily mean the reference is fabricated. Common legitimate reasons:

  • Very recent publication — Not yet indexed by databases
  • Book chapters or dissertations — Less coverage in article-focused databases
  • Workshop or regional conference papers — May not be in major indices
  • PDF extraction error — Title was mangled during extraction (ligatures, hyphenation, encoding issues)
  • Database outage — Temporary API issues (check “Failed DBs” in the output)

What to do: Check the “Failed DBs” list. If multiple databases timed out, the reference may simply need rechecking. Use Google Scholar or the paper URL (if available) for manual verification.

Author Mismatch

The title was found in a database, but the authors don’t match.

Possible explanations:

  • Different paper with similar title — The database returned a different paper
  • Author name variants — Different transliterations, maiden/married names, inconsistent initials
  • Preprint vs. published version — Author list changed between versions
  • PDF extraction error — Authors were incorrectly parsed from the PDF

What to do: Compare the “PDF authors” and “DB authors” in the output. If they’re clearly the same people with different name formats, this is a false positive. If the authors are completely different, it’s worth investigating.

Retracted

The reference was found but has been retracted. This information comes from CrossRef’s retraction metadata.

  • Retraction DOI links to the retraction notice
  • Retraction source indicates the type (e.g., retraction, removal, expression of concern)

Citing retracted papers is a serious concern in academic integrity. However, some retractions are for reasons unrelated to the paper’s scientific content (e.g., copyright disputes). Always check the retraction notice.

Skipped References

Some references are excluded from validation:

ReasonExplanation
URL-onlyReference is just a URL to a non-academic site (GitHub, documentation)
Short titleTitle has fewer than 5 words (too short for reliable matching)
No titleNo title could be extracted from the reference text

Skipped references are not counted in the “problematic” percentage.

Exception: References with a DOI or arXiv ID are never skipped for short title, since the identifier provides a reliable lookup path.

Paper Verdicts (TUI)

In the TUI, entire papers can be marked with a verdict:

  • Safe — All references verified, or issues have been manually reviewed
  • Questionable — Contains concerning unverified references

These are user-assigned labels for batch triage, not automated judgments.

Per-Database Results

Each reference includes per-database query results showing:

  • Database name — Which DB was queried
  • Statusmatch, no_match, author_mismatch, timeout, rate_limited, error, skipped
  • Elapsed time — How long the query took
  • Found authors — Authors returned by the database (if found)
  • Paper URL — Direct link to the database entry (if found)

Use this to understand why a reference got its verdict. If several databases timed out, the “Not Found” verdict may be unreliable.

DOI and arXiv Validation

When a reference includes a DOI or arXiv ID:

  • Valid — The identifier resolves to a real paper
  • Invalid — The identifier doesn’t resolve (possible fabrication signal)

A verified reference with an invalid DOI is flagged separately — the paper exists in some database, but the DOI in the citation is wrong or fabricated.

False Positive Overrides (TUI)

In the TUI, you can mark results as false positives with a reason:

ReasonUse when
Broken ParsePDF extraction mangled the title/authors
Exists ElsewhereYou verified the paper exists outside indexed databases
All Timed OutAll databases timed out; the result is inconclusive
Known GoodYou personally know this reference is legitimate
Non-AcademicThe reference is to a non-academic resource (software, standard, etc.)

FP overrides are reflected in exported results: the effective_status changes to verified while the original status is preserved for transparency.

Confidence Signals

Higher confidence in a “Not Found” verdict:

  • Multiple databases returned no_match (not just timeouts)
  • No DOI or arXiv ID was present in the reference
  • Title was cleanly extracted (no obvious parsing artifacts)
  • Paper claims to be from a well-indexed venue (top conferences, major journals)

Lower confidence (consider manual verification):

  • Several databases timed out or returned errors
  • Title contains unusual characters or formatting
  • Reference is to a workshop paper, technical report, or dissertation
  • The title is very short (close to the 5-word minimum)

The Problematic Percentage

The summary reports a “problematic %” calculated as:

(not_found + author_mismatch + retracted) / (total - skipped) * 100

This gives a quick signal for triage. A high percentage doesn’t prove misconduct — it means the paper warrants closer human review. Even legitimate papers checking niche or very recent literature can have a notable percentage of unverified references.

Manual Verification Workflow

When Hallucinator flags a reference as Not Found:

  1. Check failed databases — Were most DBs queried, or did many time out?
  2. Search Google Scholar — The output includes a Google Scholar link for each reference
  3. Check the paper URL — If available, visit the link directly
  4. Verify the venue — Is the claimed venue real? Was the paper published there?
  5. Check authors — Do the listed authors exist and publish in this field?
  6. Look for the DOI — If a DOI is listed, try resolving it at doi.org

Export Formats

Hallucinator can export validation results in five formats. The TUI supports all formats via its export dialog; the CLI writes text output by default (use --output to save to a file).

Formats

FormatExtensionBest for
JSON.jsonProgrammatic processing, data pipelines
CSV.csvSpreadsheets, bulk analysis
Markdown.mdReports, GitHub issues, documentation
Text.txtPlain-text records, email
HTML.htmlStandalone visual reports

Sorting Order

All formats use the same reference ordering within each paper:

  1. Retracted — Highest priority (most critical)
  2. Not Found — Potential hallucinations
  3. Author Mismatch — Title found, wrong authors
  4. DOI/arXiv Issues — Verified but with invalid identifiers
  5. FP-overridden — User-verified false positives
  6. Clean Verified — Confirmed references
  7. Skipped — References excluded from validation

Within each category, references are ordered by their original reference number.

False Positive Handling

When a reference has a false-positive override (from the TUI):

  • Original status is preserved (e.g., not_found)
  • Effective status becomes verified
  • FP reason is included (e.g., broken_parse, exists_elsewhere)
  • Adjusted statistics move FP-overridden references from their original bucket into verified

JSON Schema

The JSON export produces an array of paper objects:

[
  {
    "filename": "paper.pdf",
    "verdict": "safe",
    "stats": {
      "total": 42,
      "verified": 38,
      "not_found": 3,
      "author_mismatch": 1,
      "retracted": 0,
      "skipped": 5,
      "problematic_pct": 10.8
    },
    "references": [
      {
        "index": 0,
        "original_number": 1,
        "title": "Attention Is All You Need",
        "raw_citation": "[1] A. Vaswani et al., ...",
        "status": "verified",
        "effective_status": "verified",
        "fp_reason": null,
        "source": "CrossRef",
        "ref_authors": ["A. Vaswani", "N. Shazeer"],
        "found_authors": ["Ashish Vaswani", "Noam Shazeer"],
        "paper_url": "https://doi.org/10.5555/3295222.3295349",
        "failed_dbs": [],
        "doi_info": {
          "doi": "10.5555/3295222.3295349",
          "valid": true,
          "title": null
        },
        "arxiv_info": null,
        "retraction_info": null,
        "db_results": [
          {
            "db": "CrossRef",
            "status": "match",
            "elapsed_ms": 234,
            "authors": ["Ashish Vaswani", "Noam Shazeer"],
            "url": "https://doi.org/10.5555/3295222.3295349"
          },
          {
            "db": "arXiv",
            "status": "skipped",
            "elapsed_ms": 0,
            "authors": [],
            "url": null
          }
        ]
      }
    ]
  }
]

Per-Reference Fields

FieldTypeDescription
indexnumberZero-based index in the results array
original_numbernumberOriginal reference number from the paper (1-based)
titlestringExtracted reference title
raw_citationstringFull raw citation text from PDF
statusstringOriginal status: verified, not_found, author_mismatch
effective_statusstringStatus after FP overrides
fp_reasonstring?FP reason if overridden: broken_parse, exists_elsewhere, all_timed_out, known_good, non_academic
sourcestring?Database that verified the reference
ref_authorsstring[]Authors extracted from the PDF
found_authorsstring[]Authors returned by the verifying database
paper_urlstring?URL to the paper in the source database
failed_dbsstring[]Databases that timed out or errored
doi_infoobject?DOI validation: {doi, valid, title}
arxiv_infoobject?arXiv validation: {arxiv_id, valid, title}
retraction_infoobject?Retraction data: {is_retracted, retraction_doi, retraction_source}
db_resultsobject[]Per-database query results

Skipped Reference Fields

Skipped references include a skip_reason field instead of validation data:

{
  "index": 5,
  "original_number": 6,
  "title": "GitHub repo",
  "status": "skipped",
  "effective_status": "skipped",
  "skip_reason": "url_only",
  ...
}

Per-DB Result Fields

FieldTypeDescription
dbstringDatabase name
statusstringmatch, no_match, author_mismatch, timeout, rate_limited, error, skipped
elapsed_msnumberQuery time in milliseconds
authorsstring[]Authors returned (if found)
urlstring?Paper URL in this database

CSV Schema

One row per reference, with these columns:

Filename,Verdict,Ref#,Title,Status,EffectiveStatus,FpReason,Source,Retracted,Authors,FoundAuthors,PaperURL,DOI,ArxivID,FailedDBs

Multi-value fields (Authors, FoundAuthors, FailedDBs) use semicolons as separators within the CSV field.

Markdown Structure

# Hallucinator Results

## paper.pdf [SAFE]

**42** references | **38** verified | **3** not found | ...

### Problematic References

**[7]** Suspicious Paper Title — ✗ Not Found
- [Google Scholar](...)

### Verified References

| # | Title | Source | URL |
|---|-------|--------|-----|
| 1 | Attention Is All You Need | CrossRef | [link](...) |

### Skipped References

| # | Title | Reason |
|---|-------|--------|
| 6 | GitHub repo | URL-only |

Sections are only included if they contain references (no empty “Problematic References” heading when everything is verified).

Text Format

Plain-text with fixed-width formatting:

Hallucinator Results
============================================================

paper.pdf [SAFE]
-----------------
  42 total | 38 verified | 3 not found | 1 mismatch | 0 retracted | 5 skipped | 10.8% problematic

  [1] Attention Is All You Need - Verified (CrossRef)
       Authors (PDF): A. Vaswani, N. Shazeer
       DOI: 10.5555/3295222.3295349 (valid)
       URL: https://doi.org/...
  [7] Suspicious Paper Title - NOT FOUND
       Authors (PDF): J. Doe, A. Smith
       Timed out: Semantic Scholar, Europe PMC

HTML Format

A self-contained HTML file with:

  • Dark theme with CSS variables
  • Stat cards showing totals across all papers
  • Collapsible per-paper sections
  • Color-coded badges (green: verified, red: not found, yellow: mismatch, dark red: retracted)
  • Author comparison grid for mismatches
  • Retraction warning boxes
  • Google Scholar and paper URL links
  • Raw citation in expandable details blocks
  • Timestamp in footer

The HTML requires no external dependencies — all CSS is inlined.

Using Hallucinator as a Rust Library

This guide covers how to use hallucinator crates as dependencies in your own Rust project.

Which Crate to Depend On

Use caseCrateWhat you get
Validate references programmaticallyhallucinator-corecheck_references(), all DB backends, caching, rate limiting
Extract references from PDFshallucinator-parsing + hallucinator-pdf-mupdfReferenceExtractor, section detection, title/author extraction
Parse BBL/BIB fileshallucinator-bblextract_references_from_bbl(), extract_references_from_bib()
Unified file dispatchhallucinator-ingestAuto-detection (PDF/BBL/BIB/archive), streaming archive extraction
Export resultshallucinator-reportingJSON, CSV, Markdown, Text, HTML export
Build offline DBLPhallucinator-dblpbuild_database(), DblpDatabase::search()
Build offline ACLhallucinator-aclbuild_database(), AclDatabase::search()

Most users will want hallucinator-core for validation and hallucinator-ingest for file handling.

Minimal Example: Validate References

use hallucinator_core::{Config, ProgressEvent, RateLimiters, check_references};
use hallucinator_ingest::extract_references;
use std::sync::Arc;
use tokio_util::sync::CancellationToken;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let path = std::path::Path::new("paper.pdf");

    // Extract references
    let extraction = extract_references(path)
        .map_err(|e| anyhow::anyhow!("{}", e))?;

    println!("Found {} references", extraction.references.len());

    // Build config with defaults
    let config = Config {
        rate_limiters: Arc::new(RateLimiters::new(false, false)),
        ..Default::default()
    };

    // Validate
    let cancel = CancellationToken::new();
    let results = check_references(
        extraction.references,
        config,
        |event| {
            if let ProgressEvent::Result { result, .. } = &event {
                println!("[{:?}] {}", result.status, result.title);
            }
        },
        cancel,
    ).await;

    println!("{} total, {} verified, {} not found",
        results.len(),
        results.iter().filter(|r| r.status == hallucinator_core::Status::Verified).count(),
        results.iter().filter(|r| r.status == hallucinator_core::Status::NotFound).count(),
    );

    Ok(())
}

Config Construction

The Config struct controls all runtime behavior:

#![allow(unused)]
fn main() {
use hallucinator_core::{Config, RateLimiters, QueryCache, build_query_cache};
use std::sync::Arc;

let rate_limiters = Arc::new(RateLimiters::new(
    true,  // has_crossref_mailto (enables 3/s instead of 1/s)
    true,  // has_s2_api_key (enables higher S2 rate)
));

let cache = build_query_cache(
    Some(std::path::Path::new("/tmp/cache.db")),
    604800,  // positive TTL: 7 days in seconds
    86400,   // negative TTL: 24 hours in seconds
);

let config = Config {
    openalex_key: Some("your-key".to_string()),
    s2_api_key: Some("your-key".to_string()),
    num_workers: 4,
    db_timeout_secs: 10,
    db_timeout_short_secs: 5,
    max_rate_limit_retries: 3,
    rate_limiters,
    query_cache: Some(cache),
    ..Default::default()
};
}

ProgressEvent Variants

The progress callback receives these events during validation:

EventWhenKey fields
CheckingStarting a referenceindex, total, title
DatabaseQueryCompleteA single DB query finisheddb_name, status, elapsed
RateLimitWaitWaiting for rate limiterdb_name, wait_time
RateLimitRetryRetrying after 429db_name, attempt
WarningDB timeouts for a referencetitle, failed_dbs, message
ResultReference validation completeindex, total, result: Box<ValidationResult>
RetryPassStarting retry pass
RetryingRetrying a referenceindex, title

PDF Extraction

Extract and parse references without validating:

#![allow(unused)]
fn main() {
use hallucinator_core::PdfBackend;
use hallucinator_parsing::ReferenceExtractor;
use hallucinator_pdf_mupdf::MupdfBackend;

let text = MupdfBackend.extract_text(std::path::Path::new("paper.pdf"))?;

// Use ReferenceExtractor for the full pipeline
let extractor = ReferenceExtractor::new(MupdfBackend);
let result = extractor.extract(std::path::Path::new("paper.pdf"))?;

for reference in &result.references {
    println!("Title: {:?}", reference.title);
    println!("Authors: {:?}", reference.authors);
    println!("DOI: {:?}", reference.doi);
}
}

Adding a Custom PDF Backend

Implement PdfBackend (defined in hallucinator-core) to use a different PDF library:

#![allow(unused)]
fn main() {
use hallucinator_core::PdfBackend;

struct MyPdfBackend;

impl PdfBackend for MyPdfBackend {
    fn extract_text(&self, path: &std::path::Path) -> Result<String, String> {
        // Your PDF text extraction logic here
        let text = my_pdf_library::extract(path)
            .map_err(|e| format!("extraction failed: {}", e))?;
        Ok(text)
    }
}
}

Adding a Custom Database Backend

See Database Backends for the DatabaseBackend trait reference and a step-by-step guide.

Database Backends

This document covers the DatabaseBackend trait, the existing database implementations, and how to add a new backend.

The DatabaseBackend Trait

Defined in hallucinator-core/src/db/mod.rs:

#![allow(unused)]
fn main() {
pub trait DatabaseBackend: Send + Sync {
    /// Human-readable name (e.g., "CrossRef", "arXiv")
    fn name(&self) -> &str;

    /// Whether this is a local (offline) database.
    /// Local backends are queried inline by the coordinator (not via drainer tasks).
    fn is_local(&self) -> bool { false }

    /// Whether this backend requires a DOI to query (e.g., DOI resolver).
    /// References without a DOI are skipped for these backends.
    fn requires_doi(&self) -> bool { false }

    /// Query by title. Returns found title, authors, paper URL, and optional retraction info.
    fn query<'a>(
        &'a self,
        title: &'a str,
        client: &'a reqwest::Client,
        timeout: Duration,
    ) -> Pin<Box<dyn Future<Output = Result<DbQueryResult, DbQueryError>> + Send + 'a>>;

    /// Query by DOI. Default implementation returns empty/not-found.
    fn query_doi<'a>(
        &'a self,
        doi: &'a str,
        title: &'a str,
        authors: &'a [String],
        client: &'a reqwest::Client,
        timeout: Duration,
    ) -> DoiQueryResult<'a> { ... }
}
}

Return Types

#![allow(unused)]
fn main() {
pub struct DbQueryResult {
    pub found_title: Option<String>,   // Title as found in the database
    pub authors: Vec<String>,          // Author names
    pub paper_url: Option<String>,     // Direct link to the paper
    pub retraction: Option<RetractionResult>,  // Only CrossRef populates this
}

pub enum DbQueryError {
    RateLimited { retry_after: Option<Duration> },
    Other(String),
}
}

A DbQueryResult with found_title = Some(...) indicates the title was found. The validation engine then compares authors (if provided) to determine Verified vs. AuthorMismatch.

Existing Backends

Remote (HTTP-based)

BackendNameRate LimitAuthNotes
CrossRef"CrossRef"1/s (3/s with mailto)Optional mailtoExtracts retraction info inline
Arxiv"arXiv"3/sNoneSearches arXiv API
DblpOnline"DBLP"1/sNoneDBLP search API
SemanticScholar"Semantic Scholar"1/s (100/s with key)Optional API keySearches papers by title
EuropePmc"Europe PMC"3/sNoneBiomedical/life science literature
PubMed"PubMed"3/sNoneBiomedical literature via NCBI
OpenAlex"OpenAlex"10/sRequired API keyInserted first in DB list when enabled
DoiResolver"DOI"5/sNoneResolves DOI via doi.org (requires_doi = true)
AclAnthology"ACL Anthology"2/sNoneACL Anthology online scraping
NeurIPS"NeurIPS"NoneCurrently disabled
Ssrn"SSRN"NoneCurrently disabled
Searxng"Web Search"1/sNoneMeta-search fallback (requires self-hosted SearxNG)

Local (Offline)

BackendNameStorageNotes
DblpOffline"DBLP"SQLite FTS5~2–3GB, built from DBLP XML dump
AclOffline"ACL Anthology"SQLite FTS5~50–100MB, built from ACL Anthology XML

Note: offline and online backends for the same database share the same name(). The system avoids running both simultaneously — if an offline DB is available, the online API is skipped.

Local backends return is_local() = true and are queried inline by the coordinator task before dispatching to remote drainers. If a local backend verifies a reference, all remote queries are skipped.

How Backends Are Selected

The build_database_list() function in hallucinator-core/src/orchestrator.rs assembles the list of enabled backends at startup:

  1. OpenAlex — Added first if API key is provided
  2. CrossRef — Always enabled (with optional mailto for higher rate)
  3. arXiv — Always enabled
  4. DBLP Online — Always enabled
  5. Semantic Scholar — Always enabled (rate depends on API key)
  6. Europe PMC — Always enabled
  7. PubMed — Always enabled
  8. ACL Anthology (online) — Always enabled
  9. DOI Resolver — Always enabled (only queries refs with DOIs)
  10. DBLP Offline — Added if dblp_offline_path is configured
  11. ACL Offline — Added if acl_offline_path is configured
  12. SearxNG — Used as last-resort fallback for NotFound refs (not in the main drainer pool)

Backends listed in Config.disabled_dbs are excluded. Database names are matched case-sensitively.

Adding a New Backend

Step 1: Create the Module

Create hallucinator-core/src/db/my_backend.rs:

#![allow(unused)]
fn main() {
use std::time::Duration;

use crate::db::{DatabaseBackend, DbQueryError, DbQueryResult};

pub struct MyBackend {
    // Configuration fields
}

impl MyBackend {
    pub fn new() -> Self {
        Self { }
    }
}

impl DatabaseBackend for MyBackend {
    fn name(&self) -> &str {
        "My Backend"
    }

    fn query<'a>(
        &'a self,
        title: &'a str,
        client: &'a reqwest::Client,
        timeout: Duration,
    ) -> std::pin::Pin<Box<dyn std::future::Future<
        Output = Result<DbQueryResult, DbQueryError>
    > + Send + 'a>> {
        Box::pin(async move {
            // 1. Build your API request
            let url = format!("https://api.example.com/search?q={}",
                              urlencoding::encode(title));

            // 2. Execute with timeout
            let response = client
                .get(&url)
                .timeout(timeout)
                .send()
                .await
                .map_err(|e| {
                    if e.is_timeout() {
                        DbQueryError::Other("timeout".into())
                    } else if e.status().map_or(false, |s| s == 429) {
                        DbQueryError::RateLimited { retry_after: None }
                    } else {
                        DbQueryError::Other(e.to_string())
                    }
                })?;

            // 3. Parse response
            let body: serde_json::Value = response
                .json()
                .await
                .map_err(|e| DbQueryError::Other(e.to_string()))?;

            // 4. Extract result
            if let Some(found_title) = body.get("title").and_then(|t| t.as_str()) {
                Ok(DbQueryResult {
                    found_title: Some(found_title.to_string()),
                    authors: vec![],  // Extract authors if available
                    paper_url: body.get("url")
                        .and_then(|u| u.as_str())
                        .map(|s| s.to_string()),
                    retraction: None,
                })
            } else {
                Ok(DbQueryResult {
                    found_title: None,
                    authors: vec![],
                    paper_url: None,
                    retraction: None,
                })
            }
        })
    }
}
}

Step 2: Register the Module

In hallucinator-core/src/db/mod.rs, add:

#![allow(unused)]
fn main() {
pub mod my_backend;
}

Step 3: Add to Database List

In hallucinator-core/src/orchestrator.rs, add the backend to build_database_list():

#![allow(unused)]
fn main() {
dbs.push(Box::new(my_backend::MyBackend::new()));
}

Step 4: Configure Rate Limiting

In hallucinator-core/src/rate_limit.rs, add a rate limiter for your backend in RateLimiters::new():

#![allow(unused)]
fn main() {
// Example: 5 requests per second
let my_backend = AdaptiveDbLimiter::new(
    governor::Quota::per_second(std::num::NonZeroU32::new(5).unwrap()),
);
}

Key Implementation Notes

  • Title matching: You don’t need to do fuzzy matching yourself. Return the title as found in your database; the validation engine handles comparison via normalize_title() and rapidfuzz.
  • Authors: Return author names as provided by your API. The validation engine normalizes them before comparison.
  • Rate limiting: Return DbQueryError::RateLimited on HTTP 429 responses. The adaptive rate limiter will back off automatically.
  • Caching: Results are cached automatically by the validation engine. You don’t need to implement caching in your backend.
  • Timeout: Always use the provided timeout parameter with your HTTP requests.

Python Bindings

Hallucinator provides Python bindings via PyO3, offering the full validation pipeline as a native Python package with pre-compiled wheels.

Installation

pip install hallucinator

Pre-compiled wheels are available for major platforms (Linux x86_64, macOS x86_64/ARM64, Windows x86_64). To build from source:

cd hallucinator-rs/crates/hallucinator-python
pip install maturin
maturin develop --release

What’s Available

The Python bindings expose:

  • PdfExtractor — Extract references from PDFs with configurable parsing
  • Validator + ValidatorConfig — Validate references against academic databases
  • ValidationResult — Per-reference results with status, source, authors, per-DB details
  • ProgressEvent — Real-time progress callbacks
  • ArchiveIterator — Stream PDFs from tar.gz/zip archives
  • Custom segmentation strategies — Pass Python callables for reference segmentation

Quick Example

from hallucinator import PdfExtractor, Validator, ValidatorConfig

# Extract
extractor = PdfExtractor()
result = extractor.extract("paper.pdf")

# Validate
config = ValidatorConfig()
config.crossref_mailto = "you@example.com"
config.num_workers = 4
config.db_timeout_secs = 10
validator = Validator(config)

def on_progress(event):
    if event.type == "result":
        print(f"  [{event.status}] {event.title}")

results = validator.check(result.references, progress=on_progress)

Full Documentation

The complete Python API reference — including all configuration options, custom extraction strategies, progress event types, and result inspection — is in:

PYTHON_BINDINGS.md

This covers:

  • Installation (wheels vs. source build)
  • PDF extraction API and configuration
  • Custom segmentation strategies (Python callables)
  • Validator configuration options with defaults
  • Progress callbacks and event types
  • Per-database result inspection
  • DOI, arXiv, and retraction information
  • Archive processing
  • Complete API reference tables
  • End-to-end examples

Hallucinator TUI Design Document

Design only. No implementation decisions about backend capabilities — if the backend can’t support something yet, that’s a separate problem.

Who uses this and why

Area chairs / senior PC members reviewing 20-100 submissions for a venue. They need to triage: which papers have suspicious references, and how suspicious? They don’t read every result — they scan for red flags, drill in when something looks off, then move on.

Reviewers checking a handful of assigned papers (3-8 typically). More likely to read results carefully. May re-run individual papers after authors revise.

Demo / conference hallway context. Someone shows this to a colleague. First impression matters — it should look like a tool built by someone who gives a shit. But the flash has to be load-bearing: every visual element should communicate something useful.

Design principles

  1. Information density over decoration. CS people read dense UIs comfortably. Don’t waste space on padding when you could show data. Think Bloomberg terminal, not macOS settings.

  2. Glanceable status. At any point you should be able to look at the screen for <1 second and know: how far along are we, is anything wrong, what needs my attention.

  3. Progressive disclosure. Summary first, details on demand. The batch view shows paper-level status. Drill into a paper to see references. Drill into a reference to see per-database results.

  4. Don’t block the user. Analysis takes time. The user should be able to browse already-completed results while new papers are still running.

  5. Adaptive layout. Must be usable at 80x24 (cramped but functional) and take advantage of 200+ column modern terminals. Not two separate layouts — one layout that flexes.


Screens

There are four screens. You’re always on exactly one.

Screen 1: Queue

The entry point. Shows all papers (1 or 50) and their status.

 HALLUCINATOR                                          ██████░░░░ 12/50
─────────────────────────────────────────────────────────────────────────
 #   Paper                          Refs  ✓   ⚠   ✗   ☠   Status
─────────────────────────────────────────────────────────────────────────
  1  arxiv_2024_llm_survey.pdf       38  34   2   1   1   DONE
  2  neurips_submission_042.pdf       27  18   1   0   0   ████░░ 19/27
  3  review_response_v2.pdf          15   —   —   —   —   QUEUED
  4  transformer_scaling.pdf          —   —   —   —   —   QUEUED
  5  rlhf_safety_paper.pdf            —   —   —   —   —   QUEUED
 ...
 48  federated_privacy.pdf            —   —   —   —   —   QUEUED
 49  code_generation_bench.pdf        —   —   —   —   —   QUEUED
 50  multimodal_reasoning.pdf         —   —   —   —   —   QUEUED
─────────────────────────────────────────────────────────────────────────
 ✓ 34 verified  ⚠ 3 mismatch  ✗ 1 not found  ☠ 1 retracted     3:42 elapsed

 [enter] open  [r] retry failed  [d] delete  [a] add files  [q] quit

Columns:

  • # — sequential, stable. Not the filename, because filenames are long.
  • Paper — truncated filename. Full path shown on hover/focus.
  • Refs — total reference count (blank until PDF is parsed).
  • ✓ ⚠ ✗ ☠ — counts by verdict. These are the triage signal.
  • StatusQUEUED, inline progress bar while running, DONE when finished. If errors occurred during parsing/extraction: ERROR.

Behavior:

  • Papers are listed in queue order. Currently-running papers float to the top of the “not done” section (done papers above, then active, then queued).
  • The cursor (highlighted row) selects a paper. Press Enter to go to Screen 2.
  • While papers are running, counts update live. The overall progress bar in the header updates.
  • Bottom row shows aggregate totals across all completed papers.

Sorting: Default is queue order. Allow re-sort by column (keybind or click header). Most useful sort: by descending — puts the most suspicious papers at top. A reviewer running 50 papers wants to see “which 5 papers have the most not-found references” immediately.

Why this works for 1 paper: If you pass a single PDF, this screen still appears but with one row. It shows the progress bar filling up, counts incrementing. When done, it auto-focuses that row so pressing Enter takes you straight to the results. Feels natural, not like a degenerate case of a batch view.

Filtering: Simple text filter on filename. Type / to start filtering (vim convention). Also filter by status: e.g., f cycles through “all → has problems → done → running → queued”. The most common filter is “show me only the papers that have problems.”


Screen 2: Paper

Shows all references for one paper and their verdicts.

 HALLUCINATOR > arxiv_2024_llm_survey.pdf               ██████████ 38/38
─────────────────────────────────────────────────────────────────────────
 #   Reference                                     Verdict     Source
─────────────────────────────────────────────────────────────────────────
  1  Vaswani et al. "Attention Is All You Need"    ✓ verified  arXiv
  2  Brown et al. "Language Models are Few-Shot..." ✓ verified  S2
  3  Wei et al. "Chain-of-Thought Prompting..."    ✓ verified  CrossRef
  4  Smith & Jones "Recursive Self-Improvement..." ✗ not found  —
  5  Zhang et al. "Emergent Abilities of..."       ⚠ mismatch  DBLP
  6  Chen et al. "Evaluating Large Language..."    ✓ verified  arXiv
 ...
 37  Wang et al. "Constitutional AI..."            ✓ verified  S2
 38  Davis "On the Retraction of..."               ☠ retracted CrossRef
─────────────────────────────────────────────────────────────────────────
 ✓ 34 verified  ⚠ 2 mismatch  ✗ 1 not found  ☠ 1 retracted

 [enter] details  [r] retry  [esc] back  [e] export  [s] sort

Columns:

  • # — reference number as it appears in the paper.
  • Reference — authors + truncated title. Quoted title portion to visually separate it from authors.
  • Verdict — icon + word. Color-coded: green/yellow/red/magenta.
  • Source — which database confirmed it (for verified) or for not found. If multiple DBs confirmed, show the fastest one (the one that actually ended the search via early exit).

Behavior:

  • If analysis is still running, references appear as they’re processed. Unprocessed references show as dim/grey with ⏳ pending or ⟳ checking status.
  • Enter on a reference opens Screen 3 (detail view).
  • r on a specific reference retries just that one.
  • R (shift) retries all failed/not-found references for this paper.

Active reference animation: The reference currently being checked gets a subtle indicator — a spinner or a cycling set of dots. Nothing aggressive. Just enough to show “this one is live.” If multiple references are being checked concurrently (which they are — 4 at a time), all active ones show the indicator.

Problem-first ordering: Default sort is by reference number (paper order). But s cycles through sort modes, and sort-by-verdict puts not-found and retracted at the top. This is the thing the user actually cares about — “show me the problems.”

Export: e opens a small modal/prompt: export format (json / csv / markdown / plain text) and destination (file path, clipboard). Exports the results for this paper only. From the Queue screen, e exports all papers.


Screen 3: Reference Detail

Full detail on one reference. This is the “prove it” screen — when you see a suspicious result you drill in here to understand why.

 HALLUCINATOR > arxiv_2024_llm_survey.pdf > [4]
─────────────────────────────────────────────────────────────────────────

 REFERENCE [4]
 Smith, J. and Jones, A. (2024)
 "Recursive Self-Improvement in Large Language Models:
  A Theoretical Framework"
 Proceedings of ICML 2024, pp. 1234-1248

 Verdict: ✗ NOT FOUND

 Extracted title:  "Recursive Self-Improvement in Large Language
                    Models: A Theoretical Framework"
 Extracted authors: J. Smith, A. Jones
 Extracted DOI:     none
 Extracted arXiv:   none

 DATABASE RESULTS
─────────────────────────────────────────────────────────────────────────
  Database       Result          Time     Notes
─────────────────────────────────────────────────────────────────────────
  CrossRef       no match        1.2s
  arXiv          no match        0.8s
  DBLP           no match        0.3s     (offline)
  Sem. Scholar   timeout         10.0s    retried: no match (12.4s)
  OpenAlex       no match        2.1s
  ACL            no match        0.4s
  NeurIPS        no match        0.6s
  Europe PMC     no match        1.8s
  PubMed         no match        0.9s
─────────────────────────────────────────────────────────────────────────

 No close matches found in any database.

 [r] retry  [c] copy ref text  [esc] back

For a verified reference, this screen would instead show:

 Verdict: ✓ VERIFIED (arXiv)

 Matched title:  "Attention Is All You Need"
 Match score:     98.2%
 Matched authors: Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez,
                  Kaiser, Polosukhin
 Author overlap:  8/8

 DATABASE RESULTS
─────────────────────────────────────────────────────────────────────────
  Database       Result          Time     Notes
─────────────────────────────────────────────────────────────────────────
  arXiv          ✓ match         0.3s     ← verified (early exit)
  CrossRef       (skipped)        —       early exit
  DBLP           (skipped)        —       early exit
  ...

For an author mismatch:

 Verdict: ⚠ AUTHOR MISMATCH (DBLP)

 Matched title:   "Emergent Abilities of Large Language Models"
 Match score:      96.1%
 Expected authors: Zhang, Wei, Chen
 Found authors:    Wei, Tay, Bommasani, Raffel, Zoph, Borgeaud,
                   Yogatama, Bosma, Zhou, Metzler, Chi, Hashimoto,
                   Vinyals, Liang, Dean, Fedus
 Author overlap:   1/3 (Wei)

Why this screen matters: “Not found” doesn’t always mean hallucinated. Maybe the title extraction mangled something. Maybe the paper is too new. This screen lets a human make the judgment call by seeing exactly what was searched for, what came back, and how long each database took. The timing information helps distinguish “no match” from “everything timed out” — very different confidence levels.


Screen 4: Config

Accessible from any screen via , (comma). Not a modal — a full screen you navigate to and back from, same as the others. Esc returns to wherever you were.

The point: you’re mid-run, you realize you forgot to set your Semantic Scholar API key, or you want to bump concurrency, or you want to disable a database that’s down. You shouldn’t have to quit and relaunch. That’s hostile UX when someone has 30 papers already processed.

 HALLUCINATOR > Config
─────────────────────────────────────────────────────────────────────────

 API Keys
─────────────────────────────────────────────────────────────────────────
  Semantic Scholar    sk-••••••••••••7f2a               [enter] edit
  OpenAlex            (not set)                         [enter] set
─────────────────────────────────────────────────────────────────────────

 Databases
─────────────────────────────────────────────────────────────────────────
  CrossRef            ✓ enabled
  arXiv               ✓ enabled
  DBLP                ✓ enabled  (offline: ~/dblp.db)
  Sem. Scholar        ✓ enabled
  OpenAlex            ○ disabled  (no API key)
  ACL                 ✓ enabled
  NeurIPS             ✓ enabled
  Europe PMC          ✓ enabled
  PubMed              ✓ enabled
─────────────────────────────────────────────────────────────────────────

 Concurrency & Timeouts
─────────────────────────────────────────────────────────────────────────
  Parallel references         4          [enter] edit
  DB query timeout           10s         [enter] edit
  Retry timeout              45s         [enter] edit
  Request delay              1.0s        [enter] edit
─────────────────────────────────────────────────────────────────────────

 Display
─────────────────────────────────────────────────────────────────────────
  Theme                      green       [enter] toggle
  Notifications              bell        [enter] cycle
─────────────────────────────────────────────────────────────────────────

 [enter] edit  [space] toggle  [esc] back

Sections:

API Keys. Shows masked keys (last 4 chars visible) for any keys already set. Press Enter to edit — opens an inline text input. Keys entered here take effect immediately for subsequent queries. They override env vars / CLI flags for this session.

Databases. Toggle individual databases on/off with Space. If a database requires an API key that isn’t set, it shows as ○ disabled with the reason. If DBLP is in offline mode, show the DB path. Toggling a database off mid-run means it won’t be queried for remaining references (already-completed results are unaffected). Useful when a database is down and you don’t want to waste timeout budget on it.

Concurrency & Timeouts. Edit numeric values inline. Changing parallel references mid-run adjusts the worker pool for subsequent references. Changing timeouts affects subsequent queries. These are the knobs you reach for when the tool is going too slow (bump concurrency) or when a database is flaky (bump timeout).

Display. Theme toggle (green/modern) applies immediately — the screen redraws in the new palette. Notification mode cycles through off → bell → desktop → bell+desktop.

Behavior notes:

  • Changes take effect immediately for new work. They don’t retroactively affect completed results or in-flight queries.
  • Changes are session-scoped by default. They don’t persist to disk unless the user explicitly saves.
  • S (shift-s) on the config screen saves current settings to ~/.config/hallucinator/config.toml. This becomes the new default for future runs. A small confirmation appears: “Saved to ~/.config/hallucinator/config.toml”.
  • The config file is also loaded on startup if it exists, so CLI flags and env vars override the config file, and the TUI config screen overrides everything. Precedence: TUI edits > CLI flags > env vars > config file > defaults.

Why , as the keybind: It’s unused, easy to reach, and has precedent in tools like Neovim/Helix where , is a common leader key for settings. It doesn’t conflict with any navigation or action key.

Why a full screen and not a modal: The config has too many sections and options to fit comfortably in a modal overlay. A full screen gives room for the settings to breathe and be scannable. Also, settings aren’t something you adjust while simultaneously reading results — you go in, tweak, go back. Full screen matches that flow.


Adaptive layout

Narrow terminals (< 100 columns)

  • Queue screen: hide Refs column, truncate filenames more aggressively, collapse ✓ ⚠ ✗ ☠ into a single “problems” count.
  • Paper screen: hide Source column, truncate titles earlier.
  • Detail screen: wraps naturally since it’s mostly prose.

Wide terminals (140+ columns)

  • Queue screen: show full filenames, add an “elapsed time” column, add a “problems” column that sums ✗ + ☠ for quick scanning.
  • Paper screen: show full titles without truncation, add a “time” column showing how long validation took per reference.
  • Detail screen: split into two panes — reference info on the left, database results on the right (side-by-side instead of stacked).

Very wide terminals (200+ columns)

  • Queue screen: could show a mini-sparkline per paper showing distribution of verdicts as a tiny bar chart inline. Pure gravy.
  • Paper screen: show the raw reference text in a right-side pane alongside the parsed/structured view. Useful for debugging extraction issues.

Short terminals (< 30 rows)

  • Collapse the header to a single line.
  • Collapse the footer/keybinds bar to a single line.
  • Use available rows for data.

Live activity panel (overlay, not a screen)

Toggleable with Tab. This is the “flashy” part, but it earns its space.

When active, it takes the right 40-50 columns (or bottom third on narrow terminals) and shows:

 ACTIVITY
────────────────────────────────
 Database Health
  CrossRef    ●  142ms avg
  arXiv       ●  89ms avg
  DBLP        ●  12ms avg  (offline)
  Sem.Scholar ◐  1.2s avg  throttled
  OpenAlex    ●  203ms avg
  ACL         ●  67ms avg
  NeurIPS     ●  94ms avg
  Europe PMC  ●  312ms avg
  PubMed      ○  down

 Rate Limits
  CrossRef   ░░░░▓▓░░░░  12/50
  S2         ░▓░░░░░░░░   3/100

 Throughput
  refs/min  ▁▂▃▅▇█▇▅▃▄▆█  avg: 8.2

 Active Queries
  → CrossRef: "Recursive Self-Imp..."
  → arXiv: "Recursive Self-Imp..."
  → DBLP: "Recursive Self-Imp..."
  ← S2: 429 Too Many Requests

Database health indicators:

  • — healthy (responding, <500ms average)
  • — degraded (slow, rate limited, intermittent errors)
  • — down (repeated failures, all timeouts)

Why this panel exists: When you’re running 50 papers, this answers “why is it going slow” without you having to guess. If Semantic Scholar is throttling you, you can see it. If PubMed is down, you know those “not found” results are lower confidence. It converts backend infrastructure state into visible, actionable information.

Why it’s an overlay and not a screen: You want to see this while browsing results. It’s context, not content.


Keyboard model

Global (work on any screen)

KeyAction
qQuit (confirms if analysis still running)
,Open config screen
TabToggle activity panel
?Toggle keybind help overlay
Ctrl+CCancel current analysis / force quit
KeyAction
↑/↓ j/kMove cursor
EnterDrill in (queue→paper→reference)
EscBack up one level
g / GJump to top / bottom of list
Ctrl+D/UPage down / page up
/Start text filter
n / NNext / previous filter match

Actions

KeyContextAction
rQueueRetry all failed refs in paper
rPaperRetry selected reference
RPaperRetry all failed refs in paper
rDetailRetry this reference
eQueueExport all results
ePaperExport this paper’s results
sQueue / PaperCycle sort mode
fQueueCycle status filter
aQueueAdd more files
dQueueRemove paper from queue
yDetailCopy reference text to clipboard
SConfigSave settings to config file
SpaceConfigToggle selected setting

Mouse

  • Click row to select.
  • Double-click row to drill in.
  • Click column header to sort.
  • Scroll wheel scrolls the list.
  • Click the Tab activity panel area to toggle it.

Not every action needs a mouse equivalent. Keyboard is the primary interface. Mouse is a convenience for people who reach for it instinctively.


Color

The palette should work on both dark and light terminal backgrounds but optimize for dark (that’s what the target audience uses).

Verdict colors

VerdictColorRationale
VerifiedGreen (bold)Universal “good”
Author mismatchYellowWarning, needs human judgment
Not foundRedDanger / suspicious
RetractedMagenta (bold)Alarming, distinct from not-found
PendingDim / greyNot yet actionable
CheckingCyanActive, in-progress

UI chrome

  • Borders and separators: dim grey. Should recede.
  • Headers/labels: white, bold.
  • Selected row: reverse video (swap fg/bg). High contrast, works on any color scheme.
  • Active database queries in activity panel: cyan.
  • Rate limit bars: green → yellow → red gradient as capacity fills.

Emphasis principle

Color is never the only signal. Every verdict also has a text label and a distinct icon character (✓ ⚠ ✗ ☠). This matters for:

  • Accessibility (color vision deficiency).
  • Monochrome terminals / piped output.
  • Screenshots in papers or blog posts that may be printed B&W.

Startup sequence

$ hallucinator ~/papers/*.pdf

 ░█░█░█▀█░█░░░█░░░█░█░█▀▀░▀█▀░█▀█░█▀█░▀█▀░█▀█░█▀▄
 ░█▀█░█▀█░█░░░█░░░█░█░█░░░░█░░█░█░█▀█░░█░░█░█░█▀▄
 ░▀░▀░▀░▀░▀▀▀░▀▀▀░▀▀▀░▀▀▀░▀▀▀░▀░▀░▀░▀░░▀░░▀▀▀░▀░▀

 Loading 50 PDFs...
 Databases: CrossRef arXiv DBLP(offline) S2 OpenAlex ACL NeurIPS PMC PubMed

Brief. The banner renders instantly (no animation — animation on startup is a delay). The database line confirms which sources are enabled and shows if DBLP is running in offline mode. Then it transitions to the Queue screen within ~1 second as PDFs start parsing.

If the banner won’t fit (terminal < 70 columns), skip it and go straight to the Queue screen.


The “50 papers at 2am” workflow

This is the scenario that matters most. An area chair has a deadline. They run:

$ hallucinator ~/openreview-downloads/*.pdf

First 10 seconds: Queue screen populates with 50 filenames. First paper starts processing. Activity panel shows databases warming up.

Next few minutes: Papers process. The user watches for a bit, sees the system is working, then does something else in another terminal tab.

They come back: 35 papers done. They press s to sort by problems. The 3 papers with not-found references float to the top. They press Enter on the worst one, see 4 not-found references, drill into each to see what the tool searched for. Two look like genuine hallucinations (zero matches across all 9 databases). Two look like very recent preprints that just aren’t indexed yet (only in arXiv, which timed out).

They press Esc back to Queue, check the next problem paper. After 5 minutes of triage they have a clear picture: 2 submissions with probable fabricated references, 1 with a retracted citation the authors should have caught.

They press e, export a JSON report of all results, and attach it to their AC notes.

What mattered: Sort by problems. Fast drill-in/drill-out. Export. Not the sparklines or the database race visualization — those were nice for the first 30 seconds but the actual utility is in triage speed.


Non-goals for TUI

  • PDF viewing. Don’t try to render the paper. Users have their own PDF viewer open alongside.
  • Editing results. The TUI is read-only for results. No “mark as false positive” or annotation features. That’s a different tool.
  • Log viewer. The activity panel is not a log. Don’t show every HTTP request. Show state (database health, rate limits, throughput) not events.

Visual mockups: states and scenarios

Queue screen — mid-run, sorted by problems

The area chair has been running for a few minutes and just hit s to sort by descending problem count.

 HALLUCINATOR                                          ████████░░ 41/50
─────────────────────────────────────────────────────────────────────────
 #   Paper                          Refs  ✓   ⚠   ✗   ☠   Status
─────────────────────────────────────────────────────────────────────────
  7  sketchy_submission_v3.pdf        22  14   1   4   1   DONE
 31  workshop_paper_draft.pdf         18  11   2   3   0   DONE
 12  llm_alignment_study.pdf          35  28   3   2   0   DONE
  1  arxiv_2024_llm_survey.pdf        38  34   2   1   1   DONE
 19  multiagent_reasoning.pdf         29  27   1   1   0   DONE
  3  safety_evaluation.pdf            41  39   2   0   0   DONE
  5  federated_learning.pdf           33  33   0   0   0   DONE
 ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
 42  code_gen_benchmark.pdf           27  18   —   —   —   █████░ 18/27
 43  vision_transformer.pdf           19   —   —   —   —   PARSING
 ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
 44  diffusion_models.pdf              —   —   —   —   —   QUEUED
 45  robustness_theory.pdf             —   —   —   —   —   QUEUED
 ...8 more
─────────────────────────────────────────────────────────────────────────
 ✓ 412 verified  ⚠ 18 mismatch  ✗ 11 not found  ☠ 2 retracted  12:34

 sorted: problems ↓          [enter] open  [s] sort  [f] filter  [q] quit

Note the three visual zones separated by dashed rules: done (sorted), active (running now), and queued. The user’s eye goes straight to the top — paper #7 with 4 not-found and 1 retracted is the one to investigate.

Queue screen — narrow terminal (80 columns)

Same data, collapsed for a small terminal:

 HALLUCINATOR                        ████████░░ 41/50
────────────────────────────────────────────────────────
  #  Paper                     Probs  Status
────────────────────────────────────────────────────────
   7 sketchy_submission_v3…     5     DONE
  31 workshop_paper_draft…      3     DONE
  12 llm_alignment_study…       2     DONE
   1 arxiv_2024_llm_surv…       2     DONE
  19 multiagent_reasoning…      1     DONE
   3 safety_evaluation…         0     DONE
   5 federated_learning…        0     DONE
  ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
  42 code_gen_benchmark…        —     █████░ 18/27
  43 vision_transformer…        —     PARSING
  ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
  44 diffusion_models…          —     QUEUED
  ...
────────────────────────────────────────────────────────
 [enter] open  [s] sort  [f] filter  [q] quit

Individual verdict columns collapse into a single “Probs” count (✗ + ☠ + ⚠). Still scannable. The user loses per-type breakdown at a glance but gains it back by drilling in.

Queue screen — wide terminal with activity panel (180+ columns)

 HALLUCINATOR                                          ████████░░ 41/50         │ ACTIVITY
──────────────────────────────────────────────────────────────────────────────────┤────────────────────────────────────
 #   Paper                              Refs  ✓   ⚠   ✗   ☠   Time   Status   │ Database Health
──────────────────────────────────────────────────────────────────────────────────│  CrossRef    ●  142ms
  7  sketchy_submission_v3.pdf            22  14   1   4   1   0:48   DONE     │  arXiv       ●   89ms
 31  workshop_paper_draft.pdf             18  11   2   3   0   0:35   DONE     │  DBLP        ●   12ms  offline
 12  llm_alignment_study.pdf              35  28   3   2   0   1:12   DONE     │  Sem.Scholar ◐  1.2s   throttled
  1  arxiv_2024_llm_survey.pdf            38  34   2   1   1   1:31   DONE     │  OpenAlex    ●  203ms
 19  multiagent_reasoning.pdf             29  27   1   1   0   0:55   DONE     │  ACL         ●   67ms
  3  safety_evaluation.pdf                41  39   2   0   0   1:44   DONE     │  NeurIPS     ●   94ms
  5  federated_learning.pdf               33  33   0   0   0   1:22   DONE     │  Europe PMC  ●  312ms
 ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│  PubMed      ○  down
 42  code_gen_benchmark.pdf               27  18   —   —   —   0:22   ████░    │
 43  vision_transformer.pdf               19   —   —   —   —    —     PARSING  │ Rate Limits
 ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│  CrossRef  ░░░▓▓▓░░░░  18/50
 44  diffusion_models.pdf                  —   —   —   —   —    —     QUEUED   │  S2        ░▓▓▓▓▓▓▓░░  71/100
 45  robustness_theory.pdf                 —   —   —   —   —    —     QUEUED   │
 ...8 more                                                                      │ Throughput (refs/min)
──────────────────────────────────────────────────────────────────────────────────│  ▁▂▃▅▇█▇▅▃▁▂▅▇█▆▅  avg: 8.2
 ✓ 412  ⚠ 18  ✗ 11  ☠ 2                                           12:34       │
 sorted: problems ↓    [enter] open  [s] sort  [f] filter  [Tab] panel  [q] quit│

The activity panel earns its space here. You can see S2 is almost at its rate limit (71/100) — that’s why it shows as “throttled.” PubMed is straight up down. The throughput sparkline shows a dip about 2 minutes ago (probably when S2 started throttling) and a recovery.

Paper screen — actively checking, with activity panel

Drilled into paper #42 which is still running:

 HALLUCINATOR > code_gen_benchmark.pdf                   █████░░░░ 18/27        │ ACTIVITY
──────────────────────────────────────────────────────────────────────────────────┤──────────────────────────────
 #   Reference                                     Verdict      Source          │ Database Health
──────────────────────────────────────────────────────────────────────────────────│  CrossRef    ●  142ms
  1  Chen et al. "Evaluating Large Language..."    ✓ verified   arXiv           │  arXiv       ●   89ms
  2  Austin et al. "Program Synthesis with..."     ✓ verified   S2              │  DBLP        ●   12ms  offline
  3  Li et al. "Competition-Level Code..."         ✓ verified   CrossRef        │  Sem.Scholar ◐  1.2s
  4  Hendrycks et al. "Measuring Coding..."        ✓ verified   CrossRef        │
  5  Nijkamp et al. "CodeGen: An Open..."          ✓ verified   S2              │ Active Now
 ...                                                                            │  [19] → CrossRef  ⟳
 17  Fried et al. "InCoder: A Generative..."       ✓ verified   DBLP            │  [19] → arXiv     ⟳
 18  Allal et al. "SantaCoder: Don't..."           ✓ verified   S2              │  [19] → DBLP      ✓ 12ms
────────────────────────────────────────────────────────────────────────────────  │  [19] → S2        ⟳
 19  Wang et al. "Execution-Based Code..."         ⟳ checking   3/9             │  [20] → CrossRef  ⟳
 20  Fake et al. "An Invented Paper..."            ⟳ checking   1/9             │  [20] → arXiv     waiting
 21  Zhang et al. "RepoCoder: Repository..."       ⟳ checking   0/9             │  [21] → queued
 22  Liu et al. "Is Your Code Generated..."        ⏳ pending                    │  [22] → queued
 ...                                                                            │
 27  Peng et al. "The Impact of AI on..."          ⏳ pending                    │
──────────────────────────────────────────────────────────────────────────────────│
 ✓ 18 verified  so far                                                          │
 [enter] details  [r] retry  [esc] back  [s] sort                              │

The activity panel here shows per-query granularity: reference [19] has 3 databases done (DBLP already returned a match at 12ms but it’s still waiting on others — or maybe early exit will kick in and cancel the remaining). Reference [20] just started. [21] and [22] are queued waiting for a slot in the concurrency pool.

This is the “database race” made visible. You’re watching 4 references being checked concurrently, each with up to 9 databases racing. When a database returns a match, the whole reference can resolve instantly via early exit.

Paper screen — done, filtered to problems only

After analysis completes, the reviewer presses f to filter to problems:

 HALLUCINATOR > sketchy_submission_v3.pdf                  DONE ✓14 ⚠1 ✗4 ☠1
─────────────────────────────────────────────────────────────────────────
 #   Reference                                     Verdict      Source
─────────────────────────────────────────────────────────────────────────
  3  Zhang & Li "Self-Aware Neural..."             ✗ not found   —
  8  Johnson et al. "Recursive Prompt..."          ✗ not found   —
 11  Chen "Advanced Reasoning in..."               ✗ not found   —
 15  Park et al. "Constitutional Self-..."         ✗ not found   —
 19  Davis et al. "On the Emergent..."             ☠ retracted  CrossRef
  6  Williams et al. "Scaling Laws for..."         ⚠ mismatch   DBLP
─────────────────────────────────────────────────────────────────────────
 showing: problems only (6/22)

 [enter] details  [f] show all  [r] retry  [e] export  [esc] back

Six references instead of 22. The reviewer only needs to look at these. Four not-found references in a 22-reference paper is a strong signal.

Reference detail — close match found but authors wrong

 HALLUCINATOR > sketchy_submission_v3.pdf > [6]
─────────────────────────────────────────────────────────────────────────

 REFERENCE [6]
 Williams, R., Thompson, K., and Garcia, M. (2023)
 "Scaling Laws for Neural Language Models"
 In Proceedings of NeurIPS 2023

 Verdict: ⚠ AUTHOR MISMATCH

 Extracted title:   "Scaling Laws for Neural Language Models"
 Extracted authors:  R. Williams, K. Thompson, M. Garcia

 BEST MATCH (CrossRef)
─────────────────────────────────────────────────────────────────────────
  Matched title:    "Scaling Laws for Neural Language Models"
  Title score:       100.0%
  Found authors:     Jared Kaplan, Sam McCandlish, Tom Henighan,
                     Tom B. Brown, Benjamin Chess, Rewon Child,
                     Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei
  Author overlap:    0/3 — no matching authors

  DOI:               10.48550/arXiv.2001.08361
  Source:            CrossRef (0.8s)
─────────────────────────────────────────────────────────────────────────

 This paper exists but the cited authors (Williams, Thompson, Garcia)
 don't match the actual authors (Kaplan, McCandlish, et al.).

 ALL DATABASE RESULTS
─────────────────────────────────────────────────────────────────────────
  CrossRef       ⚠ author mismatch  0.8s   (match shown above)
  arXiv          ⚠ author mismatch  0.4s
  DBLP           ⚠ author mismatch  0.1s   (offline)
  Sem. Scholar   timeout            10.0s
  OpenAlex       ⚠ author mismatch  1.2s
  ACL            no match           0.3s
  NeurIPS        no match           0.5s
  Europe PMC     no match           1.1s
  PubMed         no match           0.7s
─────────────────────────────────────────────────────────────────────────

 [r] retry  [y] copy ref text  [esc] back

This is a telltale sign: the paper “Scaling Laws for Neural Language Models” is real (Kaplan et al., 2020), but the submission attributed it to completely fabricated authors. Four databases independently confirmed the title exists with different authors. The detail screen makes this case unambiguous.

Reference detail — retracted paper

 HALLUCINATOR > sketchy_submission_v3.pdf > [19]
─────────────────────────────────────────────────────────────────────────

 REFERENCE [19]
 Davis, P., Reeves, L., and Kang, S. (2021)
 "On the Emergent Properties of Transformer Architectures
  in Low-Resource Settings"
 Journal of Machine Learning Research, 22(1), pp. 1-34

 Verdict: ☠ RETRACTED

 Extracted title:   "On the Emergent Properties of Transformer
                     Architectures in Low-Resource Settings"
 Extracted authors:  P. Davis, L. Reeves, S. Kang

 MATCH (CrossRef)
─────────────────────────────────────────────────────────────────────────
  Matched title:    "On the Emergent Properties of Transformer
                     Architectures in Low-Resource Settings"
  Title score:       100.0%
  Found authors:     P. Davis, L. Reeves, S. Kang
  Author overlap:    3/3 ✓

  DOI:               10.xxxx/jmlr.2021.xxxxx
  Source:            CrossRef (1.1s)

  ╔══════════════════════════════════════════════════════════════════╗
  ║  ☠ RETRACTION NOTICE                                           ║
  ║                                                                 ║
  ║  This paper was retracted on 2022-03-15.                       ║
  ║  Retraction DOI: 10.xxxx/jmlr.2022.retract.xxxxx              ║
  ║  Reason: "Results could not be reproduced; data fabrication     ║
  ║  suspected."                                                    ║
  ╚══════════════════════════════════════════════════════════════════╝
─────────────────────────────────────────────────────────────────────────

 [r] retry  [y] copy ref text  [esc] back

The retraction notice gets a heavy box border — it’s the most important piece of information on this screen and should be impossible to miss.

Single-paper mode — just started

When invoked with a single PDF:

 HALLUCINATOR
─────────────────────────────────────────────────────────────────────────
 arxiv_2024_llm_survey.pdf                         38 references found
─────────────────────────────────────────────────────────────────────────
  1  Vaswani et al. "Attention Is All You Need"    ✓ verified   arXiv
  2  Brown et al. "Language Models are Few-..."    ✓ verified   S2
  3  Wei et al. "Chain-of-Thought Prompting..."    ✓ verified   CrossRef
  4  Bubeck et al. "Sparks of Artificial..."       ✓ verified   S2
  5  Touvron et al. "LLaMA: Open and..."           ⟳ checking   4/9
  6  Chowdhery et al. "PaLM: Scaling..."           ⟳ checking   2/9
  7  Hoffmann et al. "Training Compute-..."        ⟳ checking   0/9
  8  Ouyang et al. "Training language..."          ⟳ checking   0/9
  9  Bai et al. "Constitutional AI:..."            ⏳ pending
 10  Raffel et al. "Exploring the Limits..."       ⏳ pending
 ...
 38  Kojima et al. "Large Language Models..."      ⏳ pending
─────────────────────────────────────────────────────────────────────────
 ████░░░░░░ 4/38    ✓ 4 verified                            0:12

 [enter] details  [Tab] activity  [q] quit

No queue screen — it drops you directly into the paper view. The progress bar and running counts update live. This feels immediate and purposeful. When it finishes, the status line updates and you can browse results or export.

Export modal

Pressing e on any screen:

         ┌─ Export ──────────────────────────────┐
         │                                       │
         │  Format:  [JSON]  CSV  Markdown  Text │
         │  Scope:   This paper / All papers     │
         │  Output:  ~/hallucinator-results.json │
         │                                       │
         │          [Export]    [Cancel]          │
         └───────────────────────────────────────┘

Minimal modal. Arrow keys or tab to move between options. Enter to confirm. Esc to cancel. The output path has a sensible default and is editable.

Help overlay

Pressing ? on any screen:

 ┌─ Keybindings ─────────────────────────────────────────────────┐
 │                                                               │
 │  Navigation                     Actions                      │
 │  ↑↓ j/k    move cursor          r    retry reference/paper   │
 │  Enter      drill in             R    retry all failed        │
 │  Esc        back                 e    export results          │
 │  g/G        top/bottom           s    cycle sort mode         │
 │  Ctrl+D/U   page down/up         f    cycle filter            │
 │  /          search/filter        a    add files (queue)       │
 │  n/N        next/prev match      d    remove paper (queue)    │
 │                                  y    copy to clipboard       │
 │  Global                                                       │
 │  Tab        toggle activity      ?    this help               │
 │  ,          config               Ctrl+C  cancel/force quit    │
 │  q          quit                                              │
 │                                                               │
 │                                             [?/Esc] close     │
 └───────────────────────────────────────────────────────────────┘

Semi-transparent overlay on top of whatever screen is active. The underlying screen is still visible (dimmed) so you maintain spatial context.

Decisions (resolved)

1. Notification on completion

Terminal bell by default. Works everywhere, zero config. Desktop notification via notify-send / platform equivalent available as opt-in flag (--notify). Don’t overthink this.

2. Results persistence

Two distinct mechanisms:

Temp state (invisible infrastructure). In-progress and completed results write to ~/.cache/hallucinator/runs/<timestamp>/. This is crash safety — if the terminal dies, SSH drops, or the user hits Ctrl+C, the work isn’t lost. The TUI doesn’t expose this to the user. It just exists.

Export (deliberate user action). e key opens the export modal. User picks format (JSON, CSV, Markdown, plain text), scope (one paper or all), and destination. This produces the actual deliverable — the report they attach to AC notes or share with co-reviewers.

Resume (future). Not in v1. Eventually: hallucinator --resume reads from the temp state dir and picks up where it left off. The temp state format should be designed with this in mind even if we don’t build the resume path yet — don’t paint ourselves into a corner.

3. Reference text preview pane

Yes. Shown on the Paper screen (Screen 2) when terminal height >= 40 rows. Located below the reference list, separated by a horizontal rule. Shows the raw reference text as extracted from the PDF for the currently-selected reference.

Updates in real-time as the cursor moves through the reference list (file-manager-style preview). This is the expected behavior and the rendering cost is trivial — it’s just text reflow.

On terminals shorter than 40 rows, the preview is hidden. The user can still see the raw text by drilling into Screen 3.

 ...
  4  Smith & Jones "Recursive Self-Imp..."  ✗ not found   —
> 5  Zhang et al. "Emergent Abilities..."   ⚠ mismatch   DBLP
  6  Chen et al. "Evaluating Large..."      ✓ verified   arXiv
 ...
─────────────────────────────────────────────────────────────────
 [5] Zhang, W., Wei, J., and Chen, L., "Emergent Abilities of
 Large Language Models," in Proceedings of the International
 Conference on Machine Learning (ICML), 2023, pp. 4812-4830.
─────────────────────────────────────────────────────────────────

This earns its space. When the extracted title looks wrong (mangled by hyphenation, ligature issues, or a bad parse), you see it instantly without an extra keypress.

4. Color themes

Two themes, toggled via --theme=green or --theme=modern. No theming framework, no theme.toml. Just two palette structs.

green (default): Dark background, green/cyan primary text. Terminal hacker aesthetic. The one that makes people at poster sessions say “what is that.” Verdict colors as specified in the Color section above.

modern: Dark background, white primary text, electric blue accents. Cleaner, more subdued. For people who think the green is too much, or for screenshots in formal reports where neon green looks unserious.

Both palettes follow the same rules: verdict colors stay semantically consistent (green=verified, red=not found, etc.), only the chrome and accent colors differ.

5. Inline retry feedback

Both spinner and text. The verdict cell shows an animated spinner character cycling through frames (◜ ◝ ◞ ◟) followed by static “retrying” text:

  4  Smith & Jones "Recursive Self-..."    ◝ retrying    —

The spinner provides motion (“something is happening”) while the text provides meaning (“what is happening”). Consistent with how the ⟳ checking state already works during initial analysis — just a different animation to distinguish retry from first pass.

When the retry completes, the cell snaps to the new verdict. No transition animation — just the immediate update. The change in color (from cyan retrying to green/red/yellow result) is transition enough.

6. Phase 4 decisions

File picker screen. When launched with no PDF arguments, the banner dismisses into an interactive file picker instead of an empty queue. The file picker is a custom implementation using std::fs::read_dir — no external dependency (ratatui-explorer was considered but rejected to keep the dependency tree small). Directories are navigable, .pdf files are togglable with Space, Enter confirms selection and returns to the queue. The a key on the queue screen reopens the picker to add more files.

Manual start. PDFs load into the queue in Queued state but do not begin processing automatically. The user reviews the queue, adjusts configuration, then presses Space to start. A prominent [Space] Start indicator in the footer makes this discoverable. Processing is deferred via a BackendCommand channel — the backend listener receives a ProcessFiles command containing the file list, starting index, and a Config struct rebuilt from the current ConfigState. This means config edits made before pressing Space take effect.

Concurrency configurable from config screen. The config screen (, key) is now fully interactive. Tab cycles between sections (API Keys, Databases, Concurrency, Display). j/k navigates items within a section. Enter opens a text-editing mode for numeric and string fields. Space toggles database checkboxes. Cursor is clamped to valid bounds per section. Config values are populated from CLI flags, environment variables, and defaults — not hardcoded.

Activity panel shown by default. The activity panel (right sidebar) is now visible on launch instead of hidden. It shows database health with query counts and average response times, a throughput sparkline with refs/sec rate, active query list (which references are currently being checked), and a summary of total completed references. Throughput data is fed by a tick-based bucketing system (every ~1 second).

Mouse click support. Clicking a row in the queue or paper table selects it. The rendered table area is stored after each draw, and click coordinates are mapped to table rows accounting for borders and headers. Double-click (same row within 500ms) drills in.

False positive marking remains a non-goal. Per the original design (section “Explicit non-goals”), false-positive toggling is deferred. The TUI is an analysis tool, not an annotation tool.