Introduction
Hallucinator is a tool that validates academic citations against multiple databases to detect fabricated references, retracted papers, and author mismatches.
See the project README, MANIFESTO, and Rust README on GitHub for project context.
Use the sidebar to navigate the documentation.
System Architecture Overview
Hallucinator is a multi-crate Rust workspace that validates academic references extracted from PDFs against 10+ academic databases. This document covers the high-level architecture, key design decisions, and how the pieces fit together.
Workspace Structure
The workspace lives in hallucinator-rs/ and contains 10 member crates plus 2 excluded crates:
hallucinator-rs/
├── crates/
│ ├── hallucinator-core # Validation engine, DB backends, caching, rate limiting
│ ├── hallucinator-parsing # Reference parsing pipeline (backend-agnostic)
│ ├── hallucinator-pdf-mupdf # MuPDF backend (AGPL-licensed)
│ ├── hallucinator-bbl # BibTeX .bbl/.bib file parsing
│ ├── hallucinator-ingest # Unified file dispatch + archive handling
│ ├── hallucinator-dblp # DBLP offline database (RDF → SQLite FTS5)
│ ├── hallucinator-acl # ACL Anthology offline database
│ ├── hallucinator-reporting # Export formats (JSON, CSV, Markdown, HTML, Text)
│ ├── hallucinator-cli # CLI binary
│ ├── hallucinator-tui # TUI binary (Ratatui)
│ ├── hallucinator-python # PyO3 Python bindings (excluded from workspace)
│ └── hallucinator-web # Axum web server (excluded from workspace)
Only hallucinator-cli and hallucinator-tui are distributed as release binaries. The Python and web crates are excluded from the workspace to avoid CI complications (pyo3 version conflicts, unnecessary axum compilation during dist builds).
Crate Dependency Graph
┌──────────────┐
│ CLI / TUI │
└──────┬───────┘
│
┌──────▼───────┐
│ ingest │──────────────┐
└──────┬───────┘ │
│ │
┌────────────▼────────────┐ ┌─────▼──────┐
│ core │ │ parsing │
│ (validation, DB, cache)│ │ (extract) │
└────┬───────┬───────┬────┘ └─────┬───────┘
│ │ │ │
┌────▼──┐ ┌──▼───┐ ┌▼────────┐ ┌───▼────────┐
│ dblp │ │ acl │ │reporting│ │ pdf-mupdf │
└───────┘ └──────┘ └─────────┘ │ (AGPL) │
└────────────┘
AGPL Isolation
The MuPDF library is licensed under AGPL. To keep the rest of the codebase under a permissive license:
hallucinator-coredefines thePdfBackendtrait (permissive license)hallucinator-pdf-mupdfimplementsPdfBackendusing MuPDF (AGPL)- Only the final binaries (CLI/TUI) link to the AGPL crate
- Library consumers (
hallucinator-core,hallucinator-python) never depend onhallucinator-pdf-mupdfdirectly
This means the core validation logic remains AGPL-free. Alternative PDF backends (e.g., pdf-extract, pdfium) can be implemented by providing the PdfBackend trait.
Key Traits
DatabaseBackend
Defined in hallucinator-core/src/db/mod.rs. Every academic database implements this trait:
#![allow(unused)]
fn main() {
pub trait DatabaseBackend: Send + Sync {
fn name(&self) -> &str;
fn is_local(&self) -> bool { false }
fn requires_doi(&self) -> bool { false }
fn query(&self, title: &str, client: &reqwest::Client,
timeout: Duration) -> Pin<Box<dyn Future<Output = Result<DbQueryResult, DbQueryError>> + Send>>;
fn query_doi(&self, doi: &str, title: &str, authors: &[String],
client: &reqwest::Client, timeout: Duration) -> DoiQueryResult;
}
}
Local backends (is_local() = true) are queried inline by the coordinator before fanning out to remote drainers. See Concurrency Model for details.
PdfBackend
Defined in hallucinator-core. Abstracts PDF text extraction:
#![allow(unused)]
fn main() {
pub trait PdfBackend {
fn extract_text(&self, path: &Path) -> Result<String, String>;
}
}
Configuration Layering
Configuration is resolved with the following precedence (highest wins):
- CLI flags —
--num-workers 8,--dblp-offline /path, etc. - Environment variables —
OPENALEX_KEY,DB_TIMEOUT,SEARXNG_URL, etc. - CWD config —
.hallucinator.tomlin the current directory - Platform config —
~/.config/hallucinator/config.toml(Linux/macOS) or%APPDATA%\hallucinator\config.toml(Windows) - Defaults — Hardcoded defaults (4 workers, 10s timeout, etc.)
CWD config overlays platform config field-by-field, so you can keep API keys in the global config and override concurrency settings per-project.
See Configuration for the full reference.
Caching
A two-tier cache prevents redundant API calls:
- L1 (in-memory):
DashMap— lock-free concurrent reads, sub-microsecond lookups - L2 (optional SQLite): WAL-mode database for persistence across runs
Cache keys use aggressive title normalization (Unicode NFKD, Greek letter transliteration, math symbol replacement, ASCII-only lowercasing) to maximize hit rates across PDF extraction artifacts.
TTLs: 7 days for positive (found) entries, 24 hours for negative (not-found) entries. Both are configurable.
See Concurrency Model for how the cache interacts with the drainer pool.
Rate Limiting
Each remote database has its own AdaptiveDbLimiter using the governor crate for token-bucket rate limiting:
- Per-DB drainer task — Each drainer is the sole consumer of its DB’s rate limiter, eliminating governor contention
- Adaptive backoff — On HTTP 429: doubles the slowdown factor (1x → 2x → 4x → … → 16x max), atomically swaps the governor via
ArcSwap - Recovery — After 30 seconds without a 429, the original rate is restored
- Default rates — CrossRef 1/s (3/s with
crossref_mailto), arXiv 3/s, DBLP 1/s, Semantic Scholar varies by API key presence
Title Matching
References are matched using fuzzy string comparison with a 95% similarity threshold (via rapidfuzz). Before comparison, titles are normalized:
- HTML entity unescaping
- Separated diacritic fixing (e.g.,
B ¨UNZ→BÜNZ) - Greek letter transliteration (α → alpha, β → beta)
- Math symbol replacement (√ → sqrt, ∞ → infinity)
- Unicode NFKD decomposition
- Strip to
[a-z0-9]only
Author Validation
Two modes based on the quality of extracted author names:
- Full mode — Normalizes each author to
FirstInitial Surname, checks set intersection between PDF authors and DB authors - Last-name-only mode — Used when >50% of reference authors lack first names/initials; compares surnames only with partial suffix matching for multi-word surnames
Entry Points
All interfaces consume the same hallucinator-core library:
| Interface | Crate | Description |
|---|---|---|
| CLI | hallucinator-cli | Single-file checking with colored terminal output |
| TUI | hallucinator-tui | Batch processing with Ratatui, result navigation, false-positive overrides |
| Web | hallucinator-web | Axum HTTP server with SSE streaming (excluded from workspace) |
| Python | hallucinator-python | PyO3 bindings with pre-compiled wheels (excluded from workspace) |
| Library | hallucinator-core | Direct Rust API via check_references() |
The core check_references() function signature:
#![allow(unused)]
fn main() {
pub async fn check_references(
refs: Vec<Reference>,
config: Config,
progress: impl Fn(ProgressEvent) + Send + Sync + 'static,
cancel: CancellationToken,
) -> Vec<ValidationResult>
}
Data Flow: PDF to Results
This document traces a reference’s journey from PDF file to final validation result.
Pipeline Overview
PDF file
│
▼
┌─────────────────┐
│ File Dispatch │ hallucinator-ingest
│ (PDF/BBL/BIB/ │ Detects file type, extracts from archives
│ archive) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Text Extraction │ hallucinator-parsing + hallucinator-pdf-mupdf
│ (PdfBackend) │ MuPDF extracts raw text with ligature expansion
└────────┬────────┘
│
▼
┌─────────────────┐
│ Section Detection│ hallucinator-parsing/src/section.rs
│ │ Locates "References" / "Bibliography" header
└────────┬────────┘
│
▼
┌─────────────────┐
│ Segmentation │ hallucinator-parsing/src/section.rs
│ │ Splits section into individual references
└────────┬────────┘
│
▼
┌─────────────────┐
│ Title/Author │ hallucinator-parsing/src/title.rs, authors.rs
│ Extraction │ Parses title, authors, DOI, arXiv ID per ref
└────────┬────────┘
│
▼
┌─────────────────┐
│ Skip Filtering │ hallucinator-parsing/src/extractor.rs
│ │ Removes URL-only and short-title refs
└────────┬────────┘
│
▼
┌─────────────────┐
│ Validation │ hallucinator-core (pool, orchestrator, db/*)
│ Pool │ Concurrent DB queries with early exit
└────────┬────────┘
│
▼
┌─────────────────┐
│ Result Assembly │ hallucinator-core/src/pool.rs
│ │ Merge local+remote results, retraction check
└────────┬────────┘
│
▼
Vec<ValidationResult>
Stage 1: File Dispatch
Crate: hallucinator-ingest
The ingest crate handles file type detection and archive extraction:
- PDF files — Passed to the PDF extraction pipeline
- BBL/BIB files — Parsed by
hallucinator-bbl(LaTeX bibliography entries) - Archives (
.tar.gz,.zip) — Extracted streaming viaArchiveIterator, each contained PDF processed independently - Size limits — Configurable
max_archive_size_mbto prevent resource exhaustion
Stage 2: Text Extraction
Crate: hallucinator-parsing + hallucinator-pdf-mupdf
The PdfBackend trait abstracts text extraction. The MuPDF backend:
- Opens the PDF and iterates page-by-page
- Extracts raw text blocks
- Expands ligatures (
fi→fi,fl→fl,ff→ff, etc.) - Fixes hyphenation — distinguishes syllable breaks from compound words using a suffix heuristic
Stage 3: Section Detection
File: hallucinator-parsing/src/section.rs
Locates the references section by scanning for header patterns:
- Primary:
References,Bibliography,REFERENCES,BIBLIOGRAPHY - End markers:
Appendix,Acknowledgments,Supplementary,Author Contributions
If no header is found, falls back to using the last 30% of the document text.
The section text between the header and the first end-marker (or EOF) is extracted.
Stage 4: Reference Segmentation
File: hallucinator-parsing/src/section.rs
Individual references are split using priority-ordered strategies:
| Priority | Strategy | Pattern | Example |
|---|---|---|---|
| 1 | IEEE | [1], [2], … | [1] A. Author, "Title..." |
| 2 | Numbered | 1., 2., … | 1. Author, Title... |
| 3 | ML author-based | Full names / initials | Author, A. B. (2023). Title... |
| 4 | Springer/Nature | Uppercase + (YYYY) | AUTHOR, A. Title. J. (2023) |
| 5 | Fallback | Double newline | Two blank lines between refs |
The system tries each strategy and picks the one that produces the most valid segments. For IEEE and numbered styles, a sequential check ensures numbering is contiguous.
Stage 5: Title and Author Extraction
Files: hallucinator-parsing/src/title.rs, authors.rs, identifiers.rs
For each segmented reference:
- DOI extraction — Regex:
/10\.\d+/[^\s]+/ - arXiv ID extraction — Regex for
arXiv:YYMM.NNNNNpatterns - Title extraction — Two strategies tried in order:
- Quoted strings (e.g.,
"Title Here") - Capitalized word sequences between author and venue patterns
- Quoted strings (e.g.,
- Author extraction — Format-specific parsing for IEEE, ACM, USENIX, AAAI, NeurIPS styles
- Em-dash handling —
———means “same authors as previous reference”
Stage 6: Skip Filtering
File: hallucinator-parsing/src/extractor.rs
References are skipped (not validated) if:
- URL-only — The reference is just a URL to a non-academic site (GitHub, docs, etc.)
- Short title — Title has fewer than 5 words (prone to false matches), unless a DOI or arXiv ID is present
- No title — No title could be extracted
Skip statistics are tracked and reported: total_raw, url_only, short_title, no_title.
Stage 7: Validation
Crate: hallucinator-core (see Concurrency Model for the full deep dive)
Each reference goes through:
- Coordinator picks up reference from job queue
- Local DB query (DBLP offline, ACL offline) — inline, < 1ms
- If verified locally → skip all remote DBs, emit result immediately
- Cache pre-check — synchronously check cache for all remote DBs
- If verified from cache → skip all drainers
- Fan out cache-miss DBs to per-DB drainer queues
- Drainer queries DB — rate-limited HTTP call
- Author validation — compare PDF authors against DB authors
- Early exit — if any drainer verifies, others skip remaining work
Database Query Flow (per reference, per DB)
Drainer receives job
│
├─ Already verified? → skip
├─ Cancelled? → skip
├─ Requires DOI but ref has none? → skip
│
▼
Rate limit acquire (governor token)
│
▼
Cache check
├─ Cache hit → return cached result
│
▼
HTTP request (with timeout)
│
├─ Success + title found → author validation
│ ├─ Authors match → set verified flag
│ └─ Authors don't match → record mismatch
├─ Success + title not found → NoMatch
├─ 429 Rate Limited → adaptive backoff + retry
└─ Error/Timeout → record failure
│
▼
Cache insert (if successful)
│
▼
Decrement remaining counter
├─ Not last → done
└─ Last drainer → finalize result
Stage 8: Result Assembly
File: hallucinator-core/src/pool.rs (finalize_collector)
When the last drainer for a reference completes:
- Merge local and remote
DbResultlists - Determine status — Verified (any DB matched) > AuthorMismatch (title found, wrong authors) > NotFound
- SearxNG fallback — If still NotFound and SearxNG is configured, try web search as last resort
- DOI info — Mark DOI as valid/invalid based on DOI backend result
- Retraction info — Use inline retraction data extracted from CrossRef response (no extra API call)
- Emit events —
ProgressEvent::Warning(if DBs timed out) +ProgressEvent::Result - Send result via oneshot channel back to the caller
Output Types
The final Vec<ValidationResult> can be:
- Displayed in the CLI with colored output
- Navigated in the TUI with sorting/filtering
- Streamed via SSE in the web interface
- Exported to JSON/CSV/Markdown/Text/HTML via
hallucinator-reporting - Returned as Python objects via
hallucinator-python
See Export Formats for output schema details.
Concurrency Model
Hallucinator’s validation engine is designed around a per-DB drainer pool architecture that maximizes throughput while respecting per-database rate limits. This document explains the concurrency primitives, task structure, and how they interact.
Design Goals
- Maximize parallelism — Check multiple references simultaneously
- Respect rate limits — Each database has its own rate limit; never exceed it
- Minimize latency — Return results as soon as a verified match is found
- Avoid contention — No shared rate limiter governor across tasks
Architecture Diagram
┌──────────────────┐
│ Job Queue │
│ (async_channel) │
└────────┬─────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ Coordinator │ │ Coordinator │ │ Coordinator │
│ Task 1 │ │ Task 2 │ │ Task N │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
│ Local DBs inline │ │
│ Cache pre-check │ │
│ │ │
└──────────┬─────────┘ │
│ Fan out (cache misses only) │
┌──────────────┼──────────────────────────┐ │
│ │ │ │ │
┌─────▼─────┐ ┌──────▼─────┐ ┌─────▼────┐ ┌───▼───▼──┐
│ CrossRef │ │ arXiv │ │ DBLP │ │ ... │
│ Drainer │ │ Drainer │ │ Drainer │ │ Drainers │
│ │ │ │ │ (online) │ │ │
│ Rate: 1/s │ │ Rate: 3/s │ │ Rate:1/s │ │ │
└───────────┘ └────────────┘ └──────────┘ └──────────┘
Task Types
Coordinator Tasks
- Count: Configurable via
num_workers(default: 4) - Role: Pick references from the shared job queue, run local DBs inline, pre-check cache, fan out to drainers
- Concurrency: Multiple coordinators run in parallel, each pulling from the same
async_channel
A coordinator’s lifecycle for each reference:
- Receive
RefJobfrom job queue - Emit
ProgressEvent::Checking - Query local databases inline (DBLP offline, ACL offline) — sub-millisecond
- If verified locally → emit result, skip remote phase
- Pre-check cache for all remote DBs (synchronous, prevents race condition)
- If verified from cache → emit result, skip drainers
- Create
RefCollector(shared aggregation hub) - Send
DrainerJobto each cache-miss DB’s drainer queue
Drainer Tasks
- Count: One per enabled remote database
- Role: Process DB queries sequentially at the database’s natural rate
- Rate limiting: Each drainer is the sole consumer of its DB’s
AdaptiveDbLimiter
A drainer’s lifecycle for each job:
- Check early-exit conditions (cancelled, already verified, no DOI for DOI-requiring backend)
- Acquire rate limiter token
- Check cache (within the rate-limited query path)
- Execute HTTP query with timeout
- Validate authors if title found
- Update
RefCollectorstate - Decrement
remainingcounter; if last, finalize the result
RefCollector
A per-reference aggregation hub, shared (via Arc) by all drainers working on that reference:
RefCollector
├── remaining: AtomicUsize # Drainers left to report
├── verified: AtomicBool # Early-exit flag
├── state: Mutex<AggState> # Aggregation (held briefly)
│ ├── verified_info
│ ├── first_mismatch
│ ├── failed_dbs
│ ├── db_results
│ └── retraction
└── result_tx: Mutex<Option<oneshot::Sender>>
The last drainer to decrement remaining to zero calls finalize_collector(), which builds the final ValidationResult and sends it on the oneshot channel.
Concurrency Primitives
| Primitive | Purpose |
|---|---|
async_channel::unbounded | Job queue (coordinators) and per-DB drainer queues |
AtomicUsize + Ordering::AcqRel | remaining counter for lock-free drainer coordination |
AtomicBool + Ordering::Release/Acquire | verified flag for early exit |
Mutex<AggState> | Per-reference aggregation state (single mutex, held briefly) |
tokio::sync::oneshot | Return channel for each reference’s result |
CancellationToken | Graceful shutdown (Ctrl+C handler) |
ArcSwap | Atomic governor swapping during adaptive rate limit backoff |
DashMap | Lock-free concurrent L1 cache reads |
Cache Pre-Check: Preventing Race Conditions
A subtle race condition exists without the cache pre-check:
- Reference R is dispatched to CrossRef (drainer A) and arXiv (drainer B)
- Drainer A finishes first: CrossRef has a match → sets
verified = true - Drainer B sees
verified = true→ skips arXiv query entirely - arXiv’s result is never cached for reference R
This means future runs will always miss the arXiv cache for this title.
Solution: Before dispatching to any drainer, the coordinator synchronously checks the cache for all remote DBs. Cache hits are recorded in AggState.db_results, and only cache-miss DBs are dispatched to drainers. This ensures every DB’s cached result is always captured regardless of verification order.
Early Exit
When a drainer verifies a reference:
- Sets
collector.verifiedtotrue(atomic store with Release ordering) - Other drainers check this flag before querying (Acquire ordering)
- Drainers that see
verified = trueemit aSkippedstatus and decrementremaining
This avoids unnecessary API calls once a match is found.
SearxNG Fallback
If a reference is NotFound after all remote DBs have been checked and SearxNG is configured:
finalize_collector()runs a SearxNG web search as a last-resort fallback- If SearxNG finds the title, the status upgrades from
NotFoundtoVerified(source: “Web Search”) - SearxNG results don’t undergo author validation (web search doesn’t return structured author data)
Shutdown Sequence
- User presses Ctrl+C →
CancellationTokenis cancelled job_tx(the job queue sender) is closed- Coordinators drain remaining jobs, checking
cancel.is_cancelled()at each iteration - Drainers skip remaining jobs when cancelled
- When all coordinators finish, they drop their
Arc<drainer_txs>clones - Drainer channels close → drainers drain and exit
- Pool handle completes
Performance Characteristics
- Local DB queries: < 1ms (SQLite FTS5 lookups)
- Cache hits: Sub-microsecond (DashMap L1) to ~1ms (SQLite L2)
- Remote DB queries: 100ms–10s depending on database and network
- Throughput: Scales linearly with
num_workersfor CPU-bound coordination; drainer throughput is rate-limit-bound per DB - Memory: One
RefCollectorper in-flight reference (small: a few KB each)
Crate Map
Quick reference for each crate in the workspace: its responsibility, key types, and dependencies.
hallucinator-core
Responsibility: The validation engine — database backends, caching, rate limiting, author matching, title normalization, and the main check_references() entry point.
Key types:
Reference— Parsed reference (title, authors, DOI, arXiv ID, raw citation)ExtractionResult— References extracted from a document plus skip statisticsValidationResult— Complete result for one reference (status, source, per-DB results, retraction info)Config— Runtime configuration (API keys, timeouts, offline DBs, disabled DBs, rate limiters, cache)ProgressEvent— Events emitted during validation (Checking,Result,Warning,RetryPass,DatabaseQueryComplete,RateLimitWait)Status—Verified,NotFound,AuthorMismatchDbStatus—Match,NoMatch,AuthorMismatch,Timeout,RateLimited,Error,SkippedCheckStats— Summary counts (total, verified, not_found, author_mismatch, retracted, skipped)PdfBackendtrait — Abstraction for PDF text extraction (moved here fromhallucinator-parsing)DatabaseBackendtrait — Interface for all database backendsValidationPool— Per-DB drainer pool for concurrent validationQueryCache— Two-tier (DashMap + optional SQLite) cacheRateLimiters— Per-DB adaptive rate limiters
Key files:
src/lib.rs— Public API and type exportssrc/pool.rs— ValidationPool, coordinators, drainers, RefCollectorsrc/orchestrator.rs— Database query orchestration (local then remote)src/checker.rs—check_references()entry pointsrc/db/mod.rs—DatabaseBackendtrait andDbQueryResultsrc/db/*.rs— Individual database implementationssrc/cache.rs— Two-tier caching systemsrc/rate_limit.rs— Adaptive per-DB rate limitingsrc/matching.rs— Title normalization and fuzzy matchingsrc/authors.rs— Author name validationsrc/retraction.rs— Retraction checkingsrc/config_file.rs— TOML configuration file loading and merging
Dependencies: reqwest, tokio, async-channel, governor, dashmap, arc-swap, rapidfuzz, serde, rusqlite
hallucinator-parsing
Responsibility: Reference parsing pipeline — backend-agnostic text extraction, reference section detection, segmentation into individual references, title/author/identifier extraction. (Renamed from hallucinator-pdf to better reflect its scope.)
Key types:
ReferenceExtractor— Configurable extraction pipeline (formerlyPdfExtractor)ParsingConfig— Custom regex patterns, thresholds, segment strategies (formerlyPdfParsingConfig)ParsingError— Error type for parsing failures (formerlyPdfError)
Key files:
src/lib.rs— Public API and type exportssrc/extractor.rs—ReferenceExtractorpipeline orchestrationsrc/section.rs—find_references_section(),segment_references()src/title.rs—extract_title_from_reference(),clean_title()src/authors.rs—extract_authors_from_reference()src/identifiers.rs—extract_doi(),extract_arxiv_id()src/hyphenation.rs— Hyphenation fixing
Dependencies: regex
Note: The PdfBackend trait now lives in hallucinator-core, not here.
hallucinator-pdf-mupdf
Responsibility: MuPDF implementation of the PdfBackend trait (defined in hallucinator-core). AGPL-licensed — isolated to keep other crates permissive.
Key types:
MupdfBackend— ImplementsPdfBackendusing themupdfcrate
Dependencies: mupdf, hallucinator-core
hallucinator-bbl
Responsibility: Parse BibTeX .bbl and .bib files into Reference structs.
Key functions:
extract_references_from_bbl(path)— Parse.bblfilesextract_references_from_bib(path)— Parse.bibfiles
Dependencies: hallucinator-core (for Reference, ExtractionResult)
hallucinator-ingest
Responsibility: Unified file dispatch — detects file type (PDF, BBL, BIB, archive) and routes to the appropriate extractor. Handles archive streaming with size limits.
Key functions:
extract_references(path)— Dispatch to PDF or BBL/BIB extractoris_archive_path(path)— Check if path is a.tar.gzor.zip
Key types:
ArchiveItem— Streaming archive extraction results (Pdf,Warning,Done)
Dependencies: hallucinator-parsing, hallucinator-pdf-mupdf, hallucinator-bbl, hallucinator-core, zip, tar, flate2, tempfile
hallucinator-dblp
Responsibility: Build and query an offline DBLP database. Downloads DBLP’s XML dump (~4.6GB compressed), parses it, and creates a SQLite database with FTS5 full-text search.
Key types:
DblpDatabase— SQLite database handle with FTS5 searchBuildProgress— Progress events during database building (Downloading,Parsing,RebuildingIndex,Compacting,Complete)
Key functions:
DblpDatabase::open(path)— Open existing databaseDblpDatabase::search(title)— FTS5 title searchbuild_database(path, callback)— Download, parse, and build database
Dependencies: rusqlite, reqwest, quick-xml, flate2
hallucinator-acl
Responsibility: Build and query an offline ACL Anthology database. Downloads ACL XML data from GitHub, parses it, and creates a SQLite FTS5 database.
Key types:
AclDatabase— SQLite database handleBuildProgress— Progress events during building
Key functions:
AclDatabase::open(path)— Open existing databaseAclDatabase::search(title)— FTS5 title searchbuild_database(path, callback)— Download and build database
Dependencies: rusqlite, reqwest, quick-xml, tar, flate2
hallucinator-reporting
Responsibility: Export validation results to various formats.
Key types:
ExportFormat—Json,Csv,Markdown,Text,HtmlReportPaper— Per-paper metadata for export (filename, stats, results, verdict)ReportRef— Per-reference state for export (index, title, skip info, false-positive reason)FpReason— False-positive override reasons (BrokenParse,ExistsElsewhere,AllTimedOut,KnownGood,NonAcademic)PaperVerdict— Overall paper judgment (Safe,Questionable)
Key functions:
export_results(papers, ref_states, format, path)— Write results to file in specified format
Dependencies: hallucinator-core
hallucinator-cli
Responsibility: Command-line binary for single-file reference checking.
Commands:
check <file>— Check PDF/BBL/BIB file (or archive)update-dblp <path>— Build/update offline DBLP databaseupdate-acl <path>— Build/update offline ACL database
Dependencies: hallucinator-core, hallucinator-ingest, hallucinator-dblp, hallucinator-acl, clap, owo-colors, indicatif, tokio
hallucinator-tui
Responsibility: Terminal UI for batch processing. Built with Ratatui. Supports multiple PDFs, result navigation, sorting/filtering, false-positive overrides, result persistence (JSON), and configurable themes.
Screens: Queue → Paper → Reference Detail → Config
Dependencies: hallucinator-core, hallucinator-ingest, hallucinator-reporting, ratatui, crossterm, tokio
See TUI Design Document for design details.
hallucinator-python (excluded)
Responsibility: PyO3 Python bindings providing PdfExtractor, Validator, ValidatorConfig, and result types. Pre-compiled wheels available for major platforms.
Excluded from workspace to avoid pyo3/Python version conflicts in CI.
See the Python Bindings page for an overview, or the full PYTHON_BINDINGS.md on GitHub.
hallucinator-web (excluded)
Responsibility: Axum web server with HTML UI and SSE streaming.
Endpoints:
GET /— HTML interfacePOST /analyze/stream— SSE-streaming reference validation (multipart PDF upload)POST /retry— Recheck specific references
Excluded from workspace to avoid compiling axum/tower during dist builds (not distributed as a binary).
Getting Started
This guide covers installation and your first reference check across all available interfaces.
Choose Your Interface
| Interface | Best for | Install method |
|---|---|---|
| TUI | Batch processing, exploring results interactively | Pre-built binary or cargo install |
| CLI | Single-file checks, scripting, CI pipelines | Pre-built binary or cargo install |
| Python | Integration into existing Python workflows | pip install hallucinator |
| From source | Development, customization | cargo build --release |
Install Pre-built Binaries
Download the latest release for your platform from GitHub Releases. Both hallucinator-cli and hallucinator-tui binaries are included.
macOS / Linux
# Example: download and extract
tar xzf hallucinator-*-x86_64-unknown-linux-gnu.tar.gz
sudo mv hallucinator-cli hallucinator-tui /usr/local/bin/
Build from Source
cd hallucinator-rs
cargo build --release
# Binaries are in target/release/hallucinator-cli and target/release/hallucinator-tui
Install Python Bindings
Pre-compiled wheels are available for major platforms:
pip install hallucinator
Or build from source (requires Rust toolchain):
cd hallucinator-rs/crates/hallucinator-python
pip install maturin
maturin develop --release
See Python Bindings for the full API.
First Run: CLI
Check a single PDF:
hallucinator-cli check paper.pdf
The CLI will extract references, query databases, and print results with colored output. Each reference gets a verdict: Verified, Not Found, or Author Mismatch.
Useful Options
# Dry run — extract references without querying databases
hallucinator-cli check --dry-run paper.pdf
# Use offline DBLP for faster local lookups
hallucinator-cli check --dblp-offline dblp.db paper.pdf
# Save output to a file
hallucinator-cli check -o results.txt paper.pdf
# Check a .bbl or .bib file (LaTeX bibliography)
hallucinator-cli check references.bbl
First Run: TUI
Process multiple PDFs interactively:
hallucinator-tui paper1.pdf paper2.pdf *.pdf
The TUI opens with a queue of papers. Navigate with arrow keys:
- Enter — Open paper results
- Tab — Switch between panels
- q — Quit
- ? — Show help
See the Rust README for full key bindings.
First Run: Python
from hallucinator import PdfExtractor, Validator, ValidatorConfig
# Extract references from a PDF
extractor = PdfExtractor()
result = extractor.extract("paper.pdf")
print(f"Found {len(result.references)} references")
# Validate references
config = ValidatorConfig()
validator = Validator(config)
results = validator.check(result.references)
for r in results:
print(f" [{r.status}] {r.title}")
See PYTHON_BINDINGS.md for the complete API.
Optional: API Keys
Some databases offer higher rate limits or additional features with API keys:
| Key | Environment Variable | Effect |
|---|---|---|
| OpenAlex | OPENALEX_KEY | Enables OpenAlex database (disabled without key) |
| Semantic Scholar | S2_API_KEY | Higher rate limit (100/s vs 1/s) |
| CrossRef mailto | CROSSREF_MAILTO | Polite pool: 3/s instead of 1/s |
Set them as environment variables or in your config file.
Optional: Offline Databases
For faster local lookups and reduced API dependence, build offline databases:
# DBLP (~4.6GB download, 20–30 minutes)
hallucinator-cli update-dblp dblp.db
# ACL Anthology (smaller, a few minutes)
hallucinator-cli update-acl acl.db
Then use them:
hallucinator-cli check --dblp-offline dblp.db --acl-offline acl.db paper.pdf
Or set paths in your config file for automatic detection. See Offline Databases for details.
Next Steps
- Configuration — All config options (CLI, env vars, TOML)
- Understanding Results — Interpreting what the output means
- Offline Databases — Setup and maintenance
- Export Formats — Saving results as JSON, CSV, Markdown, etc.
Configuration Reference
Hallucinator can be configured via CLI flags, environment variables, and TOML config files. This page documents all options.
Precedence
Configuration is resolved in this order (highest wins):
- CLI flags —
--num-workers 8,--openalex-key KEY - Environment variables —
OPENALEX_KEY,DB_TIMEOUT - CWD config —
.hallucinator.tomlin the current working directory - Platform config —
~/.config/hallucinator/config.toml(Linux/macOS) or%APPDATA%\hallucinator\config.toml(Windows) - Defaults
CWD config overlays platform config field-by-field. This lets you keep API keys in the global config and override settings per-project.
Config File Format
Both config file locations use the same TOML format:
[api_keys]
openalex_key = "your-openalex-key"
s2_api_key = "your-semantic-scholar-key"
crossref_mailto = "you@example.com"
[databases]
dblp_offline_path = "/path/to/dblp.db"
acl_offline_path = "/path/to/acl.db"
cache_path = "/path/to/cache.db"
searxng_url = "http://localhost:8080"
disabled = ["NeurIPS", "SSRN"]
[concurrency]
num_workers = 4
db_timeout_secs = 10
db_timeout_short_secs = 5
max_rate_limit_retries = 3
max_archive_size_mb = 500
[display]
theme = "hacker"
fps = 30
All fields are optional. Omitted fields use defaults.
Full Option Reference
API Keys
| Option | CLI Flag | Env Var | TOML Key | Description |
|---|---|---|---|---|
| OpenAlex key | --openalex-key KEY | OPENALEX_KEY | api_keys.openalex_key | Enables OpenAlex database queries |
| Semantic Scholar key | --s2-api-key KEY | S2_API_KEY | api_keys.s2_api_key | Higher S2 rate limit (100/s vs 1/s) |
| CrossRef mailto | — | CROSSREF_MAILTO | api_keys.crossref_mailto | CrossRef polite pool (3/s vs 1/s) |
Databases
| Option | CLI Flag | Env Var | TOML Key | Default |
|---|---|---|---|---|
| DBLP offline path | --dblp-offline PATH | DBLP_OFFLINE_PATH | databases.dblp_offline_path | None |
| ACL offline path | --acl-offline PATH | ACL_OFFLINE_PATH | databases.acl_offline_path | None |
| Cache path | --cache-path PATH | HALLUCINATOR_CACHE_PATH | databases.cache_path | None |
| SearxNG URL | --searxng (flag) | SEARXNG_URL | databases.searxng_url | http://localhost:8080 |
| Disabled DBs | --disable-dbs A,B | — | databases.disabled | [] |
Notes:
--searxngis a boolean flag on the CLI. The actual URL comes from the env var or config file, defaulting tohttp://localhost:8080.--disable-dbsaccepts a comma-separated list. Database names are case-sensitive:CrossRef,arXiv,DBLP,Semantic Scholar,OpenAlex,Europe PMC,PubMed,ACL Anthology,NeurIPS,DOI,SSRN,Web Search.
Concurrency
| Option | CLI Flag | Env Var | TOML Key | Default |
|---|---|---|---|---|
| Worker count | --num-workers N | — | concurrency.num_workers | 4 |
| DB timeout | — | DB_TIMEOUT | concurrency.db_timeout_secs | 10 |
| Short timeout | — | DB_TIMEOUT_SHORT | concurrency.db_timeout_short_secs | 5 |
| Max 429 retries | --max-rate-limit-retries N | — | concurrency.max_rate_limit_retries | 3 |
| Max archive size | — | — | concurrency.max_archive_size_mb | 500 |
Display (TUI only)
| Option | TOML Key | Default | Values |
|---|---|---|---|
| Theme | display.theme | hacker | hacker, modern, gnr |
| FPS | display.fps | 30 | 1–120 |
Other CLI Flags
| Flag | Description |
|---|---|
--no-color | Disable colored output |
-o, --output PATH | Write results to file |
--dry-run | Extract and print references without querying databases |
--check-openalex-authors | Flag author mismatches from OpenAlex (skipped by default) |
--clear-cache | Clear the entire query cache and exit |
--clear-not-found | Clear only not-found entries from cache and exit |
--config PATH | Path to config file (overrides auto-detection) |
--log PATH | Write tracing/debug logs to file |
CLI Commands
hallucinator-cli check <file> # Check a PDF, BBL, or BIB file
hallucinator-cli update-dblp <path> # Download and build offline DBLP database
hallucinator-cli update-acl <path> # Download and build offline ACL database
Cache Configuration
The query cache stores database responses to avoid redundant API calls across runs.
- Positive TTL (found entries): 7 days
- Negative TTL (not-found entries): 24 hours
- Storage: SQLite with WAL mode + in-memory DashMap
To enable caching, set cache_path in your config or use --cache-path:
hallucinator-cli check --cache-path ~/.hallucinator/cache.db paper.pdf
Cache maintenance:
# Clear everything
hallucinator-cli check --cache-path ~/.hallucinator/cache.db --clear-cache
# Clear only not-found entries (useful after DB outages)
hallucinator-cli check --cache-path ~/.hallucinator/cache.db --clear-not-found
Auto-detection
The TUI and CLI auto-detect offline database paths from well-known locations on your system. If you place dblp.db or acl.db in your platform config directory (~/.config/hallucinator/ on Linux/macOS), they may be found automatically. Explicit paths in the config file or CLI flags always take precedence.
Example Configurations
Minimal (API keys only)
[api_keys]
crossref_mailto = "researcher@university.edu"
Full Setup
[api_keys]
openalex_key = "your-key"
s2_api_key = "your-key"
crossref_mailto = "researcher@university.edu"
[databases]
dblp_offline_path = "~/.hallucinator/dblp.db"
acl_offline_path = "~/.hallucinator/acl.db"
cache_path = "~/.hallucinator/cache.db"
[concurrency]
num_workers = 8
db_timeout_secs = 15
[display]
theme = "modern"
CI / Scripting
[databases]
cache_path = "/tmp/hallucinator-cache.db"
disabled = ["OpenAlex", "NeurIPS", "SSRN"]
[concurrency]
num_workers = 2
db_timeout_secs = 5
max_rate_limit_retries = 1
Offline Databases
Hallucinator supports offline copies of DBLP and ACL Anthology for faster lookups, reduced API dependence, and better reliability. Offline databases are queried inline by the coordinator task (< 1ms), before any remote API calls.
Why Use Offline Databases?
- Speed — SQLite FTS5 lookups complete in under 1ms vs. 100ms–5s for HTTP APIs
- Reliability — No network dependency, no rate limiting, no timeouts
- Early exit — If a reference is found locally, all remote DB queries are skipped
- API savings — Fewer remote calls means you stay within rate limits and API quotas
The tradeoff is disk space and a one-time build step.
DBLP Offline
What It Contains
The DBLP database indexes publications, authors, and URLs from dblp.org. This covers computer science conferences and journals comprehensively — over 7 million publications.
Building
hallucinator-cli update-dblp /path/to/dblp.db
This will:
- Download
dblp.xml.gzfrom dblp.org (~4.6GB compressed, ~16GB uncompressed) - Parse the XML to extract publications, authors, and URLs
- Build a SQLite database with FTS5 full-text search index
- Compact the database with VACUUM
Time: 20–30 minutes on a modern machine (mostly download + parse time)
Disk space: ~2–3GB for the final SQLite database
The build process supports conditional download — if the database already exists and the server reports the file hasn’t changed (304 Not Modified), the download is skipped.
Using
# CLI flag
hallucinator-cli check --dblp-offline /path/to/dblp.db paper.pdf
# Or set in config file
# [databases]
# dblp_offline_path = "/path/to/dblp.db"
# Or environment variable
DBLP_OFFLINE_PATH=/path/to/dblp.db hallucinator-cli check paper.pdf
Staleness Warning
If the database is older than 30 days, a warning is printed. To refresh:
hallucinator-cli update-dblp /path/to/dblp.db
The update is incremental via conditional HTTP (ETag/If-Modified-Since), so if the upstream data hasn’t changed, it completes instantly.
ACL Anthology Offline
What It Contains
The ACL Anthology database indexes papers from computational linguistics and NLP venues (ACL, EMNLP, NAACL, EACL, CoNLL, etc.) — tens of thousands of publications.
Building
hallucinator-cli update-acl /path/to/acl.db
This will:
- Download the ACL Anthology XML data from GitHub
- Extract and parse XML files
- Build a SQLite database with FTS5 full-text search index
Time: A few minutes (much smaller than DBLP)
Disk space: ~50–100MB for the final database
The build process tracks the GitHub commit SHA and skips the download if nothing has changed.
Using
# CLI flag
hallucinator-cli check --acl-offline /path/to/acl.db paper.pdf
# Or set in config file
# [databases]
# acl_offline_path = "/path/to/acl.db"
# Or environment variable
ACL_OFFLINE_PATH=/path/to/acl.db hallucinator-cli check paper.pdf
Recommended Setup
Store both databases in your platform config directory for automatic detection:
mkdir -p ~/.config/hallucinator
# Build databases
hallucinator-cli update-dblp ~/.config/hallucinator/dblp.db
hallucinator-cli update-acl ~/.config/hallucinator/acl.db
# Configure paths
cat > ~/.config/hallucinator/config.toml << 'EOF'
[databases]
dblp_offline_path = "~/.config/hallucinator/dblp.db"
acl_offline_path = "~/.config/hallucinator/acl.db"
cache_path = "~/.config/hallucinator/cache.db"
EOF
Maintenance Schedule
| Database | Recommended refresh | Why |
|---|---|---|
| DBLP | Monthly | New publications indexed regularly |
| ACL | Before conference deadlines | New proceedings added after each conference |
Both update commands are safe to run against existing databases — they rebuild in-place.
Combining with Online Databases
Offline and online databases complement each other:
- Local databases are queried first (< 1ms)
- If verified locally, remote queries are skipped entirely
- If not found locally, remote databases are queried in parallel
- Having both reduces total validation time and improves coverage
This means you get the speed of local lookups for common CS and NLP papers, with full coverage from 10+ remote databases for everything else.
Understanding Results
This guide explains how to interpret Hallucinator’s output, what each verdict means, and how to handle edge cases.
Verdict Types
Each validated reference receives one of these statuses:
Verified
The reference was found in at least one academic database with matching authors.
- Source is reported (e.g., “CrossRef”, “DBLP Offline”, “arXiv”)
- Found authors are listed for comparison
- Paper URL links to the database entry when available
A verified reference is almost certainly real. The 95% fuzzy title matching threshold accommodates minor PDF extraction artifacts while remaining strict enough to avoid false matches.
Not Found
The reference was not found in any queried database.
This does not necessarily mean the reference is fabricated. Common legitimate reasons:
- Very recent publication — Not yet indexed by databases
- Book chapters or dissertations — Less coverage in article-focused databases
- Workshop or regional conference papers — May not be in major indices
- PDF extraction error — Title was mangled during extraction (ligatures, hyphenation, encoding issues)
- Database outage — Temporary API issues (check “Failed DBs” in the output)
What to do: Check the “Failed DBs” list. If multiple databases timed out, the reference may simply need rechecking. Use Google Scholar or the paper URL (if available) for manual verification.
Author Mismatch
The title was found in a database, but the authors don’t match.
Possible explanations:
- Different paper with similar title — The database returned a different paper
- Author name variants — Different transliterations, maiden/married names, inconsistent initials
- Preprint vs. published version — Author list changed between versions
- PDF extraction error — Authors were incorrectly parsed from the PDF
What to do: Compare the “PDF authors” and “DB authors” in the output. If they’re clearly the same people with different name formats, this is a false positive. If the authors are completely different, it’s worth investigating.
Retracted
The reference was found but has been retracted. This information comes from CrossRef’s retraction metadata.
- Retraction DOI links to the retraction notice
- Retraction source indicates the type (e.g., retraction, removal, expression of concern)
Citing retracted papers is a serious concern in academic integrity. However, some retractions are for reasons unrelated to the paper’s scientific content (e.g., copyright disputes). Always check the retraction notice.
Skipped References
Some references are excluded from validation:
| Reason | Explanation |
|---|---|
| URL-only | Reference is just a URL to a non-academic site (GitHub, documentation) |
| Short title | Title has fewer than 5 words (too short for reliable matching) |
| No title | No title could be extracted from the reference text |
Skipped references are not counted in the “problematic” percentage.
Exception: References with a DOI or arXiv ID are never skipped for short title, since the identifier provides a reliable lookup path.
Paper Verdicts (TUI)
In the TUI, entire papers can be marked with a verdict:
- Safe — All references verified, or issues have been manually reviewed
- Questionable — Contains concerning unverified references
These are user-assigned labels for batch triage, not automated judgments.
Per-Database Results
Each reference includes per-database query results showing:
- Database name — Which DB was queried
- Status —
match,no_match,author_mismatch,timeout,rate_limited,error,skipped - Elapsed time — How long the query took
- Found authors — Authors returned by the database (if found)
- Paper URL — Direct link to the database entry (if found)
Use this to understand why a reference got its verdict. If several databases timed out, the “Not Found” verdict may be unreliable.
DOI and arXiv Validation
When a reference includes a DOI or arXiv ID:
- Valid — The identifier resolves to a real paper
- Invalid — The identifier doesn’t resolve (possible fabrication signal)
A verified reference with an invalid DOI is flagged separately — the paper exists in some database, but the DOI in the citation is wrong or fabricated.
False Positive Overrides (TUI)
In the TUI, you can mark results as false positives with a reason:
| Reason | Use when |
|---|---|
| Broken Parse | PDF extraction mangled the title/authors |
| Exists Elsewhere | You verified the paper exists outside indexed databases |
| All Timed Out | All databases timed out; the result is inconclusive |
| Known Good | You personally know this reference is legitimate |
| Non-Academic | The reference is to a non-academic resource (software, standard, etc.) |
FP overrides are reflected in exported results: the effective_status changes to verified while the original status is preserved for transparency.
Confidence Signals
Higher confidence in a “Not Found” verdict:
- Multiple databases returned
no_match(not just timeouts) - No DOI or arXiv ID was present in the reference
- Title was cleanly extracted (no obvious parsing artifacts)
- Paper claims to be from a well-indexed venue (top conferences, major journals)
Lower confidence (consider manual verification):
- Several databases timed out or returned errors
- Title contains unusual characters or formatting
- Reference is to a workshop paper, technical report, or dissertation
- The title is very short (close to the 5-word minimum)
The Problematic Percentage
The summary reports a “problematic %” calculated as:
(not_found + author_mismatch + retracted) / (total - skipped) * 100
This gives a quick signal for triage. A high percentage doesn’t prove misconduct — it means the paper warrants closer human review. Even legitimate papers checking niche or very recent literature can have a notable percentage of unverified references.
Manual Verification Workflow
When Hallucinator flags a reference as Not Found:
- Check failed databases — Were most DBs queried, or did many time out?
- Search Google Scholar — The output includes a Google Scholar link for each reference
- Check the paper URL — If available, visit the link directly
- Verify the venue — Is the claimed venue real? Was the paper published there?
- Check authors — Do the listed authors exist and publish in this field?
- Look for the DOI — If a DOI is listed, try resolving it at
doi.org
Export Formats
Hallucinator can export validation results in five formats. The TUI supports all formats via its export dialog; the CLI writes text output by default (use --output to save to a file).
Formats
| Format | Extension | Best for |
|---|---|---|
| JSON | .json | Programmatic processing, data pipelines |
| CSV | .csv | Spreadsheets, bulk analysis |
| Markdown | .md | Reports, GitHub issues, documentation |
| Text | .txt | Plain-text records, email |
| HTML | .html | Standalone visual reports |
Sorting Order
All formats use the same reference ordering within each paper:
- Retracted — Highest priority (most critical)
- Not Found — Potential hallucinations
- Author Mismatch — Title found, wrong authors
- DOI/arXiv Issues — Verified but with invalid identifiers
- FP-overridden — User-verified false positives
- Clean Verified — Confirmed references
- Skipped — References excluded from validation
Within each category, references are ordered by their original reference number.
False Positive Handling
When a reference has a false-positive override (from the TUI):
- Original status is preserved (e.g.,
not_found) - Effective status becomes
verified - FP reason is included (e.g.,
broken_parse,exists_elsewhere) - Adjusted statistics move FP-overridden references from their original bucket into
verified
JSON Schema
The JSON export produces an array of paper objects:
[
{
"filename": "paper.pdf",
"verdict": "safe",
"stats": {
"total": 42,
"verified": 38,
"not_found": 3,
"author_mismatch": 1,
"retracted": 0,
"skipped": 5,
"problematic_pct": 10.8
},
"references": [
{
"index": 0,
"original_number": 1,
"title": "Attention Is All You Need",
"raw_citation": "[1] A. Vaswani et al., ...",
"status": "verified",
"effective_status": "verified",
"fp_reason": null,
"source": "CrossRef",
"ref_authors": ["A. Vaswani", "N. Shazeer"],
"found_authors": ["Ashish Vaswani", "Noam Shazeer"],
"paper_url": "https://doi.org/10.5555/3295222.3295349",
"failed_dbs": [],
"doi_info": {
"doi": "10.5555/3295222.3295349",
"valid": true,
"title": null
},
"arxiv_info": null,
"retraction_info": null,
"db_results": [
{
"db": "CrossRef",
"status": "match",
"elapsed_ms": 234,
"authors": ["Ashish Vaswani", "Noam Shazeer"],
"url": "https://doi.org/10.5555/3295222.3295349"
},
{
"db": "arXiv",
"status": "skipped",
"elapsed_ms": 0,
"authors": [],
"url": null
}
]
}
]
}
]
Per-Reference Fields
| Field | Type | Description |
|---|---|---|
index | number | Zero-based index in the results array |
original_number | number | Original reference number from the paper (1-based) |
title | string | Extracted reference title |
raw_citation | string | Full raw citation text from PDF |
status | string | Original status: verified, not_found, author_mismatch |
effective_status | string | Status after FP overrides |
fp_reason | string? | FP reason if overridden: broken_parse, exists_elsewhere, all_timed_out, known_good, non_academic |
source | string? | Database that verified the reference |
ref_authors | string[] | Authors extracted from the PDF |
found_authors | string[] | Authors returned by the verifying database |
paper_url | string? | URL to the paper in the source database |
failed_dbs | string[] | Databases that timed out or errored |
doi_info | object? | DOI validation: {doi, valid, title} |
arxiv_info | object? | arXiv validation: {arxiv_id, valid, title} |
retraction_info | object? | Retraction data: {is_retracted, retraction_doi, retraction_source} |
db_results | object[] | Per-database query results |
Skipped Reference Fields
Skipped references include a skip_reason field instead of validation data:
{
"index": 5,
"original_number": 6,
"title": "GitHub repo",
"status": "skipped",
"effective_status": "skipped",
"skip_reason": "url_only",
...
}
Per-DB Result Fields
| Field | Type | Description |
|---|---|---|
db | string | Database name |
status | string | match, no_match, author_mismatch, timeout, rate_limited, error, skipped |
elapsed_ms | number | Query time in milliseconds |
authors | string[] | Authors returned (if found) |
url | string? | Paper URL in this database |
CSV Schema
One row per reference, with these columns:
Filename,Verdict,Ref#,Title,Status,EffectiveStatus,FpReason,Source,Retracted,Authors,FoundAuthors,PaperURL,DOI,ArxivID,FailedDBs
Multi-value fields (Authors, FoundAuthors, FailedDBs) use semicolons as separators within the CSV field.
Markdown Structure
# Hallucinator Results
## paper.pdf [SAFE]
**42** references | **38** verified | **3** not found | ...
### Problematic References
**[7]** Suspicious Paper Title — ✗ Not Found
- [Google Scholar](...)
### Verified References
| # | Title | Source | URL |
|---|-------|--------|-----|
| 1 | Attention Is All You Need | CrossRef | [link](...) |
### Skipped References
| # | Title | Reason |
|---|-------|--------|
| 6 | GitHub repo | URL-only |
Sections are only included if they contain references (no empty “Problematic References” heading when everything is verified).
Text Format
Plain-text with fixed-width formatting:
Hallucinator Results
============================================================
paper.pdf [SAFE]
-----------------
42 total | 38 verified | 3 not found | 1 mismatch | 0 retracted | 5 skipped | 10.8% problematic
[1] Attention Is All You Need - Verified (CrossRef)
Authors (PDF): A. Vaswani, N. Shazeer
DOI: 10.5555/3295222.3295349 (valid)
URL: https://doi.org/...
[7] Suspicious Paper Title - NOT FOUND
Authors (PDF): J. Doe, A. Smith
Timed out: Semantic Scholar, Europe PMC
HTML Format
A self-contained HTML file with:
- Dark theme with CSS variables
- Stat cards showing totals across all papers
- Collapsible per-paper sections
- Color-coded badges (green: verified, red: not found, yellow: mismatch, dark red: retracted)
- Author comparison grid for mismatches
- Retraction warning boxes
- Google Scholar and paper URL links
- Raw citation in expandable details blocks
- Timestamp in footer
The HTML requires no external dependencies — all CSS is inlined.
Using Hallucinator as a Rust Library
This guide covers how to use hallucinator crates as dependencies in your own Rust project.
Which Crate to Depend On
| Use case | Crate | What you get |
|---|---|---|
| Validate references programmatically | hallucinator-core | check_references(), all DB backends, caching, rate limiting |
| Extract references from PDFs | hallucinator-parsing + hallucinator-pdf-mupdf | ReferenceExtractor, section detection, title/author extraction |
| Parse BBL/BIB files | hallucinator-bbl | extract_references_from_bbl(), extract_references_from_bib() |
| Unified file dispatch | hallucinator-ingest | Auto-detection (PDF/BBL/BIB/archive), streaming archive extraction |
| Export results | hallucinator-reporting | JSON, CSV, Markdown, Text, HTML export |
| Build offline DBLP | hallucinator-dblp | build_database(), DblpDatabase::search() |
| Build offline ACL | hallucinator-acl | build_database(), AclDatabase::search() |
Most users will want hallucinator-core for validation and hallucinator-ingest for file handling.
Minimal Example: Validate References
use hallucinator_core::{Config, ProgressEvent, RateLimiters, check_references};
use hallucinator_ingest::extract_references;
use std::sync::Arc;
use tokio_util::sync::CancellationToken;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let path = std::path::Path::new("paper.pdf");
// Extract references
let extraction = extract_references(path)
.map_err(|e| anyhow::anyhow!("{}", e))?;
println!("Found {} references", extraction.references.len());
// Build config with defaults
let config = Config {
rate_limiters: Arc::new(RateLimiters::new(false, false)),
..Default::default()
};
// Validate
let cancel = CancellationToken::new();
let results = check_references(
extraction.references,
config,
|event| {
if let ProgressEvent::Result { result, .. } = &event {
println!("[{:?}] {}", result.status, result.title);
}
},
cancel,
).await;
println!("{} total, {} verified, {} not found",
results.len(),
results.iter().filter(|r| r.status == hallucinator_core::Status::Verified).count(),
results.iter().filter(|r| r.status == hallucinator_core::Status::NotFound).count(),
);
Ok(())
}
Config Construction
The Config struct controls all runtime behavior:
#![allow(unused)]
fn main() {
use hallucinator_core::{Config, RateLimiters, QueryCache, build_query_cache};
use std::sync::Arc;
let rate_limiters = Arc::new(RateLimiters::new(
true, // has_crossref_mailto (enables 3/s instead of 1/s)
true, // has_s2_api_key (enables higher S2 rate)
));
let cache = build_query_cache(
Some(std::path::Path::new("/tmp/cache.db")),
604800, // positive TTL: 7 days in seconds
86400, // negative TTL: 24 hours in seconds
);
let config = Config {
openalex_key: Some("your-key".to_string()),
s2_api_key: Some("your-key".to_string()),
num_workers: 4,
db_timeout_secs: 10,
db_timeout_short_secs: 5,
max_rate_limit_retries: 3,
rate_limiters,
query_cache: Some(cache),
..Default::default()
};
}
ProgressEvent Variants
The progress callback receives these events during validation:
| Event | When | Key fields |
|---|---|---|
Checking | Starting a reference | index, total, title |
DatabaseQueryComplete | A single DB query finished | db_name, status, elapsed |
RateLimitWait | Waiting for rate limiter | db_name, wait_time |
RateLimitRetry | Retrying after 429 | db_name, attempt |
Warning | DB timeouts for a reference | title, failed_dbs, message |
Result | Reference validation complete | index, total, result: Box<ValidationResult> |
RetryPass | Starting retry pass | — |
Retrying | Retrying a reference | index, title |
PDF Extraction
Extract and parse references without validating:
#![allow(unused)]
fn main() {
use hallucinator_core::PdfBackend;
use hallucinator_parsing::ReferenceExtractor;
use hallucinator_pdf_mupdf::MupdfBackend;
let text = MupdfBackend.extract_text(std::path::Path::new("paper.pdf"))?;
// Use ReferenceExtractor for the full pipeline
let extractor = ReferenceExtractor::new(MupdfBackend);
let result = extractor.extract(std::path::Path::new("paper.pdf"))?;
for reference in &result.references {
println!("Title: {:?}", reference.title);
println!("Authors: {:?}", reference.authors);
println!("DOI: {:?}", reference.doi);
}
}
Adding a Custom PDF Backend
Implement PdfBackend (defined in hallucinator-core) to use a different PDF library:
#![allow(unused)]
fn main() {
use hallucinator_core::PdfBackend;
struct MyPdfBackend;
impl PdfBackend for MyPdfBackend {
fn extract_text(&self, path: &std::path::Path) -> Result<String, String> {
// Your PDF text extraction logic here
let text = my_pdf_library::extract(path)
.map_err(|e| format!("extraction failed: {}", e))?;
Ok(text)
}
}
}
Adding a Custom Database Backend
See Database Backends for the DatabaseBackend trait reference and a step-by-step guide.
Database Backends
This document covers the DatabaseBackend trait, the existing database implementations, and how to add a new backend.
The DatabaseBackend Trait
Defined in hallucinator-core/src/db/mod.rs:
#![allow(unused)]
fn main() {
pub trait DatabaseBackend: Send + Sync {
/// Human-readable name (e.g., "CrossRef", "arXiv")
fn name(&self) -> &str;
/// Whether this is a local (offline) database.
/// Local backends are queried inline by the coordinator (not via drainer tasks).
fn is_local(&self) -> bool { false }
/// Whether this backend requires a DOI to query (e.g., DOI resolver).
/// References without a DOI are skipped for these backends.
fn requires_doi(&self) -> bool { false }
/// Query by title. Returns found title, authors, paper URL, and optional retraction info.
fn query<'a>(
&'a self,
title: &'a str,
client: &'a reqwest::Client,
timeout: Duration,
) -> Pin<Box<dyn Future<Output = Result<DbQueryResult, DbQueryError>> + Send + 'a>>;
/// Query by DOI. Default implementation returns empty/not-found.
fn query_doi<'a>(
&'a self,
doi: &'a str,
title: &'a str,
authors: &'a [String],
client: &'a reqwest::Client,
timeout: Duration,
) -> DoiQueryResult<'a> { ... }
}
}
Return Types
#![allow(unused)]
fn main() {
pub struct DbQueryResult {
pub found_title: Option<String>, // Title as found in the database
pub authors: Vec<String>, // Author names
pub paper_url: Option<String>, // Direct link to the paper
pub retraction: Option<RetractionResult>, // Only CrossRef populates this
}
pub enum DbQueryError {
RateLimited { retry_after: Option<Duration> },
Other(String),
}
}
A DbQueryResult with found_title = Some(...) indicates the title was found. The validation engine then compares authors (if provided) to determine Verified vs. AuthorMismatch.
Existing Backends
Remote (HTTP-based)
| Backend | Name | Rate Limit | Auth | Notes |
|---|---|---|---|---|
CrossRef | "CrossRef" | 1/s (3/s with mailto) | Optional mailto | Extracts retraction info inline |
Arxiv | "arXiv" | 3/s | None | Searches arXiv API |
DblpOnline | "DBLP" | 1/s | None | DBLP search API |
SemanticScholar | "Semantic Scholar" | 1/s (100/s with key) | Optional API key | Searches papers by title |
EuropePmc | "Europe PMC" | 3/s | None | Biomedical/life science literature |
PubMed | "PubMed" | 3/s | None | Biomedical literature via NCBI |
OpenAlex | "OpenAlex" | 10/s | Required API key | Inserted first in DB list when enabled |
DoiResolver | "DOI" | 5/s | None | Resolves DOI via doi.org (requires_doi = true) |
AclAnthology | "ACL Anthology" | 2/s | None | ACL Anthology online scraping |
NeurIPS | "NeurIPS" | — | None | Currently disabled |
Ssrn | "SSRN" | — | None | Currently disabled |
Searxng | "Web Search" | 1/s | None | Meta-search fallback (requires self-hosted SearxNG) |
Local (Offline)
| Backend | Name | Storage | Notes |
|---|---|---|---|
DblpOffline | "DBLP" | SQLite FTS5 | ~2–3GB, built from DBLP XML dump |
AclOffline | "ACL Anthology" | SQLite FTS5 | ~50–100MB, built from ACL Anthology XML |
Note: offline and online backends for the same database share the same name(). The system avoids running both simultaneously — if an offline DB is available, the online API is skipped.
Local backends return is_local() = true and are queried inline by the coordinator task before dispatching to remote drainers. If a local backend verifies a reference, all remote queries are skipped.
How Backends Are Selected
The build_database_list() function in hallucinator-core/src/orchestrator.rs assembles the list of enabled backends at startup:
- OpenAlex — Added first if API key is provided
- CrossRef — Always enabled (with optional mailto for higher rate)
- arXiv — Always enabled
- DBLP Online — Always enabled
- Semantic Scholar — Always enabled (rate depends on API key)
- Europe PMC — Always enabled
- PubMed — Always enabled
- ACL Anthology (online) — Always enabled
- DOI Resolver — Always enabled (only queries refs with DOIs)
- DBLP Offline — Added if
dblp_offline_pathis configured - ACL Offline — Added if
acl_offline_pathis configured - SearxNG — Used as last-resort fallback for NotFound refs (not in the main drainer pool)
Backends listed in Config.disabled_dbs are excluded. Database names are matched case-sensitively.
Adding a New Backend
Step 1: Create the Module
Create hallucinator-core/src/db/my_backend.rs:
#![allow(unused)]
fn main() {
use std::time::Duration;
use crate::db::{DatabaseBackend, DbQueryError, DbQueryResult};
pub struct MyBackend {
// Configuration fields
}
impl MyBackend {
pub fn new() -> Self {
Self { }
}
}
impl DatabaseBackend for MyBackend {
fn name(&self) -> &str {
"My Backend"
}
fn query<'a>(
&'a self,
title: &'a str,
client: &'a reqwest::Client,
timeout: Duration,
) -> std::pin::Pin<Box<dyn std::future::Future<
Output = Result<DbQueryResult, DbQueryError>
> + Send + 'a>> {
Box::pin(async move {
// 1. Build your API request
let url = format!("https://api.example.com/search?q={}",
urlencoding::encode(title));
// 2. Execute with timeout
let response = client
.get(&url)
.timeout(timeout)
.send()
.await
.map_err(|e| {
if e.is_timeout() {
DbQueryError::Other("timeout".into())
} else if e.status().map_or(false, |s| s == 429) {
DbQueryError::RateLimited { retry_after: None }
} else {
DbQueryError::Other(e.to_string())
}
})?;
// 3. Parse response
let body: serde_json::Value = response
.json()
.await
.map_err(|e| DbQueryError::Other(e.to_string()))?;
// 4. Extract result
if let Some(found_title) = body.get("title").and_then(|t| t.as_str()) {
Ok(DbQueryResult {
found_title: Some(found_title.to_string()),
authors: vec![], // Extract authors if available
paper_url: body.get("url")
.and_then(|u| u.as_str())
.map(|s| s.to_string()),
retraction: None,
})
} else {
Ok(DbQueryResult {
found_title: None,
authors: vec![],
paper_url: None,
retraction: None,
})
}
})
}
}
}
Step 2: Register the Module
In hallucinator-core/src/db/mod.rs, add:
#![allow(unused)]
fn main() {
pub mod my_backend;
}
Step 3: Add to Database List
In hallucinator-core/src/orchestrator.rs, add the backend to build_database_list():
#![allow(unused)]
fn main() {
dbs.push(Box::new(my_backend::MyBackend::new()));
}
Step 4: Configure Rate Limiting
In hallucinator-core/src/rate_limit.rs, add a rate limiter for your backend in RateLimiters::new():
#![allow(unused)]
fn main() {
// Example: 5 requests per second
let my_backend = AdaptiveDbLimiter::new(
governor::Quota::per_second(std::num::NonZeroU32::new(5).unwrap()),
);
}
Key Implementation Notes
- Title matching: You don’t need to do fuzzy matching yourself. Return the title as found in your database; the validation engine handles comparison via
normalize_title()andrapidfuzz. - Authors: Return author names as provided by your API. The validation engine normalizes them before comparison.
- Rate limiting: Return
DbQueryError::RateLimitedon HTTP 429 responses. The adaptive rate limiter will back off automatically. - Caching: Results are cached automatically by the validation engine. You don’t need to implement caching in your backend.
- Timeout: Always use the provided
timeoutparameter with your HTTP requests.
Python Bindings
Hallucinator provides Python bindings via PyO3, offering the full validation pipeline as a native Python package with pre-compiled wheels.
Installation
pip install hallucinator
Pre-compiled wheels are available for major platforms (Linux x86_64, macOS x86_64/ARM64, Windows x86_64). To build from source:
cd hallucinator-rs/crates/hallucinator-python
pip install maturin
maturin develop --release
What’s Available
The Python bindings expose:
PdfExtractor— Extract references from PDFs with configurable parsingValidator+ValidatorConfig— Validate references against academic databasesValidationResult— Per-reference results with status, source, authors, per-DB detailsProgressEvent— Real-time progress callbacksArchiveIterator— Stream PDFs from tar.gz/zip archives- Custom segmentation strategies — Pass Python callables for reference segmentation
Quick Example
from hallucinator import PdfExtractor, Validator, ValidatorConfig
# Extract
extractor = PdfExtractor()
result = extractor.extract("paper.pdf")
# Validate
config = ValidatorConfig()
config.crossref_mailto = "you@example.com"
config.num_workers = 4
config.db_timeout_secs = 10
validator = Validator(config)
def on_progress(event):
if event.type == "result":
print(f" [{event.status}] {event.title}")
results = validator.check(result.references, progress=on_progress)
Full Documentation
The complete Python API reference — including all configuration options, custom extraction strategies, progress event types, and result inspection — is in:
This covers:
- Installation (wheels vs. source build)
- PDF extraction API and configuration
- Custom segmentation strategies (Python callables)
- Validator configuration options with defaults
- Progress callbacks and event types
- Per-database result inspection
- DOI, arXiv, and retraction information
- Archive processing
- Complete API reference tables
- End-to-end examples
Hallucinator TUI Design Document
Design only. No implementation decisions about backend capabilities — if the backend can’t support something yet, that’s a separate problem.
Who uses this and why
Area chairs / senior PC members reviewing 20-100 submissions for a venue. They need to triage: which papers have suspicious references, and how suspicious? They don’t read every result — they scan for red flags, drill in when something looks off, then move on.
Reviewers checking a handful of assigned papers (3-8 typically). More likely to read results carefully. May re-run individual papers after authors revise.
Demo / conference hallway context. Someone shows this to a colleague. First impression matters — it should look like a tool built by someone who gives a shit. But the flash has to be load-bearing: every visual element should communicate something useful.
Design principles
-
Information density over decoration. CS people read dense UIs comfortably. Don’t waste space on padding when you could show data. Think Bloomberg terminal, not macOS settings.
-
Glanceable status. At any point you should be able to look at the screen for <1 second and know: how far along are we, is anything wrong, what needs my attention.
-
Progressive disclosure. Summary first, details on demand. The batch view shows paper-level status. Drill into a paper to see references. Drill into a reference to see per-database results.
-
Don’t block the user. Analysis takes time. The user should be able to browse already-completed results while new papers are still running.
-
Adaptive layout. Must be usable at 80x24 (cramped but functional) and take advantage of 200+ column modern terminals. Not two separate layouts — one layout that flexes.
Screens
There are four screens. You’re always on exactly one.
Screen 1: Queue
The entry point. Shows all papers (1 or 50) and their status.
HALLUCINATOR ██████░░░░ 12/50
─────────────────────────────────────────────────────────────────────────
# Paper Refs ✓ ⚠ ✗ ☠ Status
─────────────────────────────────────────────────────────────────────────
1 arxiv_2024_llm_survey.pdf 38 34 2 1 1 DONE
2 neurips_submission_042.pdf 27 18 1 0 0 ████░░ 19/27
3 review_response_v2.pdf 15 — — — — QUEUED
4 transformer_scaling.pdf — — — — — QUEUED
5 rlhf_safety_paper.pdf — — — — — QUEUED
...
48 federated_privacy.pdf — — — — — QUEUED
49 code_generation_bench.pdf — — — — — QUEUED
50 multimodal_reasoning.pdf — — — — — QUEUED
─────────────────────────────────────────────────────────────────────────
✓ 34 verified ⚠ 3 mismatch ✗ 1 not found ☠ 1 retracted 3:42 elapsed
[enter] open [r] retry failed [d] delete [a] add files [q] quit
Columns:
#— sequential, stable. Not the filename, because filenames are long.Paper— truncated filename. Full path shown on hover/focus.Refs— total reference count (blank until PDF is parsed).✓ ⚠ ✗ ☠— counts by verdict. These are the triage signal.Status—QUEUED, inline progress bar while running,DONEwhen finished. If errors occurred during parsing/extraction:ERROR.
Behavior:
- Papers are listed in queue order. Currently-running papers float to the top of the “not done” section (done papers above, then active, then queued).
- The cursor (highlighted row) selects a paper. Press
Enterto go to Screen 2. - While papers are running, counts update live. The overall progress bar in the header updates.
- Bottom row shows aggregate totals across all completed papers.
Sorting: Default is queue order. Allow re-sort by column (keybind or
click header). Most useful sort: by ✗ descending — puts the most
suspicious papers at top. A reviewer running 50 papers wants to see “which
5 papers have the most not-found references” immediately.
Why this works for 1 paper: If you pass a single PDF, this screen
still appears but with one row. It shows the progress bar filling up,
counts incrementing. When done, it auto-focuses that row so pressing
Enter takes you straight to the results. Feels natural, not like a
degenerate case of a batch view.
Filtering: Simple text filter on filename. Type / to start filtering
(vim convention). Also filter by status: e.g., f cycles through
“all → has problems → done → running → queued”. The most common filter
is “show me only the papers that have problems.”
Screen 2: Paper
Shows all references for one paper and their verdicts.
HALLUCINATOR > arxiv_2024_llm_survey.pdf ██████████ 38/38
─────────────────────────────────────────────────────────────────────────
# Reference Verdict Source
─────────────────────────────────────────────────────────────────────────
1 Vaswani et al. "Attention Is All You Need" ✓ verified arXiv
2 Brown et al. "Language Models are Few-Shot..." ✓ verified S2
3 Wei et al. "Chain-of-Thought Prompting..." ✓ verified CrossRef
4 Smith & Jones "Recursive Self-Improvement..." ✗ not found —
5 Zhang et al. "Emergent Abilities of..." ⚠ mismatch DBLP
6 Chen et al. "Evaluating Large Language..." ✓ verified arXiv
...
37 Wang et al. "Constitutional AI..." ✓ verified S2
38 Davis "On the Retraction of..." ☠ retracted CrossRef
─────────────────────────────────────────────────────────────────────────
✓ 34 verified ⚠ 2 mismatch ✗ 1 not found ☠ 1 retracted
[enter] details [r] retry [esc] back [e] export [s] sort
Columns:
#— reference number as it appears in the paper.Reference— authors + truncated title. Quoted title portion to visually separate it from authors.Verdict— icon + word. Color-coded: green/yellow/red/magenta.Source— which database confirmed it (for verified) or—for not found. If multiple DBs confirmed, show the fastest one (the one that actually ended the search via early exit).
Behavior:
- If analysis is still running, references appear as they’re processed.
Unprocessed references show as dim/grey with
⏳ pendingor⟳ checkingstatus. Enteron a reference opens Screen 3 (detail view).ron a specific reference retries just that one.R(shift) retries all failed/not-found references for this paper.
Active reference animation: The reference currently being checked gets a subtle indicator — a spinner or a cycling set of dots. Nothing aggressive. Just enough to show “this one is live.” If multiple references are being checked concurrently (which they are — 4 at a time), all active ones show the indicator.
Problem-first ordering: Default sort is by reference number (paper
order). But s cycles through sort modes, and sort-by-verdict puts
not-found and retracted at the top. This is the thing the user actually
cares about — “show me the problems.”
Export: e opens a small modal/prompt: export format (json / csv /
markdown / plain text) and destination (file path, clipboard). Exports
the results for this paper only. From the Queue screen, e exports all
papers.
Screen 3: Reference Detail
Full detail on one reference. This is the “prove it” screen — when you see a suspicious result you drill in here to understand why.
HALLUCINATOR > arxiv_2024_llm_survey.pdf > [4]
─────────────────────────────────────────────────────────────────────────
REFERENCE [4]
Smith, J. and Jones, A. (2024)
"Recursive Self-Improvement in Large Language Models:
A Theoretical Framework"
Proceedings of ICML 2024, pp. 1234-1248
Verdict: ✗ NOT FOUND
Extracted title: "Recursive Self-Improvement in Large Language
Models: A Theoretical Framework"
Extracted authors: J. Smith, A. Jones
Extracted DOI: none
Extracted arXiv: none
DATABASE RESULTS
─────────────────────────────────────────────────────────────────────────
Database Result Time Notes
─────────────────────────────────────────────────────────────────────────
CrossRef no match 1.2s
arXiv no match 0.8s
DBLP no match 0.3s (offline)
Sem. Scholar timeout 10.0s retried: no match (12.4s)
OpenAlex no match 2.1s
ACL no match 0.4s
NeurIPS no match 0.6s
Europe PMC no match 1.8s
PubMed no match 0.9s
─────────────────────────────────────────────────────────────────────────
No close matches found in any database.
[r] retry [c] copy ref text [esc] back
For a verified reference, this screen would instead show:
Verdict: ✓ VERIFIED (arXiv)
Matched title: "Attention Is All You Need"
Match score: 98.2%
Matched authors: Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez,
Kaiser, Polosukhin
Author overlap: 8/8
DATABASE RESULTS
─────────────────────────────────────────────────────────────────────────
Database Result Time Notes
─────────────────────────────────────────────────────────────────────────
arXiv ✓ match 0.3s ← verified (early exit)
CrossRef (skipped) — early exit
DBLP (skipped) — early exit
...
For an author mismatch:
Verdict: ⚠ AUTHOR MISMATCH (DBLP)
Matched title: "Emergent Abilities of Large Language Models"
Match score: 96.1%
Expected authors: Zhang, Wei, Chen
Found authors: Wei, Tay, Bommasani, Raffel, Zoph, Borgeaud,
Yogatama, Bosma, Zhou, Metzler, Chi, Hashimoto,
Vinyals, Liang, Dean, Fedus
Author overlap: 1/3 (Wei)
Why this screen matters: “Not found” doesn’t always mean hallucinated. Maybe the title extraction mangled something. Maybe the paper is too new. This screen lets a human make the judgment call by seeing exactly what was searched for, what came back, and how long each database took. The timing information helps distinguish “no match” from “everything timed out” — very different confidence levels.
Screen 4: Config
Accessible from any screen via , (comma). Not a modal — a full screen
you navigate to and back from, same as the others. Esc returns to
wherever you were.
The point: you’re mid-run, you realize you forgot to set your Semantic Scholar API key, or you want to bump concurrency, or you want to disable a database that’s down. You shouldn’t have to quit and relaunch. That’s hostile UX when someone has 30 papers already processed.
HALLUCINATOR > Config
─────────────────────────────────────────────────────────────────────────
API Keys
─────────────────────────────────────────────────────────────────────────
Semantic Scholar sk-••••••••••••7f2a [enter] edit
OpenAlex (not set) [enter] set
─────────────────────────────────────────────────────────────────────────
Databases
─────────────────────────────────────────────────────────────────────────
CrossRef ✓ enabled
arXiv ✓ enabled
DBLP ✓ enabled (offline: ~/dblp.db)
Sem. Scholar ✓ enabled
OpenAlex ○ disabled (no API key)
ACL ✓ enabled
NeurIPS ✓ enabled
Europe PMC ✓ enabled
PubMed ✓ enabled
─────────────────────────────────────────────────────────────────────────
Concurrency & Timeouts
─────────────────────────────────────────────────────────────────────────
Parallel references 4 [enter] edit
DB query timeout 10s [enter] edit
Retry timeout 45s [enter] edit
Request delay 1.0s [enter] edit
─────────────────────────────────────────────────────────────────────────
Display
─────────────────────────────────────────────────────────────────────────
Theme green [enter] toggle
Notifications bell [enter] cycle
─────────────────────────────────────────────────────────────────────────
[enter] edit [space] toggle [esc] back
Sections:
API Keys. Shows masked keys (last 4 chars visible) for any keys already set. Press Enter to edit — opens an inline text input. Keys entered here take effect immediately for subsequent queries. They override env vars / CLI flags for this session.
Databases. Toggle individual databases on/off with Space. If a
database requires an API key that isn’t set, it shows as ○ disabled
with the reason. If DBLP is in offline mode, show the DB path. Toggling
a database off mid-run means it won’t be queried for remaining
references (already-completed results are unaffected). Useful when a
database is down and you don’t want to waste timeout budget on it.
Concurrency & Timeouts. Edit numeric values inline. Changing parallel references mid-run adjusts the worker pool for subsequent references. Changing timeouts affects subsequent queries. These are the knobs you reach for when the tool is going too slow (bump concurrency) or when a database is flaky (bump timeout).
Display. Theme toggle (green/modern) applies immediately — the screen redraws in the new palette. Notification mode cycles through off → bell → desktop → bell+desktop.
Behavior notes:
- Changes take effect immediately for new work. They don’t retroactively affect completed results or in-flight queries.
- Changes are session-scoped by default. They don’t persist to disk unless the user explicitly saves.
S(shift-s) on the config screen saves current settings to~/.config/hallucinator/config.toml. This becomes the new default for future runs. A small confirmation appears: “Saved to ~/.config/hallucinator/config.toml”.- The config file is also loaded on startup if it exists, so CLI flags and env vars override the config file, and the TUI config screen overrides everything. Precedence: TUI edits > CLI flags > env vars > config file > defaults.
Why , as the keybind: It’s unused, easy to reach, and has
precedent in tools like Neovim/Helix where , is a common leader key
for settings. It doesn’t conflict with any navigation or action key.
Why a full screen and not a modal: The config has too many sections and options to fit comfortably in a modal overlay. A full screen gives room for the settings to breathe and be scannable. Also, settings aren’t something you adjust while simultaneously reading results — you go in, tweak, go back. Full screen matches that flow.
Adaptive layout
Narrow terminals (< 100 columns)
- Queue screen: hide
Refscolumn, truncate filenames more aggressively, collapse✓ ⚠ ✗ ☠into a single “problems” count. - Paper screen: hide
Sourcecolumn, truncate titles earlier. - Detail screen: wraps naturally since it’s mostly prose.
Wide terminals (140+ columns)
- Queue screen: show full filenames, add an “elapsed time” column, add a “problems” column that sums ✗ + ☠ for quick scanning.
- Paper screen: show full titles without truncation, add a “time” column showing how long validation took per reference.
- Detail screen: split into two panes — reference info on the left, database results on the right (side-by-side instead of stacked).
Very wide terminals (200+ columns)
- Queue screen: could show a mini-sparkline per paper showing distribution of verdicts as a tiny bar chart inline. Pure gravy.
- Paper screen: show the raw reference text in a right-side pane alongside the parsed/structured view. Useful for debugging extraction issues.
Short terminals (< 30 rows)
- Collapse the header to a single line.
- Collapse the footer/keybinds bar to a single line.
- Use available rows for data.
Live activity panel (overlay, not a screen)
Toggleable with Tab. This is the “flashy” part, but it earns its space.
When active, it takes the right 40-50 columns (or bottom third on narrow terminals) and shows:
ACTIVITY
────────────────────────────────
Database Health
CrossRef ● 142ms avg
arXiv ● 89ms avg
DBLP ● 12ms avg (offline)
Sem.Scholar ◐ 1.2s avg throttled
OpenAlex ● 203ms avg
ACL ● 67ms avg
NeurIPS ● 94ms avg
Europe PMC ● 312ms avg
PubMed ○ down
Rate Limits
CrossRef ░░░░▓▓░░░░ 12/50
S2 ░▓░░░░░░░░ 3/100
Throughput
refs/min ▁▂▃▅▇█▇▅▃▄▆█ avg: 8.2
Active Queries
→ CrossRef: "Recursive Self-Imp..."
→ arXiv: "Recursive Self-Imp..."
→ DBLP: "Recursive Self-Imp..."
← S2: 429 Too Many Requests
Database health indicators:
●— healthy (responding, <500ms average)◐— degraded (slow, rate limited, intermittent errors)○— down (repeated failures, all timeouts)
Why this panel exists: When you’re running 50 papers, this answers “why is it going slow” without you having to guess. If Semantic Scholar is throttling you, you can see it. If PubMed is down, you know those “not found” results are lower confidence. It converts backend infrastructure state into visible, actionable information.
Why it’s an overlay and not a screen: You want to see this while browsing results. It’s context, not content.
Keyboard model
Global (work on any screen)
| Key | Action |
|---|---|
q | Quit (confirms if analysis still running) |
, | Open config screen |
Tab | Toggle activity panel |
? | Toggle keybind help overlay |
Ctrl+C | Cancel current analysis / force quit |
Navigation
| Key | Action |
|---|---|
↑/↓ j/k | Move cursor |
Enter | Drill in (queue→paper→reference) |
Esc | Back up one level |
g / G | Jump to top / bottom of list |
Ctrl+D/U | Page down / page up |
/ | Start text filter |
n / N | Next / previous filter match |
Actions
| Key | Context | Action |
|---|---|---|
r | Queue | Retry all failed refs in paper |
r | Paper | Retry selected reference |
R | Paper | Retry all failed refs in paper |
r | Detail | Retry this reference |
e | Queue | Export all results |
e | Paper | Export this paper’s results |
s | Queue / Paper | Cycle sort mode |
f | Queue | Cycle status filter |
a | Queue | Add more files |
d | Queue | Remove paper from queue |
y | Detail | Copy reference text to clipboard |
S | Config | Save settings to config file |
Space | Config | Toggle selected setting |
Mouse
- Click row to select.
- Double-click row to drill in.
- Click column header to sort.
- Scroll wheel scrolls the list.
- Click the
Tabactivity panel area to toggle it.
Not every action needs a mouse equivalent. Keyboard is the primary interface. Mouse is a convenience for people who reach for it instinctively.
Color
The palette should work on both dark and light terminal backgrounds but optimize for dark (that’s what the target audience uses).
Verdict colors
| Verdict | Color | Rationale |
|---|---|---|
| Verified | Green (bold) | Universal “good” |
| Author mismatch | Yellow | Warning, needs human judgment |
| Not found | Red | Danger / suspicious |
| Retracted | Magenta (bold) | Alarming, distinct from not-found |
| Pending | Dim / grey | Not yet actionable |
| Checking | Cyan | Active, in-progress |
UI chrome
- Borders and separators: dim grey. Should recede.
- Headers/labels: white, bold.
- Selected row: reverse video (swap fg/bg). High contrast, works on any color scheme.
- Active database queries in activity panel: cyan.
- Rate limit bars: green → yellow → red gradient as capacity fills.
Emphasis principle
Color is never the only signal. Every verdict also has a text label and a distinct icon character (✓ ⚠ ✗ ☠). This matters for:
- Accessibility (color vision deficiency).
- Monochrome terminals / piped output.
- Screenshots in papers or blog posts that may be printed B&W.
Startup sequence
$ hallucinator ~/papers/*.pdf
░█░█░█▀█░█░░░█░░░█░█░█▀▀░▀█▀░█▀█░█▀█░▀█▀░█▀█░█▀▄
░█▀█░█▀█░█░░░█░░░█░█░█░░░░█░░█░█░█▀█░░█░░█░█░█▀▄
░▀░▀░▀░▀░▀▀▀░▀▀▀░▀▀▀░▀▀▀░▀▀▀░▀░▀░▀░▀░░▀░░▀▀▀░▀░▀
Loading 50 PDFs...
Databases: CrossRef arXiv DBLP(offline) S2 OpenAlex ACL NeurIPS PMC PubMed
Brief. The banner renders instantly (no animation — animation on startup is a delay). The database line confirms which sources are enabled and shows if DBLP is running in offline mode. Then it transitions to the Queue screen within ~1 second as PDFs start parsing.
If the banner won’t fit (terminal < 70 columns), skip it and go straight to the Queue screen.
The “50 papers at 2am” workflow
This is the scenario that matters most. An area chair has a deadline. They run:
$ hallucinator ~/openreview-downloads/*.pdf
First 10 seconds: Queue screen populates with 50 filenames. First paper starts processing. Activity panel shows databases warming up.
Next few minutes: Papers process. The user watches for a bit, sees the system is working, then does something else in another terminal tab.
They come back: 35 papers done. They press s to sort by problems.
The 3 papers with not-found references float to the top. They press
Enter on the worst one, see 4 not-found references, drill into each
to see what the tool searched for. Two look like genuine hallucinations
(zero matches across all 9 databases). Two look like very recent
preprints that just aren’t indexed yet (only in arXiv, which timed out).
They press Esc back to Queue, check the next problem paper. After
5 minutes of triage they have a clear picture: 2 submissions with
probable fabricated references, 1 with a retracted citation the authors
should have caught.
They press e, export a JSON report of all results, and attach it to
their AC notes.
What mattered: Sort by problems. Fast drill-in/drill-out. Export. Not the sparklines or the database race visualization — those were nice for the first 30 seconds but the actual utility is in triage speed.
Non-goals for TUI
- PDF viewing. Don’t try to render the paper. Users have their own PDF viewer open alongside.
- Editing results. The TUI is read-only for results. No “mark as false positive” or annotation features. That’s a different tool.
- Log viewer. The activity panel is not a log. Don’t show every HTTP request. Show state (database health, rate limits, throughput) not events.
Visual mockups: states and scenarios
Queue screen — mid-run, sorted by problems
The area chair has been running for a few minutes and just hit s to
sort by descending problem count.
HALLUCINATOR ████████░░ 41/50
─────────────────────────────────────────────────────────────────────────
# Paper Refs ✓ ⚠ ✗ ☠ Status
─────────────────────────────────────────────────────────────────────────
7 sketchy_submission_v3.pdf 22 14 1 4 1 DONE
31 workshop_paper_draft.pdf 18 11 2 3 0 DONE
12 llm_alignment_study.pdf 35 28 3 2 0 DONE
1 arxiv_2024_llm_survey.pdf 38 34 2 1 1 DONE
19 multiagent_reasoning.pdf 29 27 1 1 0 DONE
3 safety_evaluation.pdf 41 39 2 0 0 DONE
5 federated_learning.pdf 33 33 0 0 0 DONE
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
42 code_gen_benchmark.pdf 27 18 — — — █████░ 18/27
43 vision_transformer.pdf 19 — — — — PARSING
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
44 diffusion_models.pdf — — — — — QUEUED
45 robustness_theory.pdf — — — — — QUEUED
...8 more
─────────────────────────────────────────────────────────────────────────
✓ 412 verified ⚠ 18 mismatch ✗ 11 not found ☠ 2 retracted 12:34
sorted: problems ↓ [enter] open [s] sort [f] filter [q] quit
Note the three visual zones separated by dashed rules: done (sorted), active (running now), and queued. The user’s eye goes straight to the top — paper #7 with 4 not-found and 1 retracted is the one to investigate.
Queue screen — narrow terminal (80 columns)
Same data, collapsed for a small terminal:
HALLUCINATOR ████████░░ 41/50
────────────────────────────────────────────────────────
# Paper Probs Status
────────────────────────────────────────────────────────
7 sketchy_submission_v3… 5 DONE
31 workshop_paper_draft… 3 DONE
12 llm_alignment_study… 2 DONE
1 arxiv_2024_llm_surv… 2 DONE
19 multiagent_reasoning… 1 DONE
3 safety_evaluation… 0 DONE
5 federated_learning… 0 DONE
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
42 code_gen_benchmark… — █████░ 18/27
43 vision_transformer… — PARSING
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
44 diffusion_models… — QUEUED
...
────────────────────────────────────────────────────────
[enter] open [s] sort [f] filter [q] quit
Individual verdict columns collapse into a single “Probs” count (✗ + ☠ + ⚠). Still scannable. The user loses per-type breakdown at a glance but gains it back by drilling in.
Queue screen — wide terminal with activity panel (180+ columns)
HALLUCINATOR ████████░░ 41/50 │ ACTIVITY
──────────────────────────────────────────────────────────────────────────────────┤────────────────────────────────────
# Paper Refs ✓ ⚠ ✗ ☠ Time Status │ Database Health
──────────────────────────────────────────────────────────────────────────────────│ CrossRef ● 142ms
7 sketchy_submission_v3.pdf 22 14 1 4 1 0:48 DONE │ arXiv ● 89ms
31 workshop_paper_draft.pdf 18 11 2 3 0 0:35 DONE │ DBLP ● 12ms offline
12 llm_alignment_study.pdf 35 28 3 2 0 1:12 DONE │ Sem.Scholar ◐ 1.2s throttled
1 arxiv_2024_llm_survey.pdf 38 34 2 1 1 1:31 DONE │ OpenAlex ● 203ms
19 multiagent_reasoning.pdf 29 27 1 1 0 0:55 DONE │ ACL ● 67ms
3 safety_evaluation.pdf 41 39 2 0 0 1:44 DONE │ NeurIPS ● 94ms
5 federated_learning.pdf 33 33 0 0 0 1:22 DONE │ Europe PMC ● 312ms
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│ PubMed ○ down
42 code_gen_benchmark.pdf 27 18 — — — 0:22 ████░ │
43 vision_transformer.pdf 19 — — — — — PARSING │ Rate Limits
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│ CrossRef ░░░▓▓▓░░░░ 18/50
44 diffusion_models.pdf — — — — — — QUEUED │ S2 ░▓▓▓▓▓▓▓░░ 71/100
45 robustness_theory.pdf — — — — — — QUEUED │
...8 more │ Throughput (refs/min)
──────────────────────────────────────────────────────────────────────────────────│ ▁▂▃▅▇█▇▅▃▁▂▅▇█▆▅ avg: 8.2
✓ 412 ⚠ 18 ✗ 11 ☠ 2 12:34 │
sorted: problems ↓ [enter] open [s] sort [f] filter [Tab] panel [q] quit│
The activity panel earns its space here. You can see S2 is almost at its rate limit (71/100) — that’s why it shows as “throttled.” PubMed is straight up down. The throughput sparkline shows a dip about 2 minutes ago (probably when S2 started throttling) and a recovery.
Paper screen — actively checking, with activity panel
Drilled into paper #42 which is still running:
HALLUCINATOR > code_gen_benchmark.pdf █████░░░░ 18/27 │ ACTIVITY
──────────────────────────────────────────────────────────────────────────────────┤──────────────────────────────
# Reference Verdict Source │ Database Health
──────────────────────────────────────────────────────────────────────────────────│ CrossRef ● 142ms
1 Chen et al. "Evaluating Large Language..." ✓ verified arXiv │ arXiv ● 89ms
2 Austin et al. "Program Synthesis with..." ✓ verified S2 │ DBLP ● 12ms offline
3 Li et al. "Competition-Level Code..." ✓ verified CrossRef │ Sem.Scholar ◐ 1.2s
4 Hendrycks et al. "Measuring Coding..." ✓ verified CrossRef │
5 Nijkamp et al. "CodeGen: An Open..." ✓ verified S2 │ Active Now
... │ [19] → CrossRef ⟳
17 Fried et al. "InCoder: A Generative..." ✓ verified DBLP │ [19] → arXiv ⟳
18 Allal et al. "SantaCoder: Don't..." ✓ verified S2 │ [19] → DBLP ✓ 12ms
──────────────────────────────────────────────────────────────────────────────── │ [19] → S2 ⟳
19 Wang et al. "Execution-Based Code..." ⟳ checking 3/9 │ [20] → CrossRef ⟳
20 Fake et al. "An Invented Paper..." ⟳ checking 1/9 │ [20] → arXiv waiting
21 Zhang et al. "RepoCoder: Repository..." ⟳ checking 0/9 │ [21] → queued
22 Liu et al. "Is Your Code Generated..." ⏳ pending │ [22] → queued
... │
27 Peng et al. "The Impact of AI on..." ⏳ pending │
──────────────────────────────────────────────────────────────────────────────────│
✓ 18 verified so far │
[enter] details [r] retry [esc] back [s] sort │
The activity panel here shows per-query granularity: reference [19] has 3 databases done (DBLP already returned a match at 12ms but it’s still waiting on others — or maybe early exit will kick in and cancel the remaining). Reference [20] just started. [21] and [22] are queued waiting for a slot in the concurrency pool.
This is the “database race” made visible. You’re watching 4 references being checked concurrently, each with up to 9 databases racing. When a database returns a match, the whole reference can resolve instantly via early exit.
Paper screen — done, filtered to problems only
After analysis completes, the reviewer presses f to filter to problems:
HALLUCINATOR > sketchy_submission_v3.pdf DONE ✓14 ⚠1 ✗4 ☠1
─────────────────────────────────────────────────────────────────────────
# Reference Verdict Source
─────────────────────────────────────────────────────────────────────────
3 Zhang & Li "Self-Aware Neural..." ✗ not found —
8 Johnson et al. "Recursive Prompt..." ✗ not found —
11 Chen "Advanced Reasoning in..." ✗ not found —
15 Park et al. "Constitutional Self-..." ✗ not found —
19 Davis et al. "On the Emergent..." ☠ retracted CrossRef
6 Williams et al. "Scaling Laws for..." ⚠ mismatch DBLP
─────────────────────────────────────────────────────────────────────────
showing: problems only (6/22)
[enter] details [f] show all [r] retry [e] export [esc] back
Six references instead of 22. The reviewer only needs to look at these. Four not-found references in a 22-reference paper is a strong signal.
Reference detail — close match found but authors wrong
HALLUCINATOR > sketchy_submission_v3.pdf > [6]
─────────────────────────────────────────────────────────────────────────
REFERENCE [6]
Williams, R., Thompson, K., and Garcia, M. (2023)
"Scaling Laws for Neural Language Models"
In Proceedings of NeurIPS 2023
Verdict: ⚠ AUTHOR MISMATCH
Extracted title: "Scaling Laws for Neural Language Models"
Extracted authors: R. Williams, K. Thompson, M. Garcia
BEST MATCH (CrossRef)
─────────────────────────────────────────────────────────────────────────
Matched title: "Scaling Laws for Neural Language Models"
Title score: 100.0%
Found authors: Jared Kaplan, Sam McCandlish, Tom Henighan,
Tom B. Brown, Benjamin Chess, Rewon Child,
Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei
Author overlap: 0/3 — no matching authors
DOI: 10.48550/arXiv.2001.08361
Source: CrossRef (0.8s)
─────────────────────────────────────────────────────────────────────────
This paper exists but the cited authors (Williams, Thompson, Garcia)
don't match the actual authors (Kaplan, McCandlish, et al.).
ALL DATABASE RESULTS
─────────────────────────────────────────────────────────────────────────
CrossRef ⚠ author mismatch 0.8s (match shown above)
arXiv ⚠ author mismatch 0.4s
DBLP ⚠ author mismatch 0.1s (offline)
Sem. Scholar timeout 10.0s
OpenAlex ⚠ author mismatch 1.2s
ACL no match 0.3s
NeurIPS no match 0.5s
Europe PMC no match 1.1s
PubMed no match 0.7s
─────────────────────────────────────────────────────────────────────────
[r] retry [y] copy ref text [esc] back
This is a telltale sign: the paper “Scaling Laws for Neural Language Models” is real (Kaplan et al., 2020), but the submission attributed it to completely fabricated authors. Four databases independently confirmed the title exists with different authors. The detail screen makes this case unambiguous.
Reference detail — retracted paper
HALLUCINATOR > sketchy_submission_v3.pdf > [19]
─────────────────────────────────────────────────────────────────────────
REFERENCE [19]
Davis, P., Reeves, L., and Kang, S. (2021)
"On the Emergent Properties of Transformer Architectures
in Low-Resource Settings"
Journal of Machine Learning Research, 22(1), pp. 1-34
Verdict: ☠ RETRACTED
Extracted title: "On the Emergent Properties of Transformer
Architectures in Low-Resource Settings"
Extracted authors: P. Davis, L. Reeves, S. Kang
MATCH (CrossRef)
─────────────────────────────────────────────────────────────────────────
Matched title: "On the Emergent Properties of Transformer
Architectures in Low-Resource Settings"
Title score: 100.0%
Found authors: P. Davis, L. Reeves, S. Kang
Author overlap: 3/3 ✓
DOI: 10.xxxx/jmlr.2021.xxxxx
Source: CrossRef (1.1s)
╔══════════════════════════════════════════════════════════════════╗
║ ☠ RETRACTION NOTICE ║
║ ║
║ This paper was retracted on 2022-03-15. ║
║ Retraction DOI: 10.xxxx/jmlr.2022.retract.xxxxx ║
║ Reason: "Results could not be reproduced; data fabrication ║
║ suspected." ║
╚══════════════════════════════════════════════════════════════════╝
─────────────────────────────────────────────────────────────────────────
[r] retry [y] copy ref text [esc] back
The retraction notice gets a heavy box border — it’s the most important piece of information on this screen and should be impossible to miss.
Single-paper mode — just started
When invoked with a single PDF:
HALLUCINATOR
─────────────────────────────────────────────────────────────────────────
arxiv_2024_llm_survey.pdf 38 references found
─────────────────────────────────────────────────────────────────────────
1 Vaswani et al. "Attention Is All You Need" ✓ verified arXiv
2 Brown et al. "Language Models are Few-..." ✓ verified S2
3 Wei et al. "Chain-of-Thought Prompting..." ✓ verified CrossRef
4 Bubeck et al. "Sparks of Artificial..." ✓ verified S2
5 Touvron et al. "LLaMA: Open and..." ⟳ checking 4/9
6 Chowdhery et al. "PaLM: Scaling..." ⟳ checking 2/9
7 Hoffmann et al. "Training Compute-..." ⟳ checking 0/9
8 Ouyang et al. "Training language..." ⟳ checking 0/9
9 Bai et al. "Constitutional AI:..." ⏳ pending
10 Raffel et al. "Exploring the Limits..." ⏳ pending
...
38 Kojima et al. "Large Language Models..." ⏳ pending
─────────────────────────────────────────────────────────────────────────
████░░░░░░ 4/38 ✓ 4 verified 0:12
[enter] details [Tab] activity [q] quit
No queue screen — it drops you directly into the paper view. The progress bar and running counts update live. This feels immediate and purposeful. When it finishes, the status line updates and you can browse results or export.
Export modal
Pressing e on any screen:
┌─ Export ──────────────────────────────┐
│ │
│ Format: [JSON] CSV Markdown Text │
│ Scope: This paper / All papers │
│ Output: ~/hallucinator-results.json │
│ │
│ [Export] [Cancel] │
└───────────────────────────────────────┘
Minimal modal. Arrow keys or tab to move between options. Enter to confirm. Esc to cancel. The output path has a sensible default and is editable.
Help overlay
Pressing ? on any screen:
┌─ Keybindings ─────────────────────────────────────────────────┐
│ │
│ Navigation Actions │
│ ↑↓ j/k move cursor r retry reference/paper │
│ Enter drill in R retry all failed │
│ Esc back e export results │
│ g/G top/bottom s cycle sort mode │
│ Ctrl+D/U page down/up f cycle filter │
│ / search/filter a add files (queue) │
│ n/N next/prev match d remove paper (queue) │
│ y copy to clipboard │
│ Global │
│ Tab toggle activity ? this help │
│ , config Ctrl+C cancel/force quit │
│ q quit │
│ │
│ [?/Esc] close │
└───────────────────────────────────────────────────────────────┘
Semi-transparent overlay on top of whatever screen is active. The underlying screen is still visible (dimmed) so you maintain spatial context.
Decisions (resolved)
1. Notification on completion
Terminal bell by default. Works everywhere, zero config. Desktop
notification via notify-send / platform equivalent available as
opt-in flag (--notify). Don’t overthink this.
2. Results persistence
Two distinct mechanisms:
Temp state (invisible infrastructure). In-progress and completed
results write to ~/.cache/hallucinator/runs/<timestamp>/. This is
crash safety — if the terminal dies, SSH drops, or the user hits
Ctrl+C, the work isn’t lost. The TUI doesn’t expose this to the
user. It just exists.
Export (deliberate user action). e key opens the export modal.
User picks format (JSON, CSV, Markdown, plain text), scope (one paper
or all), and destination. This produces the actual deliverable — the
report they attach to AC notes or share with co-reviewers.
Resume (future). Not in v1. Eventually: hallucinator --resume
reads from the temp state dir and picks up where it left off. The temp
state format should be designed with this in mind even if we don’t
build the resume path yet — don’t paint ourselves into a corner.
3. Reference text preview pane
Yes. Shown on the Paper screen (Screen 2) when terminal height >= 40 rows. Located below the reference list, separated by a horizontal rule. Shows the raw reference text as extracted from the PDF for the currently-selected reference.
Updates in real-time as the cursor moves through the reference list (file-manager-style preview). This is the expected behavior and the rendering cost is trivial — it’s just text reflow.
On terminals shorter than 40 rows, the preview is hidden. The user can still see the raw text by drilling into Screen 3.
...
4 Smith & Jones "Recursive Self-Imp..." ✗ not found —
> 5 Zhang et al. "Emergent Abilities..." ⚠ mismatch DBLP
6 Chen et al. "Evaluating Large..." ✓ verified arXiv
...
─────────────────────────────────────────────────────────────────
[5] Zhang, W., Wei, J., and Chen, L., "Emergent Abilities of
Large Language Models," in Proceedings of the International
Conference on Machine Learning (ICML), 2023, pp. 4812-4830.
─────────────────────────────────────────────────────────────────
This earns its space. When the extracted title looks wrong (mangled by hyphenation, ligature issues, or a bad parse), you see it instantly without an extra keypress.
4. Color themes
Two themes, toggled via --theme=green or --theme=modern. No
theming framework, no theme.toml. Just two palette structs.
green (default): Dark background, green/cyan primary text. Terminal hacker aesthetic. The one that makes people at poster sessions say “what is that.” Verdict colors as specified in the Color section above.
modern: Dark background, white primary text, electric blue accents. Cleaner, more subdued. For people who think the green is too much, or for screenshots in formal reports where neon green looks unserious.
Both palettes follow the same rules: verdict colors stay semantically consistent (green=verified, red=not found, etc.), only the chrome and accent colors differ.
5. Inline retry feedback
Both spinner and text. The verdict cell shows an animated spinner
character cycling through frames (◜ ◝ ◞ ◟) followed by static
“retrying” text:
4 Smith & Jones "Recursive Self-..." ◝ retrying —
The spinner provides motion (“something is happening”) while the text
provides meaning (“what is happening”). Consistent with how the
⟳ checking state already works during initial analysis — just a
different animation to distinguish retry from first pass.
When the retry completes, the cell snaps to the new verdict. No transition animation — just the immediate update. The change in color (from cyan retrying to green/red/yellow result) is transition enough.
6. Phase 4 decisions
File picker screen. When launched with no PDF arguments, the banner
dismisses into an interactive file picker instead of an empty queue.
The file picker is a custom implementation using std::fs::read_dir —
no external dependency (ratatui-explorer was considered but rejected to
keep the dependency tree small). Directories are navigable, .pdf
files are togglable with Space, Enter confirms selection and returns to
the queue. The a key on the queue screen reopens the picker to add
more files.
Manual start. PDFs load into the queue in Queued state but do
not begin processing automatically. The user reviews the queue, adjusts
configuration, then presses Space to start. A prominent [Space] Start
indicator in the footer makes this discoverable. Processing is deferred
via a BackendCommand channel — the backend listener receives a
ProcessFiles command containing the file list, starting index, and a
Config struct rebuilt from the current ConfigState. This means
config edits made before pressing Space take effect.
Concurrency configurable from config screen. The config screen
(, key) is now fully interactive. Tab cycles between sections (API
Keys, Databases, Concurrency, Display). j/k navigates items within a
section. Enter opens a text-editing mode for numeric and string fields.
Space toggles database checkboxes. Cursor is clamped to valid bounds
per section. Config values are populated from CLI flags, environment
variables, and defaults — not hardcoded.
Activity panel shown by default. The activity panel (right sidebar) is now visible on launch instead of hidden. It shows database health with query counts and average response times, a throughput sparkline with refs/sec rate, active query list (which references are currently being checked), and a summary of total completed references. Throughput data is fed by a tick-based bucketing system (every ~1 second).
Mouse click support. Clicking a row in the queue or paper table selects it. The rendered table area is stored after each draw, and click coordinates are mapped to table rows accounting for borders and headers. Double-click (same row within 500ms) drills in.
False positive marking remains a non-goal. Per the original design (section “Explicit non-goals”), false-positive toggling is deferred. The TUI is an analysis tool, not an annotation tool.