Concurrency Model
Hallucinator’s validation engine is designed around a per-DB drainer pool architecture that maximizes throughput while respecting per-database rate limits. This document explains the concurrency primitives, task structure, and how they interact.
Design Goals
- Maximize parallelism — Check multiple references simultaneously
- Respect rate limits — Each database has its own rate limit; never exceed it
- Minimize latency — Return results as soon as a verified match is found
- Avoid contention — No shared rate limiter governor across tasks
Architecture Diagram
┌──────────────────┐
│ Job Queue │
│ (async_channel) │
└────────┬─────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ Coordinator │ │ Coordinator │ │ Coordinator │
│ Task 1 │ │ Task 2 │ │ Task N │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
│ Local DBs inline │ │
│ Cache pre-check │ │
│ │ │
└──────────┬─────────┘ │
│ Fan out (cache misses only) │
┌──────────────┼──────────────────────────┐ │
│ │ │ │ │
┌─────▼─────┐ ┌──────▼─────┐ ┌─────▼────┐ ┌───▼───▼──┐
│ CrossRef │ │ arXiv │ │ DBLP │ │ ... │
│ Drainer │ │ Drainer │ │ Drainer │ │ Drainers │
│ │ │ │ │ (online) │ │ │
│ Rate: 1/s │ │ Rate: 3/s │ │ Rate:1/s │ │ │
└───────────┘ └────────────┘ └──────────┘ └──────────┘
Task Types
Coordinator Tasks
- Count: Configurable via
num_workers(default: 4) - Role: Pick references from the shared job queue, run local DBs inline, pre-check cache, fan out to drainers
- Concurrency: Multiple coordinators run in parallel, each pulling from the same
async_channel
A coordinator’s lifecycle for each reference:
- Receive
RefJobfrom job queue - Emit
ProgressEvent::Checking - Query local databases inline (DBLP offline, ACL offline) — sub-millisecond
- If verified locally → emit result, skip remote phase
- Pre-check cache for all remote DBs (synchronous, prevents race condition)
- If verified from cache → emit result, skip drainers
- Create
RefCollector(shared aggregation hub) - Send
DrainerJobto each cache-miss DB’s drainer queue
Drainer Tasks
- Count: One per enabled remote database
- Role: Process DB queries sequentially at the database’s natural rate
- Rate limiting: Each drainer is the sole consumer of its DB’s
AdaptiveDbLimiter
A drainer’s lifecycle for each job:
- Check early-exit conditions (cancelled, already verified, no DOI for DOI-requiring backend)
- Acquire rate limiter token
- Check cache (within the rate-limited query path)
- Execute HTTP query with timeout
- Validate authors if title found
- Update
RefCollectorstate - Decrement
remainingcounter; if last, finalize the result
RefCollector
A per-reference aggregation hub, shared (via Arc) by all drainers working on that reference:
RefCollector
├── remaining: AtomicUsize # Drainers left to report
├── verified: AtomicBool # Early-exit flag
├── state: Mutex<AggState> # Aggregation (held briefly)
│ ├── verified_info
│ ├── first_mismatch
│ ├── failed_dbs
│ ├── db_results
│ └── retraction
└── result_tx: Mutex<Option<oneshot::Sender>>
The last drainer to decrement remaining to zero calls finalize_collector(), which builds the final ValidationResult and sends it on the oneshot channel.
Concurrency Primitives
| Primitive | Purpose |
|---|---|
async_channel::unbounded | Job queue (coordinators) and per-DB drainer queues |
AtomicUsize + Ordering::AcqRel | remaining counter for lock-free drainer coordination |
AtomicBool + Ordering::Release/Acquire | verified flag for early exit |
Mutex<AggState> | Per-reference aggregation state (single mutex, held briefly) |
tokio::sync::oneshot | Return channel for each reference’s result |
CancellationToken | Graceful shutdown (Ctrl+C handler) |
ArcSwap | Atomic governor swapping during adaptive rate limit backoff |
DashMap | Lock-free concurrent L1 cache reads |
Cache Pre-Check: Preventing Race Conditions
A subtle race condition exists without the cache pre-check:
- Reference R is dispatched to CrossRef (drainer A) and arXiv (drainer B)
- Drainer A finishes first: CrossRef has a match → sets
verified = true - Drainer B sees
verified = true→ skips arXiv query entirely - arXiv’s result is never cached for reference R
This means future runs will always miss the arXiv cache for this title.
Solution: Before dispatching to any drainer, the coordinator synchronously checks the cache for all remote DBs. Cache hits are recorded in AggState.db_results, and only cache-miss DBs are dispatched to drainers. This ensures every DB’s cached result is always captured regardless of verification order.
Early Exit
When a drainer verifies a reference:
- Sets
collector.verifiedtotrue(atomic store with Release ordering) - Other drainers check this flag before querying (Acquire ordering)
- Drainers that see
verified = trueemit aSkippedstatus and decrementremaining
This avoids unnecessary API calls once a match is found.
SearxNG Fallback
If a reference is NotFound after all remote DBs have been checked and SearxNG is configured:
finalize_collector()runs a SearxNG web search as a last-resort fallback- If SearxNG finds the title, the status upgrades from
NotFoundtoVerified(source: “Web Search”) - SearxNG results don’t undergo author validation (web search doesn’t return structured author data)
Shutdown Sequence
- User presses Ctrl+C →
CancellationTokenis cancelled job_tx(the job queue sender) is closed- Coordinators drain remaining jobs, checking
cancel.is_cancelled()at each iteration - Drainers skip remaining jobs when cancelled
- When all coordinators finish, they drop their
Arc<drainer_txs>clones - Drainer channels close → drainers drain and exit
- Pool handle completes
Performance Characteristics
- Local DB queries: < 1ms (SQLite FTS5 lookups)
- Cache hits: Sub-microsecond (DashMap L1) to ~1ms (SQLite L2)
- Remote DB queries: 100ms–10s depending on database and network
- Throughput: Scales linearly with
num_workersfor CPU-bound coordination; drainer throughput is rate-limit-bound per DB - Memory: One
RefCollectorper in-flight reference (small: a few KB each)