-
Notifications
You must be signed in to change notification settings - Fork 510
Labels
Description
Problem: At ~2,000 LOC, it handles config‐loading, model‐initialization, KB bootstrapping, event translation v1/v2, sync/async bridging, Colang runtime orchestration, streaming post-processing, response packaging, and public API registration.
Goal: Break into:
- ConfigLoader (load/validate/merge flows, messages, imports)
- ModelFactory (instantiate LLMs + streaming flags)
- KnowledgeBaseBuilder
- EventTranslator (messages ↔ events)
- RuntimeOrchestrator (invoke Colang runtime, retries, semaphores)
- ResponseAssembler (assemble GenerationResponse or streaming SSE)
- RailsAPI (facade wiring above components)
Then inject to LLMRails.
Current issues:
- high cyclomatic complexity: dozens of nested if/else on config flags, colang versions, streaming vs non-streaming, sync vs async.
- tangled control flow: the same operation (e.g. generate) has different code paths for v1 vs. v2, for prompt vs. messages, for threads vs. loop, for streaming vs. batch.
- hidden dead code:
if True or check_sync_call_from_async_loop()
means we always go one branch, so half the init logic is effectively never exercised, yet still maintained. - global mutable state (process_events_semaphore, events_history_cache) risking cross request bleed.
- responsibility explosion: loading files, parsing DSL, building indices, spinning threads, instantiating models, translating messages, orchestrating runtime, chunking streams, packaging responses, exposing public APIs all in one 1,500 line monolith.
- testing difficulty: to unittest anything we need to mock out the threading, the asyncio loops, the file system, the runtime, the LLM providers, the knowledge base, etc.