> ## Documentation Index > Fetch the complete documentation index at: https://ducs.surchi.xyz/llms.txt > Use this file to discover all available pages before exploring further. # SURCHI Sentinel AI Inference: Models and Architecture > SURCHI deploys ensemble AI models per sentinel domain for market intelligence, liquidity risk, and execution optimization with continuous model updates. AI inference is the cognitive core of the SURCHI protocol — the layer where raw data becomes actionable intelligence. The inference architecture is designed around three requirements: accuracy, speed, and verifiability. Accuracy demands specialized models trained on domain-specific on-chain signals. Speed demands low-latency inference pipelines capable of keeping pace with Solana's block cadence. Verifiability demands that inference results be auditable, with confidence scores and input features recorded alongside outputs. ## Model Architecture SURCHI uses an **ensemble approach**: rather than a single general-purpose model attempting to handle all intelligence tasks, the protocol deploys specialized models for each sentinel domain. This architectural decision reflects a well-established principle in applied machine learning — narrow models trained on domain-specific tasks consistently outperform general models on those tasks. The three model domains correspond directly to the three sentinels: Specialized for market intelligence tasks: price movement prediction, regime classification (trending vs. mean-reverting), whale flow analysis, and opportunity scoring. Trained primarily on DEX flow features, wallet movement data, and cross-market correlation signals. Specialized for liquidity risk assessment: pool health scoring, impermanent loss projection, liquidity depth forecasting, and early warning signal generation for liquidity crises. Trained on pool-level time series data and historical liquidity events. Specialized for execution optimization: entry and exit timing, slippage minimization, route selection across DEX venues, and transaction sequencing. Trained on historical execution data, price impact measurements, and MEV-resistant routing outcomes. Ensemble outputs from the three model families are aggregated by the Sentinel Core's cross-signal correlation layer, which weights individual model outputs by their recent accuracy metrics and generates a combined protocol-level confidence score for high-stakes decisions. *** ## Inference Pipeline Feature vectors from the Data Pipeline enter a five-stage inference process: Raw feature vectors from the Data Pipeline are validated for completeness and freshness. Missing features are imputed using short-window historical averages where permissible, or flagged as incomplete for conservative handling. Vectors are then standardized against per-feature normalization parameters derived from training distributions. Validated feature vectors are dispatched to the relevant sentinel models in parallel. Each model runs inference independently, producing a structured output object containing the primary prediction, supporting feature attributions (which input features most influenced the output), and a raw confidence score. Each model's raw output is passed through a calibrated confidence scoring layer. Confidence scores are calibrated against held-out validation data to ensure they reflect true predictive accuracy rather than model overconfidence. Signals with confidence scores below configurable thresholds are suppressed or downweighted before routing to the Core. The Sentinel Core's aggregation layer combines outputs from multiple active models, applying correlation weighting to account for signal overlap. When Alpha and Liquidity Sentinel signals point in conflicting directions, the aggregation layer surfaces the conflict rather than silently averaging it away. Conflicts above a severity threshold are escalated to the user. Aggregated, validated intelligence outputs are delivered simultaneously to the Natural Language Interface (for user-facing presentation) and the Execution Sentinel (for strategy evaluation). Outputs are timestamped with both inference completion time and data freshness timestamps, enabling downstream consumers to apply freshness-based weighting. *** ## Continuous Learning AI models in a live financial environment degrade without retraining. Market microstructure changes. New token categories emerge. Liquidity patterns shift as protocols evolve. SURCHI's continuous learning architecture ensures models remain current: **Rolling retraining:** Each sentinel's models are retrained on a rolling basis using a sliding window of recent on-chain data. Retraining jobs run on a scheduled cadence, with frequency calibrated to the rate of regime change in each domain. Execution models, which are most sensitive to short-term microstructure shifts, retrain most frequently. **Model versioning:** Every model deployment is versioned. The protocol maintains a version registry that records training data windows, validation metrics, and deployment timestamps for all active and archived model versions. Rollback to a prior version is possible without protocol downtime. **Performance monitoring:** Live model performance is monitored continuously against realized outcomes. When a model's accuracy metrics fall below threshold — measured over a rolling evaluation window — automated alerts trigger an out-of-cycle retraining run. **Governance approval for major versions:** Minor model updates (parameter tuning within the same architecture) are deployed by the protocol team without governance approval. Major version changes — new architectures, expanded feature sets, or changes to confidence calibration methodology — require governance approval via the SURCHI DAO before deployment. *** ## Latency Targets Inference latency is a first-class performance metric. The Sentinel Core monitors per-model inference latency against these targets: | Sentinel | Inference Target | P99 Target | Notes | | -------------------------- | ---------------- | ---------- | ----------------------------------------------------- | | Alpha Sentinel | \< 25ms | \< 60ms | Market opportunity windows are short-lived | | Liquidity Sentinel | \< 40ms | \< 100ms | Pool health changes on block cadence | | Execution Sentinel | \< 15ms | \< 40ms | Execution timing precision requires minimal latency | | Cross-Sentinel Aggregation | \< 10ms | \< 25ms | Adds to per-sentinel latency; combined target \< 50ms | Latency SLA breaches trigger automated alerting. Persistent latency degradation above P99 targets initiates capacity scaling procedures. *** Model weights are updated by the protocol team and subject to governance approval for major version changes.