HOSA
A bio-inspired architecture for endogenous resilience in Linux systems. Multivariate anomaly detection via Mahalanobis Distance with kernel-space instrumentation and autonomous graduated response.
This project presents HOSA, a software architecture for autonomous resilience of Linux operating systems. HOSA replaces the dominant model of exogenous telemetry — where anomaly detection and failure mitigation depend on external central servers — with a model of Endogenous Resilience, where each computational node possesses autonomous capacity for multivariate detection and real-time local mitigation, independent of network connectivity.
Anomaly detection is performed through multivariate statistical analysis based on the Mahalanobis Distance and its temporal rate of change, with signal collection via eBPF in the Linux Kernel Space. Mitigation is executed through deterministic manipulation of Cgroups v2 and XDP, implementing a graduated response system inspired by the human nervous system's reflex arc.
HOSA does not replace orchestrators or global monitoring systems. It complements them by operating in the temporal interval where these systems are structurally incapable of acting: the milliseconds between the onset of collapse and the arrival of the first metric at the external control plane.
The operational cycle of exogenous monitoring follows a discrete sequence with cumulative latency. The interval between the onset of lethal stress and the arrival of the first usable metric at the external monitoring system constitutes what this work terms the Lethal Interval — the window where systems collapse without the external observer having awareness of the problem.
memory.high containment at t=2s. No process killed.Memory leak begins in payment-service. Rate: ~50MB/s.
Prometheus last scraped 8s ago. Next scrape in 7s. Data shows: "healthy."
Mahalanobis Distance crosses vigilance threshold. Sampling rate increased from 100ms to 10ms.
Prometheus: no scrape in this interval. Zero awareness.
Dominant contributor identified: /kubepods/pod-payment-service-7b4f.
Action: memory.high reduced from 2G → 1.6G. Webhook dispatched.
Prometheus: next scrape in 5s. Still showing stale data from t=−8s.
HOSA detected and contained the anomaly before any external monitoring system could collect its first data point post-leak. In the counterfactual scenario, OOM-Kill occurs at t≈40s and Prometheus alerts at t≈100s — 50× slower than HOSA's response.
Derivative decelerating — containment is effective. No escalation needed.
Prometheus: scrapes now. Sees mem=1.47GB. Rule requires >1.8GB for 1m. Result: OK
Memory at 74% — plateau reached. Derivative near zero. System degraded but functional. All transactions preserved. No process killed.
payment-service killed mid-transaction. Data corrupted. CrashLoopBackOff begins.
Customers receive 502 errors.
Alert fires 60 seconds after the first crash. The for 1m condition is finally satisfied.
On-call engineer paged. Postmortem begins.
The architecture draws from the biological reflex arc: the spinal cord retracts your hand from a hot surface in milliseconds, then notifies the brain for contextual processing. HOSA applies this pattern — immediate local action followed by opportunistic notification to the central control plane.
No static thresholds. HOSA learns the behavioral profile of the node — how CPU, memory, I/O, and network correlate — and detects deviations using the Mahalanobis Distance. It identifies anomalous correlation structures that per-metric alerts miss.
Metrics collected via eBPF probes attached directly to kernel tracepoints. No polling, no scraping. Data flows through ring buffers with microsecond latency for kernel↔user space transitions.
HOSA computes the velocity (dD̄M/dt) and acceleration (d²D̄M/dt²) of deviation from homeostasis. It detects trajectory toward collapse, not merely the arrival at a critical state.
Six response levels from passive observation to autonomous quarantine, proportional to severity. No binary kill switches. Throttle first, contain second, isolate only as last resort. Every action is reversible at Levels 0–4.
No TSDB, no message broker, no cloud API required for primary function. Communication with orchestrators is opportunistic — utilized when available, never required for node survival.
Every autonomous action is logged with its mathematical justification — DM value, derivative, threshold crossed, target cgroup, action taken. Full transparency for postmortem analysis. No black boxes.
Inspired by biological threat response — proportional force from silent observation to network isolation, determined by the magnitude and acceleration of the Mahalanobis Distance.
| Level | Name | Action | Reversibility |
|---|---|---|---|
| 0 | Homeostasis | None. Suppress redundant telemetry. Heartbeat only. | — |
| 1 | Vigilance | Increase sampling rate. Log locally. No intervention. | Automatic |
| 2 | Soft Containment | renice non-essential processes. Webhook notification. |
Automatic |
| 3 | Active Containment | CPU/memory throttling via cgroups. Partial load shedding via XDP. | Auto w/ hysteresis |
| 4 | Severe Containment | Aggressive throttling. Block inbound traffic except healthchecks. Freeze non-critical cgroups. | Sustained recovery |
| 5 | Quarantine | Network isolation. Freeze non-essential processes. Environment-aware recovery mode. | Manual |
Three functional layers — sensory (eBPF), cortex (mathematical engine), motor (cgroups/XDP) — operating in a continuous loop. Kernel↔user space transitions occur via eBPF ring buffers with microsecond latency.
Used when available. Never required for primary function.
"Orchestrators and centralized monitoring systems are essential instruments for capacity planning, load balancing, and long-term infrastructure governance. However, they are structurally — not accidentally — too slow to guarantee a node's survival in real time. If collapse occurs in the interval between perception and exogenous action, the capacity for immediate decision must reside in the node itself."
— HOSA Whitepaper v2.1, §1.3
From foundational concepts to implementation details.
Endogenous Resilience, the Lethal Interval, and why local autonomy matters.
The perceptive-motor cycle, warm-up calibration, and system design decisions.
Mahalanobis Distance, Welford updates, EWMA, derivatives, and regime taxonomy.
Six graduated levels, cgroups actuation, XDP load shedding, quarantine modes.
Baseline recalibration, weighted Welford, safety guards, and event-driven suppression.
Behavioral state classification, derivative signals, and transition conditions.
Deployment modes, core parameters, safelists, webhooks, and CLI reference.
Full academic foundation — 52 pages covering theory, taxonomy, and validation plan.
Amorim, F. R. (2026). HOSA — Homeostasis Operating System Agent: A Bio-Inspired Architecture for Autonomous Linux Resilience. Whitepaper v2.1. IMECC, Universidade Estadual de Campinas (Unicamp).
% BibTeX
@techreport{amorim2026hosa,
title = {HOSA --- Homeostasis Operating System Agent},
author = {Amorim, Fabricio Roney de},
year = {2026},
institution = {IMECC, Universidade Estadual de Campinas},
type = {Whitepaper},
version = {2.1},
url = {https://bricio-sr.github.io/hosa/}
}