Abstract
This paper presents MILAGRO, a multiband passive Wi-Fi sensing system aligned with IEEE 802.11bf standardization. It fuses Channel State Information (CSI) from two bands — sub-7 GHz 5 GHz OFDM beacons and 60 GHz mmWave Beamforming Training (BT) frames — without any modification to existing Wi-Fi communications. A two-block CNN architecture (MILAGRO) performs pre-classification on mmWave BT data and then integrates 5 GHz beacon CSI to produce final inferences. The system demonstrates human presence detection and corridor movement tracking with 90–100% accuracy. The paper also quantifies model generalization across furniture modifications, room changes, and temporal drift, and addresses security considerations (evil P-STA attacks, spoofing AP attacks).
Fig. 8 — Testbed building blocks (top) and hardware used in the implementation (bottom): 60 GHz Mikrotik router, 5 GHz AP Turris, USRP B210 SDR, Ryzen 7 + RTX 3070.
Key Concepts
- IEEE 802.11bf: WiFi sensing standard amendment defining active sensing (dedicated signals, higher accuracy) and passive sensing (reusing existing Wi-Fi traffic). This paper focuses on the passive mode.
- Passive Sensing Station (P-STA): Unregistered receiver that listens to Wi-Fi signals transmitted between AP and STA for environmental sensing without interfering with communications or requiring network membership.
- Channel State Information (CSI): The channel frequency response H(f,t), where Y(f,t) = H(f,t)·X(f,t) + A(f,t). Movements in the environment alter multi-path propagation, causing detectable changes in H(f,t). CSI is a tensor over [antenna_Rx × antenna_Tx × subcarrier × time].
- Sub-7 GHz Passive Sensing (5 GHz beacons): OFDM beacon frames transmitted every ~100 ms over 20 MHz bandwidth. 52 active subcarriers per beacon. Captured with USRP B210 SDR; processed to extract 52×100 CSI matrix (subcarriers × beacons). Good penetration through walls; lower spatial resolution.
- mmWave Beamforming Training (60 GHz BT): IEEE 802.11ay BT sweeps multiple Antenna Weight Vectors (AWVs) from 60°–120°. Each AWV tests a beam configuration using Golay-sequence TRN fields. P-STA extracts Power Delay Profile (PDP) per AWV: 64×3000 input matrix. High spatial resolution; LOS-only (poor NLOS performance).
- MILAGRO (Multi-band Intelligence LeArning for Generalized Recognition and Observation): Two-block CNN model. Block 1 processes mmWave BT data (coarse pre-classification, narrows inference space). Block 2 integrates Block 1 output with 5 GHz beacon CSI for final classification. Adam optimizer, categorical cross-entropy loss.
- Auto-labeling with YOLOX: MILAGRO uses a YOLOX computer vision model to automatically label training data (detecting person presence/position via camera), removing the camera after training for privacy.
- Passive vs Active ISAC: No dedicated sensing waveform required — environmental information extracted from standard communications traffic. Passive mode cannot match active sensing accuracy but adds zero spectrum/energy overhead.
Key Equations and Algorithms
- Channel response model: Y(f,t) = H(f,t)·X(f,t) + A(f,t), where H(f,t) = Σᵢ Aᵢ·exp(−j2π·dᵢ(t)/λ) (multipath sum of L paths with amplitude Aᵢ and path length dᵢ(t))
- Conv1D output: out(i,j,:) = b(j) + Σc [in(i,c,:) ⋆ w(j,c,:)], where ⋆ is the cross-correlation operator (kernel size 2, 64 filters)
- MaxPool1D: out size = floor((input_size − K)/s) + 1 (pool_size=2, stride implied)
- Loss function: Categorical cross-entropy between predicted and true class distributions
- Adam optimizer: SGD variant with adaptive learning rates; learning rate = 0.001 in MILAGRO
Key Claims and Findings
- Multiband outperforms single-band: MILAGRO achieves 100% accuracy on 16-label lab scenario; single-band approaches fail beyond 8–12 labels due to inadequate spatial resolution (5 GHz) or LOS dependency (mmWave).
- mmWave as coarse filter: mmWave BT achieves 100% accuracy detecting LOS interruption but degrades rapidly in NLOS and multi-label scenarios. Its role in MILAGRO is to pre-classify and reduce the search space for the 5 GHz block.
- Corridor tracking: MILAGRO detects 13/14 tile positions at 3 km/h; degrades to 10/14 at 10 km/h. Combined 5 GHz + mmWave reaches 94% vs. 85% with 5 GHz alone across 14 corridor labels.
- Temporal generalization: After 6 months without retraining, performance is unchanged in rooms and shows only minor degradation in corridors. No performance degradation from mere passage of time — only from physical environment changes.
- Spatial generalization failure: Model trained in Room A fails in Room B (different wall materials/furniture → different multi-path propagation). Full retraining required per deployment environment.
- Training saturation: Accuracy saturates at ~60 samples and ~120 epochs for all class counts. Beyond these, additional samples/epochs offer marginal benefit and may cause overfitting.
- Person generalization: Model trained on up to 4 people generalizes correctly to 5 different individuals — the model captures spatial patterns, not person-specific features.
- Security: Spoofing attacks (second AP same SSID/channel) corrupt CSI measurements, but complex anti-sensing defenses are feasible (dynamic beacon intervals, antenna polarization modification).
Terminology
- P-STA (Passive STA): Unregistered passive receiver reusing existing Wi-Fi signals for sensing. Requires labeled training data collected in the target environment.
- AWV (Antenna Weight Vector): mmWave beam configuration specifying amplitude and phase per antenna in the array. BT sweeps many AWVs to find optimal beam.
- TRN (Training) field / TRN-Unit / TRN Subfield: Fields in 802.11ay mmWave frames used during BT. Filled with Golay sequences; one TRN-Unit per AWV.
- PDP (Power Delay Profile): Power-vs-delay profile extracted from BT frames per AWV; used as mmWave sensing input to MILAGRO.
- Active sensing (802.11bf): Uses dedicated sensing frames, achieving higher accuracy but consuming additional spectrum/energy.
- Passive sensing (802.11bf): Reuses existing beacon and BT frames; zero communication overhead; less accurate than active but entirely non-intrusive.
- CSI tensor: H(f,t) as [antenna_Rx × antenna_Tx × K_subcarriers × time]; each dimension carries environmental information.
- MILAGRO Block 1 / Block 2: Block 1 (mmWave path): Conv1D → MaxPool1D → Dense(128, ReLU) → Dropout(0.5) → Dense(labels, softmax). Block 2 (5 GHz + Block 1 output): same structure with backward propagation via Adam.
MILAGRO Architecture
Fig. 7 — MILAGRO scheme: mmWave CSI feeds Block 1 for coarse pre-classification; result merges with 5 GHz beacon CSI in Block 2 for final inference.
The two-block design exploits the complementary strengths of each band:
- Block 1 (mmWave): high spatial resolution where LOS is available → determines which beam paths are obstructed → coarse-grained pre-classification
- Block 2 (5 GHz): broad coverage through obstacles → fine-grained classification within the pre-classified subset
This cascade architecture avoids the accuracy collapse of naive data fusion approaches and outperforms late-fusion of single-band SotA methods.
IEEE 802.11bf Use Case Coverage
| Use Case | Passive feasibility | MILAGRO result |
|---|---|---|
| Presence detection | Yes | 100% (AP inside), >80% (AP in corridor) |
| People counting / room sensing | Yes (moderate accuracy) | 16-label 100% with multiband |
| Human activity recognition | Yes | 80–100% per workstation (pose) |
| Corridor tracking | Yes | 94% at 3 km/h, degrades at 10 km/h |
| Healthcare (vital signs) | Insufficient | Not tested |
| Gesture recognition | Partially | Not a MILAGRO target |
| Localization / object tracking | Insufficient | Tile-level granularity only |
Connections to Existing Wiki Pages
- 5G PRS-Based Sensing — complementary active ISAC approach using 5G NR PRS; contrasts with passive-only approach here
- ML angle — MILAGRO CNN architecture cross-section
- Attention Is All You Need — transformer-based architecture for comparison; MILAGRO uses 1D CNNs instead of attention for sequential CSI data
- ISAC section index