Overview

This cross-section entry covers the ML model architecture and learning aspects of the 802.11bf multiband passive sensing paper. For the full summary covering passive sensing, 802.11bf protocol, and experimental results, see the primary page: 802.11bf Multiband Passive Sensing (primary).

MILAGRO: A Two-Block CNN for Multiband Fusion

MILAGRO (Multi-band Intelligence LeArning for Generalized Recognition and Observation) is a hierarchical CNN that processes heterogeneous multi-band RF data (60 GHz mmWave BT + 5 GHz beacon CSI) for indoor human sensing. It differs from standard single-stream CNNs by using a cascade architecture where Block 1 pre-classifies to constrain the inference space for Block 2.

Block 1 (mmWave BT path):

  • Input: 64 × 3000 PDP matrix (antenna array × power measurements per AWV sweep)
  • Conv1D(64 filters, kernel=2) → MaxPool1D(pool=2) → Dense(128, ReLU) → Dropout(0.5) → Dense(labels, softmax)
  • Output: coarse pre-classification that selects a routing path in Block 2

Block 2 (5 GHz + Block 1 output):

  • Input: 52 × 100 CSI matrix (subcarriers × beacons) + Block 1 output
  • Same layer structure as Block 1, with multiple parallel paths selected by Block 1 output
  • Backward propagation: Adam optimizer, lr=0.001, categorical cross-entropy loss

Key architectural choices and why:

  • 1D CNN over 2D CNN: Sequential/temporal nature of beacon CSI and BT sweeps suits 1D convolution; each filter captures short-term temporal dependencies (kernel=2 examines consecutive time steps)
  • Cascade rather than late fusion: Late fusion of SotA single-band models underperforms MILAGRO; pre-classification with mmWave constrains 5 GHz search and removes confusing NLOS cases
  • Softmax outputs: Multi-class classification with probability vectors per label; same architecture used for both binary (presence/absence) and multi-label (16 workstation combinations) tasks

Training and Generalization Properties

  • Sample efficiency: Accuracy saturates at ~60 training samples and ~120 epochs across 4, 8, and 16 class settings
  • Person generalization: MILAGRO generalizes across different individuals in the same environment — it learns spatial/path-based patterns, not person-specific RF signatures
  • Spatial transfer failure: Cannot transfer to new rooms without retraining — CSI encodes room-specific multipath propagation
  • Temporal robustness: No significant accuracy decay over 6 months under stable physical environments; degradation is caused by furniture changes, not time passage itself
  • Auto-labeling: YOLOX (computer vision model) provides automated training labels by detecting person position via camera, which is removed post-training for privacy

Relationship to Other ML Architectures in the Wiki

  • vs Transformer (Attention Is All You Need): Both process sequential data. Transformer uses self-attention (O(n²) complexity, long-range dependencies). MILAGRO uses 1D CNN (O(n) complexity, local temporal patterns). For CSI data with short-range temporal correlations, CNN is a more practical choice.
  • vs NCP-AAI Agentic AI concepts: MILAGRO is a narrow supervised classification model, not an agentic system; the sensing context illustrates where classical ML models remain appropriate over LLM-based approaches.

Connections to Existing Wiki Pages