Frequency-Adaptive Pan-Sharpening with Mixture of Experts: Breaking Frequency Bias and Enhancing Content Adaptation

Abstract

High-resolution multispectral imagery holds immense value in agriculture, mapping, and environmental monitoring. However, acquiring such images directly faces technological limitations. Pan-sharpening techniques bridge this gap by merging high-resolution panchromatic (PAN) and low-resolution multispectral images. While deep learning-based methods have advanced pan-sharpening, they still grapple with frequency bias and adaptation to diverse remote sensing content.

In this groundbreaking research, we introduce FAME-Net, the Frequency-Adaptive Mixture of Experts Network, a novel deep learning-based method that revolutionizes pan-sharpening. FAME-Net employs a frequency mask predictor for adaptive high- and low-frequency masking, allowing expert networks to focus on specific frequency ranges. These expert networks dynamically adapt to varying image content through a gating mechanism, producing exceptional pan-sharpened images.

Methodology

FAME-Net’s architecture is meticulously designed to address the challenges of frequency bias and content adaptation.

Frequency Mask Predictor

The frequency mask predictor, a crucial component of FAME-Net, generates frequency masks that selectively extract high- and low-frequency features from the input PAN and multispectral images. This adaptive masking enables expert networks to concentrate on specific frequency ranges, enhancing spectral quality and spatial detail.

Expert Networks

FAME-Net employs two expert networks, each tailored to a specific task. The high-frequency expert network meticulously preserves spectral quality, while the low-frequency expert network enhances spatial resolution. This division of labor ensures the generation of pan-sharpened images with exceptional spectral and spatial characteristics.

Gating Mechanism

The gating mechanism in FAME-Net dynamically integrates the outputs of the expert networks. It assesses the contribution of each expert network based on the image content, ensuring that the final pan-sharpened image optimally balances spectral fidelity and spatial detail.

Experimental Results

Extensive experiments conducted on the WorldView-III dataset and full-resolution remote sensing imagery demonstrate FAME-Net’s superiority over existing state-of-the-art methods.

Quantitative Evaluation

FAME-Net consistently outperforms other methods in terms of quantitative metrics, including spectral angle mapper (SAM), correlation coefficient (CC), root mean square error (RMSE), and erreur relative globale adimensionnelle de synthese (ERGAS). These results underscore FAME-Net’s remarkable ability to preserve spectral quality, enhance spatial resolution, and maintain semantic information.

Qualitative Assessment

Visual comparisons reveal that FAME-Net produces pan-sharpened images with exceptional spectral and spatial characteristics. The colors and textures are accurately preserved, while edges are sharp and well-defined. In contrast, other methods often exhibit spectral distortion and loss of spatial detail.

Conclusion

FAME-Net, the Frequency-Adaptive Mixture of Experts Network, sets a new benchmark in pan-sharpening. Its innovative architecture, featuring a frequency mask predictor, expert networks, and a gating mechanism, effectively addresses frequency bias and content adaptation challenges. FAME-Net’s superior performance in both quantitative and qualitative evaluations highlights its potential to revolutionize image processing applications, including image fusion, super-resolution, and denoising.