Multimodal Learning 相关度: 7/10

U-Net with Hadamard Transform and DCT Latent Spaces for Next-day Wildfire Spread Prediction

Yingyi Luo, Shuaiang Rong, Adam Watts, Ahmet Enis Cetin

arXiv: 2602.11672v1 发布: 2026-02-12 更新: 2026-02-12

下载 PDF arXiv 页面

AI 摘要

TD-FusionUNet模型利用哈达玛变换和DCT进行野火蔓延预测，在精度和效率间取得平衡。

主要贡献

提出TD-FusionUNet模型，融合哈达玛变换和DCT
引入随机边缘裁剪和高斯混合模型预处理技术
在WildfireSpreadTS数据集上优于UNet ResNet18基线

方法论

使用深度学习模型TD-FusionUNet，结合可训练的变换层，从多模态卫星数据预测次日野火蔓延。

原文摘要

We developed a lightweight and computationally efficient tool for next-day wildfire spread prediction using multimodal satellite data as input. The deep learning model, which we call Transform Domain Fusion UNet (TD-FusionUNet), incorporates trainable Hadamard Transform and Discrete Cosine Transform layers that apply two-dimensional transforms, enabling the network to capture essential "frequency" components in orthogonalized latent spaces. Additionally, we introduce custom preprocessing techniques, including random margin cropping and a Gaussian mixture model, to enrich the representation of the sparse pre-fire masks and enhance the model's generalization capability. The TD-FusionUNet is evaluated on two datasets which are the Next-Day Wildfire Spread dataset released by Google Research in 2023, and WildfireSpreadTS dataset. Our proposed TD-FusionUNet achieves an F1 score of 0.591 with 370k parameters, outperforming the UNet baseline using ResNet18 as the encoder reported in the WildfireSpreadTS dataset while using substantially fewer parameters. These results show that the proposed latent space fusion model balances accuracy and efficiency under a lightweight setting, making it suitable for real time wildfire prediction applications in resource limited environments.

arXiv 分类

cs.CV

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类