您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2024, Vol. 54 ›› Issue (5): 101-110.doi: 10.6040/j.issn.1672-3961.0.2023.271

• 机器学习与数据挖掘 • 上一篇    

基于分解式Transformer的联邦长期时间序列预测算法

刘冬兰1,4*,刘新1,4,刘家乐2,赵鹏3,常英贤3,王睿1,4,姚洪磊1,4,罗昕2   

  1. 1.国网山东省电力公司电力科学研究院, 山东 济南 250003;2.山东大学软件学院, 山东 济南 250101;3.国网山东省电力公司, 山东 济南 250001;4.山东省智能电网技术创新中心, 山东 济南 250003
  • 发布日期:2024-10-18
  • 作者简介:刘冬兰(1987— ),女,云南宣威人,高级工程师,硕士,主要研究方向为网络安全、数据安全、隐私计算、区块链等. E-mail:liudonglan2006@126.com
  • 基金资助:
    国网山东省电力公司科技资助项目(520626220018)

Federated long-term time series forecasting algorithm based on decomposed Transformer

LIU Donglan1,4*, LIU Xin1,4, LIU Jiale2, ZHAO Peng3, CHANG Yingxian3, WANG Rui1,4, YAO Honglei1,4, LUO Xin2   

  1. 1. State Grid Shandong Electric Power Research Institute, Jinan 250003, Shandong, China;
    2. School of Software, Shandong University, Jinan 250101, Shandong, China;
    3. State Grid Shandong Electric Power Company, Jinan 250001, Shandong, China;
    4. Shandong Smart Grid Technology Innovation Center, Jinan 250003, Shandong, China
  • Published:2024-10-18

摘要: 为解决基于Transformer的方法存在计算成本高和无法捕捉时间序列总体趋势的问题,将Transformer与季节性趋势分解法相结合,提出基于分解式Transformer的联邦长期时间序列预测算法,其中分解方法用于捕捉时间序列的全局概况。在实际场景中,时间序列数据来自多个不同客户端。考虑数据隐私问题,利用联邦学习从多个客户端获得整体最优预测模型,采用基于局部锐度感知最小化的优化器提高全局模型的泛化性。与先进的方法相比,该方法在4个基准数据集的多变量和单变量时间序列预测任务中都有改进,在用电负荷(electricity consuming load, ECL)数据集上性能最高可提升26.9%。试验结果充分表明季节性趋势分解法与局部锐度感知最小化的优化器在长期时间序列预测任务上的有效性。

关键词: 隐私保护, 联邦学习, 长期预测, 模型泛化, Transformer

中图分类号: 

  • TP183
[1] 于海东, 刘文彬, 文祥宇. 基于强化学习的电动出租车充电负荷预测[J]. 山东电力技术, 2022, 49(4): 7-14. YU Haidong, LIU Wenbin, WEN Xiangyu. Electrictaxi charging load forecasting based on reinforcement learning[J]. Shandong Electric Power Technology, 2022, 49(4): 7-14.
[2] 刘萌, 田雨扬, 谢鑫, 等. 基于模型预测控制的空气源热泵负荷目标温度控制策略[J]. 山东电力技术, 2022, 49(12): 53-59. LIU Meng, TIAN Yuyang, XIE Xin, et al. Load target temperature control strategy of air source heat pump based on model predictive control[J]. Shandong Electric Power Technology, 2022, 49(12): 53-59.
[3] 路宽, 曲建璋, 高嵩, 等. 基于变分推断的超短期风电功率预测[J]. 山东电力技术, 2023, 50(4): 13-21. LU Kuan, QU Jianzhang, GAO Song, et al. Ultra-short-term wind power prediction based on variational inference[J]. Shandong Electric Power Technology, 2023, 50(4): 13-21.
[4] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30: 5998-6008.
[5] ZHOU H, ZHANG S, PENG J, et al. Informer:beyond efficient transformer for long sequence time-series forecasting[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI, 2021: 11106-11115.
[6] WU H, XU J, WANG J, et al. Autoformer:decomposition Transformers with auto-correlation for long-term series forecasting[J]. Advances in Neural Information Processing Systems, 2021, 34: 22419-22430.
[7] CLEVELAND R B, CLEVELAND W S, MCRAE J E, et al. STL: a seasonal-trend decomposition[J]. Journal of Official Statistics, 1990, 6(1): 3-73.
[8] WEN Q, GAO J, SONG X, et al. RobustSTL:a robust seasonal-trend decomposition algorithm for long time series[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI, 2019: 5409-5416.
[9] ORESHKIN B N, CARPOV D, CHAPADOS N, et al. N-BEATS:neural basis expansion analysis for interpretable time series forecasting[C] //International Conference on Learning Representations. New Orleans, USA: ICLR, 2019: 1-31.
[10] MCMAHAN B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C] //Proceedings of the 20th International Conference on Artificial Intelligence and Statistics(AISTATS). Fort Lauderdale, Flordia, USA: JMLR, 2017: 1273-1282.
[11] FORET P, KLEINER A, MOBAHI H, et al. Sharpness-awareminimization for efficiently improving generalization[C] //International Conference on Learning Rep-resentations. Addis Ababa, Ethiopia: ICLR, 2020: 1-19.
[12] ZHOU T, MA Z, WEN Q, et al. FEDformer:frequency enhanced decomposed Transformer for long-term series forecasting[C] //International Conference on Machine Learning. Baltimore, USA: PMLR, 2022: 27268-27286.
[13] RANGAPURAM S S, SEEGER M W, GASTHAUS J, et al. Deep state space models for time series forecasting[J]. Advances in Neural Information Processing Systems, 2018, 31: 7785-7794.
[14] SALINAS D, FLUNKERT V, GASTHAUS J, et al. DeepAR:probabilistic forecasting with autoregressive recurrent networks[J]. International Journal of Forecasting, 2020, 36(3): 1181-1191.
[15] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[J]. Advances in Neural Information Processing Systems, 2014, 27: 3104-3112.
[16] CHO K, VAN MERRIËNBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].(2014-09-03)[2023-11-07]. https://arxiv.org/abs/1406.1078.
[17] ARIYO A A, AEDWUMI A O, AYO C K. Stock price prediction using the ARIMA model[C] //Proceedings of the 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation. Cambridge, UK: IEEE, 2014: 106-112.
[18] TAYLOR S J, LETHAM B. Forecasting at scale[J]. The American Statistician, 2018, 72(1): 37-45.
[19] DEVLIN J, CHANG M W, LEE K, et al. BERT:pre-training of deep bidirectional Transformers for language understanding[EB/OL].(2019-05-24)[2023-11-07]. https://arxiv.org/abs/1810.04805.
[20] HUANG C Z A, VASWANI A, USZKOREIT J, et al. Music Transformer:generating music with long-term structure[C] //International Conference on Learning Representations. New Orleans, USA: ICLR, 2019: 1-15.
[21] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. Animage is worth 16×16 words: Transformers for image recognition at scale[C] //International Con-ference on Learning Representations. [S.l.] : ICLR, 2021: 1-21.
[22] RAO Y, ZHAO W, ZHU Z, et al. Global filter networks for image classification[J]. Advances in Neural Information Processing Systems, 2021, 34: 980-993.
[23] LI S, JIN X, XUAN Y, et al. Enhancing the locality and breaking the memory bottleneck of Transformer on time series forecasting[J]. Advances in Neural Information Processing Systems, 2019, 32: 5243-5253.
[24] WANG S, LI B Z, KHABSA M, et al. Linformer:self-attention with linear complexity[EB/OL].(2020-06-14)[2023-11-07]. https://arxiv.org/abs/2006.04768.
[25] KAIROUZ P, MCMAHAN H B, AVENT B, et al. Advances and open problems in federated learning[J]. Foundations and Trends in Machine Learning, 2021, 14(1/2): 1-210.
[26] MOHRI M, SIVEK G, SURESH A T. Agnostic federated learning[C] //International Conference on Machine Learning. California, USA: PMLR, 2019: 4615-4625.
[27] LI T, SAHU A K, ZAHEER M, et al. Federated optimization in heterogeneous networks[C] //Proceedings of Machine Learning and Systems. Austin, USA: MLSys, 2020: 429-450.
[28] VENKATESWARAN P, ISAHAGIAN V, MUTHUS-AMY V, et al. FedGen: generalizable federated learning for sequential data[C] //Proceedings of the 2023 IEEE 16th International Conference on Cloud Computing(CLOUD). Chicago, USA: IEEE, 2023: 308-318.
[29] ZONG L, XIE Q, ZHOU J, et al. FedCMR:federated cross-modal retrieval[C] //Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2021: 1672-1676.
[30] LI Q, HE B, SONG D. Model-contrastive federated learning[C] //Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE, 2021: 10713-10722.
[31] DURMUS A E, YUE Z, RAMON M, et al.Federated learning based on dynamic regularization[C] //International Conference on Learning Representations. [S.l.] : ICLR, 2021: 1-36.
[32] YAO D, PAN W, DAI Y, et al. Local-global knowledge distillation in heterogeneous federated learning with non-IID data[EB/OL].(2021-09-13)[2023-11-07]. https://arxiv.org/abs/2107.00051.
[33] WANG J, TANTIA V, BALLAS N, et al. SlowMo:improving communication-efficient distributed SGD with slow momentum[C] //International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR, 2020: 1-27.
[34] REDDI S J, CHARLES Z, ZAHEER M, et al. Adaptivefederated optimization[C] //International Con-ference on Learning Representations. [S.l.] : ICLR, 2021: 1-38.
[35] KARIMIREDDY S P, JAGGI M, KALE S, et al. Breaking the centralized barrier for cross-device federated learning[J]. Advances in Neural Information Processing Systems, 2021, 34: 28663-28676.
[36] KHANDURI P, SHARMA P, YANG H, et al. STEM:a stochastic two-sided momentum algorithm achieving near-optimal sample and communication complexities for federated learning[J]. Advances in Neural Information Processing Systems, 2021, 34: 6050-6061.
[37] LAKSHMINARAYANAN B, PRITZEL A, BLUNDELL C. Simple and scalable predictive uncertainty estimation using deep ensembles[J]. Advances in Neural Information Processing Systems, 2017, 30: 6402-6413.
[38] WOODWORTH B, GUNASEKAR S, LEE J D, et al. Kernel and rich regimes in overparametrized models[C] //Conference on Learning Theory. Texas, USA: PMLR, 2020: 3635-3673.
[39] QU Z, LI X, DUAN R, et al. Generalized federated learning via sharpness aware minimization[C] //International Conference on Machine Learning. Baltimore, USA: PMLR, 2022: 18250-18280.
[40] CHAUDHARI P, CHOROMANSKA A, SOATTO S, et al. Entropy-SGD:biasing gradient descent into wide valleys[J]. Journal of Statistical Mechanics: Theory and Experiment, 2019, 2019(12): 124018.
[41] KARIMIREDDY S P, KALE S, MOHRI M, et al. SCAFFOLD: stochastic controlled averaging for federated learning[C] //International Conference on Machine Learning. Texas, USA: PMLR, 2020: 5132-5143.
[42] LI T, SANJABI M, BEIRAMI A, et al. Fairresource allocation in federated learning[C] //International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR, 2020: 1-27.
[43] REISIZADEH A, FARNIA F, PEDARSANI R, et al. Robust federated learning:the case of affine distribution shifts[J]. Advances in Neural Information Processing Systems, 2020, 33: 21554-21565.
[44] KINGMA D P, BA J. Adam: a method for stochastic optimization[EB/OL].(2017-01-30)[2023-11-07]. https://arxiv.org/abs/1412.6980.
[45] KITAEV N, KAISER L, LEVSKAYA A. Reformer:the efficient transformer[C] //International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR, 2020: 1-12.
[1] 刘新,刘冬兰,付婷,王勇,常英贤,姚洪磊,罗昕,王睿,张昊. 基于联邦学习的时间序列预测算法[J]. 山东大学学报 (工学版), 2024, 54(3): 55-63.
[2] 徐芊芊,许倩,徐华畅,赵钰琳,徐凯,朱红. 基于CnViT的胶质瘤IDH1突变状态智能预测方法[J]. 山东大学学报 (工学版), 2023, 53(2): 127-134.
[3] 黄泗勇, 陈婷婷, 卢清, 吴英杰, 叶少珍. 基于kd-树的差分隐私二维空间数据划分发布方法[J]. 山东大学学报(工学版), 2015, 45(1): 24-29.
[4] 孙岚,罗钊,吴英杰,王一蕾. 面向路网限制的位置隐私保护算法[J]. 山东大学学报(工学版), 2012, 42(5): 96-101.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!