A SYSTEMATIC REVIEW OF MULTIMODAL ANALYTICAL TECHNIQUES FOR ENHANCING CYBER RESILIENCE IN SMART CITIES
DOI:
https://doi.org/10.24191/myse.v12i2.7056Keywords:
Cyber security, Multimodal analysis, Urban resilience, Machine learning, Systematic reviewAbstract
In recent years, the advancement of smart city technologies has necessitated a rethinking of urban resilience strategies, especially in the face of escalating cyber threats. Enhancing cyber resilience in smart cities is crucial due to their reliance on interconnected digital infrastructures that manage essential services such as transportation, energy, healthcare, and public safety. Cyber attacks on these systems can disrupt services, compromise data, and undermine public trust. This systematic review examines 27 selected articles on the application of multimodal analytical techniques to enhance the cyber resilience of smart cities. By evaluating these studies based on their analytical frameworks, techniques used, and research contexts, the researcher identifies the benefits and challenges of integrating multimodal data such as text, audio, video, and sensor data into urban cyber resilience strategies. The findings reveal that various multimodal strategies, including machine learning, data fusion, and real-time monitoring, significantly contribute to the robustness of smart urban infrastructures against cyber attacks. This review provides a comprehensive understanding of the application of multimodal analytics in safeguarding smart city infrastructures, highlighting best practices, technological advancements, and future research directions.
References
Ahmadi-Assalemi, G., Al-Khateeb, H., Epiphaniou, G., & Maple, C. (2020). Cyber resilience and incident response in smart cities: A systematic literature review. In Smart Cities (Vol. 3, Issue 3, pp. 894–927). MDPI. https://doi.org/10.3390/smartcities3030046.
Almeida, F. (2023). Prospects of Cybersecurity in Smart Cities. Future Internet, 15(9). https://doi.org/10.3390/fi15090285
Barthélemy, J., Verstaevel, N., & Forehead, H. (2019). Edge-Computing Video Analytics for Real-Time Traffic Monitoring in a Smart City. https://doi.org/10.3390/s19092048.
Bastos, D., Fernández-Caballero, A., Pereira, A., & Rocha, N. P. (2022). Smart City Applications to Promote Citizen Participation in City Management and Governance: A Systematic Review. In Informatics (Vol. 9, Issue 4). MDPI. https://doi.org/10.3390/informatics9040089.
Carneiro, D., Amaral, A., Carvalho, M., & Barreto, L. (2021). An anthropocentric and enhanced predictive approach to smart city management. Smart Cities, 4(4), 1366–1390. https://doi.org/10.3390/smartcities4040072.
Chaudhuri, A., & Bozkus Kahyaoglu, S. (2023). CYBERSECURITY ASSURANCE IN SMART CITIES: A RISK MANAGEMENT PERSPECTIVE. EDPACS, 67(4), 1–22. https://doi.org/10.1080/07366981.2023.2165293.
Cheng, P., Xiong, Z., Bao, Y., Zhuang, P., Zhang, Y., Blasch, E., & Chen, G. (2023). A Deep Learning-Enhanced Multi-Modal Sensing Platform for Robust Human Object Detection and Tracking in Challenging Environments. Electronics (Switzerland), 12(16). https://doi.org/10.3390/electronics12163423.
Das, A., Singh, N., & Chakraborty, S. (2024). UniPreCIS: A data preprocessing solution for collocated services on shared IoT. Future Generation Computer Systems, 153(November 2023), 543–557. https://doi.org/10.1016/j.future.2023.11.029.
Fuentes Reyes, M., Xie, Y., Yuan, X., d’Angelo, P., Kurz, F., Cerra, D., & Tian, J. (2023). A 2D/3D multimodal data simulation approach with applications on urban semantic segmentation, building extraction and change detection. ISPRS Journal of Photogrammetry and Remote Sensing, 205(September), 74–97. https://doi.org/10.1016/j.isprsjprs.2023.09.013.
Gracias, J. S., Parnell, G. S., Specking, E., Pohl, E. A., & Buchanan, R. (2023). Smart Cities—A Structured Literature Review. In Smart Cities (Vol. 6, Issue 4, pp. 1719–1743). Multidisciplinary Digital Publishing Institute (MDPI). https://doi.org/10.3390/smartcities6040080.
Juliana, J., & Arafah, S. (2018). The multimodal analysis of advertising tagline" Tolak angin sidomuncul” through systemic functional linguistics approach. Journal MELT (Medium for English Language Teaching), 3(2), 127–137.
Kalinin, M., Krundyshev, V., & Zegzhda, P. (2021). Cybersecurity risk assessment in smart city infrastructures. Machines, 9(4). https://doi.org/10.3390/machines9040078.
Kim, K., Alshenaifi, I. M., Ramachandran, S., Kim, J., Zia, T., & Almorjan, A. (2023). Cybersecurity and Cyber Forensics for Smart Cities: A Comprehensive Literature Review and Survey. In Sensors (Vol. 23, Issue 7). MDPI. https://doi.org/10.3390/s23073681.
Kitchenham, B., & Charters, S. (2007). Guidelines for performing Systematic Literature Reviews in Software Engineering.
Kohout, J., Skarda, C., Shcherbin, K., Kopp, M., & Brabec, J. (2021). A framework for comprehensible multi-modal detection of cyber threats. CoRR, abs/2111.05764. https://arxiv.org/abs/2111.05764.
Kumari, K., Singh, J. P., Dwivedi, Y. K., & Rana, N. P. (2020). Towards Cyberbullying-free social media in smart cities: a unified multi-modal approach. Soft Computing, 24(15), 11059–11070. https://doi.org/10.1007/s00500-019-04550-x.
Ma, Q., Nie, Y., Song, J., & Zhang, T. (2020). Multimodal Data Processing Framework for Smart City: A Positional-Attention Based Deep Learning Approach. IEEE Access, 8, 215505–215515. https://doi.org/10.1109/ACCESS.2020.3041447.
Qin, X. (2023). Traffic Flow Prediction Based on Two-Channel Multi-Modal Fusion of MCB and Attention. IEEE Access, 11(May), 58745–58753. https://doi.org/10.1109/ACCESS.2023.3280068.
Raptis, T. P., Cicconetti, C., Falelakis, M., Kalogiannis, G., Kanellos, T., & Lobo, T. P. (2023). Engineering Resource-Efficient Data Management for Smart Cities with Apache Kafka †. Future Internet, 15(2), 1–22. https://doi.org/10.3390/fi15020043.
Rasouli, A., Yau, T., Rohani, M., & Luo, J. (2022). Multi-Modal Hybrid Architecture for Pedestrian Action Prediction. IEEE Intelligent Vehicles Symposium, Proceedings, 2022-June, 91–97. https://doi.org/10.1109/IV51971.2022.9827055.
Sharma, A., Kumar, R., Kansal, I., Popli, R., Khullar, V., Verma, J., & Kumar, S. (2024). Fire Detection in Urban Areas Using Multimodal Data and Federated Learning. Fire, 7(4). https://doi.org/10.3390/fire7040104.
Srivastava, S., Vargas-Muñoz, J. E., & Tuia, D. (2019). Understanding urban landuse from the above and ground perspectives: A deep learning, multimodal solution. Remote Sensing of Environment, 228(October 2018), 129–143. https://doi.org/10.1016/j.rse.2019.04.014.
Su, C., Hu, X., Meng, Q., Zhang, L., Shi, W., & Zhao, M. (2024). A multimodal fusion framework for urban scene understanding and functional identification using geospatial data. International Journal of Applied Earth Observation and Geoinformation, 127(September 2023), 103696. https://doi.org/10.1016/j.jag.2024.103696.
Suel, E., Bhatt, S., Brauer, M., Flaxman, S., & Ezzati, M. (2021). Multimodal deep learning from satellite and street-level imagery for measuring income, overcrowding, and environmental deprivation in urban areas. Remote Sensing of Environment, 257(June 2020), 112339. https://doi.org/10.1016/j.rse.2021.112339.
Tang, H., Hu, Y., Wang, Y., Zhang, S., Xu, M., Zhu, J., & Zheng, Q. (2024). Listen as you wish: Fusion of audio and text for cross-modal event detection in smart cities. Information Fusion, 110(January), 102460. https://doi.org/10.1016/j.inffus.2024.102460.
Taubenböck, H., Droin, A., Standfuß, I., Dosch, F., Sander, N., Milbert, A., Eichfuss, S., & Wurm, M. (2022). To be, or not to be ‘urban’? A multi-modal method for the differentiated measurement of the degree of urbanization. Computers, Environment and Urban Systems, 95(May). https://doi.org/10.1016/j.compenvurbsys.2022.101830.
Wang, K., Song, Y., Huang, Z., Sun, Y., Xu, J., & Zhang, S. (2022). Additive manufacturing energy consumption measurement and prediction in fabricating lattice structure based on recallable multimodal fusion network. Measurement: Journal of the International Measurement Confederation, 196(April), 111215. https://doi.org/10.1016/j.measurement.2022.111215.
Xia, F., Lou, Z., Sun, D., Li, H., & Quan, L. (2023). Weed resistance assessment through airborne multimodal data fusion and deep learning: A novel approach towards sustainable agriculture. International Journal of Applied Earth Observation and Geoinformation, 120(April), 103352. https://doi.org/10.1016/j.jag.2023.103352.
Xiao, Y., Liu, Y., Luan, K., Cheng, Y., Chen, X., & Lu, H. (2023). Deep LiDAR-Radar-Visual Fusion for Object Detection in Urban Environments. Remote Sensing, 15(18), 1–19. https://doi.org/10.3390/rs15184433.
Xue, F., Lu, W., Chen, K., & Webster, C. J. (2019). BIM reconstruction from 3D point clouds: A semantic registration approach based on multimodal optimization and architectural design knowledge. Advanced Engineering Informatics, 42(June), 100965. https://doi.org/10.1016/j.aei.2019.100965.
Yan, X., Jiang, Z., Luo, P., Wu, H., Dong, A., Mao, F., Wang, Z., Liu, H., & Yao, Y. (2024). A multimodal data fusion model for accurate and interpretable urban land use mapping with uncertainty analysis. International Journal of Applied Earth Observation and Geoinformation, 129(September 2023), 103805. https://doi.org/10.1016/j.jag.2024.103805.
Yang, X., Guo, R., & Li, H. (2023). Comparison of multimodal RGB-thermal fusion techniques for exterior wall multi-defect detection. Journal of Infrastructure Intelligence and Resilience, 2(2), 100029. https://doi.org/10.1016/j.iintel.2023.100029.
Yu, M., Xu, H., Zhou, F., Xu, S., & Yin, H. (2023). A Deep-Learning-Based Multimodal Data Fusion Framework for Urban Region Function Recognition. ISPRS International Journal of Geo-Information, 12(12). https://doi.org/10.3390/ijgi12120468.
Yuan, Q., & Mohd Shafri, H. Z. (2022). Multi-Modal Feature Fusion Network with Adaptive Center Point Detector for Building Instance Extraction. Remote Sensing, 14(19). https://doi.org/10.3390/rs14194920.
Zhang, G., Li, H., Li, S., Wang, B., & Ding, Z. (2024). MMKG-PAR: Multi-Modal Knowledge Graphs-Based Personalized Attraction Recommendation. Sustainability (Switzerland) , 16(5), 1–22. https://doi.org/10.3390/su16052211.
Zhang, J., Liu, X., Liao, W., & Li, X. (2022). Deep-learning generation of POI data with scene images. ISPRS Journal of Photogrammetry and Remote Sensing, 188(April), 201–219. https://doi.org/10.1016/j.isprsjprs.2022.04.004.
Zhang, N., Wang, Y., Wang, X., & Yu, P. (2022). A Multi-Modal Fusion Network Guided by Feature Co-Occurrence for Urban Region Function Recognition. IEICE Transactions on Information and Systems, E105D(10), 1769–1779. https://doi.org/10.1587/transinf.2021EDP7230.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Malaysian Journal of Sustainable Environment

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Myse journal is a scholarly online, open access, peer reviewed journal.
Started in June 2023, the Malaysian Journal of Sustainable Environment is licensed under CC BY-NC-ND 4.0