A Multi-Agent Reinforcement Framework for Autonomous Cloud Resource Scheduling and Optimization
DOI:
https://doi.org/10.63282/3117-5481/AIJCST-V6I3P102Keywords:
Cloud Computing, Multi-Agent Reinforcement Learning, Autonomous Resource Scheduling, Deep Q-Learning, Proximal Policy Optimization, SLA Optimization, Energy-Aware Cloud OrchestrationAbstract
Cloud computing has reached the core of the digital ecosystems and provides high-availability and scale-able access to the services of computational services. Nevertheless, the active resource requirements, enforcement of service-level agreement (SLA), energy usage restrictions, and a variation in the cost-performance represent significant challenges to the cloud resource management. The conventional heuristic-based schedulers in clouds orchestrators like Kube and OpenStack fail to adjust to real-time workload variability, fractional resources, violation of SLA and inflated operational costs frequently occur. The adaptive, reward-driven decision-making properties of reinforcement learning (RL) have been discovered to be an interesting alternative to a heuristic policy. However, centralized RL has scalability and state observability scale in distributed clouds. As a way of overcoming these limitations, this study hypothesizes a Multi-Agent Reinforcement Learning (MARL) model of autonomous resource scheduling and optimization in cloud infrastructures. The offered structure suggests the use of decentralized intelligent agents executing a coordinated management of computational, storage, and network allocations. The agents are optimized to adapt local behaviors under environment feedback with the aid of cooperative communication to enhance global optimization. The system uses hybrid in reinforcement system consisting of Deep Q-Learning (DQN) and Proximal Policy Optimization (PPO) to trade between exploration, stability, and on-policy improvement. The goal of an adaptive reward model is to maximize important performance indicators (KPIs): SLA adherence, task completion latency, energy efficiency and balance resource utilization. The Cloud sim Plus experimental measurements were performed in a simulated multi-cluster environment and supported by extra features of container virtualization. The proposed version of the MARL scheduler showed a better throughput (18.7%), SLA violations reduction (27.4%) and energy consumption (14.3) than the baseline strategies like Round-Robin, Min-Min, and centralized DQN models. Convergence time was reduced and decision correlation under the workload bursts was improved due to the incorporation of the inter-agent communication. The research paper has a tremendous impact in the field by proposing a foresight, smart cloud orchestration framework that has the capability to adapt itself in the distributed large scale systems. Additionally, we offer an improvement in algorithms, performance analysis, system workflow optimization, and theoretical support of a cooperative reinforcement learning in clouds. The implication of these findings in the future is that the applicability of the research to edge and fog computing settings can be applied to Industry 5.0 automation
References
[1] Agarwal, D. A., & others. “Efficient Optimal Algorithm of Task Scheduling in Cloud Computing.” arXiv pre-print, 2014. arXiv
[2] Elmougy, S., Sarhan, S., & Joundy, M. “A novel hybrid of Shortest-Job-First and Round Robin with dynamic variable quantum time task scheduling technique.” Journal of Cloud Computing, vol 6, no 12, 2017. SpringerOpen
[3] Alhaidari, F., & Balharith, T. Z. “Enhanced Round-Robin Algorithm in the Cloud Computing Environment for Optimal Task Scheduling.” Computers, vol 10, no 5, 2021. MDPI
[4] Tani, H. G., & El Amrani, C. “Smarter Round Robin Scheduling Algorithm for Cloud Computing and Big Data.” Journal of Data Mining & Digital Humanities, 2018. jdmdh.episciences.org
[5] Wang, Z., Chen, S., Bai, L., et al. “Reinforcement learning based task scheduling for environmentally sustainable federated cloud computing.” Journal of Cloud Computing, 2023. SpringerOpen
[6] Muthusamy, A., et al. “Dynamic Q-Learning-Based Optimized Load Balancing for Cloud Environments.” ISRN Cloud Computing, 2023. onlinelibrary.wiley.com
[7] Li, Y., Zhang, X., Zeng, T., Duan, J., Wu, C., Wu, D., Chen, X. “Task Placement and Resource Allocation for Edge Machine Learning: A GNN-based Multi-Agent Reinforcement Learning Paradigm.” arXiv pre-print, 2023. arXiv
[8] Mohanarajesh Kommineni. Revanth Parvathi. (2013) Risk Analysis for Exploring the Opportunities in Cloud Outsourcing.
[9] Enabling Mission-Critical Communication via VoLTE for Public Safety Networks - Varinder Kumar Sharma - IJAIDR Volume 10, Issue 1, January-June 2019. DOI 10.71097/IJAIDR.v10.i1.1539
[10] Thallam, N. S. T. (2020). Comparative Analysis of Data Warehousing Solutions: AWS Redshift vs. Snowflake vs. Google BigQuery. European Journal of Advances in Engineering and Technology, 7(12), 133-141.
[11] Kanji, R. K. (2020). Federated Learning in Big Data Analytics Privacy and Decentralized Model Training. Journal of Scientific and Engineering Research, 7(3), 343-352
[12] The Role of Zero-Emission Telecom Infrastructure in Sustainable Network Modernization - Varinder Kumar Sharma - IJFMR Volume 2, Issue 5, September-October 2020. https://doi.org/10.36948/ijfmr.2020.v02i05.54991
[13] P. K. Maroju, "Empowering Data-Driven Decision Making: The Role of Self-Service Analytics and Data Analysts in Modern Organization Strategies," International Journal of Innovations in Applied Science and Engineering (IJIASE), vol. 7, Aug. 2021.
[14] Aragani, Venu Madhav and Maroju, Praveen Kumar and Mudunuri, Lakshmi Narasimha Raju, “Efficient Distributed Training through Gradient Compression with Sparsification and Quantization Techniques” (September 29, 2021). Available at SSRN: https://ssrn.com/abstract=5022841 or http://dx.doi.org/10.2139/ssrn.5022841
[15] Lakshmi Narasimha Raju Mudunuri, “AI Powered Supplier Selection: Finding the Perfect Fit in Supply Chain Management”, IJIASE, January-December 2021, Vol 7; 211-231.
[16] Kommineni, M. "Explore Knowledge Representation, Reasoning, and Planning Techniques for Building Robust and Efficient Intelligent Systems." International Journal of Inventions in Engineering & Science Technology 7.2 (2021): 105- 114.
[17] Thallam, N. S. T. (2021). Privacy-Preserving Data Analytics in the Cloud: Leveraging Homomorphic Encryption for Big Data Security. Journal of Scientific and Engineering Research, 8(12), 331-337.
[18] Kanji, R. K. (2021). Federated data governance framework for ensuring quality-assured data sharing and integration in hybrid cloud-based data warehouse ecosystems through advanced ETL/ELT techniques. International Journal of Computer Techniques, 8(3), 1-9.
[19] Security and Threat Mitigation in 5G Core and RAN Networks - Varinder Kumar Sharma - IJFMR Volume 3, Issue 5, September-October 2021. DOI: https://doi.org/10.36948/ijfmr.2021.v03i05.54992
[20] P. K. Maroju, "Conversational AI for Personalized Financial Advice in the BFSI Sector," International Journal of Innovations in Applied Sciences and Engineering, vol. 8, no.2, pp. 156–177, Nov. 202
[21] Kulasekhara Reddy Kotte. 2022. ACCOUNTS PAYABLE AND SUPPLIER RELATIONSHIPS: OPTIMIZING PAYMENT CYCLES TO ENHANCE VENDOR PARTNERSHIPS. International Journal of Advances in Engineering Research , 24(6), PP – 14-24, https://www.ijaer.com/admin/upload/02%20Kulasekhara%20Reddy%20Kotte%2001468.pd
[22] Gopi Chand Vegineni. 2022. Intelligent UI Designs for State Government Applications: Fostering Inclusion without AI and ML, Journal of Advances in Developmental Research, 13(1), PP – 1-13, https://www.ijaidr.com/research-paper.php?id=1454
[23] Hullurappa, M. (2022). The Role of Explainable AI in Building Public Trust: A Study of AI-Driven Public Policy Decisions. International Transactions in Artificial Intelligence, 6.
[24] Bhagath Chandra Chowdari Marella, “Driving Business Success: Harnessing Data Normalization and Aggregation for Strategic Decision-Making”, International Journal of INTELLIGENT SYSTEMS AND APPLICATIONS IN ENGINEERING, vol. 10, no.2, pp. 308 – 317, 2022. https://ijisae.org/index.php/IJISAE/issue/view/87
[25] Mohanarajesh Kommineni. (2022/11/28). Investigating High-Performance Computing Techniques For Optimizing And Accelerating Ai Algorithms Using Quantum Computing And Specialized Hardware. International Journal Of Innovations In Scientific Engineering. 16. 66-80. (Ijise) 2022.
[26] Naga Surya Teja Thallam. (2022). Enhancing Security in Distributed Systems Using Bastion Hosts, NAT Gateways, and Network ACLs. International Scientific Journal of Engineering and Management, 1(1).
[27] Thallam, N. S. T. (2022). Columnar Storage vs. Row-Based Storage: Performance Considerations for Data Warehousing. Journal of Scientific and Engineering Research, 9(4), 238-249.
[28] Garg, A. (2022). Unified Framework of Blockchain and AI for Business Intelligence in Modern Banking . International Journal of Emerging Research in Engineering and Technology, 3(4), 32-42. https://doi.org/10.63282/3050-922X.IJERET-V3I4P105
[29] Kanji, R. K. (2022). A Unified Data Warehouse Architecture for Multi-Source Forest Inventory Integration and Automated Remote Sensing Analysis. Sarcouncil Journal of Engineering and Computer Sciences, 1, 10-16.
[30] Cloud-Native 5G Deployments: Kubernetes and Microservices in Telco Networks - Varinder Kumar Sharma - IJIRMPS Volume 10, Issue 3, May-June 2022. DOI:https://doi.org/10.37082/IJIRMPS.v10.i3.232706
[31] Thirunagalingam, A. (2023). Improving Automated Data Annotation with Self-Supervised Learning: A Pathway to Robust AI Models Vol. 7, No. 7,(2023) ITAI. International Transactions in Artificial Intelligence, 7(7).
[32] Praveen Kumar Maroju, "Optimizing Mortgage Loan Processing in Capital Markets: A Machine Learning Approach, " International Journal of Innovations in Scientific Engineering, 17(1), PP. 36-55 , April 2023.
[33] P. K. Maroju, "Leveraging Machine Learning for Customer Segmentation and Targeted Marketing in BFSI," International Transactions in Artificial Intelligence, vol. 7, no. 7, pp. 1-20, Nov. 2023.
[34] Kulasekhara Reddy Kotte. 2023. Leveraging Digital Innovation for Strategic Treasury Management: Blockchain, and Real-Time Analytics for Optimizing Cash Flow and Liquidity in Global Corporation. International Journal of Interdisciplinary Finance Insights, 2(2), PP - 1 - 17, https://injmr.com/index.php/ijifi/article/view/186/45
[35] Mudunuri L.N.R.; (December, 2023); “AI-Driven Inventory Management: Never Run Out, Never Overstock”; International Journal of Advances in Engineering Research; Vol 26, Issue 6; 24-36
[36] S. Panyaram, "Digital Transformation of EV Battery Cell Manufacturing Leveraging AI for Supply Chain and Logistics Optimization," International Journal of Innovations in Scientific Engineering, vol. 18, no. 1, pp. 78-87, 2023.
[37] Hullurappa, M. (2023). Intelligent Data Masking: Using GANs to Generate Synthetic Data for Privacy-Preserving Analytics. International Journal of Inventions in Engineering & Science Technology, 9, 9
[38] B. C. C. Marella, “Data Synergy: Architecting Solutions for Growth and Innovation,” International Journal of Innovative Research in Computer and Communication Engineering, vol. 11, no. 9, pp. 10551–10560, Sep. 2023.
[39] Mohanarajesh Kommineni. (2023/6). Investigate Computational Intelligence Models Inspired By Natural Intelligence, Such As Evolutionary Algorithms And Artificial Neural Networks. Transactions On Latest Trends In Artificial Intelligence. 4. P30. Ijsdcs.
[40] Settibathini, V. S., Kothuru, S. K., Vadlamudi, A. K., Thammreddi, L., & Rangineni, S. (2023). Strategic analysis review of data analytics with the help of artificial intelligence. International Journal of Advances in Engineering Research, 26, 1-10.
[41] Sehrawat, S. K. (2023). The role of artificial intelligence in ERP automation: state-of-the-art and future directions. Trans Latest Trends Artif Intell, 4(4).
[42] Thallam, N. S. T. (2023). Comparative Analysis of Public Cloud Providers for Big Data Analytics: AWS, Azure, and Google Cloud. International Journal of AI, BigData, Computational and Management Studies, 4(3), 18-29.
[43] Naga Surya Teja Thallam. (2023). High Availability Architectures for Distributed Systems in Public Clouds: Design and Implementation Strategies. European Journal of Advances in Engineering and Technology.
[44] Arpit Garg, S Rautaray, Devrajavans Tayagi. Artificial Intelligence in Telecommunications: Applications, Risks,and Governance in the 5G and Beyond Era. International Journal of Computer Techniques – Volume10Issue1,January - February – 2023. 1-19.
[45] Mukkala, S. R. (2023). A Proficient Hospital Ratings Aware Patient Churn Prediction And Prevention System Using Abg-Fuzzy And Ner-Gfjdkmeans. Educational Administration: Theory and Practice, 29 (03), 1407-1424 Doi: 10.53555/kuey. v29i3, 9511.
[46] Rajesh Kumar Kanji, Vinodkumar Reddy Surasani, Naveen Kumar Kotha and Uday Kiran Chilakalapalli4 (2023). NLP-BASED INTER AND INTRA-SENTENCE RELATIONSHIP ANALYSIS-AWARE BANK CUSTOMER BEHAVIOR ANALYSIS AND PREFERENCE DETECTION USING GLSNSTM. Journal of Computational Analysis and Applications, 31(4), 1834-1857
[47] Varinder Kumar Sharma - 5G-Enabled Mission-Critical Networks Design and Performance Analysis -International Journal on Science and Technology (IJSAT) Volume 14, Issue 4, October-December 2023. https://doi.org/10.71097/IJSAT.v14.i4.7998
