Adaptive Resource Allocation and Scheduling in Edge–Cloud Continuum Using Multi-Agent Federated Learning Architectures
DOI:
https://doi.org/10.63282/3117-5481/AIJCST-V1I4P102Keywords:
Edge–Cloud Continuum, Multi-Agent Systems, Federated Learning, Adaptive Scheduling, Resource Allocation, Reinforcement Learning, Multi-Objective Optimization, Privacy-Preserving Coordination, Tail Latency, Energy Efficiency, SLO Assurance, Digital TwinAbstract
The edge–cloud continuum presents volatile demand, heterogeneous resources, and stringent latency/security constraints that strain centralized schedulers. We propose an adaptive resource allocation and scheduling framework that combines multi-agent systems with federated learning (FL) to coordinate decisions across edge sites and cloud regions without sharing raw data. Each site hosts an autonomous agent that forecasts workload (arrival rate, task mix, QoS risk) and locally optimizes CPU/GPU slots, memory, and network bandwidth via constrained reinforcement learning; periodically, model updates are aggregated in a privacy-preserving FL round to capture global patterns (diurnal bursts, cross-region spillovers) while respecting data locality. A hierarchical controller reconciles short-horizon edge actions with cloud-level placement and migration using multi-objective optimization over latency, cost, energy, and reliability. Safety shields enforce SLOs and policy constraints (e.g., admission limits, thermal caps), and a digital-twin simulator probes “what-if” scenarios to calibrate exploration. In evaluation on mixed IoT/stream analytics traces emulated across heterogeneous edge clusters and two public clouds, our approach achieves consistent tail-latency reductions (p95 18–32%), 10–22% energy savings at comparable throughput, and 25–40% fewer policy violations versus strong baselines (centralized heuristic and non-federated MARL). Ablations show FL rounds improve transfer under non-stationary demand and reduce negative interference between agents. The architecture is implementation-agnostic and compatible with Kubernetes-style orchestrators, offering a practical path to scalable, privacy-aware, and SLO-robust edge–cloud scheduling
References
[1] McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & Aguayo, B. (2017). Communication-Efficient Learning of Deep Networks from Decentralized Data. https://arxiv.org/abs/1602.05629
[2] Bonawitz, K., Ivanov, V., Kreuter, B., et al. (2017). Practical Secure Aggregation for Privacy-Preserving Machine Learning. https://arxiv.org/abs/1705.10679
[3] Abadi, M., Chu, A., Goodfellow, I., et al. (2016). Deep Learning with Differential Privacy. https://arxiv.org/abs/1607.00133
[4] Satyanarayanan, M. (2017). The Emergence of Edge Computing. IEEE Computer. https://ieeexplore.ieee.org/document/8014131
[5] Shi, W., Cao, J., Zhang, Q., Li, Y., & Xu, L. (2016). Edge Computing: Vision and Challenges. IEEE Internet of Things Journal. https://ieeexplore.ieee.org/document/7465053
[6] Mach, P., & Becvar, Z. (2017). Mobile Edge Computing: A Survey on Architecture and Computation Offloading. IEEE Communications Surveys & Tutorials. https://arxiv.org/abs/1702.07525
[7] Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2016). Borg, Omega, and Kubernetes. Communications of the ACM. https://cacm.acm.org/magazines/2016/5/201591-borg-omega-and-kubernetes/
[8] Verma, A., Pedrosa, L., Korupolu, M., et al. (2015). Large-scale Cluster Management at Google with Borg. EuroSys. https://dl.acm.org/doi/10.1145/2741948.2741964
[9] Dean, J., & Barroso, L. A. (2013). The Tail at Scale. Communications of the ACM. https://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/
[10] Mao, H., Alizadeh, M., Menache, I., & Kandula, S. (2016). Resource Management with Deep Reinforcement Learning. ACM HotNets/SIGCOMM. https://dl.acm.org/doi/10.1145/3005745.3005750
[11] Lowe, R., Wu, Y., Tamar, A., et al. (2017). Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments (MADDPG). https://arxiv.org/abs/1706.02275
[12] Rashid, T., Samvelyan, M., de Witt, C. S., et al. (2018). QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. https://arxiv.org/abs/1803.11485
[13] Sunehag, P., Lever, G., Gruslys, A., et al. (2017). Value-Decomposition Networks for Cooperative Multi-Agent Learning. https://arxiv.org/abs/1706.05296
[14] Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., & Whiteson, S. (2018). Counterfactual Multi-Agent Policy Gradients (COMA). https://arxiv.org/abs/1705.08926
[15] Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://arxiv.org/abs/1707.06347
[16] Mnih, V., Badia, A. P., Mirza, M., et al. (2016). Asynchronous Methods for Deep Reinforcement Learning (A3C). https://arxiv.org/abs/1602.01783
[17] Bai, S., Kolter, J. Z., & Koltun, V. (2018). An Empirical Evaluation of Generic Convolutional and Recurrent Networks for SequenceModeling (TCN). https://arxiv.org/abs/1803.01271
