AI-Powered Optimization of Networked Computing Infrastructures for Low-Latency Data Processing

Bastin Thiyagaraj

doi:10.63282/3117-5481/AIJCST-V2I1P101

Authors

Dr. Bastin Thiyagaraj Department of IT, St. Joseph’s College (Autonomous), Tiruchirappalli, Tamil Nadu, India. Author

DOI:

https://doi.org/10.63282/3117-5481/AIJCST-V2I1P101

Keywords:

Edge Computing, Low-Latency Orchestration, Reinforcement Learning, Operator Placement, Slo-Aware Autoscaling, Intent-Based Networking, Kernel-Level Telemetry, Model Compression, Stream Processing, Digital Twin

Abstract

Modern applications from real-time analytics to cyber-physical control demand millisecond-level responsiveness across heterogeneous, networked infrastructures. This paper presents an AI-driven optimization stack that jointly orchestrates compute, network, and data paths to minimize end-to-end latency under dynamic workloads. We model the system as a constrained Markov decision process and employ reinforcement learning for SLO-aware autoscaling, operator placement, and flow steering across edge cloud tiers. A complementary learning-to-rank scheduler prioritizes critical microservices using online features (queue depths, p95 latency, tail-loss risk) derived from kernel-level telemetry. To shrink inference delays, we integrate model-aware compilation and compression (quantization/distillation) with zero-copy data planes and adaptive batching. Network latency is reduced via intent-based routing that couples RL policies with programmable switches for congestion- and jitter-aware path selection. A control-theoretic safety layer enforces stability and cost caps, while a digital-twin emulator enables fast policy evaluation before deployment. Prototype implementation across containerized clusters demonstrates consistent reductions in tail latency and recovery time during bursty and faulted conditions, with improvements driven by closed-loop decisions rather than static heuristics. The framework is modular, explainable via attribution on scheduling actions, and portable to diverse hardware. We conclude with guidelines for production roll-out and discuss open challenges in cross-layer observability and multi-tenant fairness

References

[1] Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv preprint. https://arxiv.org/abs/1707.06347

[2] Dean, J., & Barroso, L. A. (2013). The Tail at Scale. Communications of the ACM. https://dl.acm.org/doi/10.1145/2408776.2408794

[3] Bosshart, P., Daly, D., Gibb, G., Izzard, M., McKeown, N., Rexford, J., … Walker, D. (2014). P4: Programming Protocol-Independent Packet Processors. ACM SIGCOMM Computer Communication Review. https://dl.acm.org/doi/10.1145/2656877.2656890

[4] Alizadeh, M., Greenberg, A., Maltz, D. A., Padhye, J., Patel, P., Prabhakar, B., … Yan, M. (2010). Data Center TCP (DCTCP). ACM SIGCOMM. https://dl.acm.org/doi/10.1145/1851182.1851192

[5] Alizadeh, M., Kabbani, A., Edsall, T., Prabhakar, B., Vahdat, A., & Yasuda, M. (2013). Less is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center (pFabric). ACM SIGCOMM. https://dl.acm.org/doi/10.1145/2486001.2486011

[6] Ramakrishnan, K. K., Floyd, S., & Black, D. (2001). The Addition of Explicit Congestion Notification (ECN) to IP (RFC 3168). IETF RFC. https://www.rfc-editor.org/rfc/rfc3168

[7] IEEE 802.1 TSN Task Group. (2016). IEEE 802.1Qbv—Enhancements for Scheduled Traffic. IEEE Standard. https://1.ieee802.org/tsn/

[8] Axboe, J. (2019). Efficient IO with io_uring. Technical Report. https://kernel.dk/io_uring.pdf

[9] Kleppmann, M. (2017). Exactly-Once Semantics Are Possible: Here’s How Kafka Does It. Confluent Blog. https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/

[10] Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the Knowledge in a Neural Network. arXiv preprint. https://arxiv.org/abs/1503.02531

[11] Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., … Adam, H. (2018). Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. CVPR Workshops. https://arxiv.org/abs/1712.05877

[12] Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S., & Stoica, I. (2011). Dominant Resource Fairness: Fair Allocation of Multiple Resource Types. USENIX NSDI. https://www.usenix.org/conference/nsdi11/dominant-resource-fairness-fair-allocation-multiple-resource-types

[13] Chow, Y., Nachum, O., Duenez-Guzman, E., & Ghavamzadeh, M. (2018). A Lyapunov-based Approach to Safe Reinforcement Learning. NeurIPS. https://arxiv.org/abs/1805.07708

[14] Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. NeurIPS. https://arxiv.org/abs/1705.07874

[15] Akidau, T., Branch, A., Chernyak, S., et al. (2013). MillWheel: Fault-Tolerant Stream Processing at Internet Scale. VLDB. http://www.vldb.org/pvldb/vol6/p1033-akidau.pdf

[16] Enabling Mission-Critical Communication via VoLTE for Public Safety Networks - Varinder Kumar Sharma - IJAIDR Volume 10, Issue 1, January-June 2019. DOI 10.71097/IJAIDR.v10.i1.1539

AI-Powered Optimization of Networked Computing Infrastructures for Low-Latency Data Processing

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

Most read articles by the same author(s)

Similar Articles

Make a Submission

Cover

Menu

Information

Keywords

Publisher

Important Links