A Comparative Study of Edge Intelligence Frameworks for Scalable Computing Applications

R. Saranya

doi:10.63282/3117-5481/AIJCST-V2I6P101

Authors

R. Saranya School of Technology, Central University of Tamil Nadu, Thiruvarur, Tamil Nadu, India. Author

DOI:

https://doi.org/10.63282/3117-5481/AIJCST-V2I6P101

Keywords:

Edge Computing, Edge Intelligence, Scalable Inference, Federated Learning, Kubeedge, Edgex Foundry, Ray, Nvidia Triton, Onnx/Openvino, Model Compression, Quantization, Orchestration, Observability, Fault Tolerance, Energy Efficiency, Privacy-Preserving Ml, Mlops, Heterogeneous Hardware, Tail Latency, Throughput Per Watt

Abstract

This paper presents a comparative study of contemporary edge-intelligence frameworks that enable scalable, low-latency ML inference and learning across heterogeneous devices. We evaluate open and vendor ecosystems spanning data plumbing and device management (e.g., EdgeX-style microservices), edge–cloud orchestration (e.g., KubeEdge/Ray-on-K8s patterns), model serving (e.g., Triton-like backends and ONNX/OpenVINO runtimes), and privacy-preserving training (federated/stream learning toolchains). A layered evaluation rubric is proposed device abstraction, runtime portability, observability, autoscaling, online model lifecycle (convert quantize deploy monitor), security, and energy awareness mapped to repeatable benchmarks in video analytics and time-series anomaly detection. Using identical models compressed via quantization/distillation and deployed with containerized operators, we analyze tail latency, throughput per watt, orchestration overhead, failure recovery, and cost on mixed CPU/NPU/GPU edge nodes with cloud offload. Results indicate that lightweight microservice stacks minimize control-plane overhead at small scale, while Kubernetes-native designs deliver superior multi-site elasticity and policy control beyond ~100 nodes. Accelerator-aware runtimes consistently improve p95 latency and energy efficiency, but require careful model-format harmonization and telemetry. Federated pipelines reduce raw-data egress and meet privacy goals, though scheduling and straggler mitigation remain bottlenecks. We conclude with a decision matrix aligning workloads to frameworks, and with guidance on MLOps guardrails profiling, canarying, drift detection, and closed-loop retraining to sustain SLOs under real-world volatility

References

[1] Shi, W., Cao, J., Zhang, Q., Li, Y., & Xu, L. (2016). Edge Computing: Vision and Challenges. IEEE IoT Journal. https://ieeexplore.ieee.org/document/7488250

[2] Satyanarayanan, M. (2017). The Emergence of Edge Computing. IEEE Computer. https://ieeexplore.ieee.org/document/8016573

[3] Mao, Y., You, C., Zhang, J., Huang, K., & Letaief, K. B. (2017). A Survey on Mobile Edge Computing. arXiv. https://arxiv.org/abs/1701.01090

[4] Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., & Zhang, J. (2019). Edge Intelligence: Paving the Last Mile of AI. arXiv. https://arxiv.org/abs/1905.10083

[5] Chen, T., et al. (2018). TVM: An Automated End-to-End Optimizing Compiler for DL. arXiv. https://arxiv.org/abs/1802.04799

[6] Flower: A Friendly Federated Learning Framework. https://flower.dev/

[7] McMahan, H. B., et al. (2017). Communication-Efficient Learning of Deep Networks from Decentralized Data. PMLR. https://proceedings.mlr.press/v54/mcmahan17a.html

[8] Dwork, C., & Roth, A. (2014). The Algorithmic Foundations of Differential Privacy. Foundations & Trends. https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf

[9] Shokri, R., et al. (2017). Membership Inference Attacks Against ML Models. IEEE S&P. https://www.cs.cornell.edu/~shmat/shmat_oak17.pdf

[10] Jacob, B., et al. (2018). Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. arXiv. https://arxiv.org/abs/1712.05877

[11] Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the Knowledge in a Neural Network. arXiv. https://arxiv.org/abs/1503.02531

[12] Gupta, O., & Raskar, R. (2018). Distributed Learning of Deep Neural Network using Split Learning. arXiv. https://arxiv.org/abs/1812.00564

[13] Sze, V., Chen, Y.-H., Yang, T.-J., & Emer, J. (2017). Efficient Processing of Deep Neural Networks. Proceedings of the IEEE. https://arxiv.org/abs/1703.09039

[14] Mahadev Satyanarayanan, Brian D. Noble, Dushyanth Narayanan, James Eric Tilton, Jason Flinn. The Case for VM Based Cloudlets in Mobile Computing. ACM SOSP, Oct 1997. (Introduction of the “cloudlet” concept is a precursor architecture to edge computing)

[15] Nan Wang, Blesson Varghese, Michail Matthaiou, Dimitrios S. Nikolopoulos. ENORM: A Framework For Edge NOde Resource Management. arXiv pre print, 12 Sep 2017.

A Comparative Study of Edge Intelligence Frameworks for Scalable Computing Applications

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

Similar Articles

Make a Submission

Cover

Menu

Information

Keywords

Publisher

Important Links