Lakehouse-Integrated Graph Risk Scoring Architectures for Advanced Fraud Detection

Authors

  • Dilliraja Sundar Independent Researcher, USA. Author
  • Jayant Bhat Independent Researcher, USA. Author

DOI:

https://doi.org/10.63282/3117-5481/AIJCST-V7I6P107

Keywords:

Fraud Detection, Lakehouse Architecture, Graph Analytics, Risk Scoring, Financial Crime, Network-Based Fraud, Big Data Analytics, Machine Learning, Transaction Graphs

Abstract

The high rate of online transactions, real-time payment systems and interconnected financial ecosystems have, therefore, made fraud detection one of the most serious analytical issues faced in contemporary digital economies. Old methods of fraud detection which were mostly dependent on rule engines and isolated machine learning classifiers cannot rule out scale and velocity of fraud and more importantly the relationship complexity that are involved in modern day fraud trends. The frauds are reported to have started to appear as organized actions within multiple networks of users, devices, accounts, merchants, and transactions and no longer a single record analysis is enough.Simultaneously, lakehouse platforms are changing the nature of enterprise data architectures by integrating data lake scalability with the affordability with the governance, dependability and data warehouse level performance guarantees. Whereas lakehouses are well suited to handling a large sized structured and semi-structured data, they are not necessarily optimized to support graph-based reasoning that is critical in the detection of collusive and network-based fraud. This is the gap that has led to the combination of graph analytics and lakehouse architectures allowing scalable, interpretable and real-time fraud intelligence.In this paper, it is suggested to come up with Lakehouse-Integrated Graph Risk Scoring Architecture (LIGRSA) as a sophisticated way of detecting fraud. The proposed architecture will unify transactional information used in a lakehouse with dynamic graph construction, graph features extraction, and hybrid risk scoring models which are graph metrics combined with machine learning and statistical inference. The architecture allows batch and streaming fraud detection pipelines, which allows propagation of risks between entity networks in near-real time. Graph based risk aggregation and composite fraud scoring mathematical formulations are introduced, scalability, system level design, governance and explainability considerations.The experimental findings, such as representative workloads of financial transactions, prove that lakehouse-integrated graph risk scoring outperforms other models in terms of fraud detection accuracy, early fraud detection, and false positive reduction. In the conclusion, the paper addresses elements of deployment, constraints and future research opportunities, such as real-time graph learning, privacy-preserving analytics, and large language model (LLM)-aided fraud investigation

References

[1] Bolton, R. J., & Hand, D. J. (2002). Statistical fraud detection: A review. Statistical science, 17(3), 235-255.

[2] Phua, C., Lee, V., Smith, K., & Gayler, R. (2010). A comprehensive survey of data mining-based fraud detection research. arXiv preprint arXiv:1009.6119.

[3] Dal Pozzolo, A. (2015). Adaptive machine learning for credit card fraud detection.

[4] Bhattacharyya, S., Jha, S., Tharakunnel, K., & Westland, J. C. (2011). Data mining for credit card fraud: A comparative study. Decision support systems, 50(3), 602-613.

[5] Fawcett, T., & Provost, F. (1997). Adaptive fraud detection. Data mining and knowledge discovery, 1(3), 291-316.

[6] Akoglu, L., Tong, H., & Koutra, D. (2015). Graph based anomaly detection and description: a survey. Data mining and knowledge discovery, 29(3), 626-688.

[7] Hooi, B., Shin, K., Song, H. A., Beutel, A., Shah, N., & Faloutsos, C. (2017). Graph-based fraud detection in the face of camouflage. ACM Transactions on Knowledge Discovery from Data (TKDD), 11(4), 1-26.

[8] Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584.

[9] Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., & Yu, P. S. (2020). A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32(1), 4-24.

[10] Armbrust, M., Das, T., Sun, L., Yavuz, B., Zhu, S., Murthy, M., ... & Zaharia, M. (2020). Delta lake: high-performance ACID table storage over cloud object stores. Proceedings of the VLDB Endowment, 13(12), 3411-3424.

[11] Armbrust, M., Ghodsi, A., Xin, R., & Zaharia, M. (2021, January). Lakehouse: a new generation of open platforms that unify data warehousing and advanced analytics. In Proceedings of CIDR (Vol. 8, p. 28).

[12] Xu, L., Jiang, C., Wang, J., Yuan, J., & Ren, Y. (2014). Information security in big data: privacy and data mining. Ieee Access, 2, 1149-1176.

[13] Makki, S., Haque, R., Taher, Y., Assaghir, Z., Ditzler, G., Hacid, M. S., & Zeineddine, H. (2017, September). Fraud analysis approaches in the age of big data-A review of state of the art. In 2017 IEEE 2nd international workshops on foundations and applications of self* systems (FAS* W) (pp. 243-250). IEEE.

[14] Alarfaj, F. K., Malik, I., Khan, H. U., Almusallam, N., Ramzan, M., & Ahmed, M. (2022). Credit card fraud detection using state-of-the-art machine learning and deep learning algorithms. Ieee Access, 10, 39700-39715.

[15] del Mar Roldán-García, M., García-Nieto, J., & Aldana-Montes, J. F. (2017). Enhancing semantic consistency in anti-fraud rule-based expert systems. Expert Systems with Applications, 90, 332-343.

[16] Nesvijevskaia, A., Ouillade, S., Guilmin, P., & Zucker, J. D. (2021). The accuracy versus interpretability trade-off in fraud detection model. Data & Policy, 3, e12.

[17] Li, R., Liu, Z., Ma, Y., Yang, D., & Sun, S. (2022). Internet financial fraud detection based on graph learning. IEEE Transactions on Computational Social Systems, 10(3), 1394-1401.

[18] Ren, L., Hu, R., Li, D., Liu, Y., Wu, J., Zang, Y., & Hu, W. (2023). Dynamic graph neural network-based fraud detectors against collaborative fraudsters. Knowledge-Based Systems, 278, 110888.

[19] Siddiqi, N. (2017). Intelligent credit scoring: Building and implementing better credit risk scorecards. John Wiley & Sons.

[20] Mock, T. J., Srivastava, R. P., & Wright, A. M. (2017). Fraud risk assessment using the fraud risk model as a decision aid. Journal of emerging technologies in accounting, 14(1), 37-56.

[21] Nangi, P. R., & Settipi, S. (2023). A Cloud-Native Serverless Architecture for Event-Driven, Low-Latency, and AI-Enabled Distributed Systems. International Journal of Emerging Research in Engineering and Technology, 4(4), 128-136. https://doi.org/10.63282/3050-922X.IJERET-V4I4P11

[22] Bhat, J., Sundar, D., & Jayaram, Y. (2024). AI Governance in Public Sector Enterprise Systems: Ensuring Trust, Compliance, and Ethics. International Journal of Emerging Trends in Computer Science and Information Technology, 5(1), 128-137. https://doi.org/10.63282/3050-9246.IJETCSIT-V5I1P114

[23] Nangi, P. R., Reddy Nala Obannagari, C. K., & Settipi, S. (2023). A Multi-Layered Zero-Trust Security Framework for Cloud-Native and Distributed Enterprise Systems Using AI-Driven Identity and Access Intelligence. International Journal of Emerging Trends in Computer Science and Information Technology, 4(3), 144-153. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I3P115

[24] Jayaram, Y., & Sundar, D. (2023). AI-Powered Student Success Ecosystems: Integrating ECM, DXP, and Predictive Analytics. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 4(1), 109-119. https://doi.org/10.63282/3050-9262.IJAIDSML-V4I1P113

[25] Bhat, J. (2023). Automating Higher Education Administrative Processes with AI-Powered Workflows. International Journal of Emerging Trends in Computer Science and Information Technology, 4(4), 147-157. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I4P116

[26] Nangi, P. R., & Reddy Nala Obannagari, C. K. (2024). High-Performance Distributed Database Partitioning Using Machine Learning-Driven Workload Forecasting and Query Optimization. American International Journal of Computer Science and Technology, 6(2), 11-21. https://doi.org/10.63282/3117-5481/AIJCST-V6I2P102

[27] Bhat, J. (2024). Responsible Machine Learning in Student-Facing Applications: Bias Mitigation & Fairness Frameworks. American International Journal of Computer Science and Technology, 6(1), 38-49. https://doi.org/10.63282/3117-5481/AIJCST-V6I1P104

[28] Jayaram, Y., Sundar, D., & Bhat, J. (2024). Generative AI Governance & Secure Content Automation in Higher Education. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 5(4), 163-174. https://doi.org/10.63282/3050-9262.IJAIDSML-V5I4P116

[29] Nangi, P. R., Obannagari, C. K. R. N., & Settipi, S. (2022). Self-Auditing Deep Learning Pipelines for Automated Compliance Validation with Explainability, Traceability, and Regulatory Assurance. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(1), 133-142. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I1P114

[30] Jayaram, Y., & Bhat, J. (2022). Intelligent Forms Automation for Higher Ed: Streamlining Student Onboarding and Administrative Workflows. International Journal of Emerging Trends in Computer Science and Information Technology, 3(4), 100-111. https://doi.org/10.63282/3050-9246.IJETCSIT-V3I4P110

[31] Bhat, J., & Sundar, D. (2022). Building a Secure API-Driven Enterprise: A Blueprint for Modern Integrations in Higher Education. International Journal of Emerging Research in Engineering and Technology, 3(2), 123-134. https://doi.org/10.63282/3050-922X.IJERET-V3I2P113

[32] Nangi, P. R., Reddy Nala Obannagari, C. K., & Settipi, S. (2024). A Federated Zero-Trust Security Framework for Multi-Cloud Environments Using Predictive Analytics and AI-Driven Access Control Models. International Journal of Emerging Research in Engineering and Technology, 5(2), 95-107. https://doi.org/10.63282/3050-922X.IJERET-V5I2P110

[33] Jayaram, Y. (2024). Private LLMs for Higher Education: Secure GenAI for Academic & Administrative Content. American International Journal of Computer Science and Technology, 6(4), 28-38. https://doi.org/10.63282/3117-5481/AIJCST-V6I4P103

[34] Nangi, P. R. (2022). Multi-Cloud Resource Stability Forecasting Using Temporal Fusion Transformers. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(3), 123-135. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I3P113

[35] Bhat, J., Sundar, D., & Jayaram, Y. (2022). Modernizing Legacy ERP Systems with AI and Machine Learning in the Public Sector. International Journal of Emerging Research in Engineering and Technology, 3(4), 104-114. https://doi.org/10.63282/3050-922X.IJERET-V3I4P112

[36] Jayaram, Y., & Sundar, D. (2022). Enhanced Predictive Decision Models for Academia and Operations through Advanced Analytical Methodologies. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(4), 113-122. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I4P113

[37] Bhat, J. (2022). The Role of Intelligent Data Engineering in Enterprise Digital Transformation. International Journal of AI, BigData, Computational and Management Studies, 3(4), 106-114. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V3I4P111

[38] Jayaram, Y. (2023). Cloud-First Content Modernization: Migrating Legacy ECM to Secure, Scalable Cloud Platforms. International Journal of Emerging Research in Engineering and Technology, 4(3), 130-139. https://doi.org/10.63282/3050-922X.IJERET-V4I3P114

[39] Reddy Nangi, P., & Reddy Nala Obannagari, C. K. (2023). Scalable End-to-End Encryption Management Using Quantum-Resistant Cryptographic Protocols for Cloud-Native Microservices Ecosystems. International Journal of Emerging Trends in Computer Science and Information Technology, 4(1), 142-153. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I1P116

[40] Jayaram, Y., Sundar, D., & Bhat, J. (2022). AI-Driven Content Intelligence in Higher Education: Transforming Institutional Knowledge Management. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(2), 132-142. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I2P115

[41] Nangi, P. R., Obannagari, C. K. R. N., & Settipi, S. (2022). Enhanced Serverless Micro-Reactivity Model for High-Velocity Event Streams within Scalable Cloud-Native Architectures. International Journal of Emerging Research in Engineering and Technology, 3(3), 127-135. https://doi.org/10.63282/3050-922X.IJERET-V3I3P113

[42] Bhat, J. (2023). Strengthening ERP Security with AI-Driven Threat Detection and Zero-Trust Principles. International Journal of Emerging Trends in Computer Science and Information Technology, 4(3), 154-163. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I3P116

[43] Jayaram, Y. (2023). Data Governance and Content Lifecycle Automation in the Cloud for Secure, Compliance-Oriented Data Operations. International Journal of AI, BigData, Computational and Management Studies, 4(3), 124-133. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V4I3P113

[44] Nangi, P. R., & Reddy Nala Obannagari, C. K. (2024). A Multi-Layered Zero-Trust–Driven Cybersecurity Framework Integrating Deep Learning and Automated Compliance for Heterogeneous Enterprise Clouds. American International Journal of Computer Science and Technology, 6(4), 14-27. https://doi.org/10.63282/3117-5481/AIJCST-V6I4P102

[45] Bhat, J., & Jayaram, Y. (2023). Predictive Analytics for Student Retention and Success Using AI/ML. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 4(4), 121-131. https://doi.org/10.63282/3050-9262.IJAIDSML-V4I4P114

[46] Nangi, P. R., Reddy Nala Obannagari, C. K., & Settipi, S. (2024). Serverless Computing Optimization Strategies Using ML-Based Auto-Scaling and Event-Stream Intelligence for Low-Latency Enterprise Workloads. International Journal of Emerging Trends in Computer Science and Information Technology, 5(3), 131-142. https://doi.org/10.63282/3050-9246.IJETCSIT-V5I3P113

[47] Jayaram, Y. (2024). AI-Driven Personalization 2.0: Hyper-Personalized Journeys for Every Student Type. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 5(1), 149-159. https://doi.org/10.63282/3050-9262.IJAIDSML-V5I1P114

[48] Jayaram, Y., & Bhat, J. (2022). Intelligent Forms Automation for Higher Ed: Streamlining Student Onboarding and Administrative Workflows. International Journal of Emerging Trends in Computer Science and Information Technology, 3(4), 100-111. https://doi.org/10.63282/3050-9246.IJETCSIT-V3I4P110

[49] Bhat, J. (2024). Designing Enterprise Data Architecture for AI-First Government and Higher Education Institutions. International Journal of Emerging Research in Engineering and Technology, 5(3), 106-117. https://doi.org/10.63282/3050-922X.IJERET-V5I3P111

Downloads

Published

2025-11-24

Issue

Section

Articles

How to Cite

[1]
D. Sundar and J. Bhat, “Lakehouse-Integrated Graph Risk Scoring Architectures for Advanced Fraud Detection”, AIJCST, vol. 7, no. 6, pp. 70–80, Nov. 2025, doi: 10.63282/3117-5481/AIJCST-V7I6P107.

Similar Articles

61-70 of 124

You may also start an advanced similarity search for this article.