Automating Cyber Attack Detection through Integrated Deep Learning and Scalable Data Science Pipelines

Kumwar Narayan Singh

Journal of Scientific Innovation and Advanced Research (JSIAR) Published: July 2025 Volume: 1, Issue: 4 Pages: 264-268

Automating Cyber Attack Detection through Integrated Deep Learning and Scalable Data Science Pipelines

Review Article

Kunwar Narayan Singh¹

¹Department of Computer Science and Engineering, Jaypee Institute of Information Technology, Noida, India

*Author for correspondence: Kunwar Narayan Singh
Department of Computer Science and Engineering, Jaypee Institute of Information Technology, Noida, India
E-mail ID: knsinghverma@gmail.com

ABSTRACT

Cyber attacks have become increasingly sophisticated, posing severe threats to critical digital infrastructure across sectors. Traditional intrusion detection systems often struggle with evolving attack patterns, high false alarm rates, and scalability limitations. To address these challenges, this paper proposes an integrated framework that combines the power of deep learning with scalable and automated data science pipelines for effective cyber attack detection. The approach involves the deployment of a real-time data ingestion and preprocessing pipeline, followed by training a deep neural network—specifically an LSTM-based model—to identify anomalous behavior in network traffic. The proposed system is designed for scalability, enabling efficient handling of high-velocity data streams, while also achieving high detection accuracy. Experiments conducted on the CICIDS2017 dataset demonstrate the effectiveness of the framework, achieving a detection accuracy of 96.2% and a notable reduction in false positives compared to baseline models. This integration of deep learning with data engineering components not only enhances threat detection capabilities but also offers a practical and scalable solution for modern cybersecurity environments.

Keywords: Cyber Attack Detection, Deep Learning, Data Science Pipelines, Intrusion Detection System (IDS), LSTM Networks, Real-Time Analytics