Intrusion Detection in the Cloud Computing Using Neural Network and Artificial Bee Colony Optimization Algorithm

Info: 5360 words (21 pages) Dissertation
Published: 9th Dec 2019

Tagged: Information SystemsCyber Security

Share this: Facebook Twitter Reddit LinkedIn WhatsApp

Intrusion detection in the cloud computing using neural network and artificial bee colony optimization algorithm

Abstract

Recently, cloud computing is employed to solve many problems and fulfill user-based demands. As cloud computing grows, security becomes more and more important. An intrusion detection system (IDS) attempts to detect the attacks by examining various data records observed in processes on the network. In this paper, a new IDS is introduced based on a combination of multi-layer perceptron (MLP), artificial bee colony (ABC) and fuzzy clustering algorithm. Fuzzy clustering technique is used to generate different training subsets. MLP is utilized as a classifier to distinguish the normal and abnormal packets in the network traffic and ABC algorithm is employed for training MLP by optimizing the values of linkage weights and bias. The simulation is accomplished in CloudSim and NSL-KDD dataset is used. Experimental results have shown that the proposed method improves similar IDSs in different evaluation criteria such as mean absolute error (MAE), root mean square error (RMSE) and Kappa statistics.

Introduction

Cloud computing is a network-based infrastructure where information technology (IT) and computing resources such as operating systems, storage, networks, hardware, databases, and even entire software applications are delivered to users as on-demand services (Milani and Navimipour 2016). The cloud metaphor is a reference to the ubiquitous availability and accessibility of computing resources via Internet technologies (Navimipour and Milani 2015). It provides computing resources which are dynamically allocated to user programs based on their needs, and users just pay for the resources their programs actually consume, similar to the conventional pay-per-use metered service for utility consumptions of water, electricity, and natural gas (Ashouraie and Jafari Navimipour 2015). Software as a service (SaaS) (Sucahyo et al. 2017), Infrastructure as a service (IaaS) (Tao and Gao 2017), Platform as a service (PaaS) (Bassiliades et al. 2017) and Expert as a service (EaaS) (Navimipour et al. 2015) are the most important provided services in the cloud computing (Keshanchi, Souri, and Navimipour 2017).

On the other hand, with the coming of Internet age, network security has become the key foundation to web applications, such as online retail sales, online auctions, etc. An intrusion detection system (IDS) attempts to detect computer attacks by examining various data records observed in the network [9]. In general, IDSs can be divided into two categories: anomaly and misuse (signature) detection based on their detection approaches where anomaly detection tries to determine whether deviation from the established normal usage patterns can be flagged as intrusions and on the other hand, misuse detection uses patterns of well-known attacks or weak spots of the system to identify intrusions [10]. There are several algorithms to improve intrusion detection in the cloud where most of them are based on evolutionary computing and meta-heuristic methods considering that it is a NP-Hard problem.

In this paper, after reviewing some related works we came up with a new idea based on a combination of artificial neural network (ANN), artificial bee colony (ABC) and fuzzy clustering algorithm. An ANN can act on his own in an IDS but combination of ANN, ABC and fuzzy clustering makes a powerful IDS. Fuzzy clustering prepares homogeneous subsets of training data so enhances training speed rate as dataset is divided into uniform subsets and ABC helps ANN to gain ideal values of linkage weights and bias faster. Also adding two more algorithms beside ANN has performance cost.

The main aims and contributions of this paper are listed below:

Introducing a new method based on combination of ANN, ABC and fuzzy clustering for intrusion detection in the cloud computing.
Reducing incorrectly classified instance and mean absolute error (MAE) and root mean squared error in cloud IDS.
Improving Kappa statistics and correctly classified instance of cloud IDS technique.

The remainder of this paper is organized as follows, some related papers are discussed in section 2. Then, a new method is proposed in Section 3 and simulation details and experimental results are analyzed in Section 4. Finally, in Section 5, we conclude this paper and suggest some ideas for future works.

2. Related works

Many researches have argued that ANNs can improve the performance of IDSs when compared with traditional methods. However, for ANN-based IDSs, detection precision, especially for low-frequent attacks, and detection stability are still needed to be enhanced. In (Wang et al. 2010), a new approach, called FC-ANN, based on ANN and fuzzy clustering (FC) is proposed to solve the problem and help IDSs achieve higher detection rate, less false positive rate and stronger stability. The general procedure of FC-ANN is as follows: firstly fuzzy clustering technique is used to generate different training subsets. Subsequently, based on different training subsets, different ANN models are trained to formulate different models. Finally, a meta-learner, fuzzy aggregation module, is employed to aggregate these results. Experimental results on the KDD CUP 1999 dataset have shown that the proposed approach outperforms back propagation neural network (BPNN) and other well-known methods such as decision tree (Jo, Sung, and Ahn 2015), the naïve Bayes (Farid et al. 2014) in terms of detection precision and detection stability.

In Enache and Patriciu (2014), an IDS model based on information gain for feature selection combined with the support vector machine (SVM) classifier is proposed. The parameters for SVM are selected by a swarm intelligence algorithm. The NSL-KDD (Tavallaee et al. 2009) data set is also employed and obtained results have shown that the model can achieve higher detection rate and lower false alarm rate than regular SVM.

In Ganeshkumar and Pandeeswari (2016), an anomaly detection system at hypervisor layer named hypervisor detector has been developed and evaluated to detect the malicious activities in the cloud environment. Deployment of fuzzy systems in IDSs has the ability to detect the presence of uncertain and imprecise nature of anomalies in cloud environment. But, they fail in constructing models based on target data. One of the successful approaches based on target data is integration of fuzzy systems with adaptation and learning proficiencies of neural network called Adaptive Neuro-Fuzzy Inference System (ANFIS) model. The hypervisor detector is designed and developed with ANFIS and is practiced with a hybrid algorithm, which is a combination of back propagation (BP) gradient descent technique with least square method. For experiments, DARPA’s KDD cup dataset is used. The performance analysis and obtained results have shown that the proposed hypervisor detector based on ANFIS is well designed to detect the anomalies in the cloud environment with minimum false negative rate and high detection accuracy.

Also in another paper Pandeeswari and Kumar (2016) an anomaly detection system at the hypervisor layer that uses a hybrid algorithm which is a mixture of Fuzzy C-Means clustering algorithm and Artificial Neural Network (FCM-ANN) to improve the accuracy of the detection system has been proposed. The proposed system is implemented and compared with Naïve Bayes classifier and Classic ANN algorithm. The DARPA’s KDD cup dataset 1999 is used for experiments. Based on extensive theoretical and performance analysis, it is evident that the proposed system is able to detect the anomalies with high detection accuracy and low false alarm rate even for low frequent attacks thereby outperforming Naïve Bayes classifier and classic ANN.

In Karaboga and Ozturk (2011), ABC has been used for data clustering on benchmark problems and the performance of ABC algorithm is compared with Particle Swarm Optimization (PSO) algorithm and other nine classification techniques from the literature. Thirteen of typical test data sets from the UCI machine learning repository are used to demonstrate the results of the techniques. The simulation results have indicated that ABC algorithm can efficiently be used for multivariate data clustering.

Also, in Alomari and Othman (2012), a wrapper–based feature selection approach using bees algorithm as a search strategy for subset generation, and using SVM as the classifier is proposed. The experiments have used four random subsets collected from KDD’99. Each subset contains around 4000 records. The performance of the proposed approach is evaluated based on standard IDS measurements such as: detection rate, false alarm rate, and classification accuracy comparing with feature selection techniques such as Linear Genetic Programming (LGP) (Zahiri and Azamathulla 2014), Multivariate Regression Splines (MARS) (Kisi 2015), and Support Vector Decision Function Ranking (SVDF) (Zainal, Maarof, and Shamsuddin 2006). The result have shown that the feature subset produced by BA-SVM has yielded better quality IDS.

A hybrid approach called Neural Network with Indicator Variable using Rough Set (NNIV-RS) is proposed in (Sadek, Soliman, and Elsayed 2013). It aimed to reduce the amount of computer resources which are required to run the detection process such as memory and processor time. Rough set theory is used to select important features. Indicator variable is also used to represent dataset in more efficient way. ANN is used for network traffic packet classification. Tests and comparison have been done on NSL-KDD dataset. The experimental results have shown that the proposed algorithm gives better and robust representation of data as it is able to select features resulting in 80.4% data reduction, select significant attributes from the selected features and achieve detection accuracy with a low false alarm rate.

3. Proposed method

In this section, fuzzy clustering, ABC algorithm and MLP are being used to build an efficient IDS. Fuzzy clustering technique is used to generate different training subsets. The MLP is utilized as a classifier to distinguish the normal and abnormal packets in the network traffic. The structure of MLP has been created relying on the features of NSL-KDD 99 dataset. In addition, the ABC algorithm is employed for training MLP by optimizing the values of linkage weights and bias.

Proposed method

Artificial neural network (ANN) is a branch of artificial intelligence. It is a computational system inspired by central nervous systems. The most common applications of ANN are machine learning, classification, pattern recognition as well as prediction. Usually, ANN structure consists of at least three layers which are input, hidden and output layers. Each layer includes a number of nodes which is determined based on the problem which is wanted to be solved. Each node connects to all nodes in the next layer through the linkage weights. In addition, there are node in each layer called bias node also are connected to all node on the particular layer by the bias weights. The training process includes update the values of the linkage weights and biases weights between the layers of ANN structure by one of the optimization algorithm. Furthermore, the output for each node in each layer is based on the weighted inputs of the node and the used activation function.

The proposed method includes three steps: training, validation, and testing. In training, fuzzy rules are extracted and optimized by the proposed IDS architecture using the training data to achieve maximum accuracy on the dark data. The validation step is used to assess how precisely a system will act upon in practice. This method observes the error on validation dataset and terminates system training when this error starts to increase. In the testing phase, the test data are passed through the saved trained model to detect intrusions. Dataset is divided into 3 subsets; TR for training; VA for validation and TE for testing.

Clustering is the process of recognizing natural groupings or clusters in multidimensional data based on some similarity measures. The aim of fuzzy cluster is to partition a given set of data into clusters, and it should have the following properties: homogeneity within the clusters, concerning data in same cluster, and heterogeneity between clusters, where data belonging to different clusters should be as different as possible. Through fuzzy clustering module, the training set is clustered into several subsets. Due to the fact that the size and complexity of every training subset is reduced, the efficiency and effectiveness of subsequent ANN module can be improved. Fuzzy c-means (FCM) is used as the cluster. FCM is one of the most popular techniques for data clustering. Since FCM tends to balance the number of data points in each cluster, centers of smaller clusters are forced to drift to larger adjacent clusters (Lin et al. 2014). Steps for K-Means clustering is as follows (Ghosh and Dubey 2013):

1) Set K – To choose a number of desired clusters, K.

2) Initialization – To choose K starting points which are used as initial estimates of the cluster centroids. They are taken as the initial starting values.

3) Classification – To examine each point in the dataset and assign it to the cluster whose centroid is nearest to it.

4) Centroid calculation – When each point in the data set is assigned to a cluster, it is needed to recalculate the new k centroids.

5) Convergence criteria – The steps of (iii) and (iv) require to be repeated until no point changes its cluster assignment or until the centroids no longer move.

After performing clustering phase, TR is clustered into k subset (TR_k). The back propagation (BP) neural network algorithm, as a multi-layer ANN, is one of the most widely applied neural network models which is being used to learn and train IDS model. Its learning rule is to adopt the steepest descent method in which the back propagation is used to regulate the weight value and threshold value of the network to achieve the minimum error sum of square (Li et al. 2012). In BP, the gradient descent method is adopted to regulate the weight value of all layers, and the learning algorithm of weight value is expressed as follows:

for output units, error is calculated where

δ_k = Ok (1-O_k)(t_k – O_k)

(2)

for hidden layers, error is calculated where

δ_h = O_h (1-O_h) Σ_k W_kh δ_k

(3)

each node weight is obtained as

W_ji = W_ji + ΔW_ji

(4)

where

ΔW_ji = η δ_j X_ji

(5)

also network function is calculated as

Y= if input>Ѳ 1 if -Ѳ≤ input≤+Ѳ 0if input<Ѳ -1

(6)

The initial weight of network is generally generated at random in certain interval; the
training starts with an initial point and reaches gradually to a minimum of error along
the slope of error function. To reach optimal node weights, Artificial Bee Colony (ABC) is used to optimizing the values of linkage weights and bias. ABC algorithm is proposed for optimizing numerical problems. The algorithm simulates the intelligent foraging behavior of honey bee swarms. It is a very simple, robust and population based stochastic optimization algorithm. In ABC algorithm, the colony of artificial bees contains three groups of bees: employed bees, onlookers and scouts. A bee waiting on the dance area for making a decision to choose a food source is called onlooker and one going to the food source visited by it before is named employed bee. The other kind of bee is scout bee that carries out random search for discovering new sources. The position of a food source represents a possible solution to the optimization problem and the nectar amount of a food source corresponds to the quality (fitness) of the associated solution. If the output model has acceptable error rate while testing TE dataset, algorithm terminates. Otherwise optimizing IDS model continues with correcting linkage weights and bias of ANN through ABC.

Fig 1. Flowchart of proposed method

Experimental results

In this section the proposed method is simulated and evaluated via CloudSim. Subsection 4.1 explains simulation process and used dataset and subsection 4.2 shows the obtained results.

Simulation environment

The experiment is accomplished in CloudSim version 4 on a computer with core-i7 2700 CPU, 16GB of RAM DDR3. CloudSim is a framework for modeling and simulation of cloud computing infrastructures and services as its primary objective is to provide a generalized, and extensible simulation framework that enables seamless modeling, simulation, and experimentation of emerging cloud computing infrastructures and application services.

Dataset

In the experiments, KDD CUP 1999[1] dataset is used which is a version of the original 1998 DARPA intrusion detection evaluation program, prepared and managed by the MIT Lincoln Laboratory. Training and Testing is performed by means of using NSL-KDD dataset, which is the improved version of KDD’99 dataset. The dataset contains about five million connection records as training data and about two million connection records as test data. The dataset includes a set of 41 features derived from each connection (table 2) and a label which specifies the status of connection records as either normal or specific attack type. These features have all forms of continuous, discrete, and symbolic variables, with significantly varying ranges falling in four categories:

(1) The first category consists of the intrinsic features of a connection, which include the basic features of individual TCP connections. The duration of the connection, the type of the protocol (TCP, UDP, etc.), and network service (http, telnet, etc.) are some of the features.

(2) The content features within a connection suggested by domain knowledge are used to assess the payload of the original TCP packets, such as the number of failed login attempts.

(3) The same host features examine established connections in the past two seconds that have the same destination host as the current connection, and calculate the statistics related to the protocol behavior, service, etc.

(4) The similar same service features inspect the connections in the past two seconds that have the same service as the current connection.

Table 2. Features derived from each connection

1	Num_ob_cmds	22	Dst_Host_Serror_Rate
2	isrv host login	23	Dst_Host_Srv_Serror_Rate
3	Duration	24	Protocol_Type
4	Dst_Bytes	25	Service
5	Logged_In	26	Src_Bytes
6	Su_Attempted	27	Count
7	Num_Root	28	Srv_Count
8	Num_File_Creations	29	Rerror_Rate
9	Num_Shells	30	Srv_Rerror_Rate
10	Num_Access_Files	31	Dst_Host_Same_Src_Port_Rate
11	Srv_Diff_Host_Rate	32	Dst_Host_Rerror_Rate
12	Dst_Host_Coun	33	Dst_Host_Srv_Rerror_Rate
13	Dst_Host_Srv_Diff_Host_Rate	34	Hot
14	Flag	35	Num_Compromised
15	Serror_Rate	36	Land
16	Srv_Error_Rate	37	Wrong_Fragment
17	Same_Srv_Rate	38	Urgent
18	Diff_Srv_Rate	39	Num_Failed_Logins
19	Dst_Host_Srv_Count	40	Root_Shell
20	Dst_Host_Same_Srv_Rate	41	Is_Guest_Login
21	Dst_Host_Diff_Srv_Rate

All attacks fall into four categories (table3):

(1) Denial of Service (DoS): making some computing or memory resources too busy to accept legitimate users access these resources.

(2) Probe (PRB): host and port scans to gather information or find known vulnerabilities.

(3) Remote to Local (R2L): unauthorized access from a remote machine in order to exploit machine’s vulnerabilities.

(4) User to Root (U2R): unauthorized access to local super user (root) privileges using system’s susceptibility.

Table 3. Four main categories of attack types

DOS	Back, land, Neptune pod smurf teardrop
Probe	Ipsweep, nmap portsweep, lht_port_attack
R2L(Remote-to-Local)	Imap, ftp_write, guess_passwd, multihop, phf, spy,warezclient, warezmaster
U2R(User-to-Root)	buffer_overﬂow, Rootkit, Perl, Loadmodule

4.3 Evaluation criteria and results

To evaluate the proposed method, some evaluation criteria is defined and results are gained. The first criterion is MAE. MAE is a quantity used to measure how close forecasts or predictions are to the eventual outcomes. The second one is RMSE (Root mean squared error) and the third one is Kappa. RMSE is a frequently used measure of the differences between values (sample and population values) predicted by a model or an estimator and the values actually observed. Kappa statistic compares observed accuracy and expected accuracy and can be used to evaluate classifiers between themselves. It accounts for agreement with a random classifier and hence is less misleading than accuracy.

MAE=∑i=1n|yi-xi|n

RMSE=∑i=1n|yi-xi|n

Kappa=ObservedAgreement – ExpectedAgreement1 – ExpectedAgreement

Table 4 illustrates a comprehensive result of execution and comparison of our proposed method with some familiar IDSs like FC-ANN, NNID and SRF. It’s obvious that the proposed method overcame all in evaluation criteria. It shows a 2.23% improvement in correctly classified instance (true positive and false positive) and decrease in incorrectly classified instance (true negative and false negative). Our method shows a higher Kappa statistic with 0.0481 improvement in comparison with SRF. In a brief summary, the proposed method showed a 2.2371 to 7.1671% in correctly classified instances (fig 2), a 0.0481to 0.2101in Kappa statistics (fig 3). Also the proposed method had lower mean absolute error (MAE) and root mean squared error (RMSE) as illustrated in fig 4 and 5.

Table 4. Simulation parameters

No. of hidden layers	3
Learning rate	0.1
Max. epoch	2000
No. of bees	40
No. of food sources	20

Fig 2. Correctly Classified Instances

Fig 3. Kappa statistic

Fig 4. Mean absolute error

Fig 5. Root mean squared error

Conclusion and future works

In this paper, we proposed a new IDS based on combination of ANN, ABC and fuzzy clustering. Fuzzy clustering technique is used to generate different training subsets. The MLP is utilized as a classifier to distinguish the normal and abnormal packets in the network traffic and ABC algorithm is employed for training MLP by optimizing the values of linkage weights and bias. The simulation is accomplished in CloudSim and NSL-KDD dataset is used. Experimental results have shown that the proposed method improves similar IDSs in different evaluation criteria such as MAE, RMSE and Kappa statistics. In the future, it seems that future improvements might be achievable if other combination of meta-heuristic methods such as genetic algorithm is employed.

REFRENCES

Alomari, Osama, and Zulaiha Ali Othman. 2012. ‘Bees algorithm for feature selection in network anomaly detection’, Journal of Applied Sciences Research, 8: 1748-56.

Ashouraie, Mehran, and Nima Jafari Navimipour. 2015. ‘Priority-based task scheduling on heterogeneous resources in the Expert Cloud’, Kybernetes, 44: 1455-71.

Bassiliades, Nick, Moisis Symeonidis, Georgios Meditskos, Efstratios Kontopoulos, Panagiotis Gouvas, and Ioannis Vlahavas. 2017. ‘A semantic recommendation algorithm for the PaaSport platform-as-a-service marketplace’, Expert Systems with Applications, 67: 203-27.

Enache, Adriana-Cristina, and Victor Valeriu Patriciu. 2014. “Intrusions detection based on support vector machine optimized with swarm intelligence.” In Applied Computational Intelligence and Informatics (SACI), 2014 IEEE 9th International Symposium on, 153-58. IEEE.

Farid, Dewan Md, Li Zhang, Chowdhury Mofizur Rahman, M Alamgir Hossain, and Rebecca Strachan. 2014. ‘Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks’, Expert Systems with Applications, 41: 1937-46.

Ganeshkumar, P, and N Pandeeswari. 2016. ‘Adaptive neuro-fuzzy-based anomaly detection system in cloud’, International Journal of Fuzzy Systems, 18: 367-78.

Ghosh, Soumi, and Sanjay Kumar Dubey. 2013. ‘Comparative analysis of k-means and fuzzy c-means algorithms’, International Journal of Advanced Computer Science and Applications, 4: 35-39.

Jo, Seongrae, Haengnam Sung, and Byunghyuk Ahn. 2015. ‘A comparative study on the performance of intrusion detection using Decision Tree and Artificial Neural Network models’, Journal of the Korea Society of Digital Industry and Information Management, 11: 33-45.

Karaboga, Dervis, and Celal Ozturk. 2011. ‘A novel clustering approach: Artificial Bee Colony (ABC) algorithm’, Applied soft computing, 11: 652-57.

Keshanchi, Bahman, Alireza Souri, and Nima Jafari Navimipour. 2017. ‘An improved genetic algorithm for task scheduling in the cloud environments using the priority queues: Formal verification, simulation, and statistical testing’, Journal of Systems and Software, 124: 1-21.

Kisi, Ozgur. 2015. ‘Pan evaporation modeling using least square support vector machine, multivariate adaptive regression splines and M5 model tree’, Journal of Hydrology, 528: 312-20.

Li, Jing, Ji-hang Cheng, Jing-yuan Shi, and Fei Huang. 2012. ‘Brief introduction of back propagation (BP) neural network algorithm and its improvement’, Advances in Computer Science and Information Engineering: 553-58.

Lin, Phen-Lan, Po-Whei Huang, Chun-Hung Kuo, and YH Lai. 2014. ‘A size-insensitive integrity-based fuzzy c-means method for data clustering’, Pattern Recognition, 47: 2042-56.

Milani, Bahareh Alami, and Nima Jafari Navimipour. 2016. ‘A comprehensive review of the data replication techniques in the cloud environments: Major trends and future directions’, Journal of Network and Computer Applications, 64: 229-38.

Navimipour, Nima Jafari, and Farnaz Sharifi Milani. 2015. ‘Task scheduling in the cloud computing based on the cuckoo search algorithm’, International Journal of Modeling and Optimization, 5: 44.

Navimipour, Nima Jafari, Amir Masoud Rahmani, Ahmad Habibizad Navin, and Mehdi Hosseinzadeh. 2015. ‘Expert Cloud: A Cloud-based framework to share the knowledge and skills of human resources’, Computers in Human Behavior, 46: 57-74.

Pandeeswari, N, and Ganesh Kumar. 2016. ‘Anomaly detection system in cloud environment using fuzzy clustering based ANN’, Mobile Networks and Applications, 21: 494-505.

Sadek, Rowayda A, M Sami Soliman, and Hagar S Elsayed. 2013. ‘Effective anomaly intrusion detection system based on neural network with indicator variable and rough set reduction’, International Journal of Computer Science Issues (IJCSI), 10: 227-33.

Sucahyo, Yudho Giri, Yan Yonathan Rotinsulu, Achmad Nizar Hidayanto, Devi Fitrianah, and Kongkiti Phusavat. 2017. ‘Software as a service quality factors evaluation using analytic hierarchy process’, International Journal of Business Information Systems, 24: 51-68.

Tao, Chuanqi, and Jerry Gao. 2017. ‘On building a cloud-based mobile testing infrastructure service system’, Journal of Systems and Software, 124: 39-55.

Tavallaee, Mahbod, Ebrahim Bagheri, Wei Lu, and Ali A Ghorbani. 2009. “A detailed analysis of the KDD CUP 99 data set.” In Computational Intelligence for Security and Defense Applications, 2009. CISDA 2009. IEEE Symposium on, 1-6. IEEE.

Wang, Gang, Jinxing Hao, Jian Ma, and Lihua Huang. 2010. ‘A new approach to intrusion detection using Artificial Neural Networks and fuzzy clustering’, Expert Systems with Applications, 37: 6225-32.

Zahiri, A, and H Md Azamathulla. 2014. ‘Comparison between linear genetic programming and M5 tree models to predict flow discharge in compound channels’, Neural Computing and Applications, 24: 413-20.

Zainal, Anazida, Mohd Aizaini Maarof, and Siti Mariyam Shamsuddin. 2006. “Feature selection using rough set in intrusion detection.” In Tencon 2006. 2006 IEEE Region 10 Conference, 1-4. IEEE.

[1] http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html