Sensitivity Analysis of Training Neural Networks by Orthogonal Weight Functions and its Application in Intrusion Detection

The analysis of statistical sensitivity for training neural networks using a new kind of orthogonal weight functions (OWFs) is discussed in this paper. The weights obtained after training are orthogonal functions defined on the sets of input variables (input patterns). We design a kind of classifier for intrusion detection. By extracting some parameters using the sensitivity formula of OWFs neural networks given in this paper, the test data for intrusion detection are optimized. We show that the classifier of OWFs neural networks has the advantages of optimized architecture and high detection rate.


Introduction
To overcomes the drawbacks that usually suffered from early algorithms (BP, RBF), a new type of artificial neural network's training algorithm using cubic spline weight functions (CSWFs) has been proposed in [1].
Based on [1], a new algorithm using orthogonal weight functions (OWFs) is introduced in this paper for the implementation of weight functions for training neural networks, so it has the advantages of CSWFs [1].
It is well known that sensitivity refers to how a system output is influenced by its input perturbations, and sensitivity analysis is very closely related with the architecture of neural networks and the training algorithms.
If patterns are affected by noise, the output of the network will be changed.Sensitivity is usually used to analyze this kind of changes [2], [3].Piché [4] used a statistical approach to relate the output error to the change of weights for an ensemble of Madalines, with several activation functions such as linear, sigmoid, and threshold.On the related literature, other sensitivity function has been defined, such as output sensitivity, trajectory sensitivity, function sensitivity, etc.Based on the analysis of sensitivity of neural network using OWFs, this paper proposes the theoretical error and approximation error affected by noise.
In order to analyze the sensitivity of orthogonal weight function neural networks, the coefficients of the weight functions are calculated and the sensitivity calculation formula is deducted while the orthogonal function is applied to the Legendre orthogonal polynomial.
Intrusion detection technology is the computer software or hardware systems which can find out the unauthorized network access and attacks through the analysis of user system data, then take the alarm, and other response measures.Nowadays the advanced intrusion detection techniques used in intrusion detection including neural networks, data mining, data fusion, computer immunology and genetic algorithms.
Detection method that based on neural networks determines the invasion by extracting the mode characteristics from the normal or abnormal behaviors of the user or the systems, and creates the outline of their behavioral characteristics, according to the normal or abnormal outline during the intrusion detection to judge the exception of audit data.It can constantly learning and adjust the Mode characteristics of the subject by training, in order to build a characteristics outline which is adaptive.
It is important that selects the appropriate neural network.In this paper, the OWFs neural network is used for intrusion detection.

Network's architecture and algorithm
The network's architecture is m-n, which is different from that used in BP or RBF networks.The network has two layers, one is input layer, and the other is output layer.
There are m points in input layer, and denoted by i x , 1, 2, , i m = , or we say, the input dimension of the network is m.Each of the input vectors x is m-dimensional, and i x is i-th element of the vector, 1, 2, , i m = ; There are n points in output layer, and denoted by Add j , 1, 2, , j n = , or we say, the output dimension of the network is n.The neuron Add j is used as an adder.Note that each of the m inputs is connected to each of the neurons (adders).each of the output vectors z is n-dimensional, and j z is the j-th element of the vector.

( )
ji i w x is the theoretical weight function of input patterns i x corresponding to j-th neuron; j z is the theoretical output of j-th neuron; ( ) w x and sj z is the approximation of j z .The mapping relations between output layer and input layer are ( ) The one-variable function ( ) ( ) x z = can be found by N+2 interpolating patterns, which describes the interpolating function by the knots ( ) x z , and is called weight function between the jth output point (neuron) and the ith input point (variable).The z sj denotes the output values of the network's jth neuron, and the z j indicates the target patterns of the jth point of the network.
In order to find ( ) ji i s x , the definition of best square approximation is firstly given below.Suppose C a b .Suppose that the function The * S is the approximation function in the subset of ϕ .If the approximation polynomial is expressed in the following (6) The coefficients of (6) for the approximation polynomial can be expressed as the following On the condition of multi-dimensional input and multi-dimensional output, the error of orthogonal weight function neural networks can be expressed as Information Technology for Manufacturing Systems III ( ) where The expressions (8) and ( 9) are important for some applications of generalization.

Sensitivity Analysis
Suppose that * x is the vector of input patterns, and is the vector of perturbation, then the disturbed input patterns can be expressed as following  The output error of theoretical noise of j-th neuron is Suppose that the function is as following Formula ( 14) can be transformed as follows In ( 16), ( , )  P k x is the value of the k-th item of orthogonal polynomials, =

, and
( ) When the perturbations of input patterns tends to zero, we have Advanced Engineering Forum Vols.6-7 The output perturbations can be expressed in the following The definition of statistical sensitivity for weight function neural networks can be expressed as Assume that the input variables are independent, we have ( the formula (21) can be expressed in the following The theoretical sensitivity is We can calculate the sensitivity values based on theoretical error by (24).And the sensitivity of approximation error is ( )

Example
From the results of sensitivity analysis given in this paper, we see that, when the disturbance of input patterns is increased, the corresponding error of the network's output will also be increased.By this principle we can remove some undetected patterns for higher detection rate.
The example given below describes the intrusion detection by orthogonal weight function neural networks.The network's architecture used in this example is 24-1, which means that there are 24dimensional input nodes and 1-dimensional output node.This example adopts 540 data patterns, including back Ipsweep, Satan attack, attack and normal data stream, and some remaining undetected data.
We select a group of data as learning patterns, calculate the perturbation of other data patterns (test patterns), and then according to the values of sensitivity, remove those intrusion detection data with large values of perturbation.The simulation results are as follows.

1212
Information Technology for Manufacturing Systems III Fig. 2 shows that actual sensitivity curve of 540 invasion of input disturbance, as can be seen, in range of [0.95, 1], numerical value of sensitivity has a sudden sharp change, which means that the invasion of network input disturbance exerts an enormous influence.Because intrusion detection information input perturbation is accomplished by normalization of, to [0.95, 1] for boundaries, the input information of the original data can be deduced, and then these information will be removed.
Using the sensitivity analysis given in this paper, we can improve the detection efficiency, see Fig. 3.
this case, the theoretical output of the j-th neuron is

Fig. 1
Fig. 1 Intrusion detection experiment with original data In this paper, two kinds of sensitivity are discussed, i.e. the sensitivity of theoretical error and the sensitivity of approximation error.Below gives the theoretical error.