Documente Academic
Documente Profesional
Documente Cultură
SFAM
Preecha Somwang1, 2 and Woraphon Lilakiatsakun2
1
Office of Academic Resources and Information Technology, Rajamangala University of Technology
Isan, Thailand
2
Faculty of Information Science and Technology, Mahanakorn University of Technology, Thailand
IA
JI
Abstract: Intrusion Detection System (IDS) has been an important tool for network security. However, existing IDSs that have
been proposed do not perform well for anomaly traffics especially Remote to Local (R2L) attack which is one of the most
concerns. We thus propose a new efficient technique to improve IDS performance focusing mainly on R2L attacks. The
Principal Component Analysis (PCA) and Simplified Fuzzy Adaptive Resonance Theory Map (SFAM) are used to work
collaboratively to perform feature selection. The results of our experiment based on KDD Cup99 dataset show that this
hybrid method improves classification performance of R2L attack significantly comparing to other techniques while
classification of the other types of attacks are still well performing.
rs
Fi
tO
1. Introduction
GOOD
Misuse Detection
Anomaly Detection
io
at
lic
BAD
ub
eP
in
nl
JI
IA
Variable name
duration
protocol_type
service
flag
src_bytes
dst_bytes
land
wrong_fragment
urgent
hot
num_failed_logins
logged_in
num_compromised
root_shell
su_attempted
num_root
num_file_creations
num_shells
num_access_files
num_outbound_cmds
is_host_login
Type
continuous
discrete
discrete
discrete
continuous
continuous
discrete
continuous
continuous
continuous
continuous
discrete
continuous
continuous
Continuous
Continuous
Continuous
Continuous
Continuous
continuous
discrete
No
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Variable name
is_guest_login
count
srv_count
serror_rate
srv_serror_rate
rerror_rate
srv_rerror_rate
same_srv_rate
diff_srv_rate
srv_diff_host_rate
dst_host_count
dst_host_srv_count
dst_host_same_srv_rate
dst_host_diff_srv_rate
dst_host_same_src_port_rate
dst_host_srv_diff_host_rate
dst_host_serror_rate
dst_host_srv_serror_rate
dst_host_rerror_rate
dst_host_srv_rerror_rate
Normal or Attack
Type
discrete
Continuous
continuous
continuous
Continuous
Continuous
Continuous
Continuous
Continuous
Continuous
Continuous
Continuous
Continuous
Continuous
Continuous
Continuous
Continuous
Continuous
Continuous
continuous
discrete
No
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
io
at
lic
ub
eP
in
nl
tO
rs
Fi
mc = 33 continuous-value attributes.
JI
IA
tO
rs
Fi
Feature Name
duration
protocol_type
service
flag
src_bytes
dst_bytes
land
wrong_fragment
urgent
No
1
2
3
4
5
6
7
8
9
Feature Name
dst_host_count
Description
Count of connections having the same
32
destination.
dst_host_srv_count
Count of connections having the same
33
destination host and using the same service.
dst_host_same_srv_rate
% of connections having the same destination
34
host
% of different services on the current host.
35 dst_host_diff_srv_rate
dst_host_same_src_port_rate % of connections to the current host having the
36
same src port.
dst_host_srv_diff_host_rate % of connections to the same service coming
37
from different host.
dst_host_serror_rate
% of connections to the current host that have an
38
S0 error.
dst_host_srv_serror_rate
% of connections to the current host and
39
specified service that have an S0 error
dst_host_rerror_rate
% of connections to the current host that have an
40
RST error.
dst_host_srv_rerror_rate
% of connections to the current host and
41
specified service that an RST error.
Feature Name
hot
num_failed_logins
logged_in
num_compromised
root_shell
su_attempted
num_root
num_file_creations
num_shells
num_access_files
num_outbound_cmds
is_hot_login
is_guest_login
Description
Number of ``hot'' indicators
Number of failed login attempts
1 if successfully logged in; 0 otherwise
Number of ``compromised'' conditions
1 if root shell is obtained; 0 otherwise
1 if ``su root'' command attempted; 0 otherwise
Number of ``root'' accesses
Number of file creation operations
Number of shell prompts
Number of operations on access control files
Number of outbound commands in an ftp session
1 if the login belongs to the ``hot'' list; 0 otherwise
1 if the login is a ``guest''login; 0 otherwise
26
27
28
29
30
31
Description
Number of connections to the same host as the current
connection in the past two seconds
srv_count
Number of connections to the same service as the
current connection in the past two seconds
serror_rate
% of connections that have ``SYN'' errors, S0 error
rate
srv_serror_rate
% of connections that have ``SYN'' errors, S0 error
rate for the same service as the current one
rerror_rate
% of connections that have ``REJ'' errors, RST error
rate
srv_rerror_rate
% of connections that have ``REJ'' errors, RST error
rate for the same service as the current one
same_srv_rate
% of connections to the same service
diff_srv_rate
% of connections to different services
srv_diff_host_rate % of connections to different hosts
25
Feature Name
count
io
at
lic
R2L
Unknown attack
apache2,mailbomb,
processtable, udpstorm
mscan, saint
ps, sqlattack, xterm
ub
eP
No
10
11
12
13
14
15
16
17
18
19
20
21
22
in
nl
Known attack
back, land, neptune, pod, smurf,
teardrop
ipsweep, nmap, portsweep, satan
buffer_overflow,loadmodule,
perl,rootkit
ftp_write, guess_passwd, phf,
imap,multihop, warezmaster,
Warezclient
4. Proposed Method
We developed a new framework based on 3 major
steps, as shown conceptually in Figure 2. The first step
is data pre-processing that handles missing and
Data set
Training
IA
Classification
Normal
JI
DoS
U2R
rs
Fi
telnet
Ntp_u
Remote_job
link
Pop_3
Tftp_u
Urp_i
Tim_i
Login
Imap4
Pop_2
Vmnet
Shell
prinetr
nntp
echo
mtp
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
Field Name
Ps
rootkit
sqlattack
xterm
Value
21
22
23
24
Group
4
4
4
4
2
2
2
2
2
2
2
3
3
3
3
3
3
4
4
4
ftp_write
guess_passwd
httptunnel
imap
multihop
named
phf
sendmail
snmpgetattack
snmguess
warezmaster
worm
xlock
xsnoop
warezclient
spy
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
x - min
(1)
max - min
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Field
Value
Name Value Field Name
Bgp
35
Gopher
52
Ldap
36
Hostname
53
Uucp
37
Iso_tsap
54
Netstat
38
Klogin
55
Kshell
39 Netbios_dgm 56
Sql_net
40
Netbios_ns
57
Netbio_ssn 41
Pm_dump
58
http_443
42
Rje
59
Whois
43
Ssh
60
Courier
44
Sunrpc
61
Nnsp
45
Supdup
62
Csnet_ns
46
Systat
63
Ctf
47
Uucp_path
64
Daytime
48
Z39_50
65
Discard
49 Netbios_ssn
66
Efs
50
Urh_i
67
Exec
51
Red_i
68
Group
1
2
2
2
io
at
lic
Value
7
8
9
10
11
ub
eP
Value
tcp
udp
icmp
Field Name
Value
normal
1
apache2
2
back
3
4
land
mailbomp
5
neptune
6
pod
7
processtable
8
smurf
9
teardrop
10
udpstorm
11
ipsweep
12
mscan
13
nmap
14
portsweep
15
saint
16
satan
17
buffer_overflow
18
loadmodue
19
perl
20
in
nl
tO
Field Name
Field Name
S3
RSTOSO
RSTO
SH
OTH
R2L
Probe
Value
1
2
3
4
5
6
Feature selection
Knowledge
ftp
Private
Name
Domain
Daomain_u
http
Smtp
ftp_data
Icmp
Other
Eco_i
Auth
Ecr_i
IRC
X11
Finger
Time
DoS
Probe
U2R
JI
IA
tO
rs
Fi
x nxm
(2)
pod
smurt
back
land
neptune
teardrop
ipsweep
portsweep
nmap
satan
buffer_overflow
loadmodule
perl
rootkit
guess_passwd.
multihop
phf.
ftp_write.
imap.
spv.
warezclient.
warezmaster.
Population Size
Normal
DoS
Probe
U2R
R2L
Summary
5,763
3,530
2,164
70
6,689
18,216
(3)
(4)
as Equation 5:
C=
1 n
xi -
n i=1
)( x i - )
AA
(5)
(6)
4.3. Classification
io
at
lic
ub
eP
1n
x
=1 i
ni
%
21.396
0.066
49.918
0.354
0.006
18.691
0.296
1.200
1.149
0.501
1.617
0.010
0.004
0.002
0.007
3.132
0.006
0.001
0.003
0.004
0.001
0.842
0.795
100
Amount
66,395
206
154,901
1,098
19
58,001
918
3,723
3,564
1,554
5,019
30
13
6
21
9,720
18
4
8
12
2
2,613
2,468
310,313
Total
in
nl
x 11 , ..., x 1m
x 21 , ..., x 2m
= x , ..., x n
=
R2L
Sub Type
C1 C2 ... CM
Matchin
Output Layer
O1 O2 ... ON
Top-down
Input
I1
I2 ... In
JI
IA
Set w ji
rs
Fi
Step1:
Initialize
network
parameters w ji , , , .
= 1, j = 1,2,..., M , i = 1,2,..., N
weights
Select
values
and
for
( )
Tj I =
Iwj
(7)
+w j
the
x = xi
(8)
T j = max{T j : j = 1,2,..., M }
(9)
I w j
I w j
(10)
= Iwj
(old )
+ 1- w j
(old )
(12)
5. Experimental
The proposed method and the other techniques were
simulated on Microsoft windows XP operating system
by using MATLAB toolbox. The parameters
considered in the evaluation phase are: the number of
clusters in SFAM neural net, the number of epochs in
the training phase of SFAM neural net and the
vigilance parameter, presented in Table 13.
Table 13. Parameters of the proposed method.
No. of epochs
(time)
100
Vigilance
parameter ()
0.65
TP
(11)
(13)
TP + FN
FalseAlarmRate =
FP
(14)
FP + TN
Predicted Connection
Attack
Normal
True Positive (TP)
False Negative (FN)
False Positive (FP)
True Negative (TN)
( )
( new )
io
at
lic
MF I,w j =
wj
DetectionRate =
i =1
ub
eP
operator defined as
in
nl
tO
If similar go to Step 7.
Else go to next Step 6.
6. Results
We use accuracy of detection (Detection rate) and
error of detection (False Alarm rate) as performance
[4]
JI
IA
Normal
Probe
DoS
U2R
R2L
Average
Three-level [13]
Detection
False Alarm
rate
rate
94.68 %
5.32 %
93.50 %
6.50 %
98.54 %
1.46 %
97.14 %
2.86 %
48.91 %
51.09 %
86.55 %
13.45 %
[6]
# of
Record
5,763
2,164
3,530
70
6,689
18,216
Miss
5,719
2,091
3,473
62
6,539
17,884
44
73
57
8
150
332
Detection
Rate
99.24 %
96.63 %
98.38 %
88.57 %
97.75 %
96.11 %
False Alarm
Rate
0.76 %
3.37 %
1.62 %
11.43 %
2.25 %
3.89 %
tO
7. Conclusions
Hit
rs
Fi
Normal
Probe
DoS
U2R
R2L
Summary
[5]
[8]
in
nl
[7]
[3]
[12]
[13]
[2]
[11]
io
at
lic
[1]
[10]
ub
eP
References
[9]
[14]
[15]
[16]
[17]
[26]
[27]
IA
[18]
JI
[19]
[25]
[24]
[31]
io
at
lic
[23]
[30]
ub
eP
[22]
[29]
in
nl
[21]
tO
rs
Fi
[20]
[28]
JI
IA
io
at
lic
ub
eP
in
nl
tO
rs
Fi