Sunteți pe pagina 1din 19

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/326543226

A highly Customizable Intrusion Dataset Creation Software

Presentation · July 2018


DOI: 10.13140/RG.2.2.18876.74889

CITATIONS READS

0 111

3 authors, including:

Xianbin Wang Nadun N Rajasinghe


The University of Western Ontario The University of Western Ontario
401 PUBLICATIONS   5,817 CITATIONS    5 PUBLICATIONS   2 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

DR-NET View project

Real-time Intrusion Detection System using Artificial Intelligence with Stream Processing View project

All content following this page was uploaded by Nadun N Rajasinghe on 22 July 2018.

The user has requested enhancement of the downloaded file.


INSecS-DCS A HIGHLY CUSTOMIZABLE

Nadun Rajasinghe 1 Jagath Samarabandu 2 Xianbin Wang 3

Copyright © 2018 Nadun Rajasinghe. Some


Rights Reserved.
INSecS-DCS: Intelligent Network Security
Systems - Dataset Creation
System
• Interested in Intrusion Detection Research !!

• Looked into
• IDS in Literature
• Work Done by previous members of research group

• The IDS it trained and tested using Datasets

• If the Dataset was flawed, so was the IDS


➢ A standard dataset is created for conditions available at that time

 The Versions of software and services are different


They Get Out
dated Fast !!  The nature of attacks and attack tools improve with time

 Dataset has what the dataset creators thought was good

➢ Users of the dataset has no choice on customizability

 Once dataset is created and released it’s fixed

 No Choice on the Output format of Dataset

o PCAP in most cases, sometimes CSV

PCAP - Weka

 No choice over the attributes

o Ex. NSL-KDD has 41 attributes that cannot be changed


➢ Title of paper - ID2T: A DIY dataset creation toolkit for Intrusion Detection
Systems

▪ Authors – C. G. Cordero et al. from Tele cooperation Lab, TU


Darmstadt , Germany

▪ They have addressed some of the problems like

o Ability to create Datasets locally, anytime

o Insert attacks of your choice

o Attribute selection
➢ Still not addressed,

▪ Choice on the Input/Output format of Dataset

▪ Make new attributes

Last 100
Ex: Koyoto, NSL-KDD provide Destination Source IP
attributes related to multiple IP

packets present within a


fixed time window
▪ These kind of attributes are not available in datasets that provide
only the PCAP file

▪ Even in the datasets that provide these kind of attributes, you


cannot make new ones or customize them.

o Cannot change the number 100 in the above example from


Kyoto
• Ability to Make a network intrusion dataset at will

• Option to pick attributes

• Get customized attributes for packets in a specified time window

• Specify the output format ( CSV, PCAP)

• Choose Input format

o Use an existing PCAP dataset and make a custom dataset with


custom attributes

o Run the software in a network of choice and make a dataset


specific to that network
TShark is a network protocol analyzer

• It lets you capture packet data from a live network

• Read packets from a previously saved capture file

• TShark's native capture file format is pcap format, which is


also the format used by tcpdump and various other packet
analyzer

Packet Dividing traffic into


Capturing time windows

Collecting individual Collecting overall packet


Packet Pre- information for time
packet information
Processing window

Raw Dataset Processed Dataset


Packet Dividing traffic into
Capturing time windows

Collecting individual Collecting overall packet


Packet Pre- information for time
packet information
Processing window

Raw Dataset Processed Dataset

• Convert the Captured packets into a format suitable for the


algorithm to use easily. In this case, a dictionary of key value
pairs.
Packet Dividing traffic into
Capturing time windows

Collecting individual Collecting overall packet


Packet Pre- information for time
packet information
Processing window

Raw Dataset Processed Dataset

• Information from individual packets are collected by selecting key


value pairs of interest. These include,

• The software comes • Protocols used - TCP, UDP, IP, FTP, SMTP, SSH, SSL, ARP, DHCP,HTTP
with a list of
preselected attributes • Source and destination information - IP address, port numbers
but the user can
customize this.
Packet Dividing traffic into
Capturing time windows

Collecting individual Collecting overall packet


Packet Pre- information for time
packet information
Processing window

Raw Dataset Processed Dataset

• Select a time window and analyzing the traffic flow during that time.

• As opposed to getting information from just the individual packets


here we get information on overall traffic behavior during the time
window and identify common trends in traffic
➢ This allows users to create attributes like the following listed below

Attribute Description
connection pairs The number of different source and destination pairs
num ports number of different port numbers used
src bytes the total amount of source traffic
tcp frame length the total amount of frame bytes for TCP traffic
udp length the total amount of UDP data
num ssl total number of packets containing SSL traffic

➢ Before INSecS-DCS, you had to do this manually to the entire dataset


if you wanted more attributes.
Packet Dividing traffic into
Capturing time windows

Collecting individual Collecting overall packet


Packet Pre- information for time
packet information
Processing window

Raw Dataset Processed Dataset

PCAP file CSV or txt


Capability INSecS-DCS ID2T

Ability to Label dataset Yes Yes

Open Source Yes Yes

Raw PCAP dataset Yes Yes

Has a GUI No Yes

Allows attack injection within the software No Yes

Ability to divide traffic into time window and get Yes No


INSecS-DCS overall traffic attributes

vs Ability to select input method ( packets captured


on a network of choice or get a raw PCAP dataset
Yes No

ID2T toolkit from another source )

Processed dataset that can fed into WEKA and Yes No


other ML tools directly

Attribute selection for processed dataset Yes No


• INSecS-DCS is provided for public use under an MIT license

• Hosted on GitHub - https://github.com/nrajasin/Network-intrusion-dataset-creator


Q&A

View publication stats

S-ar putea să vă placă și