Sunteți pe pagina 1din 12

Anomaly and Attribution in Networks with Temporally Detection

Correlated Traffic

INTRODUCTION

Anomalies in data network are patterns that deviate from the ‘normal’ expected
behavior of the network. These patterns might consist of suspicious behaviors such
network scans for vulnerable ports/services, attacks such as TCP SYN flooding,
DDoS amplification attacks, etc., or they could also be the result of spurious traffic
caused by network failures . Anomaly detection has many applications in various
areas of research. These include change detection in sensor networks , location
spoofing detection in IoT networks , fraud detection , etc. In this work, we
concentrate on anomalies in data network, but the model and algorithms we
develop can be applied to other domains as well.

There are numerous works that have attempted to solve the problem of network
anomaly detection. In the past works, a number of features with granularities
varying from packets to flows to sessions have been considered (see for a recent
analysis). Similarly, a number of models have been used to analyze these features
with the aim of detecting anomalies in network traffic (see Section II). In our work
here, we concentrate on temporally correlated features of network traffic which we
model via a Markov Chain, with the aim of studying the feasibility and
performance of such a model in detecting network anomalies. In the process, we
choose as our working example one important feature—state transitions of TCP
(Transmission Control Protocol) flows—for modeling traffic data. TCP state
transitions for normal flows are based on TCP’s Finite State Machine (FSM) which
evolves stochastically according to a first order Markov Chain [6], [7]. We note
that our model is general to encompass any feature that can be accurately modelled
as Markovian FSM (e.g.,. traffic rate).

In the simplest model for anomalies, one defines two mutually exclusive
hypotheses: in the first, all the traffic in the network is normal; and in the second,
there are one or more anomalous flows. This constitutes a binary hypotheses
testing, which can be solved via the classical decision theory framework [8], [9].
However, this approach may be too restrictive in many practical cases; since in
practice, we also need to identify the flows that are anomalous, in order to trigger
actions and also for further investigations. The realistic requirement makes the
problem much more difficult as it involves multiple competing models, and this is
the scenario we tackle in this paper. In particular, we consider a multi-flow system
(e.g., traffic of an enterprise network), where a subset of active and parallel traffic
flows may be anomalous. This problem is of practical interest, because it is
important to not only detect an anomalous event, but also to attribute the event to
the set of flows causing it, consequently attributing the event to the end-hosts. To
summarize, the main goal of this paper is to answer the following important
questions:

1) Can we reliably detect the existence of anomalies in temporally correlated


network traffic, which only affects a subset of flows? Also, how subtle an
anomaly can we detect?
2) Can we attribute the detected anomalous event to the specific subset of flows
that caused it? In other words, can we detect which subset of flows are
anomalous?

To the best of our knowledge, these important aspects have not been addressed
using a single model before. Instead, the treatment so far has been to perform a
flow-by-flow detection, [7] being a recent example. This approach is clearly
suboptimal, since the joint density of the flows’ observations do not decompose to
independent flows, but is only conditionally independent, where the conditioning
is on the model parameters (i.e., the transition matrix). As such, a principled
statistical analysis should consider the joint density and develop the detection
algorithms based on this quantity. This kind of joint anomaly detection and
attribution problem can be cast as a combinatorial hypotheses testing, which may
be feasible only when the number of flows is small—the number of hypotheses to
be tested is exponential in the number of flows. If there are n flows, the number of
competing models required to attribute the flows to the anomaly would be 2n; that
is, the search space is exponential in the number of flows.

To deal with this complexity, we frame the problem via a compact model in
which the system contains only two hypotheses: one of normal behaviour, which
describes the case where all the flows are normal; and the other of anomalous
behaviour which describes the case where at least one of the flows is anomalous.
We therefore embed all possible subsets of models which contain anomalous
flows into the alternative hypothesis. This simplifies the problem structure
considerably as we do not need to compare all possible sets of models.
Literature Survey

Title: Distributed detection in sensor networks over fading channels with multiple
antennas at the fusion centre

Author: I. Nevat, G. W. Peters, and I. B. Collings,

Description:

Develop new and optimal algorithms for distributed detection in sensor networks
over fading channels with multiple receive antennas at the Fusion Centre (FC).
Sensors observe a hidden physical phenomenon over fading channels and transmit
their observations using the amplify-and-forward scheme over fading channels to
the FC which is equipped with multiple antennas. We derive the optimal decision
rules and the associated probabilities of detection and false alarm for three
scenarios of Channel State Information (CSI) availability. For the most difficult
case of unknown CSI, we develop two new algorithms to derive the optimal
decision rule. The first is based on a Gaussian approximation method where we
quantify the approximation error and its rate of convergence (to a true Normal
distribution) via a multivariate version of the Berry-Esseen bound. The second is
based on a multivariate Saddle-point (Laplace) approximation which is obtained
via a non-convex optimisation problem which is solved efficiently via Bayesian
Expectation-Maximisation method. We show under which system configuration
which algorithm is suitable and should be used. For cases where the distribution of
the optimal decision rule can not be derived exactly, we develop a Laguerre series
expansion to approximate the resulting distribution. The performance of the
proposed algorithms is evaluated via analytic bounds and numerical simulations.
We show that the detection performance of the proposed algorithms is
significantly superior to a local vote decision fusion based algorithms
Title: Geo-spatial location spoofing detection for Internet of Things

Author: J. Y. Koh, I. Nevat, D. Leong, and W.-C. Wong

Description:

We develop a new location spoofing detection algorithm for geo-spatial tagging


and location-based services in the Internet of Things (IoT), called enhanced
location spoofing detection using audibility (ELSA), which can be implemented at
the backend server without modifying existing legacy IoT systems. ELSA is based
on a statistical decision theory framework and uses two-way time-of-arrival (TW-
TOA) information between the user's device and the anchors. In addition to the
TW-TOA information, ELSA exploits the implicit audibility information (or
outage information) to improve detection rates of location spoofing attacks. Given
TW-TOA and audibility information, we derive the decision rule for the
verification of the device's location, based on the generalized likelihood ratio test.
We develop a practical threat model for delay measurements' spoofing scenarios,
and investigate in detail the performance of ELSA in terms of detection and false
alarm rates. Our extensive simulation results on both synthetic and real-world
datasets demonstrate the superior performance of ELSA compared to conventional
non-audibility-aware approaches.
Title: Spatio-temporal network anomaly detection by assessing deviations of
empirical measures

Author: I. C. Paschalidis and G. Smaragdakis

Description:

Introduce an Internet traffic anomaly detection mechanism based on large


deviations results for empirical measures. Using past traffic traces we characterize
network traffic during various time-of-day intervals, assuming that it is anomaly-
free. We present two different approaches to characterize traffic: (i) a model-free
approach based on the method of types and Sanov's theorem, and (ii) a model-
based approach modeling traffic using a Markov modulated process. Using these
characterizations as a reference we continuously monitor traffic and employ large
deviations and decision theory results to ldquocomparerdquo the empirical
measure of the monitored traffic with the corresponding reference
characterization, thus, identifying traffic anomalies in real-time. Our experimental
results show that applying our methodology (even short-lived) anomalies are
identified within a small number of observations. Throughout, we compare the
two approaches presenting their advantages and disadvantages to identify and
classify temporal network anomalies. We also demonstrate how our framework
can be used to monitor traffic from multiple network elements in order to identify
both spatial and temporal anomalies. We validate our techniques by analyzing real
traffic traces with time-stamped anomalies.
Title: Statistical traffic anomaly detection in time-varying communication
networks

Author: J. Wang and I. C. Paschalidis,

Description:

Propose two methods for traffic anomaly detection in communication networks


where properties of normal traffic evolve dynamically. We formulate the anomaly
detection problem as a binary composite hypothesis testing problem and develop a
model-free and a model-based method, leveraging techniques from the theory of
large deviations. Both methods first extract a family of probability laws (PLs) that
represent normal traffic patterns during different time-periods, and then detect
anomalies by assessing deviations of traffic from these laws. We establish the
asymptotic Newman-Pearson optimality of both methods and develop an
optimization-based approach for selecting the family of PLs from past traffic data.
We validate our methods on networks with two representative time-varying traffic
patterns and one common anomaly related to data exfiltration. Simulation results
show that our methods perform better than their vanilla counterparts, which
assume that normal traffic is stationary.
Title: Parametric methods for anomaly detection in aggregate traffic

Author: G. Thatte, U. Mitra, and J. Heidemann

Description:

This paper develops parametric methods to detect network anomalies using only
aggregate traffic statistics, in contrast to other works requiring flow separation,
even when the anomaly is a small fraction of the total traffic. By adopting simple
statistical models for anomalous and background traffic in the time domain, one
can estimate model parameters in real time, thus obviating the need for a long
training phase or manual parameter tuning. The proposed bivariate parametric
detection mechanism (bPDM) uses a sequential probability ratio test, allowing for
control over the false positive rate while examining the tradeoff between detection
time and the strength of an anomaly. Additionally, it uses both traffic-rate and
packet-size statistics, yielding a bivariate model that eliminates most false
positives. The method is analyzed using the bit-rate signal-to-noise ratio (SNR)
metric, which is shown to be an effective metric for anomaly detection. The
performance of the bPDM is evaluated in three ways. First, synthetically
generated traffic provides for a controlled comparison of detection time as a
function of the anomalous level of traffic. Second, the approach is shown to be
able to detect controlled artificial attacks over the University of Southern
California (USC), Los Angeles, campus network in varying real traffic mixes.
Third, the proposed algorithm achieves rapid detection of real denial-of-service
attacks as determined by the replay of previously captured network traces. The
method developed in this paper is able to detect all attacks in these scenarios in a
few seconds or less.
Modules

Register

Login

Upload and send File

View Traffic

View anomaly detection

Module Description

Register

Register Your Account .A check register, also called a cash disbursements journal,


is the journal used to record all of the checks, cash payments, and outlays of cash
during an accounting period.

Login

A login is a set of credentials used to authenticate a user. Most often,


these consist of a username and password. However, a login may include other
information, such as a PIN number, pass-code, or passphrase. Some logins require
a biometric identifier, such as a fingerprint or retina scans.

Upload and Send File

Transferring a document on the server for their information in the cloud to move
the record starting with one hub then onto the next hub. Send a record to your
Destination.
View Traffic

System traffic alludes to the measure of information moving over a system at a


given purpose of time. System information is for the most part embodied in system
parcels, which give the heap in the system. System traffic is the principle segment
for system traffic estimation, arrange traffic control and recreation.

View anomaly detection

Anomaly location is a significant information investigation task that identifies


irregular or strange information from a given dataset. System conduct oddity
identification (NBAD) is the ceaseless checking of an exclusive system for
surprising occasions or patterns. NBAD is a vital piece of system conduct
investigation (NBA), which offers an extra layer of security to that given by
customary enemy of risk applications, for example, firewalls, antivirus
programming and spyware-discovery programming.
Architecture
Problem Statement

We built up another measurable structure for irregularity recognition in transiently


connected rush hour gridlock correspondence systems, by means of Markov Chain
demonstrating of the checked traffic highlights. We defined the ideal irregularity
recognition issue as the Neyman-Pearson Likelihood Ratio Test and created two
ideal discovery calculations. The primary calculation in light of Cross Entropy, not
just identifies the presence of an oddity, yet in addition credits it to the subset of
streams are strange. The subsequent calculation depends on stream conglomeration
which takes into account a conservative, low dimensional portrayal of the crude
traffic streams. We assessed the discovery, execution of our calculations by means
of broad reenactments utilizing engineered, semi-manufactured and genuine
information, and exhibited that great execution (regarding high discovery
likelihood and low false caution likelihood) is acquired notwithstanding for short
state-ways, just as when the segment of strange streams is less.

S-ar putea să vă placă și