Sunteți pe pagina 1din 4

Volume 4, Issue 3, March 2014 ISSN: 2277 128X

International Journal of Advanced Research in


Computer Science and Software Engineering
Research Paper
Available online at: www.ijarcsse.com
Survey on Insider Threat Detection
Aruna Singh*, Mrs. Smita Shukla Patel
IT, SKNCOE, Pune University
India

Abstract— Collaborative Information Systems (CIS) contains aggregated information at one place. This information
can be accessed by various categories of users for distinct or common purposes. As the users are diverse, security of
system is an issue of utmost concern. Security issues can be both from outside world i.e. outsider threat as well as
from authorized users of the system i.e. insider threat which proves even more difficult to detect. For outsider threat
detection various access control models are implemented in almost all the information systems which grant privileges
according to user’s rights which are again based on their roles. These roles are assumed to be known in advance. This
situation conflicts with the idea of collaborative information system where the roles may change according to the
shifting needs of the organization. Insider threat deals with authorized users trying to access subjects for other than
genuine reasons. This paper discusses the diverse research work done for anomaly detection. The exhaustive study is
done covering numerous approaches as graph based detection, statistical methods, pattern based findings etc. But it
emphasizes on insider threat detection which is more probable in case of collaborative information system.

Keywords— Anomaly Detection, CAD, CIS, k nearest neighbor, threat.

I. INTRODUCTION
Collaborative information systems allow users to cooperate over various tasks. It also provides them with much broader
access privileges which prove beneficial in using the system. But the larger scope of access rights also results in
encouraging illegitimate as well as immoral use of information. As a result much work has been done to provide proper
access control which helps in providing security from outsider threat. Role based access control or experience based
access control is also implemented to gain more flexibility. Insider threat detection has evolved much lesser compared to
the outsider threat detection techniques. It is also difficult to detect, as the threats are authorized users who are
performing some activities which are not genuine. The insider threat detection mainly uses the anomaly detection
methods because of causing anomalous behaviour. Anomaly is understood as a pattern in the data that does not behave as
expected. It is also referred to as outliers, exceptions, peculiarities, surprise, etc. There are techniques in data mining too
for this type of threat detection. These are of two types supervised learning and unsupervised learning. Supervised
learning makes use of training set data which has to be available in advance. It has many problems e.g. necessity of
labelled data and inability to detect rare events. In contrast unsupervised learning methods do not require any such prior
information. Its success only depends on the proper selection of similarity measures, feature selection etc. This approach
is very helpful in detecting rare events called as outliers or exceptions by calculating the deviation from expected
behaviour.
Various systems such as specialized network anomaly detection system, community anomaly detection framework which
is the base of this project use the anomaly detection approach to detect the unnatural activities of authenticated users.
Generally there are two types of solutions which consider the insider threat. The first one uses access rules within the
system to control the user’s activities. The second method involves the review of patterns in which the user has behaved.
When there is a large organization, even if there are access rights provided, still the information can be gathered and
leaked in a wrong way. The two types are explained below.
A. Prevention of the insider threat
The prevention of insider threat mostly makes use of the access control frameworks to control the activities of
users..Almost all the access control frameworks check whether the request given to the system satisfies the access rights
provided to the user and also whether it agrees with the set of predefined rules. The problem with this kind of access
control is that it assumes the system to be static i.e. the users and system itself behaves in the same manner. But the
dynamic nature of Collaborative Information Systems is difficult to manage by implementing such kind of scenario. The
roles and task division are not always very feasible or easy in CIS. It may require a more flexible definition.
B. Detection of the insider threat
The approach discussed previously defines some zones where the user can act and get the required information from the
system. It is however very much possible to perform illicit activities in the allowed domain too i.e. users authorized for a
given zone can also do activities which is unethical but practically possible. The information from the authorized zones
can be misused. Also there are again two types of internal users which can harm the system. They are the following
1) Masqueraders
2) Traitors

© 2014, IJARCSSE All Rights Reserved Page | 844


Singh et al., International Journal of Advanced Research in Computer Science and Software Engineering 4(3),
March - 2014, pp. 844-847
The masqueraders are the most familiar type of insider threat. They have very little knowledge of the system and the kind
of behavior it will exhibit while some unusual access of subjects. These users are the ones who might be trying to explore
for gathering some extra knowledge or can be the users whose accounts have been compromised. As they are not familiar
with the behavior of the system their unusual access may lead to a distinct deviation from the normal patterns. This
makes it easier to detect the unnatural activity. In the other case, a traitor is a user who has the proper knowledge of the
system, its behavior in case of unusual accesses. So this type of user makes sure that there is no unusual deviation from
the common patterns .Therefore it is very difficult to detect such type of insider threats and much more precise analysis
of patterns is required. The project is implements the detection of masqueraders.

II. RELATED WORK


A recent work done to detect the insider anomaly makes use of a framework called as community anomaly framework
detection (CADS) which infers the user communities from the access logs of the users. It then uses the nearest neighbour
algorithm to find the deviation from other patterns. Any unusual deviation points to detection of insider threat [1].
Specialised network anomaly detection is a work implemented earlier. It tries to create patterns or communities based on
the access of various users on one subject. It then checks whether the community pattern with and without that user is
sufficiently different. If there is a remarkable difference than earlier community, it points out that the corresponding user
is creating an anomaly. This approach creates a model based on the subject and the users who try to access that particular
subject. This model works under the hypothesis that if a user is accessing a particular subject for some anomalous reason
than a genuine one, it will definitely mismatch with the existing pattern and its similarity with the network will be much
lesser than the other genuine users. This approach is termed as specialized network anomaly detection because it only
considers the local view of the system i.e. it considers only one subject and the users around it. The system does give an
idea of the complete view of the network as a whole including all the users and all the subjects. The specialized network
anomaly detection method gets its required information from the access logs of the users. This system has basically two
components:
1) Similarity Management
2) Anomaly Evaluation
The first step i.e. the similarity management brings out the access network of the users based on a particular subject. It
does so for every subject individually. It tries to extract some general representation of the social network rather than
focussing on the individual features.
The second step evaluates the anomaly based on comparing the new access network with the already existing sub
network. It is assumed after studies that if it is a genuine access then there will not be very distinct network i.e. the newly
constructed access network will be of the similar pattern. It also implies that the behaviour of the access network does not
change much. If the new network is very distinct then it is an anomaly [2].
Another work dealt by making use of role based access control and experience based access management. In this work,
the concept called as “role prediction” is developed. This concept refers to the ability to learn whether a particular user
satisfies a particular role due to its access pattern. This concept is very useful for role engineers who need to think about
the various roles within an organization. Rather than thinking on their own, it proves comparatively easier to take some
related data of usage and analyze it for extracting roles. It also acts as a valuable tool in dedicating assignments to the
extracted roles. Also it helps with the situations where two roles are being confused upon. In such cases the roles can be
merged. It also relates to experience based access management as roles are being extracted on the basis of experienced
data. It is very helpful in dynamic environments where it is not easier to predict the roles at one time only. The work has
major contribution as follows:
A. Role classification
It tries to find out that up to what extent the already existing expert defined role titles help in distinguishing the various
roles. Also helps in finding out the accuracies of role specifications.
B. Intelligent role abstraction
This hypothesis suggested that certain abstractions could lead to more meaningful role definitions. An algorithm called as
“roll up” algorithm was developed in this context which removes the duplications.
C. Empirical evaluation
This was done to analyze that how role prediction performs and how the roles can be optimized.
The above contribution made it clear that role prediction can very effectively take help from the access logs and other
information. Also it provides help in generalizing the roles making the system more effective without losing any kind of
specificity of the system [3].
There is another related implementation which makes use of the access transactions. It is implemented to review the
already existing access control policies by making use of auditing access logs. It helps in assessing the existing policies
as well as figuring out some unknown behavior if present. This implementation also has two steps as follows:
1) Network Construction
2) Association Rule Discovery
In the first step access transactions are converted to graphs and a general view of user interaction with the system is
found out. The inference of the underlying network makes it easier to proceed to the next step. In the next part, the
relationships are converted to association rules which depict it pair wise. It explains the relative frequency of a user in
view of accessing records. Also it helps in generating the likelihood of a user accessing a record based on the observation
that another user accessed the same record. In this way the graph helps in extracting the relations between users and

© 2014, IJARCSSE All Rights Reserved Page | 845


Singh et al., International Journal of Advanced Research in Computer Science and Software Engineering 4(3),
March - 2014, pp. 844-847
subjects as well as between similar patterns. It proves to be very useful in new policies generation and also checking
abrupt accesses [4].
All the above mentioned used some techniques and representations to make the anomaly detection easier. There have
been previous works based on graph based anomaly detection. It becomes very much easier to find out the anomalies if
the relationship is depicted as a graph. Each vertex or edge in a graph corresponds to some relationship between the users
and subjects accessed. In one of these implementations the idea is that a graph substructure is showing an anomaly if it is
not isomorphic to the graph’s normative substructure by X%. In such cases X shows the percentage of vertices and edges
which need to be added/removed so as to make the graph isomorphic. There are basically three categories of graph
anomalies:
1) Insertions: These constitute the presence of an unexpected edge or vertex which has to be removed.
2) Deletions: These constitute a missing edge or vertex which has to be added to fix the graph.
3) Modifications: These can constitute both of the above as well as changing the vertices of an edge [5].
Another concept implemented is that of incremental outlier detection. In incremental outlier detection the techniques
identifies the anomaly as soon as it takes place. This approach is also used for activity monitoring. An incremental outlier
detection algorithm depends on the numbers of nearest neighbors surrounding it. The work further proved that insertion
of a new data or deletion of the same affects to only the few neighbors which are very close to it. Thus it makes clear that
any insertion or deletion operation does not affect the no of updates required [6].
The graph based approaches generally implemented bipartite graphs. A bipartite graph can be partitioned in a disjoint
manner in two sets. Such graphs make it very simple to find out the association rules. For example, traders vs. stocks etc.
Such graphs can support the operations for similar neighbor identification as well as anomalous insider detection [7].
Statistical methods observe the activities of subjects and the profiles are generated to represent the behavior. The profile
consists of various measures as distribution of access of records as well as CPU utilized during a particular transaction.
This work implements by basically keeping two profiles as the previously stored as well as the current profile. Such
intrusion detection systems regularly update the current profile and after a particular duration compare it with the stored
profile. An anomaly score is calculated, which is the difference between the current and the stored profile. A threshold is
already decided for detecting the anomaly. If the calculated anomaly score reaches that score or is higher, then instantly
an anomaly is detected [8].
Insider detection is based on scenarios of anomalous behaviour, various temporal sequences and the unfamiliar graph
evolution etc. There are various algorithms and representations for dynamic graph processing. This was done in view of
scaling well with the business level requirements and deployments in real time data streams [9].

III. CADS FRAMEWORK


The community anomaly detection CAD framework doesn’t depend on any prior knowledge of the roles of users or any
other extra information for detecting the anomalies. It has two basic steps which not only cover anomaly detection but
also the processing of data required prior to anomaly detection. These are
1) Graph based Community Extraction
2) K Nearest Neighbor based anomaly detection
A. Community extraction
As the users access various subjects in a collaborative information system, a bipartite graph is created depicting the user
subject access. Based on the graph, communities are extracted. The users accessing the similar subjects fall under the
same community. The extracted communities are helpful in anomaly detection.
B. Anomaly detection
Anomaly detection compares the behaviours of the users to the communities inferred earlier. This is based on the K
nearest neighbour algorithm which implements the idea of comparing each user with its nearest neighbour. If there is any
user which does not come under any community, then it is considered to be anomalous [10].

IV. CONCLUSIONS
In this paper various approaches for insider threat detection have been discussed. Insider threat detection has gained
much importance lately due to the rapidly increasing use of common data by various users. It has been observed during
the survey that making data available exclusively to distinct users is not possible any more. Due to this reason various
users hold access for common data. This makes the detection of anomalous user very difficult. The work done in this
field has covered various ways to handle this situation by applying the techniques mentioned above. But still there is
much scope in this field to enhance the efficiency of existing frameworks .This may include modifying the algorithms
used, combing one or more approaches or bringing out an altogether new method.

REFERENCES
[1] Y. Chen and B. Malin, "Detection of anomalous insiders in collaborative environments via relational analysis of
access logs" Proceedings of the first ACM conference on Data and application security and privacy, pp. 63-74,
2011.
[2] Y. Chen, S. Nyemba, W. Zhang, and B. Malin, “Leveraging Social Networks to Detect Anomalous Insider Actions
in Collaborative Environments,” Proc. IEEE Ninth Intelligence and Security Informatics, pp. 119-124, 2011.
[3] W. Zhang, C. Gunter, D. Liebovitz, J. Tian, and B. Malin, “Role prediction Using Electronic Medical Record
System Audits,” Proc. Ann. Symp. Am. Medical Informatics Assoc., pp. 858-867, 2011.

© 2014, IJARCSSE All Rights Reserved Page | 846


Singh et al., International Journal of Advanced Research in Computer Science and Software Engineering 4(3),
March - 2014, pp. 844-847
[4] B. Malin, S. Nyemba, and J. Paulett, “Learning Relational Policies from Electronic Health Record Access Logs,” J.
Biomedical Informatics, vol. 44, no. 2, pp. 333-342, 2011.
[5] W. Eberle and L. Holder, “Applying Graph-Based Anomaly Detection Approaches to the Discovery of Insider
Threats,” Proc. IEEE Int’l Conf. Intelligence and Security Informatics, pp. 206-208, 2009.
[6] D. Pokrajac, A. Lazarevic, and L. Latecki, “Incremental Local Outlier Detection for Data Streams,” Proc. IEEE
Symp. Computational Intelligence and Data Mining, pp. 504-515, 2007.
[7] J. Sun, H. Qu, D. Chakrabarti, and C. Faloutsos, “Neighborhood Formation and Anomaly Detection in Bipartite
Graph,” Proc. IEEE Fifth Int’l Conf. Data Mining, pp. 418-425, 2005.
[8] A. Patcha and J. M. Park, “An overview of anomaly detection techniques: Existing solutions and latest
technological trends”, Computer Networks, 51(12):3448–3470, 2007.
[9] T. E. Senator, H. G. Goldberg, A. Memory, W. T. Young, B. Rees, R. Pierce, D. Huang, M. Reardon, D. A. Bader,
E. Chow, I. Essa, J. Jones, V. Bettadapura, D. H. Chau, O. Green, O. Kaya, A. Zakrzewska, E.Briscoe, R. I. L.
Mappus, R. McColl, L. Weiss, T. G. Dietterich, A. Fern, W. Wong, S. Das, A. Emmott, J. Irvine, J. Lee, D. Koutra,
C. Faloutsos, D. Corkill, L. Friedland, A. Gentzel, and D. Jensen (2013), “Detecting insider threats in a real
corporate database of computer usage activity,” in Proceedings of the 19th ACM SIGKDD international conference
on Knowledge discovery and data mining, New York, NY, USA, pp. 1393-1401. 2013.
[10] Y. Chen, S. Nyemba, and B. Malin, "Detecting Anomalous Insiders in Collaborative Information System”
Dependable and Secure Computing, IEEE Transactions on , vol.9, no.3, pp.332,344, May-June 2012.

© 2014, IJARCSSE All Rights Reserved Page | 847

S-ar putea să vă placă și