Sunteți pe pagina 1din 23

Phishing-Aware: A Neuro-Fuzzy Approach for Anti-Phishing on Fog Networks

Abstract—

Today search engines are tightly coupled with social networks, and present

users with a double-edged sword: they are able to acquire information interesting to

users but are also capable of spreading viruses introduced by hackers. It is

challenging to characterize how a search engine spreads viruses, since the search

engine serves as a virtual virus pool and creates propagation paths over the

underlying network structure. In this paper, we quantitatively analyze virus

propagation effects and the stability of the virus propagation process in the presence

of a search engine in social networks. First, although social networks have a

community structure that impedes virus propagation, we find that a search engine

generates a propagation wormhole. Second, we propose an epidemic feedback model

and quantitatively analyze propagation effects employing four metrics: infection

density, the propagation wormhole effect, the epidemic threshold, and the basic

reproduction number. Third, we verify our analyses on four real-world data sets and

two simulated data sets. Moreover, we prove that the proposed model has the

property of partial stability. Evaluation results show that, comp black with to a case

without a search engine present; virus propagation with the search engine has a

higher infection density, shorter network diameter, greater propagation velocity,

lower epidemic threshold, and larger basic reproduction number.


Objective:

Search engines supply a highly effective means of information retrieving way.

But the search engine is also a platform for spreading information. Because of these

features, the propagators of malicious code have kept in step with search engines,

building a hidden relationship within them. In recent years, so-called worms have

utilized the search engine to spread themselves across the Web. A search engine is

a quick and easy vehicle for malicious code to locate new targets. But it also has

some down sides. The search engine can cause a single point of failure. Operators

can choose not to return results for the malicious code’s query or even purge the

search engine database of web pages that match the query if they discover they are

being used for malicious purposes. Moreover, the search engine poisoning (SEP)

was applied by malicious software’s that published some vicious and fake pages to

push the page ranking higher and attract more accesses.

Existing System:

Recent years have witnessed the rapid development of viruses, and the wide

variety of security threats caused by viruses has heightened the need to study virus

propagation. For example, a new threat, called the search engine poison has recently

appeablack, exacerbating the situation by spreading viruses like butter in the desert
heat. Hundblacks of thousands, even millions, of people all over the globe have

become victims. In many cases, the search engine plays a vital role in the

propagation of viruses. For example, a user publishes a post on a particular topic in

which the malicious codes are hidden on Facebook; other social network users, such

as Twitter users, may search for that topic and subsequently visit the malicious

Facebook web pages. Through the search engine, the malicious codes are then

propagated from Facebook to Twitter.

Disadvantage:

 Which were utilized by the attacker? Search engines put illegal searching

results before legal searching results.

 Search engines had no strict control ver advertisements and search results

which were utilized by the attacker.

 This manipulated search results to increase advertising revenue, while

allowing attackers to spread malicious codes.

 This caused many identities to be exposed.

Propose System:

Our goal is to address these challenges by analyzing the virus propagation

effects of the search engine, which appears to be a hidden power for virus

propagation. To achieve our research goal, we first need to analyze how a search
engine increases propagation sources and routes in social networks. As a virtual

virus pool, a search engine may contain a lot of viruses to increase propagation

sources: any user accessing web pages may be infected, and thus those activities

increase the propagation routes. Second, we need to quantitatively analyze the

propagation effect of the search engines. In building the specific propagation model

that combines the social network and the search engine, some key metrics of virus

propagation need to be analyzed. Third, we design experiments to verify this

analysis. Data sets of current real social networks should be tested and discussed.

Advantage:

 We discover that a search engine generates a propagation wormhole effect by

delivering virtual virus sources to each community.

 The search engine gathers viruses from a global domain and serves as a virtual

virus pool, and the propagation wormhole spreads those viruses across the

whole network.

 We theoretically analyze the positive feedback epidemic model with and

without the presence of a search engine.

 We find that virus propagation with the search engine has partial stability.
SYSTEM CONFIGURATION:

H/W SYSTEM CONFIGURATION:

 Processor - Intel Pentium

 RAM - 4 Gb (min)

 Hard Disk - 260 GB

S/W SYSTEM CONFIGURATION:

 Operating System - Windows 7/8/10

 Front End - HTML, J2EE

 Database - MySQL

 Database Connectivity - JDBC.

Scope of Project:

The rest of this paper is organized as follows. We present the social network

model and our problem description; the propagation model with the search engine is

established. A theoretical mathematical analysis is presented. Evaluations are

discussed and the related work is briefly listed in Section.

CONCLUSION
With the proliferation of social networks and their ever increasing use, viruses

have become much more prevalent. We investigate the propagation effect of search

engines, and characterize the positive feedback effect and the propagation wormhole

effect. The virtual virus pool and virtual infection paths that are formed by a search

engine make propagation take place much more quickly. We show that propagation

velocity is quicker, infection density is larger, the epidemic threshold is lower and

the basic reproduction number is greater in the presence of a search engine. Finally,

we conduct experiments that verify the propagation effect in terms of both infection

density and virus propagation velocity. Results show the significant influence of a

search engine particularly its ability to accelerate virus propagation in social

networks.

Literature Survey

PAPER: 1

TITLE: Systematization of Knowledge (SoK): A Systematic Review of Software

Based Web Phishing Detection.

AUTHOR: Zuochao Dou, Issa Khalil, Abdallah Khreishah, Ala Al-Fuqaha

PUBLICATION: 2017(IEEE)

CONCEPT DISSCUSSED:
Phishing is a form of cyber-attacks that leverages social engineering approaches and

other sophisticated techniques to harvest personal information from users of

websites. The average annual growth rate of the number of unique phishing websites

detected by Anti Phishing Working Group (APWG) is 36.29% for the past six years

and 97.36% for the past two years.

WORKDONE:

In this paper, we provide a systematic study of existing phishing detection works

from different perspectives. We first describe the background knowledge about the

phishing ecosystem and the state-of-the-art phishing statistics. Then we present a

systematic review of the automatic phishing detection schemes. Specifically, we

provide taxonomy of the phishing detection schemes, discuss the datasets used in

training and evaluating various detection approaches, discuss the features used by

various detection schemes, and discuss the underlying detection algorithms and the

commonly used evaluation metrics.

PROBLEM IDENTIFICATION:

On the other hand, NRTT of both the phishing and the legitimate website are

measured at the same time in real time, and hence, permanent instabilities are not a

concern. Long term instabilities are also not a concern in our problem. This is
because local network congestion does not apply in the case of web hosting servers

compared to web clients who may have poor network connections.

KNOWLEDGE GAINED:

However, it is quite challenging to evaluate the robustness of a feature in a

systematic and measurable way. The importance of the problem has been recognized

by many researchers in other domains as well (e.g. [123], [82], [118]). However, to

the best of our knowledge, none of the existing approaches provide a framework that

can be used to quantitatively evaluate the robustness of features.

GAP:

Therefore, an approach may result in excellent detection accuracy at the time of

design or in its early deployment, but fails miserably later due to either changes in

the dataset or deliberate manipulation of the features utilized by the approach. They

fail to handle large scale datasets and cannot cope with high data rates, frequent

dataset changes, or adaptive attack behaviors. Therefore, machine learning

technologies, which utilize data-driven algorithms, were introduced to help automate

the learning process. Different machine learning algorithms are used.

PAPER: 2

TITLE: A PageRank Based Detection Technique for Phishing Web Sites.


AUTHOR: A.Naga Venkata Sunil, Anjali Sardana

PUBLICATION: 2012(IEEE)

CONCEPT DISSCUSSED:

Phishing is an attempt to acquire one’s information without user’s knowledge by

tricking him by making similar kind of website or sending emails to user which looks

like legitimate site or email. Phishing is a social cyber threat attack, which is causing

severe loss of economy to the user, due to phishing attacks online transaction users

are declining. This paper aims to design and implement a new technique to detect

phishing web sites using Google’s PageRank. Google gives a PageRank value to

each site in the web. This work uses the PageRank value and other features to

classify phishing sites from normal sites. We have collected a dataset of 100 phishing

sites and 100 legitimate sites for our use.

WORKDONE:

In this work, a new technique to detect phishing websites has been designed and

implemented. This is a light weight technique to detect phishing than the techniques

that are discussed in [7] and [10] because these techniques require more

computations. We have considered GTR value as an additional heuristic, because

Google‟s PageRank is more reliable, and for legitimate sites GTR value will be high.

So this technique will easily classify the phished URL‟s. Phishing sites will have
very less GTR value so they can be easily identified as phished sites by using the

values of this heuristic and other five heuristics.

PROBLEM IDENTIFICATION:

In this technique, the submitted url is compared with the ‘blacklist’, if it matches

with a url in a ‘blacklist’ then we can say that the url is a phishing url. The problem

with the ‘blacklisting’ is it doesn’t cover the entire phishing sites.

KNOWLEDGE GAINED:

This technique implemented simple forward linear model for classification. This

technique shows a high accuracy of 98% and showing high True Positive Rate of

0.98 and very low False Positive and Negative Rates of 0.02. In future, the system

can be implemented by adding more heuristics to the technique proposed to attain

high accuracy rate to classify the phishing sites from the legitimate sites. This

technique can be combined with other techniques to build a hybrid technique to

detect phished websites.

GAP:

Before shopping in the net, users need to insert their credit card into the card reader,

and input their PIN code, then the card reader will produce a onetime security

password, users can perform transactions only after the right password is input.
Another method is to use the biometrics characteristic for user authentication. There

is drawback to the method because of the extra hardware that is needed.

PAPER: 3

TITLE3: Software-defined Network Function Virtualization: A Survey

AUTHORS NAME: YONG LI1, (Member, IEEE), AND MIN CHEN2, (Senior

Member, IEEE)

PUBLICATION: 2015 (IEEE)

CONCEPT DISCUSSED:

Network function virtualization (NFV) is proposed to address these issues by

implementing network functions as pure software on commodity and general

hardware. NFV allows flexible provisioning, deployment, and centralized

management of virtual network functions.

WORKDONE:
In this work, they investigate a comprehensive overview of NFV within the

software-defined NFV architecture. They introduced NFV its relationship with

SDN. They also look at the history of NFV, presenting how middle boxes evolve to

virtual network functions. In particular, they choose service chaining as a typical

application of NFV.

PROBLEM IDENTIFICATION:

How to design a flexible and efficiency API for both the north-bound and south-

bound communications are important problems in the research and development of

NFV technologies.

KNOWLEDGE GAINED:

Introduce the software defined NFV architecture as the state of the art of NFV

and present relationships between NFV and SDN. Then, they provide a historic view

of the involvement from middle box to NFV. Finally, they introduce significant

challenges and relevant solutions of NFV, and discuss its future research directions

by different application domains.

GAP:
Network managers would like to consume as much or as little of the network as

they need, but there is a gap between what enterprise customers want and what

service providers can offer today, which can be address by NFV. It enables the

dynamic provisioning of virtual network services on commodity servers within

minutes instead of months.

PAPER: 4

TITLE4: Phishing-alarm: Robust and Efficient Phishing Detection via Page

Component Similarity

AUTHORS NAME: JIAN MAO1, WENQIAN TIAN1, PEI LI1, TAO WEI2,

AND ZHENKAI LIANG3

PUBLICATION: 2017(IEEE)

CONCEPT DISCUSSED:

As a traditional information stealing technique, phishing attacks still work in their

way to cause a lot of privacy violation incidents. In a Web-based phishing attack, an

attacker sets up scam Web pages (pretending to be an important Website such as a

social network portal) to lure users to input their private information, such as

passwords, social security numbers, credit card numbers, and so on. In fact, the

appearance of Web pages is among the most important factors in deceiving users,
and thus, the similarity among Web pages is a critical metric for detecting phishing

Websites.

WORK DONE:

They propose a robust phishing detection approach, Phishing-Alarm, based on CSS

features of web pages. They develop techniques to identify effective CSS features,

as well as algorithms to efficiently evaluate page similarity. They prototyped

Phishing-Alarm as an extension to the Google Chrome browser and demonstrated

its effectiveness in evaluation using real-world phishing samples.

PROBLEM IDENTIFICATION:

Attackers may evade these approaches easily using the images to replace the

corresponding web pages' content fragments, and attackers may also insert the

invisible contents. Both attacks can disable the text-based detection without

affecting the visual layout of the phishing web pages. Rendered-page-based

Mechanisms, evaluate the pages' similarity by comparing the pixels of the rendered

page. Unfortunately, these methods introduce high additional performance cost

during image extraction.


KNOWLEDGE GAINED:

Cascading style sheet (CSS) is the technique to specify page layout across

browser implementations, our approach uses CSS as the basis to accurately quantify

the visual similarity of each page element. As page elements do not have the same

influence to pages, they base our rating method on weighted page-component

similarity. They prototyped their approach in the Google Chrome browser. Their

large-scale evaluation using real-world websites shows the effectiveness of

approach. The proof of concept implementation verifies the correctness and

accuracy of their approach with a relatively low performance overhead.

GAP:

Approaches are not resilient to evasions, where attackers can change the contents

used by the above solution, but still can lure the victim users. Content-based

approaches generally extract content features of web pages to identify suspicious

websites. To deal with such evasion attempts, some solutions compare images of

rendered pages to evaluate their visual similarity.

PAPER 5:

TITLE 5: An Intelligent Anti-phishing Strategy Model for Phishing Website

Detection

AUTHOR: Weiwei Zhuang1,2, Qingshan Jiang2,3*, Tengke Xiong2


PUBLICATION: 2012(IEEE)

CONCEPT DISCUSSED:

As a new form of malicious software, phishing websites appear frequently in

recent years, which cause great harm to online financial services and data security.

In this paper, we design and implement an intelligent model for detecting phishing

websites.

WORK DONE:

we proposed a framework for intelligent phishing website detection via an

ensemble of the prediction results generated by different feature classifiers and a

hierarchical clustering algorithm for phishing categorization. Empirical studies on

large and real daily data sets collected by Kingsoft Internet Security Lab illustrate

that our method performs well in phishing website detection and categorization.

PROBLEM IDENTIFICATION:

Studies which use URL address, domain name information, website ranking,

etc. as the features of the webpage always lead to lower recognition rates; Heuristics

and machine learning methods which use features that contain the text and the

images of the webpage have been introduced


to phishing detection, but most of them have high complexity and high false positive

rates; Most of the current studies were conducted on a small experimental data set,

the robustness and effectiveness of these algorithms on real large-scale data sets

cannot be guaranteed; furthermore, the number of phishing

sites grows very fast, how to identify phishing websites from mass of legitimate

websites in real-time must also be addressed.

KNOWLEDGE GRAINED:

Heterogeneous classifiers are then built based on these different features. We

propose a principled ensemble classification algorithm to combine the predicted

results from different phishing detection classifiers. Hierarchical clustering

technique has been employed for automatic phishing

categorization. Case studies on large and real daily phishing websites collected from

Kingsoft Internet Security Lab demonstrate that our proposed model outperforms

other commonly used anti-phishing methods and tools in phishing website detection.

GAP:

This solution has greatly improved the phishing detection efficiency

compared to the old signature based detection method. In present a white list based

approach that prevents accesses to explicit phishing sites and warns for phishing-
suspicious accesses by the URL similarity check, and a mechanism comparing DNS

query results was proposed to cope with the local and DNS pharming attacks.

PAPER 6:

TITLE: Neural Markers of Cyber security: An fMRI Study of Phishing, and

Malware Warnings

AUTHOR: Ajaya Neupane, Nitesh Saxena, Jose O Maximo, and Rajesh Kana

PUBLICATION: 2016(IEEE)

CONCEPT DISCUSSED: The security of computer systems often relies upon

decisions and actions of end users. In this paper, we set out to investigate users’

susceptibility to cybercriminal attacks by concentrating at the most fundamental

component governing user behavior – the human brain.

WORK DONE:

we presented an fMRI study to bring insights into user-centered security by

focusing on phishing detection and responding to malware warnings. Our results

provide a largely positive perspective towards users’ capability and performance vis-

à-vis these crucial security tasks.

PROBLEM IDENTIFICATION:
We found that, in both phishing detection and malware warnings tasks,

impulsive individuals showed significantly less brain activation and connectivity in

regions governing decision-making and problem solving. This implies that

impulsive behavior might be counter-productive to phishing detection and malware

warnings task performance.

KNOWLEDGE GRAINED:

We found that users showed significant brain activity in key regions known

to govern decision-making, attention, and problem-solving ability (phishing and

malware warnings) as well as language comprehension and reading (malware

warnings). Apart from that, we saw strong functional connectivity in several regions

of the brain while performing the phishing task.

GAP:

Each trial displayed a website snapshot for 6s followed by a gap of 6s. The

experiment started with the set of instructions followed by a fixation for 10s, and

after every 6 trials, a fixation of 10s was displayed on the screen.


Architecture Diagram:

MODULE:

1. Social Networks Model

2. Epidemic Model in Social Networks

3. Propagation Canalization Module

4. Feedback Model
MODULES DESCRIPTION:

SOCIAL NETWORKS MODEL

With the proliferation of social networks and their ever increasing use, viruses have

become much more prevalent. In this module the user login to the application and

use search engine to search any content of data in the application to get the required

data with respective to the keywords entered in the search engine.

EPIDEMIC MODEL IN SOCIAL NETWORKS

The user click the unofficial links and get access the data along with virus which

get affected along with the retrieval of data then application. In a static network,

weakly connected heterogeneous communities can have significantly different

infection levels.

PROPAGATION CANALIZATION MODULE

Results show the significant influence of a search engine particularly its ability to

accelerate virus propagation in social networks. In contrast, adaptation promotes

similar infection levels and alters the network structure so that communities have

more similar average degrees

FEEDBACK MODEL:
Based on the user review, the acceleration of the virus in the official link has been

predicted. Whenever the user visit the any uniform resource locator (URL) through

search engine. They will be redirected to the uniform resource locator (URL), the

user will get the broader details about the link how much it affected or how much it

is safe to access.

S-ar putea să vă placă și