Documente Academic
Documente Profesional
Documente Cultură
INTRODUCTION
With 20 million installs a day, third-party apps are a major reason for the popularity and
addictiveness of Facebook. Unfortunately, hackers have realized the potential of using apps for
spreading malware and spam. The problem is already significant, as we find that at least 13% of
apps in our dataset are malicious. So far, the research community has focused on detecting
malicious posts and campaigns. In this paper, we ask the question: Given a Facebook
application, can we determine if it is malicious? Our key contribution is in developing
FRAppE—Facebook’s Rigorous Application Evaluator—arguably the first tool focused on
detecting malicious apps on Facebook. To develop FRAppE, we use information gathered by
observing the posting behavior of 111K Facebook apps seen across 2.2 million users on
Facebook. First, we identify a set of features that help us distinguish malicious apps from benign
ones. For example, we find that malicious apps often share names with other apps, and they
typically request fewer permissions than benign apps. Second, leveraging these distinguishing
features, we show that FRAppE can detect malicious apps with 99.5% accuracy, with no false
positives and a high true positive rate (95.9%). Finally, we explore the ecosystem of malicious
Facebook apps and identify mechanisms that these apps use to propagate. Interestingly, we find
that many apps collude and support each other; in our dataset, we find 1584 apps enabling the
viral propagation of 3723 other apps through their posts. Long term, we see FRAppE as a step
toward creating an independent watchdog for app assessment and ranking, so as to warn
Facebook users before installing apps.
Wikipedia defines a social network service as a service which “focuses on the building and
verifying of online social networks for communities of people who share interests and activities,
or who are interested in exploring the interests and activities of others, and which necessitates the
use of software.”
A report published by OCLC provides the following definition of social networking sites: “Web
sites primarily designed to facilitate interaction between users who share interests, attitudes and
activities, such as Facebook, Mixi and MySpace.”
1.2 What Can Social Networks Be Used For?
The popularity and ease of use of social networking services have excited institutions with their
potential in a variety of areas. However effective use of social networking services poses a
number of challenges for institutions including long-term sustainability of the services; user
concerns over use of social tools in a work or study context; a variety of technical issues and
legal issues such as copyright, privacy, accessibility; etc.
Institutions would be advised to consider carefully the implications before promoting significant
use of such services.
What is networking?
Networking is the word basically relating to computers and their connectivity. It is very often
used in the world of computers and their use in different connections. The term networking
implies the link between two or more computers and their devices, with the vital purpose of
sharing the data stored in the computers, with each other. The networks between the computing
devices are very common these days due to the launch of various hardware and computer
software which aid in making the activity much more convenient to build and use.
General Network Techniques - When computers communicate on a network, they send out
data packets without knowing if anyone is listening. Computers in a network all have a
connection to the network and that is called to be connected to a network bus. What one
computer sends out will reach all the other computers on the local network.
Above diagrams show the clear idea about the networking functions
For the different computers to be able to distinguish between each other, every computer has a
unique ID called MAC-address (Media Access Control Address). This address is not only unique
on your network but unique for all devices that can be hooked up to a network. The MAC-
address is tied to the hardware and has nothing to do with IP-addresses. Since all computers on
the network receives everything that is sent out from all other computers the MAC-addresses is
primarily used by the computers to filter out incoming network traffic that is addressed to the
individual computer.
When a computer communicates with another computer on the network, it sends out both the
other computers MAC-address and the MAC-address of its own. In that way the receiving
computer will not only recognize that this packet is for me but also, who sent this data packet so
a return response can be sent to the sender.
On an Ethernet network: as described here, all computers hear all network traffic since
they are connected to the same bus. This network structure is called multi-drop.
One problem with this network structure is that when you have, let say ten (10) computers on a
network and they communicate frequently and due to that they sends out there data packets
randomly, collisions occur when two or more computers sends data at the same time. When that
happens data gets corrupted and has to be resent. On a network that is heavy loaded even the
resent packets collide with other packets and have to be resent again. In reality this soon
becomes a bandwidth problem. If several computers communicate with each other at high speed
they may not be able to utilize more than 25% of the total network bandwidth since the rest of
the bandwidth is used for resending previously corrupted packets. The way to minimize this
problem is to use network switches.
Characteristics of Networking:
The following characteristics should be considered in network design and ongoing maintenance:
6) Scalability defines how well the network can adapt to new growth, including new users,
applications, and network components.
7) Topology describes the physical cabling layout and the logical way data moves between
components.
Organizations of different structures, sizes, and budgets need different types of networks.
Networks can be divided into one of two categories:
peer-to-peer
server-based networks
1. Peer-to-Peer Network:
A peer-to-peer network has no dedicated servers; instead, a number of workstations are
connected together for the purpose of sharing information or devices. Peer-to-peer networks are
designed to satisfy the networking needs of home networks or of small companies that do not
want to spend a lot of money on a dedicated server but still want to have the capability to share
information or devices like in school, college, cyber cafe
2. Server-Based Networks:
In server-based network data files that will be used by all of the users are stored on the one
server. With a server-based network, the network server stores a list of users who may use
network resources and usually holds the resources as well.
This will help by giving you a central point to set up permissions on the data files, and it will
give you a central point from which to back up all of the data in case data loss should occur.
Network Communications:
Computer networks use signals to transmit data, and protocols are the languages computers
use to communicate.
Protocols provide a variety of communications services to the computers on the network.
Local area networks connect computers using a shared, half-duplex, baseband medium, and
wide area networks link distant networks.
Enterprise networks often consist of clients and servers on horizontal segments connected by a
common backbone, while peer-to-peer networks consist of a small number of computers on a
single LAN.
Advantages of Networking:
KRISHNA CHAITANYA DEGREE AND PG COLLEGE Page 7
Detecting Malicious facebook Applications
1. Easy Communication:
It is very easy to communicate through a network. People can communicate efficiently using a
network with a group of people. They can enjoy the benefit of emails, instant messaging,
telephony, video conferencing, chat rooms, etc.
This is one of the major advantages of networking computers. People can find and share
information and data because of networking. This is beneficial for large organizations to
maintain their data in an organized manner and facilitate access for desired people.
3. Sharing Hardware:
Another important advantage of networking is the ability to share hardware. For an example, a
printer can be shared among the users in a network so that there’s no need to have individual
printers for each and every computer in the company. This will significantly reduce the cost of
purchasing hardware.
4. Sharing Software:
Users can share software within the network easily. Networkable versions of software are
available at considerable savings compared to individually licensed version of the same software.
Therefore large companies can reduce the cost of buying software by networking their
computers.
5. Security:
Sensitive files and programs on a network can be password protected. Then those files can only
be accessed by the authorized users. This is another important advantage of networking when
there are concerns about security issues. Also each and every user has their own set of privileges
to prevent those accessing restricted files and programs
6. Speed:
Sharing and transferring files within networks is very rapid, depending on the type of network.
This will save time while maintaining the integrity of files.
SYSTEM OVERVIEW
HARDWARE REQUIREMENTS:
SOFTWARE REQUIREMENTS:
SYSTEM STUDY
FEASIBILITY STUDY
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the system will
have on the organization. The amount of fund that the company can pour into the
research and development of the system is limited. The expenditures must be
justified. Thus the developed system as well within the budget and this was
achieved because most of the technologies used are freely available. Only the
customized products had to be purchased.
TECHNICAL FEASIBILITY
This study is carried out to check the technical feasibility, that is, the
technical requirements of the system. Any system developed must not have a high
demand on the available technical resources. This will lead to high demands on the
available technical resources. This will lead to high demands being placed on the
client. The developed system must have a modest requirement, as only minimal or
null changes are required for implementing this system.
SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by the
user. This includes the process of training the user to use the system efficiently.
The user must not feel threatened by the system, instead must accept it as a
necessity. The level of acceptance by the users solely depends on the methods that
are employed to educate the user about the system and to make him familiar with
it. His level of confidence must be raised so that he is also able to make some
constructive criticism, which is welcomed, as he is the final user of the system.
SYSTEM ANALYSIS
EXISTING SYSTEM:
So far, the research community has paid little attention to OSN apps specifically. Most
research related to spam and malware on Facebook has focused on detecting malicious
posts and social spam campaigns.
Gao et al. analyzed posts on the walls of 3.5 million Facebook users and showed that
10% of links posted on Facebook walls are spam. They also presented techniques to
identify compromised accounts and spam campaigns.
Yang et al. and Benevenuto et al. developed techniques to identify accounts of spammers
on Twitter. Others have proposed a honey-pot-based approach to detect spam accounts on
OSNs.
Yardi et al. analyzed behavioral patterns among spam accounts in Twitter.
Chia et al.investigate risk signaling on the privacy intrusiveness of Facebook apps and
conclude that current forms of community ratings are not reliable indicators of the
privacy risks associated with an app.
PROPOSED SYSTEM:
The proposed work is arguably the first comprehensive study focusing on malicious
Facebook apps that focuses on quantifying, profiling, and understanding malicious apps
and synthesizes this information into an effective detection approach.
Several features used by FRAppE, such as the reputation of redirect URIs, the number of
required permissions, and the use of different client IDs in app installation URLs, are
robust to the evolution of hackers.
Software Environment
Java Technology
Simple
Architecture neutral
Object oriented
Portable
Distributed
High performance
Interpreted
Multithreaded
Robust
Dynamic
Secure
You can think of Java byte codes as the machine code instructions for the
Java Virtual Machine (Java VM). Every Java interpreter, whether it’s a
development tool or a Web browser that can run applets, is an implementation of
the Java VM. Java byte codes help make “write once, run anywhere” possible. You
can compile your program into byte codes on any platform that has a Java
compiler. The byte codes can then be run on any implementation of the Java VM.
That means that as long as a computer has a Java VM, the same program written in
the Java programming language can run on Windows 2000, a Solaris workstation,
or on an iMac.
Native code is code that after you compile it, the compiled code runs
on a specific hardware platform. As a platform-independent environment,
the Java platform can be a bit slower than native code. However, smart
compilers, well-tuned interpreters, and just-in-time byte code compilers can
bring performance close to that of native code without threatening
portability.
What Can Java Technology Do?
The most common types of programs written in the Java programming
language are applets and applications. If you’ve surfed the Web, you’re
probably already familiar with applets. An applet is a program that adheres
to certain conventions that allow it to run within a Java-enabled browser.
However, the Java programming language is not just for writing cute,
entertaining applets for the Web. The general-purpose, high-level Java
programming language is also a powerful software platform. Using the
generous API, you can write many types of programs.
An application is a standalone program that runs directly on the Java
platform. A special kind of application known as a server serves and
supports clients on a network. Examples of servers are Web servers, proxy
servers, mail servers, and print servers. Another specialized program is a
servlet. A servlet can almost be thought of as an applet that runs on the
server side. Java Servlets are a popular choice for building interactive web
applications, replacing the use of CGI scripts. Servlets are similar to applets
in that they are runtime extensions of applications. Instead of working in
browsers, though, servlets run within Java Web servers, configuring or
tailoring the server.
How does the API support all these kinds of programs? It does so with
packages of software components that provides a wide range of
functionality. Every full implementation of the Java platform gives you the
following features:
The essentials: Objects, strings, threads, numbers, input and output,
data structures, system properties, date and time, and so on.
Applets: The set of conventions used by applets.
Networking: URLs, TCP (Transmission Control Protocol), UDP
(User Data gram Protocol) sockets, and IP (Internet Protocol)
addresses.
Internationalization: Help for writing programs that can be localized
for users worldwide. Programs can automatically adapt to specific
locales and be displayed in the appropriate language.
Security: Both low level and high level, including electronic
signatures, public and private key management, access control, and
certificates.
Software components: Known as JavaBeansTM, can plug into
existing component architectures.
Object serialization: Allows lightweight persistence and
communication via Remote Method Invocation (RMI).
Java Database Connectivity (JDBCTM): Provides uniform access to
a wide range of relational databases.
KRISHNA CHAITANYA DEGREE AND PG COLLEGE Page 18
Detecting Malicious facebook Applications
The Java platform also has APIs for 2D and 3D graphics, accessibility,
servers, collaboration, telephony, speech, animation, and more. The
following figure depicts what is included in the Java 2 SDK.
We can’t promise you fame, fortune, or even a job if you learn the Java
programming language. Still, it is likely to make your programs better and
requires less effort than other languages. We believe that Java technology
will help you do the following:
Get started quickly: Although the Java programming language is a
powerful object-oriented language, it’s easy to learn, especially for
programmers already familiar with C or C++.
Write less code: Comparisons of program metrics (class counts,
method counts, and so on) suggest that a program written in the Java
programming language can be four times smaller than the same
program in C++.
ODBC
JDBC
In an effort to set an independent database standard API for Java; Sun
Microsystems developed Java Database Connectivity, or JDBC. JDBC offers a
generic SQL database access mechanism that provides a consistent interface to a
variety of RDBMSs. This consistent interface is achieved through the use of “plug-
in” database connectivity modules, or drivers. If a database vendor wishes to have
JDBC support, he or she must provide the driver for each platform that the
database and Java run on.
To gain a wider acceptance of JDBC, Sun based JDBC’s framework on
ODBC. As you discovered earlier in this chapter, ODBC has widespread support
on a variety of platforms. Basing JDBC on ODBC will allow vendors to bring
JDBC drivers to market much faster than developing a completely new
connectivity solution.
JDBC was announced in March of 1996. It was released for a 90 day public
review that ended June 8, 1996. Because of user input, the final JDBC v1.0
specification was released soon after.
The remainder of this section will cover enough information about JDBC for you
to know what it is about and how to use it effectively. This is by no means a
complete overview of JDBC. That would fill an entire book
.
JDBC Goals
Few software packages are designed without goals in mind. JDBC is one
that, because of its many goals, drove the development of the API. These goals, in
conjunction with early reviewer feedback, have finalized the JDBC class library
into a solid framework for building database applications in Java.
The goals that were set for JDBC are important. They will give you some
insight as to why certain classes and functionalities behave the way they do. The
eight design goals for JDBC are as follows:
5. Keep it simple
This goal probably appears in all software design goal listings. JDBC is no
exception. Sun felt that the design of JDBC should be very simple, allowing for
only one method of completing a task per mechanism. Allowing duplicate
functionality only serves to confuse the users of the API.
6. Use strong, static typing wherever possible
Strong typing allows for more error checking to be done at compile time;
also, less error appear at runtime.
7. Keep the common cases simple
Because more often than not, the usual SQL calls used by the programmer
are simple SELECT’s, INSERT’s, DELETE’s and UPDATE’s, these queries
should be simple to perform with JDBC. However, more complex SQL
statements should also be possible.
Java ha two things: a programming language and a platform. Java is a high-level
programming language that is all of the following
Simple Architecture-neutral
Object-oriented Portable
Distributed High-performance
Interpreted multithreaded
Robust Dynamic
Secure
Java is also unusual in that each Java program is both compiled and interpreted.
With a compile you translate a Java program into an intermediate language called
Java byte codes the platform-independent code instruction is passed and run on
the computer.
Compilation happens just once; interpretation occurs each time the program is
executed. The figure illustrates how this works.
Compilers My Program
You can think of Java byte codes as the machine code instructions for the Java
Virtual Machine (Java VM). Every Java interpreter, whether it’s a Java
development tool or a Web browser that can run Java applets, is an
implementation of the Java VM. The Java VM can also be implemented in
hardware.
Java byte codes help make “write once, run anywhere” possible. You can compile
your Java program into byte codes on my platform that has a Java compiler. The
byte codes can then be run any implementation of the Java VM. For example, the
same Java program can run Windows NT, Solaris, and Macintosh.
Networking
TCP/IP stack
The TCP/IP stack is shorter than the OSI one:
IP datagram’s
The IP layer provides a connectionless and unreliable delivery system.
It considers each datagram independently of the others. Any association
between datagram must be supplied by the higher layers. The IP layer
supplies a checksum that includes its own header. The header includes the
source and destination addresses. The IP layer handles routing through an
Internet. It is also responsible for breaking up large datagram into smaller
ones for transmission and reassembling them at the other end.
UDP
UDP is also connectionless and unreliable. What it adds to IP is a
checksum for the contents of the datagram and port numbers. These are
used to give a client/server model - see later.
TCP
TCP supplies logic to give a reliable connection-oriented protocol
above IP. It provides a virtual circuit that two processes can use to
communicate.
Internet addresses
In order to use a service, you must be able to find it. The Internet uses
an address scheme for machines so that they can be located. The address
is a 32 bit integer which gives the IP address. This encodes a network ID
and more addressing. The network ID falls into various classes according
to the size of the network address.
Network address
Class A uses 8 bits for the network address with 24 bits left over for
other addressing. Class B uses 16 bit network addressing. Class C uses 24
bit network addressing and class D uses all 32.
Subnet address
Internally, the UNIX network is divided into sub networks. Building 11
is currently on one sub network and uses 10-bit addressing, allowing 1024
different hosts.
Host address
8 bits are finally used for host addresses within our subnet. This places
a limit of 256 machines that can be on the subnet.
Total address
Port addresses
A service exists on a host, and is identified by its port. This is a 16 bit
number. To send a message to a server, you send it to the port for that
service of the host that it is running on. This is not location transparency!
Certain of these ports are "well known".
Sockets
A socket is a data structure maintained by the system to handle network
connections. A socket is created using the call socket. It returns an integer
that is like a file descriptor. In fact, under Windows, this handle can be
used with Read File and Write File functions.
#include <sys/types.h>
#include <sys/socket.h>
int socket(int family, int type, int protocol);
Here "family" will be AF_INET for IP communications, protocol will
be zero, and type will depend on whether TCP or UDP is used. Two
Implement a new (to JFreeChart) feature for interactive time series charts --- to
display a separate control that shows a small version of ALL the time series
data, with a sliding "view" rectangle that allows you to select the subset of the
time series data to display in the main chart.
3. Dashboards
4. Property Editors
MIME Type or Content Type: If you see above sample HTTP response header, it
contains tag “Content-Type”. It’s also called MIME type and server sends it to
client to let them know the kind of data it’s sending. It helps client in rendering the
data for user. Some of the mostly used mime types are text/html, text/xml,
application/xml etc.
Understanding URL
URL is acronym of Universal Resource Locator and it’s used to locate the server
and resource. Every resource on the web has it’s own unique address. Let’s see
parts of URL with an example.
http://localhost:8080/FirstServletProject/jsps/hello.jsp
http:// – This is the first part of URL and provides the communication protocol to
be used in server-client communication.
localhost – the unique address of the server, most of the times it’s the hostname of
the server that maps to unique IP address. Sometimes multiple hostnames point to
same IP addresses and web server virtual host takes care of sending request to the
particular server instance.
8080 – This is the port on which server is listening, it’s optional and if we don’t
provide it in URL then request goes to the default port of the protocol. Port
numbers 0 to 1023 are reserved ports for well known services, for example 80 for
HTTP, 443 for HTTPS, 21 for FTP etc.
Web Container
Tomcat is a web container, when a request is made from Client to web server, it
passes the request to web container and it’s web container job to find the correct
resource to handle the request (servlet or JSP) and then use the response from the
resource to generate the response and provide it to web server. Then web server
sends the response back to the client.
When web container gets the request and if it’s for servlet then container creates
two Objects HTTPServletRequest and HTTPServletResponse. Then it finds the
correct servlet based on the URL and creates a thread for the request. Then it
invokes the servlet service() method and based on the HTTP method service()
method invokes doGet() or doPost() methods. Servlet methods generate the
dynamic page and write it to response. Once servlet thread is complete, container
converts the response to HTTP response and send it back to client.
Some of the important work done by web container are:
Communication Support – Container provides easy way of communication
between web server and the servlets and JSPs. Because of container, we
don’t need to build a server socket to listen for any request from web server,
parse the request and generate response. All these important and complex
tasks are done by container and all we need to focus is on our business logic
for our applications.
Lifecycle and Resource Management – Container takes care of managing
the life cycle of servlet. Container takes care of loading the servlets into
memory, initializing servlets, invoking servlet methods and destroying them.
Container also provides utility like JNDI for resource pooling and
management.
Deployment Descriptor
Web.xml file is the deployment descriptor of the web application and contains
mapping for servlets (prior to 3.0), welcome pages, security configurations,
session timeout settings etc.
That’s all for the java web application startup tutorial, we will explore Servlets
and JSPs more in future posts.
MySQL:
MySQL, the most popular Open Source SQL database management system, is
developed, distributed, and supported by Oracle Corporation.
The MySQL Web site (http://www.mysql.com/) provides the latest information
about MySQL software.
The MySQL Database Server is very fast, reliable, scalable, and easy to
use.
If that is what you are looking for, you should give it a try. MySQL Server
can run comfortably on a desktop or laptop, alongside your other
applications, web servers, and so on, requiring little or no attention. If you
dedicate an entire machine to MySQL, you can adjust the settings to take
advantage of all the memory, CPU power, and I/O capacity available.
MySQL can also scale up to clusters of machines, networked together.
You can find a performance comparison of MySQL Server with other
database managers on our benchmark page.
The official way to pronounce “MySQL” is “My Ess Que Ell” (not “my sequel”),
but we do not mind if you pronounce it as “my sequel” or in some other localized
way.
SYSTEM DESIGN
SYSTEM MODEL:
SYSTEM ARCHITECTURE:
LOGIN
ADMIN
USER
UNAUTHORIZED USER
VIEW USER DETALIS
SENDING REQUEST
ENTER KEY
UPLOAD IMAGE
SHARE IMAGES
BLOCK COMMAND
PRICACY SETTING
CONTROLLING APPLICATION
ENTER ATTACKER
MAINTAIN SECURITY
FIND URL
CLASSIFIED
GET NOTIFICATION DATAS
STOP REDIRECT
BLOCK SPAM
ENTER SPAM
INCREASE
End Process
ACCURACY
UML DIAGRAMS
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so that
they can develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core
concepts.
3. Be independent of particular programming languages and development
process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations,
frameworks, patterns and components.
7. Integrate best practices.
LOGIN
FIND FRIENDS
SHARE IMAGES
AUTHORIZED KEY
VIEW DATAS
FIND URL
ADMIN
STOP HYPERLINK
USER
FATCH ATTACKERS
MAINTAIN DATAS
ENTER SPAM
STOP SPAM
CLASS DIAGRAM:
SEQUENCE DIAGRAM:
login
enter key
authorized keys
block commands
find friends
enter attacker
post command
classified datas
find url
stop spam
enter spam
ACTIVITY DIAGRAM:
LOGIN
FIND FRIEND
AUTHORIZED KEY
SHARE IMAGES
UPLOAD IMAGE
CHECK DATAS
FIND URL
STOPHYPER LINK
STOP SPAM
ACCURACY
INPUT DESIGN
The input design is the link between the information system and the user. It
comprises the developing specification and procedures for data preparation and
those steps are necessary to put transaction data in to a usable form for processing
can be achieved by inspecting the computer to read data from a written or printed
document or it can occur by having people keying the data directly into the system.
The design of input focuses on controlling the amount of input required,
controlling the errors, avoiding delay, avoiding extra steps and keeping the process
simple. The input is designed in such a way so that it provides security and ease of
use with retaining the privacy. Input Design considered the following things:
OBJECTIVES
2. It is achieved by creating user-friendly screens for the data entry to handle large
volume of data. The goal of designing input is to make data entry easier and to be
free from errors. The data entry screen is designed in such a way that all the data
manipulates can be performed. It also provides record viewing facilities.
3. When the data is entered it will check for its validity. Data can be entered with
the help of screens. Appropriate messages are provided as when needed so that the
user will not be in maize of instant. Thus the objective of input design is to create
an input layout that is easy to follow
OUTPUT DESIGN
A quality output is one, which meets the requirements of the end user and presents
the information clearly. In any system results of processing are communicated to
the users and to other system through outputs. In output design it is determined
how the information is to be displaced for immediate need and also the hard copy
output. It is the most important and direct source information to the user. Efficient
and intelligent output design improves the system’s relationship to help user
decision-making.
The output form of an information system should accomplish one or more of the
following objectives.
IMPLEMENTATION
MODULES:
Data collection
Feature extraction
Training
Classification
Detecting Suspicious
MODULES DESCRIPTION:
Data collection
The data collection component has two subcomponents: the collection of facebook apps with
URLs and crawling for URL redirections. Whenever this component obtains a facebook app with
a URL, it executes a crawling thread that follows all redirections of the URL and looks up the
corresponding IP addresses. The crawling thread appends these retrieved URL and IP chains to
the tweet information and pushes it into a queue. As we have seen, our crawler cannot reach
malicious landing URLs when they use conditional redirections to evade crawlers. However,
because our detection system does not rely on the features of landing URLs, it works
independently of such crawler evasions.
Feature extraction
The feature extraction component has three subcomponents: grouping of identical domains,
finding entry point URLs, and extracting feature vectors.
To classify a post, MyPageKeeper evaluates every embedded URL in the post. Our key novelty
lies in considering only the social context (e.g., the text message in the post, and the number of
Likes on it) for the classification of the URL and the related post. Furthermore, we use the fact
that we are observing more than one user, which can help us detect an epidemic spread.
Training
The training component has two subcomponents: retrieval of account statuses and training of the
classifier. Because we use an offline supervised learning algorithm, the feature vectors for
training are relatively older than feature vectors for classification. To label the training vectors,
we use the account status; URLs from suspended accounts are considered malicious whereas
URLs from active accounts are considered benign. We periodically update our classifier using
labeled training vectors.
Classification
The classification component executes our classifier using input feature vectors to classify
suspicious URLs. When the classifier returns a number of malicious feature vectors, this
component flags the corresponding URLs information as suspicious.
The classification module uses a Machine Learning classifier based on Support Vector
Machines, but also utilizes several local and external white lists and blacklists that help speed up
the process and increase the over-all accuracy. The classification module receives a URL and the
related social context features extracted in the previous step.
These URLs, detected as suspicious, will be delivered to security experts or more sophisticated
dynamic analysis environments for an in-depth investigation.
Detecting Suspicious
The Detecting Suspicious and notification module notifies all users who have social malware
posts in their wall or news feed. The user can currently specify the notification mechanism,
which can be a combination of emailing the user or posting a comment on the suspect posts.
Screenshots:
LOGIN
ADMIN
USER
UNAUTHORIZED USER
VIEW USER DETALIS
SENDING REQUEST
ENTER KEY
UPLOAD IMAGE
SHARE IMAGES
BLOCK COMMAND
PRICACY SETTING
CONTROLLING APPLICATION
ENTER ATTACKER
MAINTAIN SECURITY
FIND URL
CLASSIFIED
GET NOTIFICATION DATAS
STOP REDIRECT
BLOCK SPAM
ENTER SPAM
INCREASE
End Process
ACCURACY
UML DIAGRAMS
GOALS:
The Primary goals in the design of the UML are as follows:
8. Provide users a ready-to-use, expressive visual modeling Language so that
they can develop and exchange meaningful models.
9. Provide extendibility and specialization mechanisms to extend the core
concepts.
10.Be independent of particular programming languages and development
process.
11.Provide a formal basis for understanding the modeling language.
12.Encourage the growth of OO tools market.
13.Support higher level development concepts such as collaborations,
frameworks, patterns and components.
14.Integrate best practices.
Registration
Login
Find Friends
Add App's
User
Admin
Provide License
Access App's
Logout
CLASS DIAGRAM:
SEQUENCE DIAGRAM:
Registration
Login
Login
Add Friends
Create App's
Access App's
LogOut
Logout
SYSTEM TESTING
Software system meets its requirements and user expectations and does not fail in
an unacceptable manner. There are various types of test. Each test type addresses a
specific testing requirement.
TYPES OF TESTS
Unit testing
Unit testing involves the design of test cases that validate that the internal
program logic is functioning properly, and that program inputs produce valid
outputs. All decision branches and internal code flow should be validated. It is the
testing of individual software units of the application .it is done after the
completion of an individual unit before integration. This is a structural testing, that
relies on knowledge of its construction and is invasive. Unit tests perform basic
tests at component level and test a specific business process, application, and/or
system configuration. Unit tests ensure that each unique path of a business process
performs accurately to the documented specifications and contains clearly defined
inputs and expected results.
Integration testing
Functional test
System Test
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An
example of system testing is the configuration oriented system integration test.
System testing is based on process descriptions and flows, emphasizing pre-driven
process links and integration points.
document. It is a testing in which the software under test is treated, as a black box
.you cannot “see” into it. The test provides inputs and responds to outputs without
considering how the software works.
Unit testing is usually conducted as part of a combined code and unit test
phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.
Test objectives
All field entries must work properly.
Pages must be activated from the identified link.
The entry screen, messages and responses must not be delayed.
Features to be tested
Verify that the entries are of the correct format
No duplicate entries should be allowed
All links should take the user to the correct page.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
CONCLUSION
Applications present convenient means for hackers to spread malicious content on
Facebook. However, little is understood about the characteristics of malicious apps
and how they operate. In this paper, using a large corpus of malicious Facebook
apps observed over a 9-month period, we showed that malicious apps differ
significantly from benign apps with respect to several features. For example,
malicious apps are much more likely to share names with other apps, and they
typically request less permission than benign apps. Leveraging our observations,
we developed FRAppE, an accurate classifier for detecting malicious Facebook
applications. Most interestingly, we highlighted the emergence of app-nets—large
groups of tightly connected applications that promote each other. We will continue
to dig deeper into this ecosystem of malicious apps on Facebook, and we hope that
Facebook will benefit from our recommendations for reducing the menace of
hackers on their platform.
REFERENCES
[1] C. Pring, “100 social media statistics for 2012,” 2012 [Online]. Available:
http://thesocialskinny.com/100-social-media-statistics-for-2012/
[2] Facebook, Palo Alto, CA, USA, “Facebook Opengraph API,” [Online]. Available:
http://developers.facebook.com/docs/reference/api/
[5] “Whiich cartoon character are you—Facebook survey scam,” 2012 [Online]. Available:
https://apps.facebook.com/mypagekeeper/?status=scam_report_fb_survey_scam_whiich_cartoon_charact
er_are_you_2012_03_30
[6] G. Cluley, “The Pink Facebook rogue application and survey scam,” 2012 [Online].
Available: http://nakedsecurity.sophos.com/2012/02/27/pink-facebook-survey-scam/
[7] D. Goldman, “Facebook tops 900 million users,” 2012 [Online]. Available:
http://money.cnn.com/2012/04/23/technology/facebookq1/index.htm
[8] R. Naraine, “Hackers selling $25 toolkit to create malicious Facebook apps,” 2011 [Online].
Available: http://zd.net/g28HxI
[9] HackTrix, “Stay away from malicious Facebook apps,” 2013 [Online]. Available:
http://bit.ly/b6gWn5
[10] M. S. Rahman, T.-K. Huang, H. V. Madhyastha, and M. Faloutsos, “Efficient and scalable
socware detection in online social networks,” in Proc. USENIX Security, 2012, p. 32.
[11] H. Gao et al., “Detecting and characterizing social spam campaigns,” in Proc. IMC, 2010,
pp. 35–47.
[12] H. Gao, Y. Chen, K. Lee, D. Palsetia, and A. Choudhary, “Towards online spam filtering in
social networks,” in Proc. NDSS, 2012.
[13] P. Chia, Y. Yamamoto, and N. Asokan, “Is this app safe? A large scale study on application
permissions and risk signals,” in Proc. WWW, 2012, pp. 311–320.
[14] “WhatApp? (beta)—A Stanford Center for Internet and Society Website with support from
the Rose Foundation,” [Online]. Available: https://whatapp.org/facebook/
[16] Facebook, Palo Alto, CA, USA, “Facebook platform policies,” [Online]. Available:
https://developers.facebook.com/policy/
[17] Facebook, Palo Alto, CA, USA, “Application authentication flow using OAuth 2.0,”
[Online]. Available: http://developers.facebook. com/docs/authentication/
[18] “11 million bulk email addresses for sale—Sale price $90,” [Online]. Available:
http://www.allhomebased.com/BulkEmailAddresses.htm
[19] E. Protalinski, “Facebook kills app directory, wants users to search for apps,” 2011
[Online]. Available: http://zd.net/MkBY9k
[20] SocialBakers, “SocialBakers: The recipe for socialmarketing success,” [Online]. Available:
http://www.socialbakers.com/
[23] Facebook, Palo Alto, CA, USA, “Permissions reference,” [Online]. Available:
https://developers.facebook.com/docs/authentication/permissions/
[24] Facebook, Palo Alto, CA, USA, “Facebook developers,” [Online]. Available:
https://developers.facebook.com/docs/appsonfacebook/tutorial/
[25] “Web-of-Trust,” [Online]. Available: http://www.mywot.com/
[26] F. J. Damerau, “A technique for computer detection and correction of spelling errors,”
Commun. ACM, vol. 7, no. 3, pp. 171–176,Mar. 1964.
[27] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” Trans. Intell.
Syst. Technol., vol. 2, no. 3, 2011, Art. no. 27.
[28] J. Ma, L. K. Saul, S. Savage, and G. M. Voelker, “Beyond blacklists: Learning to detect
malicious Web sites from suspicious URLs,” in Proc. KDD, 2009, pp. 1245–1254.
[29] A. Le, A.Markopoulou, and M. Faloutsos, “PhishDef: URL names say it all,” in Proc. IEEE
INFOCOM, 2011, pp. 191–195.
[35] C. Yang, R. Harkreader, and G. Gu, “Die free or live hard? Empirical
evaluation and new design for fighting evolving Twitter spammers,” in Proc.
RAID, 2011, pp. 318–337.
[43] J. King, A. Lampinen, and A. Smolen, “Privacy: Is there an app for that?,” in
Proc. SOUPS, 2011, Art. no. 12.
[44] M. Gjoka, M. Sirivianos, A. Markopoulou, and X. Yang, “Poking Facebook:
Characterization of OSN applications,” in Proc. 1st WOSN, 2008, pp. 31–36.
[45] T. Stein, E. Chen, and K. Mangla, “Facebook immune system,” in Proc. 4th
Workshop Social Netw. Syst., 2011, Art. no. 8.
[46] L. Parfeni, “Facebook softens its app spam controls, introduces better tools for
developers,” 2011 [Online]. Available: http://bit.ly/LLmZpM