Pl-II Lab Manual - Scs

DYPCOE, Akurdi.
Department of Computer Engineering

TE Comp (SEMESTER I) Programming Laboratory-II 1
DEPARTMENT OF COMPUTER ENGINEERING
D. Y. PATIL COLLEGE OF ENGINEERING,
AKURDI, PUNE - 44
LABORATORY MANUAL

PROGRAMMING LABORATORY- II

TE COMPUTER

SEMESTER I
2014-15

Teaching Scheme Examination Scheme
Term Work : 50 Marks
Practical : 4 Hrs/Week Oral : 50 Marks

Ms. S. C. Shinde Ms. D. B. Gothwal
Prepared By Checked By Verified By

(Mrs. M. A. Potey)
HOD COMP
DYPCOE, Akurdi. Department of Computer Engineering


DEPARTMENT OF COMPUTER ENGI NEERING
D.Y. PATIL COLLEGE OF ENGINEERING, AKURDI, PUNE-44.

List of Assignments
Assignments Group A

1. Implementation of following spoofing assignments using C++ Programming
) IP Spoofing
) Web spoofing.

3. Write a computer forensic application program in Java/Python/C++ for Recovering Deleted
Files and Deleted Partitions

Assignments Group B

1. Develop a GUI and write a Java/Python/C++ program to monitor Network Forensics,
Investigating Logs and Investigating Network Traffic.

7. Write a program to implement a packet sniffing tool in C++/Java/Python.

9. Install and use open source tools to identifying various types of WiFi attacks. Write a C+
+/Java/Python program to identify atleast one such attack.

17. Write a program for identifying the tampering of digital signature using Python

18. Write a C++/Java program for Log Capturing and Event Correlation.

19. Write a tool to detect and prevent Capturing mobile messages in Python/Java.

Assignment Group C:

1. Implementation of Steganography program.

2. Implement a program to generate and verify CAPTCHA image.



(GR:A) ASSIGNMENT NO: 1

Implementation of following spoofing assignments using C++ Programming
) IP Spoofing
) Web spoofing.


Problem Statement:

Implementation of following spoofing assignments using C++ Programming
) IP Spoofing
) Web spoofing.

Theory:

IP Spoofing:

A technique used to gain unauthorized access to computers, whereby the intruder sends messages
to a computer with an IP address indicating that the message is coming from a trusted host. To engage
in IP spoofing, a hacker must first use a variety of techniques to find an IP address of a trusted host and
then modify the packet headers so that it appears that the packets are coming from that host.
The concept of IP spoofing was initially discussed in academic circles in the 1980's. While known about
for some time, it was primarily theoretical until Robert Morris, whose son wrote the first Internet Worm,
discovered a security weakness in the TCP protocol known as sequence prediction. Stephen Bellovin
discussed the problem in-depth in Security Problems in the TCP/IP Protocol Suite, a paper that addressed
design problems with the TCP/IP protocol suite. Another infamous attack, Kevin Mitnick's Christmas
Day crack of Tsutomu Shimomura's machine, employed the IP spoofing and TCP sequence prediction
techniques. While the popularity of such cracks has decreased due to the demise of the services they
exploited, spoofing can still be used and needs to be addressed by all security administrators.

Technical Discussion

To completely understand how these attacks can take place, one must examine the structure of the
TCP/IP protocol suite. A basic understanding of these headers and network exchanges is crucial to the
process.

Internet Protocol IP

Internet protocol (IP) is a network protocol operating at layer 3 (network) of the OSI model. It is a
connectionless model, meaning there is no information regarding transaction state, which is used to route
packets on a network. Additionally, there is no method in place to ensure that a packet is properly
delivered to the destination.
Examining the IP header, we can see that the first 12 bytes (or the top 3 rows of the header) contain
information about packet. The next 8 bytes (the next 2 rows), however, contains the source and
destination IP addresses. Using one of several tools, an attacker can easily modify these addresses
specifically the source address field. It's important to note that each datagram is sent independent of all
others due to the stateless nature of IP. Keep this fact in mind as we examine TCP in the next section.

Transmission Control Protocol TCP

IP can be thought of as a routing wrapper for layer 4 (transport), which contains the Transmission

Control Protocol (TCP). Unlike IP, TCP uses a connection-oriented design. This means that the
participants in a TCP session must first build a connection via the 3-way handshake (SYN-SYN/ACK-
ACK) then update one another on progress - via sequences and acknowledgements. This conversation,
ensures data reliability, since the sender receives an OK from the recipient after each packet exchange.
TCP header is very different from an IP header. We are concerned with the first 12 bytes of the TCP
packet, which contain port and sequencing information. Much like an IP datagram, TCP packets can be
manipulated using software. The source and destination ports normally depend on the network
application in use (for example, HTTP via port 80). What's important for our understanding of spoofing
are the sequence and acknowledgement numbers. The data contained in these fields ensures packet
delivery by determining whether or not a packet needs to be resent. The sequence number is the number
of the first byte in the current packet, which is relevant to the data stream. The acknowledgement
number, in turn, contains the value of the next expected sequence number in the stream. This relationship
confirms, on both ends, that the proper packets were received. Its quite different than IP, since
transaction state is closely monitored.

Consequences of the TCP/IP Design

Now that we have an overview of the TCP/IP formats, let's examine the consequences. Obviously, it's
very easy to mask a source address by manipulating an IP header. This technique is used for obvious
reasons and is employed in several of the attacks discussed below. Another consequence, specific to
TCP, is sequence number prediction, which can lead to session hijacking or host impersonating. This
method builds on IP spoofing, since a session, albeit a false one, is built. We will examine the
ramifications of this in the attacks discussed below.

Spoofing Attacks

There are a few variations on the types of attacks that successfully employ IP spoofing. Although some
are relatively dated, others are very pertinent to current security concerns.

Non-Blind Spoofing

This type of attack takes place when the attacker is on the same subnet as the victim. The sequence and
acknowledgement numbers can be sniffed, eliminating the potential difficulty of calculating them
accurately. The biggest threat of spoofing in this instance would be session hijacking. This is
accomplished by corrupting the datastream of an established connection, then re-establishing it based on
correct sequence and acknowledgement numbers with the attack machine. Using this technique, an
attacker could effectively bypass any authentication measures taken place to build the connection.

Blind Spoofing

This is a more sophisticated attack, because the sequence and acknowledgement numbers are
unreachable. In order to circumvent this, several packets are sent to the target machine in order to sample
sequence numbers. While not the case today, machines in the past used basic techniques for generating
sequence numbers. It was relatively easy to discover the exact formula by studying packets and TCP
sessions. Today, most OSs implement random sequence number generation, making it difficult to predict

them accurately. If, however, the sequence number was compromised, data could be sent to the target.
Several years ago, many machines used host-based authentication services (i.e. Rlogin). A properly
crafted attack could add the requisite data to a system (i.e. a new user account), blindly, enabling full
access for the attacker who was impersonating a trusted host.

Man In the Middle Attack

Both types of spoofing are forms of a common security violation known as a man in the middle (MITM)
attack. In these attacks, a malicious party intercepts a legitimate communication between two friendly
parties. The malicious host then controls the flow of communication and can eliminate or alter the
information sent by one of the original participants without the knowledge of either the original sender or
the recipient. In this way, an attacker can fool a victim into disclosing confidential information by
spoofing the identity of the original sender, who is presumably trusted by the recipient.

Denial of Service Attack

IP spoofing is almost always used in what is currently one of the most difficult attacks to defend against
denial of service attacks, or DoS. Since crackers are concerned only with consuming bandwidth and
resources, they need not worry about properly completing handshakes and transactions. Rather, they wish
to flood the victim with as many packets as possible in a short amount of time. In order to prolong the
effectiveness of the attack, they spoof source IP addresses to make tracing and stopping the DoS as
difficult as possible. When multiple compromised hosts are participating in the attack, all sending
spoofed traffic, it is very challenging to quickly block traffic.

Misconceptions of IP Spoofing

While some of the attacks described above are a bit outdated, such as session hijacking for host-based
authentication services, IP spoofing is still prevalent in network scanning and probes, as well as denial of
service floods. However, the technique does not allow for anonymous Internet access, which is a
common misconception for those unfamiliar with the practice. Any sort of spoofing beyond simple
floods is relatively advanced and used in very specific instances such as evasion and connection
hijacking.

Defending Against Spoofing

There are a few precautions that can be taken to limit IP spoofing risks on your network, such as:
Filtering at the Router - Implementing ingress and egress filtering on your border routers is a great place
to start your spoofing defense. You will need to implement an ACL (access control list) that blocks
private IP addresses on your downstream interface. Additionally, this interface should not accept
addresses with your internal range as the source, as this is a common spoofing technique used to
circumvent firewalls. On the upstream interface, you should restrict source addresses outside of your
valid range, which will prevent someone on your network from sending spoofed traffic to the Internet.
Encryption and Authentication - Implementing encryption and authentication will also reduce spoofing

threats. Both of these features are included in Ipv6, which will eliminate current spoofing threats.
Additionally, you should eliminate all host-based authentication measures, which are sometimes common
for machines on the same subnet. Ensure that the proper authentication measures are in place and carried
out over a secure (encrypted) channel.

b)Web Spoofing:

Website spoofing is the act of creating a website, as a hoax, with the intention of misleading readers that
the website has been created by a different person or organization. Normally, the spoof website will
adopt the design of the target website and sometimes has a similar URL. A more sophisticated attack
results in an attacker creating a "shadow copy" of the World Wide Web by having all of the victim's
traffic go through the attacker's machine, causing the attacker to obtain the victim's sensitive information.
Another technique is to use a 'cloaked' URL. By using domain forwarding, or inserting control
characters, the URL can appear to be genuine while concealing the address of the actual website.
The objective may be fraudulent, often associated with phishing or e-mail spoofing, or to criticize or
make fun of the person or body whose website the spoofed site purports to represent. Because the
purpose is often malicious, "spoof" (an expression whose base meaning is innocent parody) is a poor
term for this activity so that more accountable organizations such as government departments and banks
tend to avoid it, preferring more explicit descriptors such as "fraudulent" or "phishing".
As an example of the use of this technique to parody an organization, in November 2006 two spoof
websites, www.msfirefox.com and www.msfirefox.net, were produced claiming that Microsoft had
bought Firefox and released Microsoft Firefox 2007.

Techniques

A variety of techniques are used in website spoofing. Most techniques involve creating websites that are
designed to look and act the same as the target website. More sophisticated attacks involve javascript and
web server plug-ins. First, a victim is infected through a malicious website or infected email. A web
browser is displayed on the victim's machine that matches the look of the normal web browser. Then, in
this infected window, all traffic is sent through a malicious server, allowing the server to intercept
information, possibly containing passwords, usernames, and sensitive data. As long as the victim uses the
infected browser, the malicious server intercepts all information while still preserving a normal web
experience, so the victim is unable to detect the attack.

How to identify and prevent web spoofing

One of the main types of website spoofing occurs on websites that have anything to do with money. For
example, any website one might use for banking, buying, selling or transferring money, may be subject
to website spoofing. When using any website where a credit number must be entered, one of the first
steps to identifying a spoofed website is making sure the website is secured with SSL/TLS. This means
that it has Secure Sockets Layer/Transport Layer Security. SSL is used to verify the identity of the
server. If the website does not have SSL, it is most likely a spoof. The best way to prevent spoofing is to
avoid using hyperlinks. For example, instead of using a link attached in an email, type the websites
address into the address bar yourself. One additional tip to avoid spoofing is to avoid using the same
password for every website.

How to respond to website spoofing

There exist procedures that can be undergone in response to a spoofed website, which will help mitigate
risks. These procedures will, in theory, eliminate the threat of identity theft and financial fraud.
Mitigating the risk of website spoofing can done in the following ways. Firstly, educating customers on
how to be aware of a spoof can be helpful. This can be done with website alerts that explain and warn
about various internet-related scams. If possible, certain employees should be assigned to monitor the site
and make sure there are not fraudulent sites being created. If a fraudulent site is found, these employees
are responsible for responding correctly.

The most common method of detecting a fraudulent site is encountering emails that return to a websites
mail server, but were not sent by the website. A large increase in customer calls or contact to the website
in general is also sometimes a sign that a website is being spoofed. If it has been determined that a site
has been targeted for spoofing, gathering information is necessary. This information will help identify the
fraudulent website, determine whether customer information has been obtained, and assist law
enforcement agencies in any investigation. It is also imperative to communicate promptly with the
internet service provider (ISP) responsible for hosting the fraudulent website demanding it be taken
down. Contact the domain name registrars with the same intention, and demand the incorrect use of
trademarks ends immediately.

How to detect a spoofed webpage

Triple check the spelling of the URL Look for small differences such as a hyphen (-) or an underscore
(e.g. suntrust.com vs. sun-trust.com) Mouse over message (careful: this can be spoofed too!)Beware of
pages that use server scripting such as php these tools make it easy to obtain your information
Beware of javascripting as well. Beware of longer than average load times.
Preventing strategies web Spoofing Attacks

Conclusion:
Hence, we have successfully studied concept of IP Spoofing and web Spoofing



(GR:A) ASSIGNMENT NO: 3

Write a computer forensic application program in Java/Python/C++ for
Recovering Deleted Files and Deleted Partitions

Problem Statement:

Write a computer forensic application program in Java/Python/C++ for Recovering Deleted
Files and Deleted Partitions.

Theory

Data recovery is a feature for restoring computer files which have been removed from a file system
by file deletion. Data recovery also known as undeletion. Deleted data can be recovered on many
file systems, but not all file systems provide data recovery feature. Recovering data without data
recovery facility is usually called data recovery, rather than undeletion. Although data recovery can
help prevent users from accidentally losing data, it can also pose a computer security risk, since
users may not be aware that deleted files remain accessible.
Not all file systems or operating systems support undeletion. Data recovery is possible on FAT16
file systems, with Microsoft providing data recovery utilities for both MS-DOS 5-6.22 and 16-bit
Windows operating systems. It is not supported by most modern UNIX file systems, though AdvFS
is a notable exception. The ext2 file system has an add-on program called e2undel which allows file
undeletion. The similar ext3 file system does not officially support undeletion, but ext3grep was
written to automate the data recovery of ext3 volumes. Undelete was proposed in ext4, but is yet to
be implemented. However, trash bin feature was posted as a patch on December 4, 2006. The Trash
bin feature uses undelete attributes in ext2/3/4 and Reiser file systems.
Graphical user environments often take a different approach to undeletion, instead using a "holding
area" for files to be deleted. Undesired files are moved to this holding area, and all of the files in the
holding area are deleted periodically or when a user requests it. This approach is used by the Trash
can in Macintosh operating systems and by the recycle bin in Microsoft Windows. This is a natural
continuation of the approach taken by earlier systems, such as the limbo group used by LocoScript.
This approach is not subject to the risk that other files being written to the file system will disrupt a
deleted file very quickly; permanent deletion will happen on a predictable schedule or with manual
intervention only.
Another approach is offered by programs such as Norton GoBack (formerly Roxio GoBack): a
portion of the hard disk space is set aside for file modification operations to be recorded in such a
way that they may later be undone. This process is usually much safer in aiding recovery of deleted
files than the data recovery operation as described below.
Similarly, file systems that support "snapshots" (like ZFS or btrfs), can be used to make snapshots
of the whole file system at regular intervals (e.g. every hour), thus allowing recovery of files from
an earlier snapshot.
Limitations
Data recovery is not fail-safe. In general, the sooner data recovery is attempted; the more likely it
will be successful. Fragmentation of the deleted file may also reduce the probability of recovery,
depending on the type of file system (see below). A fragmented file is scattered across different
parts of the disk, instead of being in a contiguous area.
Mechanics
The workings of data recovery depend on the file system on which the deleted file was stored. Some
file systems, such as HFS, cannot provide data recovery feature because no information about the

deleted file is retained (except by additional software, which is not usually present). Some file
systems, however, do not erase all traces of a deleted file, including the FAT file system:
FAT file system
When a file is "deleted" using a FAT file system, the directory entry remains unchanged,
preserving most of the "deleted" file's name, along with its time stamp, file length and most
importantly its physical location on the disk. The list of disk clusters occupied by the file will,
however, is erased from the File Allocation Table, marking those sectors available for use by other
files created or modified thereafter.
When data recovery operation is attempted, the following conditions must be met for a successful
recovery of the file:
The entry of the deleted file must still exist in the directory, meaning that it must not yet be
overwritten by a new file (or folder) that has been created in the same directory. Whether
this is the case can fairly easily be detected by checking whether the remaining name of the
file to be undeleted is still present in the directory.
The sectors formerly used by the deleted file must not be overwritten yet by other files. This
can fairly well be verified by checking that the sectors are not marked as used in the File
Allocation Table. However, if, in the meantime, a new file had been written to the disk,
using those sectors, and then deleted again, freeing those sectors again, this cannot be
detected automatically by the data recovery program. In this case data recovery operation,
even if appearing successful, might fail because the recovered file contains different data.
File recovery process:
File recovery process can be briefly described as drive or folder scanning to find deleted entries
in Master File Table (MFT) then for the particular deleted entry, defining clusters chain to be
recovered and then copying contents of these clusters to the newly created file.
Different file systems maintain their own specific logical data structures, however basically each
file system:
Has a list or catalog of file entries, so we can iterate through this list and entries, marked as
deleted
Keeps for each entry a list of data clusters, so we can try to find out set of clusters
composing the file
After finding out the proper file entry and assembling set of clusters, composing the file,
read and copy these clusters to another location.
MFT Record structure
MFT Record has pre-defined structure. It has a set of attributes defining any file of folder
parameters.
MFT Record begins with standard File Record Header (first bold section, offset 0x00):
"FILE" identifier (4 bytes)
Offset to update sequence (2 bytes)
Size of update sequence (2 bytes)
$LogFile Sequence Number (LSN) (8 bytes)
Sequence Number (2 bytes)
Reference Count (2 bytes)
Offset to Update Sequence Array (2 bytes)
Flags (2 bytes)
Real size of the FILE record (4 bytes)

Allocated size of the FILE record (4 bytes)
File reference to the base FILE record (8 bytes)
Next Attribute Id (2 bytes)
The most important information for us in this block is a file state: deleted or in-use. If Flags(in red
color) field has bit 1 set, it means that file is in-use.
Chances of recovering deleted files is higher in FAT16 as compared to FAT32 drives;
fragmentation of files is usually less in FAT16 due to large cluster size support (1024 Bytes, 2KB,
4KB, 8KB, 16KB, 32KB and 64KB which is supported only in Windows NT) as compared to
FAT32 (4KB, 8KB, 16KB only).
If the data recovery program cannot detect clear signs of the above requirements not being met, it
will restore the directory entry as being in use and mark all consecutive sectors (clusters), beginning
with the one as recorded in the old directory entry, as used in the File Allocation Table. It is then up
to the user to open the recovered file and to verify that it contains the complete data of the formerly
deleted file.
Recovery of fragmented files (after the first fragment) is therefore not possible by automatic
processes, but only by manual examination of each (unused) block of the disk. This requires
detailed knowledge of the file system, as well as the binary format of the file type being recovered.
Microsoft included a similar UNDELETE program in versions 5.0 to 6.22 of MS-DOS, but applied
the Recycle Bin approach instead in later operating systems using FAT.
Partition Recovery Concepts
System Boot Process:
In most cases, the first indication of a problem with hard drive data is a refusal of the machine to
boot properly. For the computer to be able to find startup partition and to start booting, the
following conditions must apply:
Master Boot Record (MBR) or GIUD Partition Table (GPT) exists and is safe
Partition Table exists and contains at least one Active partition
Active partition contains all necessary and not damaged system files for the OS launch
If the above is in place, executable code in the MBR selects an active partition and passes control
there, so it can start loading the standard files (COMMAND.COM, NTLDR, BOOTMGR ...)
depending on the OS and the file system type on that partition. If these files are missing or
corrupted it will be impossible for the OS to boot - you understand the situation if you have ever
seen the famous "NTLDR is missing ..." error message.
Volume Visibility
A more serious situation exists if your computer will start and cannot see a drive partition*. For the
partition to be visible to the Operating System the following conditions must apply:
- Partition/Drive can be found via Partition Table
- Partition/Drive/Volume boot sector is safe
- Volume system areas (MFT, Root) are safe and accessible
If the above conditions are true, the Operating System can read the partition or physical drive
parameters and display the drive in the list of the available drives. If the file system is damaged
(Master File Table (MFT) records on NTFS) the drive's content might not be displayed and we
might see errors like "MFT is corrupted", or "Drive is invalid". If this is the case it is less likely that
you will be able to restore your data in full. Do not despair, as there may be some tricks or tips to
display some of the residual entries that are still safe, allowing you to recover your data to another
location.

Partition Recovery Includes
1. Physical partition recovery. The goal is to identify the problem and write information to
the proper place on the hard drive (to MBR and Boot Sectors) so that the partition becomes
visible to the Operating System again. This can be done using manual Disk Editors along
with proper guidelines or using partition recovery software, designed specifically for this
purpose.
2. Virtual partition recovery. The goal is to determine the critical parameters of the
deleted/damaged/overwritten partition and render it open to scanning in order to display its
content to copy important data to the safe place. This approach can be applied in some cases
when physical partition recovery is not possible (for example, partition boot sector is dead
and physically unreadable) and is commonly used by file recovery software. This process is
almost impossible to implement it manually.
Other Hard Drive Partition Recovery Topics
Lets consider the topics, related to the recovery of partitions in common, not specific to the
particular file system. We have the following cases:
Master Boot Record (MBR) is damaged
Partition is deleted or Partition Table is damaged
Partition Boot Sector is damaged
Missing or Corrupted System Files
1. Master Boot Record (MBR) is damaged
The Master Boot Record (MBR) will be created when you create the first partition on the
hard disk. It is very important data structure on the disk. The Master Boot Record contains
the Partition Table for the disk and a small amount of executable code for the boot start. The
location is always the first sector on the disk.
2. Partition is deleted or Partition Table is damaged
The information about primary partitions and extended partition is contained in the Partition
Table, a 64-byte data structure, located in the same sector as the Master Boot Record
(cylinder 0, head 0, sector 1). The Partition Table conforms to a standard layout, which is
independent of the operating system.
3. Partition Boot Sector is damaged
The Partition Boot Sector contains information, which the file system uses to access the
volume.
On personal computers, the Master Boot Record uses the Partition Boot Sector on the
system partition to load the operating system kernel files. Partition Boot Sector is the first
sector of the Partition.
4. Missing or Corrupted System Files
For Operating System to boot properly, system files required to be safe.
Windows Vista, Windows 2008 Server, Windows 7 - BOOTMGR and Boot folder
located at the root folder of the bootable volume. Boot folder should contain BCD file
containing bootable configuration.
Windows NT / 2000 / XP / Windows 2003 Server NTLDR, ntdetect.com, boot.ini,
located at the root folder of the bootable volume, Registry files (i.e., SAM, SECURITY,
SYSTEM and SOFTWARE), etc.
Windows 95 / 98 / ME - msdos.sys, config.sys, autoexec.bat, system.ini,at the root folder,
system.dat, user.dat, etc.

If these files have been deleted, corrupted, damaged by virus, Windows will be unable to
boot.
You'll see an error message "NTLDR is missing" or "BOOTMGR is missing".
Once it is determined that the operating system wont start, the next step in the recovery
process is to check the existence and safety of these vital system files.

Conclusion
Hence, we have successfully studied concept of recovery of files and Deleted Partitions.



(GR:B) ASSIGNMENT NO: 1

Develop a GUI and write a Java/Python/C++ program to monitor Network
Forensics, Investigating Logs and Investigating Network Traffic.


Problem Satement :

Develop a GUI and write a Java/Python/C++ program to monitor Network Forensics,
Investigating Logs and Investigating Network Traffic.

Theory:

Network Forensics:

Network forensics is the process of identifying criminal activity and the people behind it.
Network forensics can be defined as the sniffing,recording,acquisition and analysis of the network
traffic and event logs in order to investigate a network security incident.
It allows investigators to inspect network traffic and logs to identify and locate the attack system.

Network forensics can reveal:

Source of security incidents and network attacks
Path of the attack
Intrusion techniques used by attackers.

Network Addressing Scheme:

There are two types of network addressing schemes

1. LAN Addressing:
Each node in LAN has a MAC address that is factory programmed into its
NIC
Data packets are addressed to either one of the nodes or all of the nodes.

2. Internet Addressing:
Internet is a collection of LANs and/or other networks that are connected
with routers

Each network has a unique address and each node on the network has a
unique address, so an Internet address is a combination of network and node
addresses.
IP is responsible for network layer addressing in the TCP/IP protocol.

Overview of Network Protocols:

Network Vulnerabilities:

Internal Network Vulnerabilities occur due to the overextension of bandwidth and bottlenecks

External Network Vulnerabilities occur due to the threats such as DoS/DDoS attacks and network
data interception.

Investigating Logs:

Use Log capturing tools to capture log files of various devices and applications.
Log files from the following devices and applications can be used as evidence for network security
incidents:



EventLog Analyzer:

EventLog Analyzer is a web based, real time event log and application log monitoring and
management software.
It collects, analyzes, reports, and archives:
Event Log from distributed Windows hosts
SysLog from distributed Unix hosts, routers, switches, and other SysLog devices.
Application logs from IIS Web sever, IIS FTP server, MSSQL server, Oracle database server, DHCP
Windows and DHCP Linux servers.

LogCapturing Tool:

ManageEngine Firewall Analyzer
ManageEngine Firewall Analyzer is a firewall log analysis tool for security event management that
collects, analyzes, and archives logs from network perimeter security devices and generates reports.

GFI EventsManager
GFI EventsManager automatically processes and archives logs, collecting the information you need
to know about the most important events occurring in your network.
It supports a wide range of event types such as W3C, Windows events, Syslog, SQL server and
Oracle audit logs and SNMP traps generated by devices such as firewalls, routers and sensors.

Kiwi SysLog Server


Kiwi Syslog Server is a syslog server for Windows that receives logs and displays and forwards
syslog messages from hosts such as routers, switches, Unix hosts and other syslog-enabled devices.

Handling Logs as Evidence

Use Multiple Log as Evidence
Recording the same information in two different devices makes the evidence stronger.
Firewall logs, IDS logs, and TCPDump output can contain evidence of an Internet user connecting
to a specific server at a given time.

Avoid Missing Logs
When no log files exist, there is no way of knowing if the server got no hits or if the log file was
actually deleted.
Determine whether the server was running and online during the time for which log entries are not
available by monitoring the server uptime records.
Log File Authenticity:

An investigator can prove authenticity of log files if they are unaltered from the time they were originally
recorded.
If the server is compromised investigator should move the logs off the compromised server.
Move the logs to a master server and then to secondary storage media such as a DVD or disk.

Use Signatures, Encryption, and Checksums:

To ensure that the log file is not modified, encrypt the log by using the public-key encryption
scheme.
File signature makes the log file more secure.
Use Fsum tool, MD5 to generate the hash code.
Store the signature and hashes with the log.
Store a secure copy in a separate location.

Work With Copies:Do not use original log files for analysis; always work on copies.

Ensure Systems Integrity:

Always stay up-to-date on service packs and hotfixes to assure that the systems file is valid.
Audit all changes to binary files in Windows System directory.
If an intruder modifies the system files that record log files, then the log files are not valid as
evidence.
Access Control:

Once a log file is created, it is important to prevent the file from being accessed and to audit any
authorized and unauthorized access.
If you properly secure and audit a log file using NTFS permissions, you will have documented
evidence to establish its credibility.

Chain of Custody:
As you move log files from the server and later to an offline device, you should keep track of where
the file goes.
This can be done either through technical or non-technical methods such as MD5 authentication.

Condensing Log File:
Log files can be sorted by using a syslog, but the output of the syslog contains a large log file.
It is difficult for the forensic team to look for the important log entry.
Log entries need to be filtered as per the requirement.

Investigate Network Traffic:

To know who is generating the troublesome traffic, and where the traffic is being transmitted to or
received from. To locate suspicious network traffic. To identify network problems.

Evidence Gathering via Sniffing
Investigators should configure sniffers for the size of frames to be captured.
Sniffer is computer software or hardware that can intercept and log traffic passing over a digital
network or part of a network.
Sniffers, which put NICs in promiscuous mode, are used to collect digital evidence at the physical
layer.
Spanned ports, hardware taps help sniffing in a switched network.
Sniffers collect traffic from the network and transport layers other than the physical and data-link
layer.

Capturing Live Data Packets Using Wireshark:

Wireshark is a traffic capturing and sniffing tool,It uses Winpcap to capture packets, so it can only
capture the packets on the networks supported by Winpcap.
Captures live network traffic from Ethernet , IEEE 802.11, PPP/HDLC,ATM,Bluetooth,USB,Token
Ring, Frame Relay, FDDI networks.
Captured files can be programmatically edited via command line. A set of filters for customized data
display can be refined using a display filter.
Acquiring Traffic Using DNS Poisoning Techniques:

It is a technique that tricks a DNS server into believing that it has received authentic information
when, in reality, it has not.
It results in substitution of a false Internet provider address at the domain name service level where
web addresses are converted into numeric Internet provider addresses.
Perform DNS poisoning by setting up a fake website.

Conclusion

Hence we have successfully studied to monitor Network Forensics, Investigating Logs and
Investigating Network Traffic.




Write a program to implement a packet sniffing tool in C++/Java/Python.

Problem Statement:

Write a program to implement a packet sniffing tool in C++/Java/Python.

THEORY

Packet Sniffer:
A packet sniffer (also known as a network analyzer, protocol analyzer or for particular types of
networks, an Ethernet sniffer or wireless sniffer) is a computer program or a piece of computer
hardware that can intercept and log traffic passing over a digital network or part of a network. As
data streams flow across the network, the sniffer captures each packet and, if needed, decodes the
packet's raw data, showing the values of various fields in the packet, and analyzes its content.
Capabilities:
On wired broadcast LANs, depending on the network structure (hub or switch), one can capture
traffic on all or just parts of the network from a single machine within the network; however, there
are some methods to avoid traffic narrowing by switches to gain access to traffic from other systems
on the network (e.g., ARP spoofing). For network monitoring purposes, it may also be desirable to
monitor all data packets in a LAN by using a network switch with a so-called monitoring port,
whose purpose is to mirror all packets passing through all ports of the switch when systems
(computers) are connected to a switch port. To use a network tap is an even more reliable solution
than to use a monitoring port, since taps are less likely to drop packets during high traffic load.
On wireless LANs, one can capture traffic on a particular channel, or on several channels when
using multiple adapters.
On wired broadcast and wireless LANs, to capture traffic other than unicast traffic sent to the
machine running the sniffer software, multicast traffic sent to a multicast group to which that
machine is listening, and broadcast traffic, the network adapter being used to capture the traffic
must be put into promiscuous mode; some sniffers support this, others do not. On wireless LANs,
even if the adapter is in promiscuous mode, packets not for the service set for which the adapter is
configured will usually be ignored. To see those packets, the adapter must be in monitor mode.
When traffic is captured, either the entire contents of packets can be recorded, or the headers can be
recorded without recording the total content of the packet. This can reduce storage requirements,
and avoid legal problems, but yet have enough data to reveal the essential information required for
problem diagnosis.
The captured information is decoded from raw digital form into a human-readable format that
permits users of the protocol analyzer to easily review the exchanged information. Protocol
analyzers vary in their abilities to display data in multiple views, automatically detect errors,
determine the root causes of errors, generate timing diagrams, reconstruct TCP and UDP data
streams, etc.
Some protocol analyzers can also generate traffic and thus act as the reference device; these can act
as protocol testers. Such testers generate protocol-correct traffic for functional testing, and may also
have the ability to deliberately introduce errors to test for the DUT's ability to deal with error
conditions.
Protocol analyzers can also be hardware-based, either in probe format or, as is increasingly more
common, combined with a disk array. These devices record packets (or a slice of the packet) to a

disk array. This allows historical forensic analysis of packets without the users having to recreate
any fault.
Uses:
The versatility of packet sniffers means they can be used to:
Analyze network problems
Detect network intrusion attempts
Detect network misuse by internal and external users
Documenting regulatory compliance through logging all perimeter and endpoint traffic
Gain information for effecting a network intrusion
Isolate exploited systems
Monitor WAN bandwidth utilization
Monitor network usage (including internal and external users and systems)
Monitor data-in-motion
Monitor WAN and endpoint security status
Gather and report network statistics
Filter suspect content from network traffic
Serve as primary data source for day-to-day network monitoring and management
Spy on other network users and collect sensitive information such as login details or users
cookies (depending on any content encryption methods that may be in use)
Reverse engineer proprietary protocols used over the network
Debug client/server communications
Debug network protocol implementations
Verify adds, moves and changes
Verify internal control system effectiveness (firewalls, access control, Web filter, spam
filter, proxy)
Packet capture can be used to fulfill a warrant from a law enforcement agency (LEA) to produce all
network traffic generated by an individual. Internet service providers and VoIP providers in the
United States must comply with CALEA (Communications Assistance for Law Enforcement Act)
regulations. Using packet capture and storage, telecommunications carriers can provide the legally
required secure and separate access to targeted network traffic and are able to use the same device
for internal security purposes. Collection of data from a carrier system without a warrant is illegal
due to laws about interception.
Capturing Packets with libpcap:
All data on the network travels in the form of packets, which is the data unit for the network. To
understand the data a packet contains, we need to understand the protocol hierarchy in the reference
models. The network layer is where the term packet is used for the first time. Common protocols at
this layer are IP (Internet Protocol), ICMP (Internet Control Message Protocol), IGMP (Internet
Group Management Protocol) and IPsec (a protocol suite for securing IP). The transport layers
protocols include TCP (Transmission Control Protocol), a connection-oriented protocol; UDP (User
Datagram Protocol), a connection-less protocol; and SCTP (Stream Control Transmission Protocol),
which has features of both TCP and UDP. The application layer has many protocols that are
commonly used, like HTTP, FTP, IMAP, SMTP and more.
Capturing packets means collecting data being transmitted on the network. Every time a network
card receives an Ethernet frame, it checks if its destination MAC address matches its own. If it does,
it generates an interrupt request. The routine that handles this interrupt is the network cards driver;
it copies the data from the card buffer to kernel space, then checks the ethertype field of the

Ethernet header to determine the type of the packet, and passes it to the appropriate handler in the
protocol stack. The data is passed up the layers until it reaches the user-space application, which
consumes it.
When we are sniffing packets, the network driver also sends a copy of each received packet to the
packet filter. To sniff packets, we will use libpcap, an open source library.
Understanding libpcap
libpcap is a platform-independent open source library to capture packets (the Windows version is
winpcap). Famous sniffers like tcpdump and Wireshark make the use of this library.
To write our packet-capturing program, we need a network interface on which to listen. We can
specify this device, or use a function which libpcap provides:
char *pcap_lookupdev(char *errbuf).
This returns a pointer to a string containing the name of the first network device suitable for packet
capture; on error, it returns NULL (like other libpcap functions). The errbuf is a user-supplied buffer
for storing an error message in case of an error it is very useful for debugging your program.
This buffer must be able to hold at least PCAP_ERRBUF_SIZE (currently 256) bytes.
Getting control of the Network Device
Next, we open the chosen network device using the function
pcap_t *pcap_open_live(const char *device, int snaplen, int promisc, int to_ms, char *errbuf).
It returns an interface handler of type pcap_t,
which other libpcap functions will use.
The first argument is the network interface we want to open; the second is the maximum number of
bytes to capture. Setting it to a low value will be useful when we only want to grab packet headers.
The Ethernet frame size is 1518 bytes. A value of 65535 will be enough to hold any packet from
any network. The promisc flag indicates whether the network interface should be put into
promiscuous mode or not. (In promiscuous mode, the NIC will pass all frames it receives to the
CPU, instead of just those addressed to the NICs MAC address.)
The to_ms option tells the kernel to wait for a particular number of milliseconds before copying
information from kernel space to user space. A value of zero will cause the read operation to wait
until enough packets are collected. To save extra overhead in copying from kernel space to user
space, we set this value according to the volume of network traffic.
Actual capture
Now, we need to start getting packets. Lets use
u_char *pcap_next(pcap_t *p, struct pcap_pkthdr *h).
Here, *p is the pointer returned by pcap_open_live(); the other argument is a pointer to a variable of
type struct pcap_pkthdr in which the first packet that arrives is returned.
The function int pcap_loop(pcap_t *p, int cnt, pcap_handler callback, u_char *user) is used to collect the
packets and process them. It will return when cnt number of packets have been captured. A callback
function is used to handle captured packets (we need to define this callback function). To pass extra
information to this function, we use the *user parameter, which is a pointer to a u_char variable (we
will have to cast it ourselves, according to our needs in the callback function).
The callback function signature should be of the form:
void callback_function(u_char *arg, const struct pcap_pkthdr* pkthdr, const u_char* packet).
The first argument is the *user parameter we passed to pcap_loop(); the next argument is a pointer to a
structure that contains information about the captured packet. The structure of


struct pcap_pkthdr is as follows (from pcap.h):
struct pcap_pkthdr {
struct timeval ts; /* time stamp */
bpf_u_int32 caplen; /* length of portion present */
bpf_u_int32 len; /* length of this packet (off wire) */
};

An alternative to pcap_loop() is pcap_dispatch(pcap_t *p, int cnt, pcap_handler callback, u_char *user). The only
difference is that it returns when the timeout specified in pcap_open_live() is exceeded.
Filtering traffic
Until now, we have been just getting all the packets coming to the interface. Now, well use a pcap
function that allows us to filter the traffic coming to a specific port. We might use this to only
process packets of a specific protocol, like ARP or FTP traffic, for example. First, we have to
compile the filter using the following code:
int pcap_compile(pcap_t *p, struct bpf_program *fp, const char *str, int optimize, bpf_u_int32 mask);
The first argument is the same as before; the second is a pointer that will store the compiled version
of the filter. The next is the expression for the filter. This expression can be a protocol name like
ARP, IP, TCP, UDP, etc. You can see a lot of sample expressions in the pcap-filter or tcpdump man
pages, which should be installed on your system.
The next argument indicates whether to optimize or not (0 is false, 1 is true). Then comes the
netmask of the network the filter applies to. The function returns -1 on error (if it detects an error in
the expression).After compiling, lets apply the filter using int pcap_setfilter(pcap_t *p, struct bpf_program
*fp). The second argument is the compiled version of the expression.

Finding IPv4 information
int pcap_lookupnet(const char *device, bpf_u_int32 *netp, bpf_u_int32 *maskp, char *errbuf)

We use this function to find the IPv4 network address and the netmask associated with the device.
The address will be returned in *netp and the mask in *mask.

Conclusion

Hence, we have successfully studied concept of Packet Sniffer.




Install and use open source tools to identifying various types of WiFi attacks.
Write a C++/Java/Python program to identify atleast one such attack.


Problem Statement:

Install and use open source tools to identifying various types of WiFi attacks. Write a C+
+/Java/Python program to identify atleast one such attack.

THEORY

Wireless Networks:

Wi-fi is developed on IEEE 802.11 standards, and it is widely used in wireless
communication. It provides wireless access to applications and data across a radio network.
Wi-fi sets up numerous ways to build up a connection between the transmitter and the
receiver such as DSS, FHSS, Infrared(IR) and OFDM.

Advantages:

Installation is fat and easy eliminates wiring through walls and ceilings.
It is easier to provide connectivity in areas where it is difficult to lay cable.
Access to the network can be from anywhere within range of an access point.
Public places like airports, libraries, schools or even coffee shops offer you a constant
Internet connection using Wireless LAN.

Disadvantages:

Security is a big issue and may not meet expectations.
As the number of computers on the network increases, the bandwidth suffers.
Wi-fi standards change, which requires the replacement of wireless cards and/or access points.
Some electronic equipment can interfere with the Wi-fi networks.


Wireless Terminologies:

GSM Universal system used for mobile transportation for wireless network worldwide.
Antenna-Directional Used to broadcast and obtain radio waves from a single direction.
Antenna-Omni-Directional Used to broadcast and obtain radio waves from all slides.
Wi-fi Finder Device used to find a Wi-fi network.
Association The process of connecting a wireless device to an access point.
Autentication Process of identifying a device prior to allowing access to network resources.
BSSID The MAC address of an access point that has set up a Basic Service Set(BSS).
Wi-Fi Protected Access(WPA) It is an advanced WLAN client authenticating and data encryption
protocol using TKIP, MIC, and AES encryption.
Gigahertz Frequency represented as billions of cycles.
Hotspot Places where wireless network is available for public use.
Access Point Used to connect wireless devices to a wireless network.
ISM band A range of radio frequencies that are assigned for use by unlicensed users.
Bandwidth Describes the amount of information that may be broadcasted over a connection.
Wired Equivalent Privacy (WEP) It is a WLAN client authentication and data encryption protocol.

Wireless Components:

Antenna It is part of a transmitting or receiving system which designed to radiate or to receive
electromagnetic waves.
Wireless Access Points Wireless Access Points (APs or WAPs) are specially configured nodes on
wireless local area networks(WLANs)
Wireless Router A wireless router is a device in a wireless local area network(WLAN) that
determines the net network point to which a packet should be forwarded toward its destination.
Wireless Modem A wireless is a type of modem that connects to a wireless network instead of to
the telephone system.
Ssid A service set identifier (SSID) is the name if a wireless local area network(WLAN)
Mobile Station A users wireless device, which can be a cellphone or a system in a vehicle.
Base Station Subsystem Base station Subsystem (BSS) controls the radio link with mobile station.
Network Subsystem It is responsible for handling call control.
Base Station Controller Maintains radio connections towards the Mobile Station and a terrestrial
connection towards the NSS.
Mobile Switching Center The Mobile Switching Center(MSC) performs switching of user calls and
provides the necessary functionality to handle mobile subscribers.


Types of Wireless Networks:

Wi-Fi Chalking:

WarChalking A method used to draw symbols in public places to advertise open Wi-Fi networks.
WarWalking Attacks walk around with Wi-Fi enabled laptops to detect open wireless networks.
WarFlying In this techniques, attackers fly around with Wi-Fi enabled laptops to detect open
wireless networks.
WarDriving Attackers drive around with Wi-Fi enabled laptops to detect open wireless networks.

Access Control Attacks:
Wireless access control attacks aim to penetrate a network by evading WLAN access control
measures, such as AP MAC filters and Wi-Fi port access controls. Attacker listens to beacons or
sends probe requests to discover wirelesss LANs, thereby providing launch point for further attacks.
Creating an open backdoor into a trusted network by installing an unsecured AP. It can be used to
hijack the connections of legitimate network users.
Hacker spoofs the MAC address of WLAN client equipment to mask as an authorized client.
Attacker connects to AP as an authorized client and eavesdrops on sensitive information. Connecting
directly to an unsecured a station to circumvent AP security or to attack a station. It is similar to an
Evil Twin attack, but is not based on fooling a user to find a free unsecured network; instead, it
forces the user to connect to the unsecured network.


Client Mis-Association:
Attacker sets up a rouge access point outside the corporate perimeter and lures the employees of the
organization to connect with it.
Once associated, employees may bypass the enterprise security policies.
Unauthorized Association:
Soft access points are client cards or embedded WLAN radios in some PDAs and laptops that can be
launched inadvertently or through a virus program.
Attackers infect victims machine and activate soft Aps, allowing them unauthorized connection to
the enterprise network.

AP Misconfiguration:
Access points are configured to broadcast SSIDs to authorized users.
To verify authorized users, network administrators incorrectly use the SSIDs as passwords.
SSID broadcasting is a configuration error that allows intruders to steal an SSID and have the AP
assume they are allowed to connect.

Integrity Attacks:
In integrity attacks, attackers send forged control, management or data frames over a wireless
network to misdirect the wireless devices in order to perform another type of attack (e.g., DoS)
Data Frame Injection crafting and sending forged 802.11 frames
WEP Injection WEP Injection is used to crack WEP encryption keys using tools such as Aircrack-
ng suite
Data Replay Capturing 802.11 data frames for later replay.
Initialization Vector Replay Attacks It is a network attack in which a known plaintext massage is
sent to an observable wireless LAN client. Network attacker sniffs the wireless LAN looking for the
predicted ciphertext and will find the known frame to derive the stream.Network attacker can grow
the key stream using the same IV/WEP key pair to subvert the network.
Bit-Flipping Attacks The attack relies on the weakness of the ICV. Although the data payload size
varies, many elements remain constant and in the same bit position. Attacker tampers with the
payload portion of the frame to modify the higher layer packet.
Extensible AP Replay Capturing 802.1X Extensible Authentication protocols such as EAP Identity,
Success, Failure for later replay.
RADIUS Replay Capturing RADIUS Access-Accept or Reject messages for later replay.
Wireless Network Viruses Wireless networks are an excellent target for computer viruses, as they
are always connected to the Internet and do not have specific software to protect them.

Confidentiality Attacks:
These attacks attempt to intercept confidential information sent over wireless associations,
Whether sent in the cleartext or encrypted by Wi-Fi protocols.


Availability Attacks:
Denial of Service attacks aim to prevent legitimate users from accessing resources in a
wireless network.

Authentication Attacks:
The objective of authentication attacks is to steal the identity of Wi-Fi clients, their personal
information, login credentials, etc. to gain unauthorized access to network resources.

Conclusion:

Hence, we have studied various types of Wi-Fi attacks.




Write a program to implement a packet sniffing tool.



Problem Statement:

Write a program to implement a packet sniffing tool.

THEORY

Packet Sniffer:
A packet sniffer (also known as a network analyzer, protocol analyzer or for particular types of
networks, an Ethernet sniffer or wireless sniffer) is a computer program or a piece of computer
hardware that can intercept and log traffic passing over a digital network or part of a network. As
data streams flow across the network, the sniffer captures each packet and, if needed, decodes the
packet's raw data, showing the values of various fields in the packet, and analyzes its content.
Capabilities:
On wired broadcast LANs, depending on the network structure (hub or switch), one can capture
traffic on all or just parts of the network from a single machine within the network; however, there
are some methods to avoid traffic narrowing by switches to gain access to traffic from other systems
on the network (e.g., ARP spoofing). For network monitoring purposes, it may also be desirable to
monitor all data packets in a LAN by using a network switch with a so-called monitoring port,
whose purpose is to mirror all packets passing through all ports of the switch when systems
(computers) are connected to a switch port. To use a network tap is an even more reliable solution
than to use a monitoring port, since taps are less likely to drop packets during high traffic load.
On wireless LANs, one can capture traffic on a particular channel, or on several channels when
using multiple adapters.
On wired broadcast and wireless LANs, to capture traffic other than unicast traffic sent to the
machine running the sniffer software, multicast traffic sent to a multicast group to which that
machine is listening, and broadcast traffic, the network adapter being used to capture the traffic
must be put into promiscuous mode; some sniffers support this, others do not. On wireless LANs,
even if the adapter is in promiscuous mode, packets not for the service set for which the adapter is
configured will usually be ignored. To see those packets, the adapter must be in monitor mode.
When traffic is captured, either the entire contents of packets can be recorded, or the headers can be
recorded without recording the total content of the packet. This can reduce storage requirements,
and avoid legal problems, but yet have enough data to reveal the essential information required for
problem diagnosis.
The captured information is decoded from raw digital form into a human-readable format that
permits users of the protocol analyzer to easily review the exchanged information. Protocol
analyzers vary in their abilities to display data in multiple views, automatically detect errors,
determine the root causes of errors, generate timing diagrams, reconstruct TCP and UDP data
streams, etc.
Some protocol analyzers can also generate traffic and thus act as the reference device; these can act
as protocol testers. Such testers generate protocol-correct traffic for functional testing, and may also
have the ability to deliberately introduce errors to test for the DUT's ability to deal with error
conditions.

Protocol analyzers can also be hardware-based, either in probe format or, as is increasingly more
common, combined with a disk array. These devices record packets (or a slice of the packet) to a
disk array. This allows historical forensic analysis of packets without the users having to recreate
any fault.
Uses:
The versatility of packet sniffers means they can be used to:
Analyze network problems
Detect network intrusion attempts
Detect network misuse by internal and external users
Documenting regulatory compliance through logging all perimeter and endpoint traffic
Gain information for effecting a network intrusion
Isolate exploited systems
Monitor WAN bandwidth utilization
Monitor network usage (including internal and external users and systems)
Monitor data-in-motion
Monitor WAN and endpoint security status
Gather and report network statistics
Filter suspect content from network traffic
Serve as primary data source for day-to-day network monitoring and management
Spy on other network users and collect sensitive information such as login details or users
cookies (depending on any content encryption methods that may be in use)
Reverse engineer proprietary protocols used over the network
Debug client/server communications
Debug network protocol implementations
Verify adds, moves and changes
Verify internal control system effectiveness (firewalls, access control, Web filter, spam
filter, proxy)
Packet capture can be used to fulfill a warrant from a law enforcement agency (LEA) to produce all
network traffic generated by an individual. Internet service providers and VoIP providers in the
United States must comply with CALEA (Communications Assistance for Law Enforcement Act)
regulations. Using packet capture and storage, telecommunications carriers can provide the legally
required secure and separate access to targeted network traffic and are able to use the same device
for internal security purposes. Collection of data from a carrier system without a warrant is illegal
due to laws about interception.
Capturing Packets with libpcap:
All data on the network travels in the form of packets, which is the data unit for the network. To
understand the data a packet contains, we need to understand the protocol hierarchy in the reference
models. The network layer is where the term packet is used for the first time. Common protocols at
this layer are IP (Internet Protocol), ICMP (Internet Control Message Protocol), IGMP (Internet
Group Management Protocol) and IPsec (a protocol suite for securing IP). The transport layers
protocols include TCP (Transmission Control Protocol), a connection-oriented protocol; UDP (User
Datagram Protocol), a connection-less protocol; and SCTP (Stream Control Transmission Protocol),
which has features of both TCP and UDP. The application layer has many protocols that are
commonly used, like HTTP, FTP, IMAP, SMTP and more.
Capturing packets means collecting data being transmitted on the network. Every time a network
card receives an Ethernet frame, it checks if its destination MAC address matches its own. If it does,

it generates an interrupt request. The routine that handles this interrupt is the network cards driver;
it copies the data from the card buffer to kernel space, then checks the ethertype field of the
Ethernet header to determine the type of the packet, and passes it to the appropriate handler in the
protocol stack. The data is passed up the layers until it reaches the user-space application, which
consumes it.
When we are sniffing packets, the network driver also sends a copy of each received packet to the
packet filter. To sniff packets, we will use libpcap, an open source library.
Understanding libpcap
libpcap is a platform-independent open source library to capture packets (the Windows version is
winpcap). Famous sniffers like tcpdump and Wireshark make the use of this library.
To write our packet-capturing program, we need a network interface on which to listen. We can
specify this device, or use a function which libpcap provides:
char *pcap_lookupdev(char *errbuf).
This returns a pointer to a string containing the name of the first network device suitable for packet
capture; on error, it returns NULL (like other libpcap functions). The errbuf is a user-supplied buffer
for storing an error message in case of an error it is very useful for debugging your program.
This buffer must be able to hold at least PCAP_ERRBUF_SIZE (currently 256) bytes.
Getting control of the Network Device
Next, we open the chosen network device using the function
pcap_t *pcap_open_live(const char *device, int snaplen, int promisc, int to_ms, char *errbuf).
It returns an interface handler of type pcap_t,
which other libpcap functions will use.
The first argument is the network interface we want to open; the second is the maximum number of
bytes to capture. Setting it to a low value will be useful when we only want to grab packet headers.
The Ethernet frame size is 1518 bytes. A value of 65535 will be enough to hold any packet from
any network. The promisc flag indicates whether the network interface should be put into
promiscuous mode or not. (In promiscuous mode, the NIC will pass all frames it receives to the
CPU, instead of just those addressed to the NICs MAC address.)
The to_ms option tells the kernel to wait for a particular number of milliseconds before copying
information from kernel space to user space. A value of zero will cause the read operation to wait
until enough packets are collected. To save extra overhead in copying from kernel space to user
space, we set this value according to the volume of network traffic.
Actual capture
Now, we need to start getting packets. Lets use
u_char *pcap_next(pcap_t *p, struct pcap_pkthdr *h).
Here, *p is the pointer returned by pcap_open_live(); the other argument is a pointer to a variable of
type struct pcap_pkthdr in which the first packet that arrives is returned.
The function int pcap_loop(pcap_t *p, int cnt, pcap_handler callback, u_char *user) is used to collect the
packets and process them. It will return when cnt number of packets have been captured. A callback
function is used to handle captured packets (we need to define this callback function). To pass extra
information to this function, we use the *user parameter, which is a pointer to a u_char variable (we
will have to cast it ourselves, according to our needs in the callback function).
The callback function signature should be of the form:
void callback_function(u_char *arg, const struct pcap_pkthdr* pkthdr, const u_char* packet).

The first argument is the *user parameter we passed to pcap_loop(); the next argument is a pointer to a
structure that contains information about the captured packet. The structure of

struct pcap_pkthdr is as follows (from pcap.h):
struct pcap_pkthdr {
struct timeval ts; /* time stamp */
bpf_u_int32 caplen; /* length of portion present */
bpf_u_int32 len; /* length of this packet (off wire) */
};

An alternative to pcap_loop() is pcap_dispatch(pcap_t *p, int cnt, pcap_handler callback, u_char *user). The only
difference is that it returns when the timeout specified in pcap_open_live() is exceeded.
Filtering traffic
Until now, we have been just getting all the packets coming to the interface. Now, well use a pcap
function that allows us to filter the traffic coming to a specific port. We might use this to only
process packets of a specific protocol, like ARP or FTP traffic, for example. First, we have to
compile the filter using the following code:
int pcap_compile(pcap_t *p, struct bpf_program *fp, const char *str, int optimize, bpf_u_int32 mask);
The first argument is the same as before; the second is a pointer that will store the compiled version
of the filter. The next is the expression for the filter. This expression can be a protocol name like
ARP, IP, TCP, UDP, etc. You can see a lot of sample expressions in the pcap-filter or tcpdump man
pages, which should be installed on your system.
The next argument indicates whether to optimize or not (0 is false, 1 is true). Then comes the
netmask of the network the filter applies to. The function returns -1 on error (if it detects an error in
the expression).After compiling, lets apply the filter using int pcap_setfilter(pcap_t *p, struct bpf_program
*fp). The second argument is the compiled version of the expression.

Finding IPv4 information
int pcap_lookupnet(const char *device, bpf_u_int32 *netp, bpf_u_int32 *maskp, char *errbuf)

We use this function to find the IPv4 network address and the netmask associated with the device.
The address will be returned in *netp and the mask in *mask.

Conclusion

Hence, we have successfully studied concept of Packet Sniffer.




Write a program for Identifying the tampering of digital signature using
Python.



Problem Statement:

Write a program for Identifying the tampering of digital signature using Python.

THEORY

Digital Signatures:

Digital signatures are commonly used for software distribution, financial transactions, and in other cases
where it is important to detect forgery or tampering. so digital signatures ("digital thumbprints") are
commonly used to identify electronic entities for online transactions. A digital signature uniquely identifies
the originator of digitally signed data and also ensures the integrity of the signed data against tampering or
corruption.

One possible method for creating a digital signature is for the originator of data to create the signature by
encrypting all of the data with the originator's private key and enclosing the signature with the original
data. Anyone with the originator's public key can decrypt the signature and compare the decrypted
message to the original message. Because only someone with the private key can create the signature, the
integrity of the message is verified when the decrypted message matches the original. If an intruder alters
the original message during transit, the intruder cannot also create a new valid signature.

If an intruder alters the signature during transit, the signature does not verify properly and is invalid.
However, encrypting all data to provide a digital signature is impractical for three reasons:
The ciphertext signature is the same size as the corresponding plaintext, so message sizes are
doubled, consuming large amounts of bandwidth and storage space.
Public key encryption is slow and places heavy computational loads on computer processors, so
network and computer performance can be significantly degraded.
Encrypting the entire contents of information produces large amounts of ciphertext, which can be
used for cryptanalysis attacks, especially known plaintext attacks (where certain parts of the
encrypted data, such as e-mail headers, are known beforehand to the attacker).



Tampering of digital signature:

An electronic signature is only as good as the security that protects it. Some electronic signature services
-offer a feature called tamper evidence.If someone tries to change any part of the document (even
Something as simple as deleting a space or capitalizing a word) theres proof that tampering took place.
Dealing with important documents or high-value transactions, advanced tamper evidence is vital. In fact
-many banks and credit unions require this kind of tamper evidence to protect their customers.

Intuitively, a signature scheme is tamper-evident if there is only a single valid signature for any given
message and any given transcript. Note that in practice a verifier (whom we call observer) must check the
validity and covert-validity of all the signatures output by the signer. If the signer refuses to engage in the
Verify* protocol with the observer, or (in the non-interactive case which we focus on), if it does not output
a proof of covert-validity, the observer announces that the signer failed the test of covert-validity and is
thus untrustworthy.

Conclusion

Hence, we have successfully studied concept of Identifying the tampering of digital signature.




Write a C++/Java program for Log Capturing and Event Correlation.


Problem Statement:

Write a C++/Java program for Log Capturing and Event Correlation.

THEORY

Log Capturing :
Every device on network spits out some kind of a log and also keeping track of those logs is an
important piece of the puzzle to knowing security posture. Reason behind capturing logs is, If you
need to investigate which technologies best do correlation and will help you see things on your
network that you would have trouble seeing yourself. These systems are typically complicated and
take a lot of planning effort in order to ensure good results. You need to have an intimate
knowledge of your network to know avenues of attacks and vital systems so you can setup rules and
alerts. They also require maintenance when changes are made on your network.
However, if done right, these tools can give you a very good look into your network's security, and
they can help find problems much quicker than a human could. However, if your priority is not
alerting and complicated correlation, then perhaps you simply want to capture the logs for forensic
purposes and some simpler alerting. If this is the case, then you need to find those technologies that
focus on disk space (high native capacity and possible expansion), the least amount of log
normalization, and log protection (encryption and no repudiation).
The reason behind limited log normalization (or none, if you can get it) and protecting logs is in
case you have a security violation that will possibly involve a court case. You need to have the
ability to prove that the logs are accurate and have not been changed in order to be accepted in
court. These boxes also need to have the ability to move logs easily to storage and not have affected
the any repudiation. The reason you need disk space is because if you are focusing on forensics, you
will probably need to keep logs for a while. Another reason that people have these types of devices
is as an audit or compliance widget. Though to find this to be the least important reason behind
implementing log management, if it will get an auditor off my back, use it. And many
manufacturers build in extensive audit and compliance reporting that are an auditors dream.
If this is your need/desire, then make sure you focus on devices that have strong reporting
characteristics. Speaking of reporting, in my experience with these types of devices, I often find that
a device is either very strong in one of the above characteristics, or it is very strong in reporting.
Very few have strengths in both areas. However, it is my contention that manufacturers should
focus heavily on both. Having all the information in the world does you no good if the user has no
idea how to retrieve it. And the security admin of the world love to have configurable dashboards
that they can give to their boss (and the auditors I spoke of above) so they get fewer questions about
what is going on in the network.
Most technologies available for log management will have all of the above features in some way,
shape, or form, but they will vary in strength. When you are performing your risk analysis,
determine what the focus of your company needs to be with log management. An example could be
if you are a large enterprise with a very complicated network, then you may need to find a good
correlation engine. Or, if you are a smaller firm that has high value intellectual property, you may
want to have a box that focuses on forensic capabilities so you be sure to track down violations and
recover your losses in a court of law.


JAMon API
The Java Application Monitor (JAMon) is a free, simple, high performance, thread safe, Java API
that allows developers to easily monitor production applications. JAMon can be used to determine
application performance bottlenecks, user/application interactions, and application scalability.
JAMon gathers summary statistics such as hits, execution times (total, average, minimum,
maximum, standard deviation), and simultaneous application requests. JAMon statistics are
displayed in the clickable JAMon Report.

Event Correlation
Event correlation is a technique for making sense of a large number of events and pinpointing the
few events that are really important in that mass of information.The central unit of information for
any event correlation engine is the event. Events can be viewed as a generalized log records
produced by various agents including standard Unix syslog. As such they can be related to any
significant change in the state of the operating system or application. Events can be generated for
not only for problems but for successful completions of scheduled tasks. For example, a host is
being rebooted, attempt to log as an administrator, or a hard drive being nearly full.
Typical event flow is not that different from email flow: each event has its origin, creation time,
time, subject and body. Often they have severity and other fixed parameters. Like in case of email
many events are just spam. Like in email they can be sorted in multiple event streams. For example,
operator event stream, Unix administrators event stream, Web server and Web sphere
administrators event stream, etc. Like in Lotus Notes events can be processed much like database
records using some kind of SQL-alike or generic scripting language.
Actually, as we will see below, the analogy runs deeper than that.
Event processing flow includes several stages. Among them
Event detection and forwarding to the processing (for example, Unix has built-is event
collection system called syslog),
Pre-filtering or stateless event correlation,
More complex or Stateful event correlation
Event notification (often integrated with help-desk system),
Event response.
Event archiving
Event correlation is one of the most important parts of event processing flow. Proper event
correlation and filtering is critical to ensuring service quality and the ability to respond rapidly to
exceptional situations. The key to this is having experts encode their knowledge about the
relationship between event patterns and actions to take. Unfortunately, doing so is time-consuming
and knowledge-intensive.
Simple approaches based on collecting events on "enterprise console" often lead to information
overload when the system is "crying wolf" way too often and as a result even useful alerts get
ignored due to noise level. Correlation of events, while not a panacea, can substantially reduce the
load of human operator and this improve chances that a relevant alert will be noticed and reacted in
due time. But the devil is in details. As Marcus Ranum noted:
"Correlation is something everyone wants, but nobody even knows what it is. It's like liberty or
free beer -- everyone thinks it's a great idea and we should all have it, but there's no road map for
getting from here to there."

Still there are at least a couple of established technologies that are associated with event
correlation:
Stateless correlation: when the correlation engine does not use its current state for the
decision. It is usually limited to filtering.
Stateful correlation: when the correlation engine works with a "sliding window"
(peephole) of events and can match the latest event against any other event in the window as
well as its own state.
Stateful correlation is essentially a pattern recognition applied to a narrow domain: the process of
identification of patterns of events often across multiple systems or components, patterns that might
signify hardware or software problems, attacks, intrusions, misuse or failure of components. It can
also implement as specialized database with SQL as a query and peephole manipulation engine.
The most typical operations include but are not limited to:
1. Transformation (or enrichment)
2. Duplicates removal
3. Filtering
4. Aggregation
5. Generalization
6. Auto-closure
7. Time-linking
8. Topology based correlation
Event correlation is often associated with root cause analysis: the process of determining the root
cause of one or more events. For example, a failure situation on the network usually generates
multiple alerts but only one of them can be considered to be the root course. This is because a
failure condition on one device may render other devices inaccessible. Polling agents are unable to
access the device which has the failure condition. In addition, polling agents are also unable to
access other devices rendered inaccessible by the error on the original device. Events are generated
indicating that all of these devices are inaccessible are essentially spam. All we need is a root cause
event.
The most typical event stream that serves as a playground for event correlation is Unix (or other
OS) system logs. Log analysis is probably the major application domain of event correlation. Unix
syslog provide rich information about state of the system that permits building sophisticated
correlation schemes. Essentially each log entry is translatable to the event, although many can be
discarded as non-essential. Syslog often serves guinea pigs for enterprise correlation efforts and
rightly so: the implementation is simple (syslog can easily be centralized) and return on investment
is immediate as syslog in Unix contains mass of important events that are often overlooked.
Additional events can be forwarded to syslog from cron scripts and other sources.
With log-based evens as a constituent part of the events stream, the number of events in a typical
large corporate IT infrastructure or just its Unix part can be quite large. That meant that typically
raw events are going via special preprocessing phase that is often called normalization and that
stage somewhat trim the number of events for the subsequent processing. Many events extracted
from syslog are also discarded as useless. Normalization eliminates minor, non-essential variations
and convert all events into standard format, or at least format more suitable for further processing.
During this procedure event is also assigned some unique (often numeric) ID. It some way it is
similar to rewriting of envelope in email systems like Send mail.


Pre-filtering vs. "deep" correlation
It does not make any sense to perform events correlation is a single step. It is more productive to
use a separate stage for each event stream which is usually called "pre-filtering" (or surface
correlation) as opposed to "deep" correlation:
Pre-filtering. Pre-filtering is also called surface correlation. It should be optimized for
speed and implement simple processing rules which can perform aggregation of duplicate
events and elimination of non-essential, "spam" events. The main advantage is that it
lessens the load of the main, more complex correlation engine. Spam filter based
technologies can be adapted for pre-filtering. A regular expression is one such mechanism
that is definitely useful in event correlation. Among mail filters that use regular expressions
for this purpose I would like to mention procmail. Event stream itself is very similar to
SMTP stream and many typical for SMTP gateways problems are usually present too. Pre-
filtering can also be implemented using in-memory SQL databases. See Memory-based SQL
databases

Deep correlation. Historically Tivoli was one of the first systems that introduced "deep"
event correlation technologies. Tivoli event correlation engine is based on Prolog and
operates on a special purpose database. Not that it was tremendously successful but at least
it created critical mass of experience with "deep correlation". That means that techniques of
deep correlation are best studied by comparing with Tivoli TEC capabilities which should
consider 'classic" for this area.

IBM plans to discontinue its Prolog-based TEC engine in 2012 and that open possibilities
for open source implementation which can borrow concepts from the pre-existing TEC
infrastructure and documentation and improve upon that. As there are open source Prolog
implementations of reasonable quality all Tivoli correlation techniques can be replicated
with open source software. But in my opinion scripting languages represent a better
platform for "deep" correlation technologies. Prolog-like constructs can be emulated if
necessary within the scripting language. See, for example, Prolog Interpreters in Python
As those two technologies are complimentary, they should generally be deployed together as two
different stages of correlation engine:
preliminary (fast but with limited functionality; possibly programmed for speed in C )
main correlation engine (one that uses scripting language and is programmable)
Attempt to do correlation in one stage usually is counterproductive as "noise events" stress the
engine.
The complementary nature of pre-filtering and deep correlation means that advertisements about a
particular correlation engine based on the claims that it can process tremendous amount of events
per second (Micromuse used to boast about "thousands of events per second") are pretty stupid and
tells us something about the quality of the architecture.
For example, with 10K event cache IBM TEC 3.8 (and by extension 3.9) can process around 50
events per second using reasonably optimally split set of rule. Assuming newer 3.2Ghs dual core
Intel CPUs, Linux and DB2 this might be getting closer to 100 and such a speed is pretty much
adequate for most purposes if pre-filtering is used. It is very difficult to imagine more than 100
"important" events per second, if noise events are filtered out. In a way, any speed above 100 events
per second probably does not improve the quality of the "deep" correlation engine but just can point

out to architectural problems of the particular system and/or deceptive advertising designed for
fooling PHBs.

The Structure of the Event
Complexity of the event correlation engine is somewhat related to the structure of event. Events can
be strictly structured (essentially making them equal to structures in C and other programming
languages) or fuzzy structured (when the number and names of slots can be dynamic). Note that one
form cannot be converted into another, but different forms have different strong and weak points.
For example different flexibility.
Typically all events contains several command fields such as
Time
Event type
Host on which event occurred
Object to which event is related (for example syslog or disk file systems monitor are two
type of objects)
Application that is affected by event
It also can contain several large text or XML fields such as
Standard output from the probe
Annotations
Operator instructions
Action buttons definitions
There are two large classes of systems:
Systems with fixed structure of events
systems with flexible structure of events
For example HP Operations Monitor and Tivoli both belong to a system with strictly structured
events (both use Oracle database for storing them), but treat this quite differently.
In tivoli each event has certain number of predefined, strongly typed fields (slots). The structure is
defined in special BAROC (Basic Recorder Of Objects in C) language. The latter is not that
different from the notation used for C structures. Before event can be send to the system you need
to add it to the database of event class definitions. Otherwise the system will fail to recognize this
event.
For example:
The following example defines attribute names of name, address, employer, and hobbies for the
Person event class:
TEC_CLASS:
Person ISA EVENT
DEFINES {
name: STRING, dup_detect=YES;
address: STRING, dup_detect=YES;
employer: STRING;
hobbies: STRING;
};
The data type and any facets are comma-separated. The attribute definition should always be
terminated with a semi-colon.
If the main correlation engine is SQL-based, it usually presuppose strictly structured events. To
simplify processing by SQL engine they might even have "uniform" structure which is kind of strait

jacket (all field are predefined and cannot be changed). In this case you can fool the system by using
some string fields to extend the strait jacket using them as substructures that are interpreted by the
correlation engine. This is more or less convenient only if sting processing capabilities of the engine
are good.
Again, to stress that it is usually possible to convert events from one scheme to another . For
example IBM faced this task due to transition from TEC to Netcool. As a result it developed a
conversion tool called the BAROC conversion tool (nco_baroc2sql)
Another approach is fuzzy structuring of event, similar to structure of SMTP messages. That means
that event consists of two parts -- one rigidly structured (header) and the second which is not.
Actually in SMTP messages even header is flexible and can be extended by so-called X fields and
that approach has value in description of events too. As we mentioned before there is a stong
similarity between events and e-mail messages. You can consider events as email messages to
operators with special browser and special additional properties.
Another distinction is connected with the data carries by the event. Events can be completely
passive (data only; also some data can trigger interpretive actions as in Tivoli), or with active parts
(interpreted by some built-in scripting engine). For example in SMTP messages the body can
contain mime attachments which can executable scripts.
In general, event does not need to have any passive data fields at all and can be a statement or
sequence of statements in some language. In this case passive event is just a print statement in this
language that "spit" all the necessary information. For example, a procedure for performing some
actions on the event window (SQL insert statement). That, of course, raises some security questions,
but if operations are allowed only on event window ("sandbox") they are not very relevant.
The beauty of this approach is that you can send complex events that manipulate event windows in
non-trivial way. The simplest examples of this approach are so called "cancelling" events -- event
specifically designed to remove other event(s) of the same type (or set of matching attributes from
the event queue.

Conclusion

Hence, we have successfully studied concept of Log Capturing and Event Correlation.




Write a tool to detect and prevent Capturing mobile messages in Python/Java.


Problem Statement:

Write a tool to detect and prevent Capturing mobile messages in Python/Java.

THEORY

Mobile Phone Features:

The mobile phone or cellular phone is a complex electronic device that contains many features.

Voice and text messaging
Personal Information Management(PIM)
SMS and MMS messaging
Internet and email
Chat
Store the images and videos
Games
Camera with Video recorder
Bluetooth and infrared
GPS navigator

Different Mobile Devices:

BlackBerry BlackBerry is a personal wireless handheld device that supports email, mobile phone
capabilities, text messaging, web browsing and other wireless information services.
A BlackBerry can be used as a phone, address book, or calendar, and to create to-do lists and access
wireless Internet.

iPod is a portable digital audio and video player offering a huge storage capacity.

iPhone is an Internet connected designed and marketed by Apple Inc. with a multi-touch screen and
a minimal hardware interface.



Hardware Characteristics of Mobile Devices:

Software Characteristics of Mobile Devices:



Components of Cellular Network:
Mobile Switching Center (MSC) is the switching system for the cellular network.
Base Transceiver Station(BTS) is radio transceiver equipment that communicates with mobile
phones.
Base station controller(BSC):It manages the transceivers equipment and performs channel
assignment.
Base Station Subsystem (BSS) is responsible for managing the radio network and is controlled by
Mobile service switching center(MSC).
It consists of the elements BSC (Base Station Controller),BTS(Base Transceiver Station), and TC
(Transcoder).
Home location Register(HLR) is the database at MSC. It is the central repository system for
subscriber data and service information.
Visitor Location Register (VLR) is the database used in conjunction with the HLR for mobile phones
roaming outside their service area.

Precautions to be taken Before Investigation:

Handle cell phone evidence properly to maintain physical evidence such as fingerprints.
To avoid unwanted interaction with devices found on the scene, turn off wireless interfaces such as
Bluetooth and Wi-Fi radios on equipment brought into the search area.

Photograph the crime scene including, mobile phones, cables, cradles, power connectors,
removable media, and connections.
If the devices display is ON, the screens contents should be photographed and, if necessary
recorded manually, capturing the time, service status, battery level, and other displayed icons.
Collect other sources of evidence such as SIM, and other hardware in the phone, but do not remove
them from the device.

Conclusion

Hence, we have studied how to detect and prevent Capturing mobile messages



(GR:C) ASSIGNMENT NO: 1

Design and implementation of Steganography



Problem Statement:

Design and implementation of Steganography

THEORY

Steganography:
The art and science of hiding information by embedding messages within other, seemingly harmless
messages. Steganography works by replacing bits of useless or unused data in regular computer files
(such as graphics, sound, text, HTML, or even floppy disks) with bits of different, invisible
information. This hidden information can be plain text, cipher text, or even images.
Steganography sometimes is used when encryption is not permitted. Or, more commonly,
steganography is used to supplement encryption. An encrypted file may still hide information using
steganography, so even if the encrypted file is deciphered, the hidden message is not seen.

Implementation:
Steganography takes cryptography a step farther by hiding an encrypted message so that no one suspects
it exists. Ideally, anyone scanning your data will fail to know it contains encrypted data.
In modern digital steganography, data is first encrypted by the usual means and then inserted, using a
special algorithm, into redundant (that is, provided but unneeded) data that is part of a particular file
format such as a JPEG image. Think of all the bits that represent the same color pixels repeated in a
row. By applying the encrypted data to this redundant data in some random or nonconspicuous way, the
result will be data that appears to have the "noise" patterns of regular, nonencrypted data. A trademark
or other identifying symbol hidden in software code is sometimes known as a watermark.
Digital text:
Making text the same color as the background in word processor documents, e-mails, and forum
posts.
Using Unicode characters that look like the standard ASCII character set. On most systems, there is

no visual difference from ordinary text. Some systems may display the fonts differently, and the
extra information would be easily spotted.
Using hidden (control) characters, and redundant use of markup (e.g., empty bold, underline or
italics) to embed information within HTML, which is visible by examining the document source.
HTML pages can contain code for extra blank spaces and tabs at the end of lines, and colours, fonts
and sizes, which are not visible when displayed.
Using non-printing Unicode characters Zero-Width Joiner (ZWJ) and Zero-Width Non-Joiner
(ZWNJ).

These characters are used for joining and disjoining letters in Arabic, but can be used in
Roman alphabets for hiding information because they have no meaning in Roman alphabets:
because they are "zero-width" they are not displayed. ZWJ and ZWNJ can represent "1" and "0".
Network:
All information hiding techniques that may be used to exchange steganograms in telecommunication
networks can be classified under the general term of network steganography. This nomenclature was
originally introduced by Krzysztof Szczypiorski in 2003. Contrary to the typical steganographic
methods which utilize digital media (images, audio and video files) as a cover for hidden data, network
steganography utilizes communication protocols' control elements and their basic intrinsic functionality.
As a result, such methods are harder to detect and eliminate.
Typical network steganography methods involve modification of the properties of a single network
protocol. Such modification can be applied to the PDU (Protocol Data Unit, to the time relations
between the exchanged PDUs,or both (hybrid methods).
Moreover, it is feasible to utilize the relation between two or more different network protocols to enable
secret communication. These applications fall under the term inter-protocol steganography.Network
steganography covers a broad spectrum of techniques, which include, among others:
Steganophony - the concealment of messages in Voice-over-IP conversations, e.g. the employment of
delayed or corrupted packets that would normally be ignored by the receiver (this method is called
LACK - Lost Audio Packets Steganography), or, alternatively, hiding information in unused header
fields.
WLAN Steganography the utilization of methods that may be exercised to transmit steganograms in
Wireless Local Area Networks. A practical example of WLAN Steganography is the HICCUPS system
(Hidden Communication System for Corrupted Networks)
Printed:

Digital steganography output may be in the form of printed documents. A message, the plaintext, may
be first encrypted by traditional means, producing a ciphertext. Then, an innocuous covertext is
modified in some way so as to contain the ciphertext, resulting in the stegotext. For example, the letter
size, spacing, typeface, or other characteristics of a covertext can be manipulated to carry the hidden
message. Only a recipient who knows the technique used can recover the message and then decrypt it.
Francis Bacon developed Bacon's cipher as such a technique.
The ciphertext produced by most digital steganography methods, however, is not printable. Traditional
digital methods rely on perturbing noise in the channel file to hide the message, as such, the channel file
must be transmitted to the recipient with no additional noise from the transmission. Printing introduces
much noise in the cipher text, generally rendering the message unrecoverable. There are techniques that
address this limitation; one notable example is ASCII Art Steganography.
Using puzzles
This is the art of concealing data in a puzzle can take advantage of the degrees of freedom in stating the
puzzle, using the starting information to encode a key within the puzzle / puzzle image.
For instance, steganography using sudoku puzzles has as many keys as there are possible solutions of a
Sudoku puzzle, which is 6.711021. This is equivalent to around 70 bits, making it much stronger than
the DES method which uses a 56 bit key.[22]
Additional terminology
In general, terminology analogous to (and consistent with) more conventional radio and
communications technology is used; however, a brief description of some terms which show up in
software specifically, and are easily confused, is appropriate. These are most relevant to digital
steganographic systems.
The payload is the data to be covertly communicated. The carrier is the signal, stream, or data file into
which the payload is hidden; which differs from the "channel" (typically used to refer to the type of
input, such as "a JPEG image"). The resulting signal, stream, or data file which has the payload encoded
into it is sometimes referred to as the package, stego file, or covert message. The percentage of bytes,
samples, or other signal elements which are modified to encode the payload is referred to as the
encoding density and is typically expressed as a number between 0 and 1.
In a set of files, those files considered likely to contain a payload are called suspects. If the suspect was
identified through some type of statistical analysis, it might be referred to as a candidate.
Countermeasures and detection

Detection of physical steganography requires careful physical examination, including the use of
magnification, developer chemicals and ultraviolet light. It is a time-consuming process with obvious
resource implications, even in countries where large numbers of people are employed to spy on their
fellow nationals. However, it is feasible to screen mail of certain suspected individuals or institutions,
such as prisons or prisoner-of-war (POW) camps. During World War II, a technology used to ease
monitoring of POW mail was specially treated paper that would reveal invisible ink. In computing,
detection of steganographically encoded packages is called steganalysis. The simplest method to detect
modified files, however, is to compare them to known originals. For example, to detect information
being moved through the graphics on a website, an analyst can maintain known-clean copies of these
materials and compare them against the current contents of the site. The differences, assuming the
carrier is the same, will compose the payload. In general, using extremely high compression rate makes
steganography difficult, but not impossible. While compression errors provide a hiding place for data,
high compression reduces the amount of data available to hide the payload in, raising the encoding
density and facilitating easier detection (in extreme cases, even by casual observation).
Applications
Usage in modern printers
Steganography is used by some modern printers, including HP and Xerox brand color laser printers.
Tiny yellow dots are added to each page. The dots are barely visible and contain encoded printer serial
numbers, as well as date and time stamps.

Conclusion

Hence, we have successfully studied concept of Steganography



(GR:C) ASSIGNMENT NO: 2

Implement a program to generate and verify CAPTCHA image.


Problem Statement:

Implement a program to generate and verify CAPTCHA image.

THEORY

A CAPTCHA (an acronym for "Completely Automated Public Turing test to tell Computers and
Humans Apart") is a type of challenge-response test used in computing to determine whether or not
the user is human.
It is used to differentiate between humans and robots.This form of CAPTCHA requires that the user
type to the letters of a distorted image, sometimes with the addition of an obscured sequence of
letters or digits that appears on the screen. This helps in differentiating between a program and
visualisation. Because the test is administered by a computer, in contrast to the standard Turing test
(test of a machine's ability to exhibit intelligent behaviour) that is administered by a human, a
CAPTCHA is sometimes described as a reverse Turing test.

History:
The term was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas J. Hopper of
Carnegie Mellon University and John Langford of IBM. The most common type of CAPTCHA was
first invented by Mark D. Lillibridge, Martin Abadi, Krishna Bharat and Andrei Z. Broder.

Characteristics:
1) CAPTCHAs are by definition fully automated, requiring little human maintenance or
intervention to administer. This has obvious benefits in cost and reliability.
2) The algorithm used to create the CAPTCHA must be made public, though it may be covered by a
patent. This is done to demonstrate that breaking it requires the solution to a difficult problem in the
field of artificial intelligence (AI) rather than just the discovery of the (secret) algorithm, which
could be obtained through reverse engineering or other means.
3) Modern text-based CAPTCHAS are designed such that they require the simultaneous use of three
separate abilitiesinvariant recognition (ability to recognize the large amount of variation in the
shapes of letters), segmentation, and parsingto correctly complete the task with any consistency.
4) There are nearly an infinite number of versions for each character that a human brain can
successfully identify. The same is not true for a computer, and teaching it to recognize all those
differing formations is an extremely challenging task.
5) Segmentation, or the ability to separate one letter from another, is also made difficult in
CAPTCHAs, as characters are crowded together with no white space in between.

6) Unlike computers, humans excel at differentiating between unusual fonts. While segmentation
and recognition are two separate processes necessary for understanding an image for a computer,
they are part of the same process for a person. Human brain will not be fooled by variations in
letters.


Application:
CAPTCHAs are used to prevent bots from using various types of computing services or
collecting certain types of sensitive information. Applications include preventing bots from taking
part in online polls, registering for free email accounts (which may then be used to send spam) and
collecting email addresses. CAPTCHAs can prevent bot-generated spam by requiring that the
(unrecognized) sender pass a CAPTCHA test before the email message is delivered, but the
technology can also be exploited by spammers by impeding OCR detection of spam in images
attached to email messages.

Relation with AI:
While used mostly for security reasons, CAPTCHAs also serve as a benchmark task for
artificial intelligence technologies. Any program that passes the tests generated by a CAPTCHA can
be used to solve a hard unsolved AI problem

The arguement is that the advantages of using hard AI problems as a means for security are
twofold. Either the problem goes unsolved and there remains a reliable method for distinguishing
humans from computers, or the problem is solved and a difficult AI problem is resolved along with
it. In the case of image and text based CAPTCHAs, if an AI were capable of accurately completing
the task without exploiting flaws in a particular CAPTCHA design, then it would have solved the
problem of developing an AI that is capable of complex object recognition in scenes.

reCAPTCHA:
reCAPTCHA is a free service provided by google to protect your website from spam and abuse.
reCAPTCHA uses an advanced risk analysis engine and adaptive CAPTCHAs to keep automated
software from engaging in abusive activities on your site. It does this while letting your valid users
pass through with ease.
reCAPTCHA offers more than just spam protection. Every time our CAPTCHAs are solved,
that human effort helps digitize text, annotate images, and build machine learning datasets. This in
turn helps preserve books, improve maps, and solve hard AI problems.

In this assignment, we have used the code provided by google using jsp.

Explanation of the program:
Program is divided into two main parts
(1) Index.jsp
(2) Validate.jsp

In the program we are simply utilizing the reCAPTCHA code provided by Google. For this we must
save our program with extention .jsp, thus, we are using java code to access the reCAPTCHA.
The first code index.jsp provides us with the title and presentation of the page and is also used to
import the reCAPTCHA. Username and password are displayed and keys are provided to validate
CAPTCHA.
The second code validate.jsp checks if the input entered matches the CAPTCHA code. Then
by re-importing CAPTCHA and providing the public and private keys respectively we can validate

the input and show the user to be registered if the CAPTCHA is valid. validate.jsp consists of three
imp[ortant fields-
1. Remoteaddr: it takes the address required for running of the program.
2. Challenge: it checks if the text entered matches with CAPTCHA
3. uresponse: it registers the user if the the CAPTCHA is valid else shows a failure message.

Apart from the code, we need to install the tools like JAVA 6.0, Tomcat 6.0, and make the required
connections. Once the connections are made we must enter the following address in the browser:
http://localhost:8080/
Then enter into the manager and open the saved file by providing the path.

Conclusion

Hence, we have successfully studied concept of generation and verification of CAPTCHA image.

Pl-II Lab Manual - Scs

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Pl-II Lab Manual - Scs

Încărcat de

Drepturi de autor:

Formate disponibile

DYPCOE, Akurdi.

Department of Computer Engineering

S-ar putea să vă placă și