Documente Academic
Documente Profesional
Documente Cultură
INTRODUCTION
The word "Computer Virus" is introduced by mathematician Dr. Frederick Cohen, the "father" of computer virus, in 1984 and is defined as follows: "A virus is a program that is able to infect other programs by modifying them to include a possibly-evolved copy of itself. Computer viruses are harmful and have the following features. 1. Computer virus is an executable program which can attack the host program. 2. Computer virus can put a precise copy of itself or a possibly-evolved copy of itself in the other programs, namely the host programs by modifying the other programs, thereby infecting the host programs. 3. Computer virus can also modify the relative information or parts of the host program, and link the virus code and the host program together; thereby infecting the host program. Computer viruses destroy the computer materials and gravely affect the system performance. First, viruses occupy large quantities of computer materials, especially the CPU, which makes computer run slowly. Second, viruses access data illegally. They can access the secret or private data of the user, and even steal the business or government secret. Third, viruses unlawfully destroy or delete data stored in a computer. These data may be important user data or some system files. The illegal deletion of these user data may cause inestimable expense to users and the destroying of system files may cause the system break down. Since computer viruses harm seriously, the computer anti-virus technology has greatly developed. There are many useful anti-virus methods: the detection, removal viruses, immunity and prevention of virus. There into, the detection of virus is the simplest, but the most applied method which based on the detection of virus signature. It is widely used in the business antivirus soft. At present, the viruses spread widely through the high-speed network. It puts forward higher requirements to the performance of virus scan systems such as high throughput and
accuracy. In this paper, a virus scan system based on hardware-accelerated - the HVSS (HIFN/Hardware accelerated Virus Scan System) - is proposed to satisfy the high-speed network. This paper is organized as follows. Section 2 introduces the related work on computer viruses and antiviruses, indicates the ubiquitous shortcomings of current virus scan systems and gives our own solvents to these problems. Section 3 introduces the design and implementation of HVSS. The experimental results and performance analysis are given in Section 4, and Section 5 is a conclusion.
1.1. VIRUS
A computer virus is a computer program that can copy itself and infect a computer. The term "virus" is also commonly but erroneously used to refer to other types of malware, adware, and spyware programs that do not have the reproductive ability. A true virus can only spread from one computer to another (in some form of executable code) when its host is taken to the target computer; for instance because a user sent it over a network or the Internet, or carried it on a removable medium such as a floppy disk, CD, DVD, or USB drive. Viruses can increase their chances of spreading to other computers by infecting files on a network file system or a file system that is accessed by another computer. The term "computer virus" is sometimes used as a catch-all phrase to include all types of malware. Malware includes computer viruses, worms, Trojans, most root kits, spyware, dishonest adware, crime ware, and other malicious and unwanted software, including true viruses. Viruses are sometimes confused with computer worms and Trojan horses, which are technically different. A worm can exploit security vulnerabilities to spread itself to other computers without needing to be transferred as part of a host, and a Trojan horse is a program that appears harmless but has a hidden agenda. Worms and Trojans, like viruses, may cause harm to a computer system's hosted data, functional performance or networking throughput, when they are executed. Some viruses and other malware have symptoms noticeable to the computer user, but many are surreptitious or go unnoticed.
scanned. Fast infectors rely on their fast infection rate to spread. The disadvantage of this method is that infecting many files may make detection more likely, because the virus may slow down a computer or perform many suspicious actions that can be noticed by anti-virus software. Slow infectors, on the other hand, are designed to infect hosts infrequently. Some slow infectors, for instance, only infect files when they are copied. Slow infectors are designed to avoid detection by limiting their actions: they are less likely to slow down a computer noticeably and will, at most, infrequently trigger anti-virus software that detects suspicious behavior by programs. The slow infector approach, however, does not seem very successful.
1.2. CONCURRENCY
In computer science, concurrency is a property of systems in which several computations are executing simultaneously, and potentially interacting with each other. The computations may be executing on multiple cores in the same chip, preemptively time-shared threads on the same processor, or executed on physically separated processors. A number of mathematical models have been developed for general concurrent computation including Petri nets, process calculi, the synchronous model and the Actor model.
1.3 SCANNING
Real-time protection, on-access scanning, background guard, resident shield, auto protect, and other synonyms refer to the automatic protection provided by most antivirus, antispyware, and other antimalware programs, which is arguably their most important feature. This monitors computer systems for suspicious activity such as computer viruses, spyware, adware, and other malicious objects in 'real-time', in other words while data is coming into the computer, for example when inserting a CD, opening an email, or browsing the web or when a file already on the computer is opened or executed, in other words loaded into the computer's active memory. This means all data in files already on the computer is analyzed each time that the user attempts to access the files. This can prevent infection by not yet activated malware that entered the computer unrecognized before the antivirus received an update. Real-time protection and its synonyms are used in contrast to the expression "on-demand scan" or similar expressions that mean a user-activated scan of part or all of a computer.
3 steps to keep the computer free from viruses: 1. Keep Windows Critical updates current For Windows XP or Windows 2000, it is important to keep the Windows critical updates current. Some viruses attack Windows vulnerabilities. The updates protect against many flaws in the operating system. These updates and virus scans should be run in Safe Mode. 2. Have Antivirus software installed and run scans 3. Remove Ad software and Spyware from computer
2. RELATED WORK
The term computer virus was introduced by Cohen in 1984, according to the recommendation of his advisor, Professor Leonard Adleman, who picked the name from science fiction novels. The most important feature of the computer virus is self-replicating. The idea of self-replicating systems that model self-replication structures has been around since the Hungarian-American, John von Neumann, suggested it in 1948. Von Neumann proposed the idea of self-replicating and gave a model to describe nature's self-reproduction with the idea of selfbuilding automata. A few years later, Stanislaw Ulam suggested to Von Neumann to use the processes of cellular automation to describe this model. Instead of using "machine parts," states of cells were introduced. Neumann's model mathematically proved the possibility of selfreproducing structures: Regular, non-living parts (molecules) could be combined to create selfreproducing structures and potentially living organisms. In 1968, Second International Multisymposium on Computer and Computational Sciences Codd simplified Neumann's model using cells that had eight states in 5-cell environments. Such simplification is the base for "selfreplicating loops developed by artificial life researchers, such as Christopher G. Langton, in 1979. Such replication loops eliminate the complexity of universal machine from the system and to focus on the needs of replication. A computer virus is executable programs that can self reproduce and infect its host programs. People did not recognize the damage of computer caused by early viruses, until the 1988, the famous Morris worm event (although speak seriously, worm is not virus). From then on, especially with the development of network, the technology of computer virus puts up some new features. 1. General-service computer viruses: They are the viruses that can infect different kinds of operating systems. 2. The new viruses live in new operating systems: As the new operating systems occurred, the new viruses were produced to infect them.
3. Aiming at the network: As the network developed, some viruses were generated to attack the network, so as to access the network materials illegally, to acquire the accounts and passwords of other people, and even to attack the network bank systems or the nations important departments. 4. The tools to generate viruses automatically: The occurrence of the tools to generate viruses automatically makes the production of viruses. There occurs large quantity of viruses. The wide spread of viruses makes the technology of anti-viruses to attract peoples attention. A lot of technologies of anti-virus are recently proposed. There are four classes of antivirus technologies, the detection of virus, the removing of virus, the immunity of virus and the prevention of virus. There into, the virus detection is the most important technology in the virus resistance. Everyone knows that the viruses are some executable programs. Before resisting them, they should be identified. There are two kinds of situations. First, if the computer has not been infected by viruses, when it downloads data from network or other devices, the data can be scanned first to avoid downloading the viruses. Second, if the computer has already been infected, the computer can be scanned for viruses, and then delete them. Therefore, whether or not, people should as soon as possible to kill viruses. The most widely used virus detection technologies include the character code, the check sum, the action monitor, and the infecting experiment. The character code is the simplest method with lowest spending in virus detection. The steps are as follows. Step 1: Gather the samples of viruses. Step 2: Pick up the character code in the virus samples. Step 3: Add the character code in the virus database. Step 4: Scan the file for the character codes stored in the virus database. Step 5: If a character code is found in the file, that means the file is infected. HVSS uses the method of character code because it has the following advantages. 1. Detect quickly and accurately.
2. Can identify the name of viruses. 3. Has a low false alert rate. 4. The detection result is to help to remove the viruses. The method of character code also has the following typical shortages especially under current high speed network environment. 1. Cannot detect the unknown viruses. 2. As the number of viruses becomes larger, a big virus database needs to be maintained. Scanning the big database waste time and the virus scan will affect the bandwidth of networks seriously. Aiming at fixing these shortages of character code, we propose our own solvents in HVSS. HVSS has the following features. 1. HVSS is hardware supported. An accelerator card of HIFN -YUQUAN - is used for virus scan. YUQUAN adopts Pattern Match algorithm which provides an accurate and fast virus scan. 2. HVSS uses an effective protocol. The hpmd communication protocol is specially designed to provide effective communication between the client and the virus scan server. 3. HVSS supports multiple clients. Multiple clients can connect to the virus scan server at the same time. It enhances the ability of concurrent processing and increases the throughput of the server. 4. HVSS has the well-designed software. The software is of clear structure and complete function. The synchronization mechanism is well designed to satisfy the max concurrency.
Fig.3.1 Architecture of HVSS YUQUAN is an accelerator card which is installed in the PCIE interface of the virus scan server directly. It operates as a flow-through or a look-aside security processor to allow a modular upgrade to security appliances. Virus scanning in a big database takes heavy load to the host CPU. YUQUAN is believed to have the following advantages: 1. A potentially larger pattern match speed per thread (about 250M) makes YUQUAN more competitive to other existing solutions. 2. It has a flow-through path for specified applications. 3. There are eight pattern match engines in YUQUAN. Each engine handles one scan thread. They are incorporated to achieve a 1Gbps performance. 4. It contains a bandwidth control.
In addition, a host can also send the local files to the virus scan server for scanning. It has a compact pattern match engine allows YUQUAN to contain larger amount of rules
Fig.3.2 Message format from proxy to virus scan server. Message Header: 8 bytes
Flag: 16-bits Bit 0 : This bit should be set to 0 for request packet.
Bit 1~13 : reserved Bit 14~15: block flag 0 0 - the middle data block in file 0 1 - the first data block in file 1 0 - the last data block in file 1 1 - the full file File ID: 16-bits integer, file identification. A file can be divided into blocks and sent to the server for virus scan. The File ID is used to identify which file the block belongs to. Data Length: 32-bits integer, indicate the length of data need to be scanned, it is the length of message data, not includes the length of the message header. Message Data: Message data is a character sequence. It is the data block sent from the proxy to the server for a virus scan, and the length of the data is indicated by the "Data Length" parameter in the message header.
Fig.3.4 Message format from virus scan server to proxy. Message Header: 8 bytes
Fig.3.5 Flag format of response. Flag: 16-bits Bit 0: This bit should be set to 1 for response packet. Bit 1~9: reserved Bit 10: unexpected errors when 1 is set Bit 11: viruses found when 1 is set Bit12~15: reserved File ID: 16-bits integer, file identification. After a block is scanned, the server will send the result to the proxy; and the File ID is used to indicate which file the block belongs to. Data Length: 32-bits integer, indicate the length of data responded from server, it is the length of Attribute Data, not including the length of message header.
Attribute Data: Attribute data is the response sent from the server to the proxy as the result of scan, which includes one or more attributes, the length of attribute data is indicated by the "Data Length" parameter in the message header. Type : 8-bits, Indicate the scan result
Length: 8-bits integer; the length of the attribute value Value : a character sequence; the length of the "value" is indicated by the "length" field Attribute field: Type: 0 - The scanned block in the file is normal, the scanned block length will be returned as the value. Length: 4 - The value of the block length will occupy 4-bytes. Value: The block length has been scanned
Type: 1 - Viruses are found in the block. The virus name will be returned as the value. Length: The string length of the virus name Value: The virus name Type:2 - There are errors during the virus scan, the error bitmap will be returned as value Length: 4 - The value of the error bitmap will occupy 4 bytes. Value: The error bitmap
the blocks in the same file will be deleted and a corresponding response will be encapsulated in the stated message format which will be sent to the proxy. If the block is normal, the subsequent blocks in the same file will be sent for a virus scan until all the blocks in the file are scanned or abnormity (virus or error) occurred. For the message sent from hpmd to proxy, the bit 0 of the flag field should be set to 1. For the virus block, the bit 11 will be set to 1. For the error block, the bit 10 will be set to 1. For the normal block, the bit 10~11 will be "0 0". That the bit 10~11 is "1 1" should not occur. 3. When the proxy receives the response from hpmd, it will check the response type. If the response is virus found or if an error occurred, the proxy will not send the block to the user and it deletes all the blocks with the same File ID to the block (since virus has been found or error occurs, the remain blocks of the file need not to be scanned). If the response is normal, the proxy will send the block to the user and deletes the block only. The hpmd communication protocol has the following advantages. 1. It saves the disk space: First, the virus scan server does not need to keep the data block for a long time. After a data block is sent to YUQUAN for a scan, the data block can be deleted. Second, when the proxy receives the scan result, if a virus or an error occurs, all the data blocks belong to the same file will be deleted. If it is normal, the data block will be sent to the client. The proxy does not need to keep the block anymore. 2. Suitable for high-speed networks: Hpmd is designed to support multiple clients. It is a single process multiple threads function module. So hpmd is of great parallelism. Experimental results show that two hpmds will enhance the throughput of the virus scan greatly. 3. It is easy to realize: Hpmd communication protocol is easy to realize. First, the message format is simple. Second, there are many open source codes can be used such as the frox.
4. When a virus or an error occur during the processing of a block, the subsequent blocks in the same file will not be scanned and all the blocks with the same File ID on the hpmd will be deleted. Second, the action of frox: 1. Frox will divide file into blocks according the default block size. These blocks will be identified according to the File ID. 2. After frox receives the response message from hpmd, it will do corresponding process according to the flag of message. If the scan result of the block is virus or error, frox will not send the block to the user, and will delete all the blocks with the same File ID as the scanned block. If the block is normal, frox will send the block to the user and will delete the block from buffer.
Fig. 4.1 Ixia with single frox, hpmd disabled network configuration
Table4.1. Results for single frox, with Virus scan disabled File size(Bytes) 4MB 1MB 256KB 64KB 15M(Clean) 15M(virus at begin) 15M(virus at middle) 15M(virus at end) Throughput(Mbps) 424 384 184 200 424 424 424 424
Table4.2. Results for single frox, with Virus scan enabled File size(Bytes) 4MB 1MB 256KB 64KB 15M(Clean) 15M(virus at begin) 15M(virus at middle) 15M(virus at end) Throughput(Mbps) 408 384 176 184 408 416 408 408
Fig. 4.3 Ixia with two froxes, hpmd disabled network configuration Table4.3. Results for two froxes, with Virus scan disabled File size(Bytes) 4MB 1MB 256KB 64KB 15M(Clean) 15M(virus at begin) 15M(virus at middle) 15M(virus at end) Throughput(Mbps) 920 816 440 424 920 920 920 920
Fig.4.4 Ixia with two froxes, hpmd enabled network configuration Table4.4 Results for two froxes, with Virus scan enabled File size(Bytes) 4MB 1MB 256KB 64KB 15M(Clean) 15M(virus at begin) 15M(virus at middle) 15M(virus at end) Throughput(Mbps) 792 704 392 376 776 864 824 776
The results with hpmd disabled reflect the real throughput of our network testing environment. It can be seen from Figure 4.3 and Table 4.3 that, the proposed network is near to Gbps and can be used as a high-speed network environment
Fig. 4.5 The throughput comparison of these four kinds of situations Figure 4.5 shows the throughput comparisons of these four kinds of situations. From these figures, we can draw the following conclusions: 1. When downloading files from ftp servers, the virus scan does not have obviously effect on the throughput of ftp. Comparing the curves with virus scan enabled and disabled, it can be found that when virus scan is enabled, the throughput of ftp does not have obviously reduction. It proves that the virus scan is fast enough to work on the high-speed network. It can reach nearly 900Mbps when two froxes are used. 2. When doubling the number of the frox, the throughput of ftp is almost doubled. First, it proves that HVSS has good expansibility. Second, the virus scan is fast with hardware accelerated. 3. The throughput of large files is higher than small files. The reason is that, when download a total of 1000MB of data, there are more small files than large files. The small files will waste more transfer time. Of course, the file size cannot be infinite. There is a balance point. Experiments show that the performance is higher with a file size between 1MB and 15MB. According to the testing results, HVSS does not affect the bandwidth of the high-speed network, and it has good expansibility. It can support the virus scan for more ftp servers at the same time
5. CONCLUSION
HVSS is designed for the high-speed network. In order to work on a high-speed network, HVSS uses an accelerator card to accelerate the virus scan. In the meantime, a special protocol is designed to provide effective communication between the proxy and the virus scan server. The protocol is easy to realize and has a lower error rate and makes HVSS work swimmingly. Experimental results show that HVSS does not have obviously effect on the bandwidth of a highspeed network and has good expansibility. It can provide virus scan for multiple ftp servers at the same time. Though only tested with ftp protocol, HVSS is designed to support multiple protocols from the beginning. It can be used to scan data encapsulated in other protocols with little modification. HVSS will be improved further in immediate future.
BIBLIOGRAPHY
1. H. T. Zhang, The technology of computer virus and anti-virus, Publisher: Tsinghua University Publishing Company, 2nd edition, December, 1996. 2. P. Szor, The art of computer virus research and defense, Publisher: Addison Wesley Professional, February 03, 2005. 3. F. B. Cohen, A Short Course on Computer Viruses, Wiley Professional Computing, New York, 2nd edition, 1994, ISBN: 0471007684. 4. IEEE paper, Virus scan system based on hardware acceleration By, Baojun Zhang, Jiebing Wang and Xuezeng Pan 5. V. V. Bontchev, Methodology of computer anti-virus research, University of Hamburg Dissertation, 1998. 6. J. V. Neumann, The general and logical theory of automata, Hixon Symposium, 1948. 7. www.wikipedia.org 8. www.ixiacom.com