Sunteți pe pagina 1din 5

Interactive Network Active-Traffic Visualization

Nathan Robinson, Jeff Scaparra Computer Science Department Texas A&M University, College Station, TX 77843, USA {robinson, jcs8644}@cs.tamu.edu

STATEMENT OF CONCEPT

Interactive network active-traffic visualization (INAV) is a monitoring solution for use in real-time network environments, and is unique such that it only monitors the traffic that is currently active between nodes within a network and affords an intuitive visualization of that information - which consists of multiple layers of information/interaction. Several considerations exist that must be considered for this choice of research: to investigate and explore the problems and solutions associated with the visualization of large graphs amounts of data. Also to visualize the collected information without the problem of overwhelming the user, and particularly important is how to visualize the network infrastructure in real-time. Additionally, INAV is a useful tool designed for use by network administrators for the discovery of network traffic trends and behavior. This is relevant as the network administrator is responsible for the reliability and traffic between network segments, which is collected by utilizing passive scanning techniques. There will be two sections of INAV, the server and the UI. The server is responsible for collecting traffic and sending it to the UI module (the client) which will consist of a graph in which the nodes are associated with IP addresses, and the edges represent the traffic between those two nodes. As traffic is collected, new nodes will emerge, and their associated traffic (the edges) will become thicker and change in hue/saturation, as to represent the throughput of traffic between those nodes (figure 1). This will also be interactive, so that feedback will be provided to the user at different layers of information: computer/router, traffic data types, and the active network infrastructure.

By providing this information in real-time and on-demand the network administrator will be able to evaluate health diagnostics of the network and discover network traffic/routing trends. This is important as the efficiency and robustness of a network depends on the proper configuration of equipment in order to suit current traffic needs. The important processes in our research are the dynamic, interactive visualization coupled with real-time active statistics of network traffic. This bridges an already existing gap between "snapshot" network dataflow graphs and real-time packet sniffing applications.
CONTRIBUTION AND BENEFITS

There is already development in many different network mapping applications and there also exists network diagnostic software, however a key difference is that there is currently a missing link between the visualization of network traffic in a real-time environment and time traffic analysis, both examples of where INAV provides a solution. This is achieved by utilizing multiple layers alongside details on demand. For network administrators, it is essential to know network health and efficiency, and INAV will provide this information through an interactive graph and at the highest level, the user will see the throughput thresholds between nodes (figure 1) and be able to monitor their efficiency. The INAV project meets these needs by providing feedback and interaction through a web interface. Another prospective use for INAV, would be for the small office/home office (SOHO) users, and they would find this information valuable to assist them troubleshooting network connections. One such scenario would be if your ISP goes down you will notice that link deactivated. Following the SOHO usefulness is that INAV will allow for the discovery of mal-ware or some unknown virus on your system, as the traffic information of the mal-ware will be visible the second they communicate with servers across the internet. The feedback mechanism will be intuitive and based upon current conceptual models and ideas so that the information presented can be easily understood, as it is useless otherwise.

2007

Figure 1 At the data link layer in which the INAV server analyzes packet information (figure 2) there will be a server with two network interfaces. The first interface will be for data collection and is where the traffic will be sniffed. The second is for communicating with the client (UI module) which will display the data and perform simple trace routes. This will be used for the visualization INAV does to dynamically draw/map the network as we collect information from it. One visible problem is that performing trace routes on large quantities of data can potentially place a large strain on the network. To overcome this problem, INAV will place a cap on the trace routes at some restrictive bandwidth limit and cache the results into a lookup table. The only active network detection method used is an ICMP trace route.

Figure 3 One such network management tool for mapping and monitoring your network is Cheops-ng [1]. Cheops-ng is split into a server and client architecture much like our design with the server running on a separate computer on the network. The client interacts with the server to actively discover entities on the network. Cheops-ng can also be used to communicate with devices after their discovery. Interaction is afforded by the use of right clicking on an entity which will drop down a menu allowing connections to services that were discovered via a port scan of the system. It is also possible to rerun the scans and reverse lookups of the hostname. Another useful feature of Cheopsng is operating system detection. This allows system administrators to see the different types of systems that are on the network. Another study performed by five national researching institutes, Lawrence Berkeley National Laboratory, High Performance Computing Research Department, National Energy Research Sciences Center, Los Alamos National Laboratory, and the University of New Mexico worked toward identifying, characterizing, and visualizing "anomalous subsets of as large of a collection of network connection data as possible" [2]. They analyzed petabytes worth of network traffic over the course of 24 months. In order to manage the data Lawrence Berkeley "developed a unique indexing, storage and retrieval system known as FastBit that uses extremely efficient compressed bitmap indices." [2] In addition to the very advanced database machine-learning is used for automatic classification and novelty detection. Los Alamos's Emaad system can detect anomalies unsupervised. With these capabilities they were able to perform Intrusion Detection to identify failed

Figure 2
PRIOR WORK ANALYSIS

Many other applications have done some part of collecting or visualizing of network traffic. Open source projects such as cheops-ng [1] map networks and can dynamically find devices on the network (figure 3) with little or no user input. Other papers and applications exist for visualizing network traffic and dealing with the large amounts of data that is associated with computer data networks. Our research will combine these efforts and incorporate realtime information about the network to give a relative snapshot of the current state of the network.

connections and graph them. For visualization of the data two different methods were used, the SpaceShield and HyperSpace. The SpaceShield Viewer (figure 4) displays hosts on a globe representing the world and in the middle sits the internal hosts for the network. In contrast the HyperCude Viewer (figure 5) shows cubes which represent similar connections. "Expert analysts are experimenting with these clusters to try to identify an intuitive understanding of how these data are correlated"[2].

the user can use more advance incremental algorithms to shape the graph more accordingly. Studies have shown humans perceive things as instantaneous if they occur "sufficiently fast" (within 200ms). Because there is no way to re-draw a million node graph in that time partial-drawing solutions had to be developed. By predicting the number of nodes that can be drawn in 200ms they are able to keep the graph's shape very similar to the full graph with all the nodes. "NicheWorks allows users to visualize weighted networks with hundreds of thousands of nodes and edges."[3]

EVALUATION PLAN

In order to properly evaluate the INAV project, we will utilize several methods: Usability testing will be used to determine and score these tasks: Time on Task How long does it take people to complete basic tasks? (For example, can malware traffic be detected? Can network segment outages be detected, or rather, should they be detected?) Accuracy How many mistakes did people make? (And were they fatal or recoverable with the right information?) Recall How much does the person remember afterwards or after periods of non-use? Emotional Response How does the person feel about the tasks completed? (Confident? Stressed? Would the user recommend this system to a friend?)

Figure 4

There will be several different cases to evaluate these needs. Our initial testing will be performed in house, followed by a pilot study at different levels. Our in house testing will be performed so that we can test to determine if node placement and visualization techniques are effective in providing sufficient feedback as to the status of the network, and if the UI is responsive enough to be usable for a real-time network map. This initial series of evaluations are important as they allow us to iteratively develop this application. We will also employ several testing methodologies, such as unit testing which is a procedure is used to validate functions or modules of source code and is a test for a specific unit. Usually, the test cases are independent from other test cases, so that the different units can be tested in isolation. By using unit testing, we will be able to ensure that our application will respond to the proper data stream sent by the server application. Our pilot study will involve the participation of several

Figure 5 In a similar paper entitled Niche Works-Interactive Visualization of Very Large Graphs focuses on managing the large amounts of data that networks provide and displaying them visually [3]. NicheWorks, a tool created by the Bell Labs Visualization Group is part of a suite of visualization views for performing interactive analysis of large datasets. The visualization takes place in two parts. In part one, the initial layout; the algorithm must be capable of laying out up to a million node on a graph in just a few minutes. Then in the second part, the improvement phase,

INAV server

2/1/2007 Server/Display communication defined and developed. Unit test written for the communication channel.

2/15/2007 Network Sniffing and transferring of sniffed data to display

2/22/2007 Bandwidth monitoring and reporting

3/8/2007 Filtering

3/15/2007 Iterative testing and refinement

3/22/2007 Fine-tune parameters

4/12/2007 Evaluation and testing

INAV Display

Server/Display Node/edge Details-onIterative Fine-tune Evaluation communication implementation demand testing parameters and defined and and visualization. and testing developed. visualization. refinement Iterative feedback and design with network administrators different categories of users: the network administrator, the locations or on the same physical device. The server will be SOHO user, and a college student. developed using C++ and the display will be developed using java. This will allow the INAV display to run on most The network administrator will use the software and its operating systems. The INAV server will run on a Linux usability will be evaluated. We are interested in seeing if machine with root privileges - which are needed to allow INAV is able to properly visualize a large graph, and if its the software to put the interface into promiscuous mode. able to provide necessary information such as (for a node) the operating system, the IP address, the MAC address, the In order for the INAV server to see the traffic it either needs different network services available, and the type of traffic to be configured as a router that the traffic is going through that node is either receiving or generating. or have an interface that the switch is copying traffic to. The INAV server will collect information and discover the The small office/home office (SOHO) users and college routes traffic is taking between nodes. It will hold all of the students will be more interested in evaluating node state information and relay it to the INAV display. To connectivity, as network connectivity with the internet is collect the information the device will be put in to important, and they do not consistently know how to promiscuous mode and the data will be gathered and sent to troubleshoot these issues. Information that is also useful is the display. After this the nodes will go into a queue which if the network is what you are expecting (i.e. is there malwill then perform a trace route in parallel between the ware or a virus present?) nodes. Then the hops that the traffic is taking between the two end nodes will be relayed to the display. The INAV PROJECT PLAN server will also be responsible for filtering the information The project will be developed in stages with the server and so that the viewer could filter out only a certain type of the display being developed concurrently. The first iteration information. In addition there will be a sliding window for will be devoted to the development of the INAV server and which the visualized data is present. This is necessary INAV display communication channel. Then we will add because of the amount of traffic that networks create. A features to the server and develop their visualizations for more active network would want a smaller window so that the display. We will need data first so we will add the the traffic would become too dense and render the tool sniffing capabilities to the server and allow it to begin useless. The server will also be responsible for bandwidth gathering information. The next step will be to add reporting which allow the INAV display to show link bandwidth monitoring and a system to keep track of the congestions. The INAV display will serve as a lightweight amount of data flowing between any two nodes on the GUI responsible for relaying user input to the server and network. We then will need to add filtering and queuing to displaying server data. manage the massive amounts of data. The project will be split into two distinct parts. The INAV server will gather information and relay it to the INAV display which will graphically represent the data. Jeff Scaparra will be the lead developer on the server and Nathan Robinson will be the lead developer on the INAV display. The server and display with communicate over a TCP/IP network which will allow them to run in different

ABSTRACT

The Interactive network active-traffic visualization research will provide users with a view of network traffic in real time, something which has previously never been done. It combines several elements and filtering methods which are useful for malware analysis, Peer-to-Peer programs, and other information concerning how data is traversing across the network. Furthermore it can be extended by using machine learning techniques for the identification of anomalies which can be useful for the discovery of malicious, unauthorized access on the network.
CONCLUSION

aims to collect active network data interactively and display feedback about that data where others may do one, or several, of these but not all of them. We expect this to be challenging: from both the data collection problem (there is a lot of data) and the visualization problem (visualizing large data sets). We will achieve this by the clever collating of data, optimized network communication between the server/client, and a carefully planned visualization to represent that data. We havent finalized our decision, but we are interested in both DUX and InfoVis.
REFERENCES

1. By developing the INAV application, we will be able to provide for not only a visually stimulating tool, but also one that is useful. We want to bridge the current gap in software so that we can combine the usefulness in the visualizations of Cheops-ng with the robustness of Nmap (or other similar network diagnostic utilities). The INAV system is considerably different than other (prior) work in that it integrates a visualization layer on top of a real-time collection agent, whereas there currently exists real-time monitoring, or network visualization (mapping), and even the artistic/abstract versions of these. INAV is different as it

Brent Priddy, Cheops-ng http://cheops-

ng.sourceforge.net/

2. Stockinger, K. Kesheng Wu Campbell, S. Lau, S. Fisk, M. Gavrilov, E. Kent, A. Davis, C.E. Olinger, R. Young, R. Prewett, J. Weber, P. Caudell, T.P. Bethel, E.W. Smith, S. High Performance Computing Research Department (HPCRD/LBNL); Network Traffic Analysis With Query Driven Visualization SC 2005 HPC Analytics Results 3. Wills, Graham J., : NicheWorksInteractive Visualization of Very Large Graphs