Sunteți pe pagina 1din 23

FlowMatrix Tutorial

A message from the creators: When developing FlowMatrix our main goal was to provide a better and more affordable network anomaly detection and network behavior analysis tool for network security professionals. We chose to use a different technology for network intrusion detection than most of the modern IDS products currently use. FlowMatrix uses signature free, self-tuning, multidimensional statistical network behavioral models to identify network behavior anomalies, including unknown ones, in real time. In order to classify detected anomalies we use combination of fuzzy logic, Bayesian networks and a few other techniques. We tried to make FlowMatrix to stay focused on a specific security goal and keep it free of unnecessary functionality. In our opinion many existing products attempt to cover too much ground diluting their main purpose and burdening the user with an unnecessarily extensive configuration before their products can be deployed. Special considerations and efforts were given to achieve a low rate of false positives (FP) with high confidence level, the factors that plague many current IDS and put high workload on security analysts. The high rate of false positives is a well known and hard problem to solve by any network anomaly tools, but we did our best to achieve an acceptable balance between the detection rate and the false positive rate. In addition we are constantly working to improve it. Overall we believe weve mostly succeeded and we hope that you, our users, would find our product superior and easy to use as well as more affordable than what is available on the market at this time. We are open for suggestions from our users on how to improve the product to serve better its purpose. Please send your suggestions to support@akmalabs.com

FlowMatrix modus operandi


FlowMatrix consists of three main functional parts: the processing engine (backend), database and WebGUI (front end). The engine receives NetFlow records from capable sources, such as routers, switches, firewalls, etc. and processes them in real-time for network and network applications anomaly detection. When installed the engine first operates in learning mode. In this phase it is able to build network models with just a few hours of network data. It needs to be added that with so few initial data points available detected anomaly events can be false positives. After the learning is complete (usually 7-14 days) it enters fully operational mode. In this mode the computed network models are constantly compared with the current data derived from incoming NetFlows. Alarms are generated if the models differ beyond allowed tolerances. The model and associated thresholds are self-adjustable as theyre derived from the processed data itself (NetFlows) and not hardcoded externally. Internal models are periodically updated without requiring dedicated periodic learning interval. The models are built not just for whole network but also for 3 groups of user configurable applications and are compared independently. This means that FlowMatrix can detect behavioral anomalies for applications which can be defined by users. Applications are defined by groups of ports mostly used by applications in the group.

This will detect network applications anomalies even so network as a whole may operate correctly. Such functionality is intended to monitor operation of some business critical network applications. Detected network or application anomalies are classified when possible. Independently of classifications detailed relevant information is presented such that manual classification can be performed For the user, the model boundaries are expressed via thresholds, which if crossed, may trigger an anomaly event generation. The engine is capable of detecting two major classes of anomalies: volume based and traffic characteristics based (number of unique source or destination ports, unique source or destinations IP addresses, number of connections, connection properties etc.). Volume based anomalies are those caused by abnormal number of IP packets, substantially lower or higher that usual. Traffic characteristics alarms are triggered by unusual changes in distribution of one or more IP packet characteristics: src and/or dst addresses, src and/or dst ports. Note: It is important to have network conditions close to typical when the product is first deployed to compute the network and network application models closely matching the reality of your network. As result it is advisable that you dont perform deployment of this product while you are under heavy attack, lite level of attack is considered acceptable and will be compensated by the models. Having correct initial models improves detection and lowers the False Positives rate.

Deployment Model
FlowMatrix is most effective when deployed properly, such that it receives NetFlow records about most important traffic on your network. We recommend a deployment similar to that shown on the Figure 1. Configure internal NetFlow sources that handle traffic from corporate hosts to Internet and vice versa such as routers, switches and firewalls to export NetFlows to the FlowMatrix server. For best result and more visibility make sure those sources deal with clear, not NATed traffic. If you have multiple connections to Internet make sure that traffic from all connections is observed and NetFlow records sent to FlowMatrix for processing.

Figure 1

Installation
Have the installation executable ready on the server where FlowMatrix is going to be installed You can download latest version of FlowMatrix from this location: http://www.akmalabs.com/downloads_flowmatrix.php Launch the downloaded executable, in this example FMSetup.exe

Click Next on the Welcome screen and select a destination install folder, if appropriate

Click Install The installer will check for existing Apache and PostgreSQL installations and will not proceed if any of those is found on your system. FlowMatrix assumes that it owns installation of Apache and PosgreSQL servers install will fail if it finds any of these servers installed. You would need to uninstall them and start the installation process over. Otherwise, the install process will continue installing all necessary components and services Note: All FlowMatrix components will be installed under the destination folder. No files are installed in the system directories Once the install process successfully finishes press Ok to exit the installer. To access the Web GUI click Start->All Programs->FlowMatrix->Flow Matrix. You can also create a shortcut on your desktop.

Note: FlowMatrix Web GUI is best when used with IE or Firefox browsers. Other browsers support may come in the future releases .

Logon to the system by using default user name and password: admin / admin

Now you need to configure the product and NetFlow sources

Setup FlowMatrix and configure NetFlow sources


What is NetFlow? NetFlow is a protocol used for collecting network traffic information, which was developed by Cisco Systems, Inc. NetFlow enabled devices, which include Cisco routers and switches (as well as switches and routers made by other supporting vendors) generate records, which are sent from the router in UDP packets using NetFlow protocol format. A NetFlow collector must then collect these packets as they stream from the router. Some of the information that NetFlow provides per a reported flow are: Source and destination IP addresses for the network flow; Protocol type (field from IP header which indicates next protocol that follows); Source and destination ports for the network flow when applicable for the protocol; Total number of packets and bytes per flow; Other information (TCP flags etc). Note: Routers will only send the information pertaining to a given conversation after it has ended or when timeout reached when it is configured so. Examples of devices and vendors that support NetFlow Cisco routers Cisco switches (some models) Juniper Enterasys Checkpoint (through 3rd party modules)

In addition to commercial products listed above there are numerous implementation of NetFlow aggregators that monitor network traffic in real time and produce NetFlow records as done by the routers and switches from vendors above. Please check following link for good collection of free open source NetFlow probes which can be used as a NetFlow probe in case if you dont have NetFlow capable router/switch: http://www.switch.ch/network/projects/completed/TF-NGN/floma/software.html

Configure NetFlow receiving port on FlowMatrix The default port FlowMatrix engine listens to NetFlow traffic is on port 2055. This can be changed in Settings->System Options in the Web GUI

Example of configuring NetFlow on a Cisco router Telnet to the router and enter the following commands in global config mode: (config)#ip flow-export destination<ip address of FlowMatrix> <port number on FlowMatrix 2055> (config)#ip flow-export version 5 peer-as (config)#ip flow-export source <interface number> (config)#ip flow-cache timeout active 1 Note: Currently FlowMatrix only supports collection using version 5 of NetFlow. Then for each interface you configured above enable NetFlow collection:

interface <interface> ip route-cache flow

In short time (about 1-2 minutes) after NetFlow configured on both FlowMatrix and on the router, you can verify that FlowMatrix engine receives the NetFlow traffic by looking on NetFlows recived (per minute last hour) graph on the Dashboard->Summary View

Configure Learning Intervals (Optional) Learning Intervals influence FlowMatrixs internal models and as a result user visible thresholds and affect accuracy of anomaly detection. Typical default values provided should be appropriate for most deployments. Current version defines four intervals which are day and night for weekdays and day and night for weekends. Unless your network activities differ substantially from default intervals you dont need to change them in the initial setup and can change them later without braking the existing models which will be updated with new intervals in short time (1-2 hours) after your changes saved and models are reloaded at proper moments. If your network usage intervals differ substantially and you would like to change defaults you should go to Settings -> Learning Intervals page and adjust proper settings.

Click Save to commit changes

FlowMatrix Workflow
FlowMatrix interface was designed with a specific workflow in mind. Following this workflow will help you to get most out of FlowMatrix capabilities. Later you most likely will workout your own workflow. The recommended workflow starts at Dashboard->Summary View screen which provides an aggregate view of various key events the system tracks.

To get familiar with main operational page, lets walk through the available graphs and explain their meaning. Anomaly Events Count graph shows total number of anomaly alarms per interval. In FlowMatrix default execution interval is 1 minute. This graph shows last 60 minutes of events. Please note that graph shows only alarms and not warnings. Anomaly Events table shows summary of all anomaly alarms and warnings for last 60 minutes. You can drill down on each of the alarms by clicking on summary description of the alarm. This will bring

you to pages with more detailed information on the alarm and explanations of reasons why it was considered alarm or warning. When possible it will present classification information of the alarm as belonging to particular type of the anomaly. You will get most of detailed relevant information to make your decision on the spot: accept alarm and investigate it more or ignore as being false positive or known anomalous condition. Anomaly Detection in Subspace #1 and Anomaly Detection in Subspace #2 graphs present anomaly detection statistics as related to corresponding threshold.. Statistical models divide all the network traffic in network clusters and each clusters in subspaces which apply to individual clusters as well as to whole network. Typically, small magnitude anomalies (low intensity attacks, scans, etc) would show up in the Subspace #1 graph while large scale distortion (worms, alpha flows, etc) may show up in the Subspace #2 or most of the time in both. Both subspaces, as applied to multiple traffic clusters, are capable of detecting volume and characteristics anomalies, simultaneously and independently. Volume anomalies are caused by abnormal packet volume, both higher and lower than usual while characteristic alarms are caused by abnormal changes in distribution of IP flows properties such as src/dst addresses and ports. Both, volume and characteristics anomalies generate alarm conditions when their values cross normalized threshold lines (red). Black warning threshold lines signal attention. The subspace graphs show one hour detection window. Click on any of the subspace graphs to go to Anomaly Analysis page where you can see these graphs in configurable windows sizes. Note: FlowMatrix retention period is 14 days though the summary screen only shows one hour window which is considered relevant real time information. All events occurred for the past 14 days days are available on Anomaly pages Rules Trigerred graph shows number of rules, which have been triggered by corresponding rules matching conditions. Each rule can generate one or more events. Rule Events table reflects events generated by rules that were triggered. Rules are part of more manual network monitoring and are not discussed in this guide but documented in FlowMatrix User Manual Packet Volume from all Devices graph shows total packets as a sum of all NetFlows reported packets from all of the devices. The FlowMatrix doesnt show packets volume per each reported device and only show aggregate packets volume from all reporting devices..Mean (blue lines) and 2xStandard Deviation (red lines) of packet volume for the entire retention period (14 days). This graph could be useful for a number of things, such as determining proper learning intervals, detecting volume based anomalies, creating manual thresholds for manual rules etc. Click the graph to go to NetFlow Statistics page. On that page you will find not only packet volume graphs but bytes volume, src/dst ip and port counts. NetFlow received graph, which shows total number of Netwflow packet received from all NetFlow sources per minute. This graph shows counters for last 60 minutes and rest of the data is discarded.

Note: FlowMatrix creates models of not only of network behavior but also that of behavior of 3 groups of network applications. You can switch to application group view if you click on proper application group near Swicth view to: label at the top right corner of the page. Some elements of the page do not change with the switch since they are common under any view.

We suggest that a security analyst would have the DashBoard summary view page in his view for continuous monitoring. Things to pay attention in this mode are the Anomaly Events graph, Anomaly Events table a nd Packets Volume graph and the NetFlows received gauge. When the threshold is crossed there will be a corresponding event in the Anomaly Events table. If the event is of some interest to you, click on it to go to the Details page. This page shows friendly summary of important flows for most anomalous virtual traffic clusters. The clusters for which anomaly is very clearly represented will be classified and summary information about classification will be provided with most likely violators clearly stated. Please keep in mind that virtual traffic clusters do not correspond to grouping of your hosts on network and determined based on IP addresses and number of other parameters. If you would like to get more summary information you can drill it down by clicking at links Show by IP count or Show by packet count. Note: In two dimensional space an anomaly location is determined by its time (shown in the events table) and the IP flow(s). Since at any given time multiple anomalies can occur or a single anomaly can spread multiple IP flows the virtual clustering provides a way to separate affected IP flows from the unaffected ones

TheShow by IP count will show top src and dst hosts and ports sorted by a number of unique IP addresses they communicated with. The Show by packet count will show top src and dst hosts and ports sorted by number of packets, i.e. volume. Click on a link to see the cluster details screen:

There are five tables that help you to pinpoint the source of anomaly: Top 10 src IP addresses sorted by either IP or packet counts (depending on which link you clicked). This table shows which unique dst IP addresses each src IP of the top 10 contacted to. For example, the screen above shows that host 64.95.76.7 contacted 8 different hosts listed in the Dst IP Addr column. For brevity, only 1 IP address is shown and all of them accessible when you click show list link. Packet count and Kbytes count is also reflected in the corresponding columns.

Top 10 dst IP addresses sorted by IP or packet counts. This table is the same as the Top 10 src IPs but from the dst point of view. In our example, the first entries in both tables are reversed which indicates an active pair of hosts talking to each other. However, depending on the attack (anomaly) it is not always true that such a pairing would exists: when the traffic is asymmetric the dst or src host may not even be in the top 10. Top 10 src ports and Top 10 dst ports have similar concept to that of Top 10 src IPs and Top 10 dst IPs Top 10 peers by packet count shows the 10 most verbose pairs of hosts exchanged the largest number of packets. As was noted, in the case of asymmetric traffic information on a peer may not be available in the Top 10 src and Top 10 dst tables so this table shows the peers regardless of the direction of the traffic.

Typically, you would want to choose the option (by IP or Packet counts) that is relevant to the type of anomaly. For example, if an anomaly is classified as SCAN type you would know that the Packet count option is the optimal one as scans dont usually produce large number of packets. On the other hands, scans would cover large number of IP addresses and ports so choosing by IP count option would give you more information about the source or sources of the anomaly.

Once you have the tables in front of you additional information can be obtained regarding a particular IP address or port. Click on an IP address to fetch whois type of details about the address. Clicking on a port will give you IANA port assignment, if available for the port.

Aside from the Dashboard view there are two options to see all anomaly and warning events retained by the system: Anomaly Events and Anomaly Analysis.

Anomaly Analysis allows you to see graphs for traffic characteristics and volume in two detection subspaces. By default, last 12 hours are shown and updated automatically but you can select different time scale for the live update if you wish. Click Go to start updating. To isolate an anomaly use the graphs time coordinate (X axis) and a time query. For example, we want to see the first two traffic characteristics anomalies in the subspace #1 closely. We see that the first occurred on Tuesday at roughly 1:43 (24-hours scale is used) and the second stopped at before 3:06.

So, lets construct the query like this:

We can see the anomalies beginning and the end times as well as their nature (duration, magnitude, etc) more precisely.

With the times noted we could go the Anomaly Events table to find the corresponding entries. Use the same query to limit the number of events:

Our two events occurred at 1:44:35 and 2:17:35 (the third event is in subspace #2). Click on any of them to go to the Event Details.

Other Tools of Interest There a few auxiliary tools that facilitate anomaly detection and situation awareness. It is important to realize that these tools can be used for standalone visual detection and also to verify and correlate system generated anomalies. Network Patterns->Traffic Characteristics Color Map provides color coded representation of various degree of standard deviation for Src/Dst IP addresses and ports grouped by virtual clusters. It covers last 300 minutes. The warmer the color the more the deviation of a particular characteristics in a cluster. The idea here is that when a low intensity anomaly occurs (low intensity scan, etc) it should cause, a change in one of the four traffic characteristics, i.e. deviation in one or more traffic clusters (row in the matrix). The deviation may be too subtle for the system to detect it as an alarm. Very often this subtle anomaly manifests itself as having a certain pattern that can be easily spotted by the human eye. This feature requires some practice but can be an invaluable auxiliary tool.

Network Patterns->Subspace #1 and Subspace #2 Harmonics graphs show raw harmonics of the network computed from NetFlow. Usually, stable networks have some sort of periodicity in their harmonics that change if the network configuration itself changes. By visually determining if theres the periodicity has changed you may spot otherwise undetectable changes. It is useful to look at these graphs using >1 day time scale to assess any possible changes

Statistics->NetFlow Statistics assists in detecting anomalies that could be spotted by changes in the simple stats such as Mean and Std. Deviation.

For example, in this picture we can see that a sudden spike in byte and packet counts may indicate a volume based anomaly and it needs to be investigated. Again, use time scale and queries to spot or isolate anomalies.

S-ar putea să vă placă și