NAME

CUFlow - flowscan module that is a little more configurable than SubnetIO.pm in return for sacrificing some modularity.


SYNOPSIS

   $ flowscan CUFlow

or in flowscan.cf:

   ReportClasses CUFlow


DESCRIPTION

CUFlow.pm creates rrds matching the configuration given in CUFlow.cf. It (by default) creates a 'total.rrd' file, representing the total in and out-bound traffic it receives. It also creates 2 rrd files for every Service directive in CUFlow.cf, service_servicename_src.rrd and service_servicename_dst.rrd.

It makes some assumptions about the nature of the flows exported to it: basically that they are either inbound to your network or outbount from it. It is designed to be run on a border router, and is not written to handle traffic exported to it about flows that source and end in your network, or outside it. It should be used to monitor the traffic being sent out by networks contained in Subnet statements.


CONFIGURATION

CUFlow's configuration file is CUFlow.cf. This configuration file is located in the directory in which the flowscan script resides.

In this file, blank lines, and any data after a pound sign (#) are ignored. Directives that can be put in this file include:

Router
By default, CUFlow does not care from which router the flow records it is processing are from. Unless you specify a Router statement, it just aggregates all the traffic it gets and produces rrd files based on the total. But, if you put
        # Separate out traffic from foorouter
        # Router <Ip Address> <optional alias>
        Router 127.0.0.5 foorouter

In addition to generating the totals rrd files it normally does in OutputDir, it will create a directory whose name is the IP address specified (or the alias, if one is provided), and create all the same service_*, protocol_*, and total.rrd files in it, except only for traffic passed from the router whose address is <Ip Address>.

Note that it does not make any sense to have Router statements in your config unless you have more than one router feeding flow records to flowscan (with one router, the results in the per-router directory will be identical to the total records in OutputDir)

SampleRate
If you are using sampled netflow (mandatory on Juniper) the router will export only 1/n samples. Specify the sample rate in the configfile by using
        # Sample rate pr. exporter in case we're using sampled netflow
        # SampleRate <Ip Address> <rate>
        SampleRate 127.0.0.5 96

This will effectively multiply all data from that router by 96.

Subnet
Each Subnet entry in the file is an IP/length pair that represents a local subnet. E.g.:
        # Subnet for main campus
        Subnet 128.59.0.0/16

Add as many of these as is necessary. CUFlow does not generate additional reports per subnet, as does CampusIO, it simply treats any packet destined to an address *not* in any of its Subnet statements as an outbound packet. The Subnet statements are solely to determine if a given IP address is ``in'' your network or not. For subnet-specific reporting, see the Network item below.

Network
Each Network statement in the cf file is used to generate an rrd file describing the bytes, packets, and flows in and out of a group of IP addresses within your larger Subnet blocks. E.g.:
        # Watson Hall traffic
        Network 128.59.39.0/24,128.59.31.0/24 watson

It consists of a comma separated list of 1 or more CIDR blocks, followed by a label to apply to traffic into/out of those blocks. It creates rrd files named 'network_label.rrd'. Note that these are total traffic seen only, unfortunately, and not per-exporter as Service and Protocol are. Note also that a Network must be subset of your defined Subnet's.

Service
Each Service entry in the file is a port/protocol name that we are interested in, followed by a label. E.g.:
        # Usenet news
        Service nntp/tcp news

In this case, we are interested in traffic to or from port 119 on TCP, and wish to refer to such traffic as 'news'. The rrd files that will be created to track this traffic will be named 'service_news_src.rrd' (tracking traffic whose source port is 119) and 'service_news_dst.rrd' (for traffic with dst port 119). Each Service entry will produce these 2 service files.

The port and protocol can either be symbolic (nntp, tcp), or absolute numeric (119, 6). If a name is symbolic, we either getservbyname or getprotobyname as appropriate.

Service tags may also define a range or group of services that should be aggregated together. E.g:

        # RealServer traffic
        Service 7070/tcp,554/tcp,6970-7170/udp realmedia

This means that we will produce a 'service_realmedia_dst.rrd' and 'service_realmedia_src.rrd' files, which will contain traffic data for the sum of the port/protocol pairs given above. Do not put spaces in the comma-separated list.

The label that follows the set of port/protocol pairs must be unique, as it is used to create the rrd file for the matching data. CUFlow will not start if there is a duplicate label name.

ASNumber
Each ASNumber entry in the configuration file specifies a foreign Autonomous System number we are interested in. Specify these in the config file as:
        # Track our traffic to as 23517
        ASNumber 23517 FooNET

where the first argument is the AS number (as reported in the netflow records), and the second is a label describing that AS. (Do not use spaces or characters that have meaning to the filesystem/shell in these labels!) An rrd file is created for every exporting Router (given by a Router entry). You may specify multiple AS numbers to graph together by providing a comma-separated list as the first argument, as in the Service tag above.

Also as in Service tags, do not put spaces in the comma-separated list, and use unique labels for each ASNumber statement.

Multicast
Add Multicast to your CUFlow.cf file to enable our cheap multicast hack. E.g. : # Log multicast traffic Multicast

Unfortunately, in cflow records, multicast traffic always has a nexthop address of 0.0.0.0 and an output interface of 0, meaning by default CUFlow drops it (but counts for purposes of total.rrd). If you enable this option, CUFlow will create protocol_multicast.rrd in OutputDir (and exporter-specific rrd's for any Router statements you have)

Protocol
Each Protocol entry means you are interested in gathering summary statistics for the protocol named in the entry. E.g.:
        # TCP
        Protocol 6 tcp

Each protocol entry creates an rrd file named protocol_<protocol>.rrd in OutputDir The protocol may be specified either numerically (6), or symbolically (tcp). It may be followed by an optional alias name. If symbolic, it will be resolved via getprotobyname. The rrd file will be named according to the alias, or if one is not present, the name/number supplied.

TOS
Each TOS entry means you are interested in gathering summary statistics for traffic whose TOS flag is contained in the range of the entry. E.g.:
        # Normal
        TOS 0 normal

Each TOS entry creates an rrd file named tos_<tos>.rrd in OutputDir. The TOS value must be specified numerically. The rrd file will be named according to the alias.

Similar to Service tags, you may define ranges or groups of TOS values to record together. E.g.:

        # first 8 values
        TOS 0-7 normal

This will graph data about all flows with the matching TOS data. TOS values are between 0 and 255 inclusive.

OutputDir
This is the directory where the output rrd files will be written. E.g.:
        # Output to rrds
        OutputDir rrds

Scoreboard
The Scoreboard directive is used to keep a running total of the top consumers of resources. It produces an html reports showing the top N (where N is specified in the directive) source addresses that sent the most (bytes, packets, flows) out, and the top N destination addresses that received the most (bytes, packets, flows) from the outside. Its syntax is
        # Scoreboard <NumberResults> <RootDir> <CurrentLink>
        Scoreboard 10 /html/reports /html/current.html

The above indicates that each table should show the top 10 of its category, to keep past reports in the /html/reports directory, and the latest report should be pointed to by current.html.

Within RootDir, we create a directory per day, and within that, a directory per hour of the day. In each of these directories, we write the scoreboard reports.

Scoreboarding measures all traffic we get flows for, it is unaffected by any Router statements in your config.

AggregateScore
The AggregateScore directive indicates that CUFlow should keep running totals for the various Scoreboard categories, and generate an overall report based on them, updating it every time it creates a new Scoreboard file. E.g.:
        # AggregateScore <NumberToPrint> <Data File> <OutFile>
        AggregateScore 10 /html/reports/totals.dat /html/topten.html

If you configure this option, you must also turn on Scoreboard. /html/reports/totals.dat is a data file containing an easily machine-readable form of the last several ScoreBoard reports. It then takes each entries average values for every time it has appeared in a ScoreBoard. Then it prints the top NumberToPrint of those. Every 100 samples, it drops all entries that have only appeared once, and divides all the others by 2 (including the number of times they have appeared). So, if a given host were always in the regular ScoreBoard, its appearance count would slowly grow from 50 to 100, then get cut in half, and repeat.

This is usefull for trend analysis, as it enables you to see which hosts are *always* using bandwidth, as opposed to outliers and occasional users.

AggregateScoreboarding measures all traffic we get flows for, it is unaffected by any Router statements in your config.

ASScoreboard
ASScoreboard is similar to Scoreboard. It produces a similar graphing service, but shows the top N (where N is specified in the directive) AS'es sending or receiving the most (bytes, packets, flows). Its syntax is
        # ASScoreboard <NumberResults> <RootDir> <CurrentLink>
        ASScoreboard 10 /html/reports/AS /html/currentAS.html

This example indicates we should keep reports on the top 10 AS'es, keep past reports in the /html/reports/AS directory, and keep a symlink called currentAS.html that points to the latest report.

Within RootDir, we create a directory per day, and within that, a directory per hour of the day. In each of these directories, we write the scoreboard reports. If you also have Scoreboard configured, you should use a separate RootDir for the ASScoreboard reports.

Like Scoreboard, ASScoreboard measures all traffic we get flows for, it is unaffected by any Router statements in your config.


BUGS

Some.

Majorly, flows that have the same source and destination port will end up being counted twice (in either the inbound or outbound direction) for the purposes of measuring the percentages in CUGrapher. The total traffic is correct, but the assumptions CUGrapher makes lead to a negative percentage of ``other traffic''.

This may just be a cosmetic bug with CUGrapher. If an inside host transfers 5megs to an outside host from port 80 to port 80, we need to update both counters. The bug is in assuming that the sum of all the services totals will be less than the total traffic, which may not be the case if some traffic belongs to more than 1 service. More stuff for 2.0...

scoreboard() assumes our flowfile is 300 seconds worth of data. It should figure out over how many seconds the records run, and divide by that instead.

report and its subroutines need to do locking


AUTHOR

Johan Andersen <johan@columbia.edu>


CONTRIBUTORS

Matt Selsky <selsky@columbia.edu> - CUGrapher and co-developer Terje Krogdahl <terje@krogdahl.net> - Sampled netflow support and AS graphing Joerg Borchain <jd@europeonline.net> - CUGrapher menu on displayed graphs page


REPORT PROBLEMS

Please contact <cuflow-users@columbia.edu> to get help with CUFlow.