Many of you out there who uses NetFlow Analyzer or is evaluating NetFlow Analyzer would certainly want to know how the product stores its data and does all the historic reporting.
NetFlow Analyzer processes the NetFlow data exported from the devices and stores it in the database for traffic analysis and reporting. NetFlow Analyzer’s flexible data storage pattern is intended to achieve detailed data storage forever without having an impact on the hard disk space and also provide real time reporting.
Data stored on NetFlow Analyzer will help you to achieve following things:
1. Troubleshooting Network spikes
6. Understanding Traffic Pattern and much more.
Coming to the data storage, NetFlow Analyzer stores two types of data, Raw data and Aggregated data.
Raw Data Storage:
Raw data is each and every flow exported from the monitored interfaces of the routers. All the flows exported from the routers is stored in the NetFlow Analyzer database as raw data. Since, the raw data is each and every flow from the routers, it consumes lot of disk space and so is set to be stored for maximum of 30 days. Raw data storage is determined by the amount of flows the product receives from the monitored routers. To make calculation easier, the product itself can suggest how long one can store the raw data based on the free space available in the installation directory and the flow rate.
Raw data storage can be configured on the product by clicking on Product Settings –> Storage Settings –> Raw data Storage. There are also options available to alert you when free disk space goes below specified percentage and to automatically delete the older raw data when disk space goes below a specified percentage.
The raw data is used in the product when generating ‘Troubleshoot’ reports and the last 2 hours reports will be generated from the raw data. The raw data has complete port level information which helps in detailed analysis of traffic.
Apart from the raw data storage, NetFlow Analyzer stores aggregated data which is stored for ever in the database. The aggregation mechanism will happen simultaneously at the back end along with the raw data storage. The aggregated data is stored based on top 100 fields of the application and conversation for every 10 minute interval and is further aggregated as time goes on.
The aggregation of NetFlow data collected is done to avoid high disk space usage without impact on reporting and performance. The aggregated data on NetFlow Analyzer is used for historical reporting, capacity planning and trend analysis.
Following explanation will help you to understand how Application data on NetFlow Analyzer is aggregated and stored in various tables.
Aggregation Mechanism for Application data:
Older data is repeatedly rolled up into less granular times (10 minute, 1 hour, 6 hour, 24 hour and weekly). The top 100 records of application based on octet value is stored for every 10 minute interval. As time goes, this data is further aggregated to an hourly table.
When we select time period 10:00 to 10:59, NetFlow Analyzer stores top 100 Application for each 10 minutes (10:00, 10:10, 10:20, 10:30, 10:40 and 10:50), this data will be under 10 minute table. From this six 10 minutes data, the 600 records pertaining to 10:00, 10:10, 10:20, 10:30, 10:40 and 10:50 would be aggregated and the top 100 would be moved to the 1 hour table pertaining to 10:00.
In the same manner, aggregation happens to the hourly table and the data is moved to 6 hour table then to daily table and finally weekly tables. Most recent data is stored with 10 minute granularity and data older than 92 days is stored with 1 week granularity.
The 10 minute table will have most recent data and data older than 25 hours is cleaned up. Following
is how the data are repeatedly rolled out.
10 minute granular data is stored for 25 hours (beyond which the older data is deleted)
1 hour granular data is stored for 45 days
6 hour granular data is stored for 62 days
24 hour granular data is stored for 92 days
1 week granular data is stored forever
In the same way as applications, conversations are also aggregated and stored in the database for historic reporting. The Application, Source, Destination, Conversation and QoS reports generated for more than last 2 hour period will be generated from the Aggregated data. The granularity of data represented will change based on the time period you select.
1 Minute traffic Data Storage:
Apart from the raw data and aggregated data, NetFlow Analyzer stores 1 minute traffic data which is used for real time reporting purpose. The aggregation mechanism for the traffic data happens as the same way we explained for Application data. The traffic report generated for any time period which is less than 24 hour is generated with 1 minute granularity which will give you a detail picture of each and every transaction going IN and OUT.
One minute data storage can be configured on the product by clicking on Product Settings———> Storage Settings—–> One Minute Data Storage Settings.
Hope this blog gives you a better understanding about the data storage pattern in NetFlow Analyzer and will help you use the product better.