文摘
Network traffic analysis is applied to detect intrusions and manage application traffic. Continuous batch network traffic analysis is a computationally demanding task. Because of traffic intensity variations due to the natural peaks and crests of network traffic intensity, a network analysis cluster may have to be severely over-dimensioned to support 24/7 continuous packet block capture and processing. In this paper, we characterize the computational requirements of the network traffic packets for several conditions, which constitute a useful tool for generating a network workload in simulated scenarios. Our target MapReduce jobs are map-intensive, including string matching-based virus and malware detection. We present an architecture for a Hadoop-based network analysis solution including a scheduler, report on using this approach in a small cluster, and show scheduling performance results obtained through simulation. The scheduler considers a cloud-based traffic analysis solution that bursts traffic to the cloud to overcome local resource limitations. The results show that we are able to reduce the amount of the traffic to burst out by up to 50 % and still accomplish a continuous batch traffic analysis with single-job comparable run times.