MalPaCA Feature Engineering

A comparative analysis between automated feature engineering and manual feature engineering on network traffic

More Info
expand_more

Abstract

Identifying novel malware and their behaviour enables security engineers to prevent and protect users with devices on the network from attackers. MalPaCA is an algorithm that helps to understand the behaviours of the network traffic by clustering uni-directional network connections which can be analyzed further to interpret which label suites the malicious connection. When clustering connections, features extracted from the packet information were chosen manually based on the generalizability of information and research of common malware characteristics. The feature set can be extracted automatically with an autoencoder to increase the representation of each packets in network traffics. A comparison with an autoencoder generated feature set to the hand-crafted feature set shows that the hand-crafted feature set represents the malicious traffics with higher accuracy and more insightful explainability. A comparative experiment is run on the IoT-23 dataset, a network traffic capture from Avast’s AIC laboratory.