Malware Datasets

Malware datasets are crucial for developing and evaluating machine learning models designed to detect and classify malicious software. Current research focuses on improving model accuracy and explainability using various techniques, including deep learning (e.g., convolutional neural networks, hidden Markov models), ensemble methods, and novel feature extraction methods (e.g., from audio, images, or antivirus scan reports). The availability of high-quality, diverse, and realistically challenging datasets, along with the development of robust methods to mitigate adversarial attacks and concept drift, is vital for advancing malware detection capabilities and enhancing cybersecurity.

Papers