ChatGPT's OSS Malware Review – Henrik Plate, Endor Labs

May 8, 2023

Endor Labs conducted an experiment into how well ChatGPT performs at OSS malware review. Henrik asked the AI to classify 1,800 binaries for artifacts published on PyPI and npm as either malicious or benign. His conclusion is large language model-assisted malware reviews are not yet a viable alternative to manual reviews, but they can be one additional signal and input. They can help automatically review larger numbers of malware signals produced by noisy detectors, but one inherent problem is the reliance on identifiers and comments to “understand” code behavior. They are a valuable source of information for code developed by benign developers, but they can also be easily misused by adversaries to evade the detection of malicious behavior.

Guest(s): Henrik Plate
Categories: Interviews
