Employing Latent Semantic Analysis to Detect Malicious Command Line Behavior by Jonathan Woodbridge.
From the post:
Detecting anomalous behavior remains one of security’s most impactful data science challenges. Most approaches rely on signature-based techniques, which are reactionary in nature and fail to predict new patterns of malicious behavior and modern adversarial techniques. Instead, as a key component of research in Intrusion Detection, I’ll focus on command line anomaly detection using a machine-learning based approach. A model based on command line history can potentially detect a range of anomalous behavior, including intruders using stolen credentials and insider threats. Command lines contain a wealth of information and serve as a valid proxy for user intent. Users have their own discrete preferences for commands, which can be modeled using a combination of unsupervised machine learning and natural language processing. I demonstrate the ability to model discrete commands, highlighting normal behavior, while also detecting outliers that may be indicative of an intrusion. This approach can help inform at scale anomaly detection without requiring extensive resources or domain expertise.
…
This is very cool and a must read on all sides of cybersecurity.
From the perspective of Jonathan’s post, how do you detect “malicious” command line behavior? From the perspective of a defender.
Equally useful for what profiles do you mimic in order to not be detected as engaging in “malicious” command line behavior?
For example, do you mimic the profile of the sysadmin who is in charge of backups, since their “normal” behavior will be copying files in bulk and/or running scripts that accomplish that task?
Or for that matter, how do you build up profiles and possibly modify profiles over time, by running commands in the user’s absence, to avoid detection?
Opportunity is knocking, are you going to answer the door?
I first saw this in a tweet by Kirk Borne.