In this talk, we will discuss detection of three well-defined security problems—adversarial user behavior, lateral movement and insider threat detection—using a relatively untapped data set: shell and session commands. We'll discuss machine learning (ML) techniques needed to analyze this data, present research key findings and describe the effects and mitigations of bias to achieve higher accuracy. Additionally, we will explore techniques for safeguarding ML models based on this data.
The presentation also will outline a number of the tools used to develop these findings, including methods for analyzing and visualizing massive datasets over billions of Linux audit events. Finally, we'll cover advances in ML that can be leveraged to gain meaning out of the data thrown into the lake.