Big Data in Security – Part III: Graph Analytics by Levi Gundert.
In interview form with Michael Howe and Preetham Raghunanda.
You will find two parts of the exchange particularly interesting:
You mention very large technology companies, obviously Cisco falls into this category as well — how is TRAC using graph analytics to improve Cisco Security products?
Michael: How we currently use graph analytics is an extension of the work we have been doing for some time. We have been pulling data from different sources like telemetry and third-party feeds in order to look at the relationships between them, which previously required a lot of manual work. We would do analysis on one source and analysis on another one and then pull them together. Now because of the benefits of graph technology we can shift that work to a common view of the data and give people the ability to quickly access all the data types with minimal overhead using one tool. Rather than having to query multiple databases or different types of data stores, we have a polyglot store that pulls data in from multiple types of databases to give us a unified view. This allows us two avenues of investigation: one, security investigators now have the ability to rapidly analyze data as it arrives in an ad hoc way (typically used by security response teams) and the response times dramatically drop as they can easily view related information in the correlations. Second are the large-scale data analytics. Folks with traditional machine learning backgrounds can apply algorithms that did not work on previous data stores and now they can apply those algorithms across a well-defined data type – the graph.
…
For intelligence analysts, being able to pivot quickly across multiple disparate data sets from a visual perspective is crucial to accelerating the process of attribution.
Michael: Absolutely. Graph analytics is enabling a much more agile approach from our research and analysis teams. Previously when something of interest was identified there was an iterative process of query, analyze the results, refine the query, wash, rinse, and repeat. This process moves from taking days or hours down to minutes or seconds. We can quickly identify the known information, but more importantly, we can identify what we don’t know. We have a comprehensive view that enables us to identify data gaps to improve future use cases.
Did you catch the “…to a common view of the data…” caveat In the third sentence of Michael’s first reply.
Not to deny the usefulness of Titan (the graph solution being discussed) but to point out that current graphs require normalization of data.
For Cisco, that is a winning solution.
But then Cisco can use a closed solution based on normalized data.
Importing, analyzing and then returning results to heterogeneous clients could require a different approach.
Or if you have legacy data that spans centuries.
Or even agencies, departments, or work groups.