The value of data science in security
Data science has been used to improve business processes and operations for some time – but it also offers a way to boost security capabilities.
If data is the oil of the modern economy, businesses may find themselves sitting on vast reserves, unaware of the untapped value they hold, or uncertain of the best way to exploit such resources.
Data science offers a way for businesses to profit from the data they hold in order to improve their processes and operations. In the field of security, it enables them to query diverse datasets and use the information they've gathered to block would-be attacks, investigate potential threats, and automate security practices.
Businesses have been expanding their use of data science for some years, but more recent advances in Big Data analytics, machine learning and artificial intelligence have lit a fire under the trend by allowing businesses to meaningfully execute on the insights generated.
For most businesses, the foundations of data science are already in place. Rule-based approaches are standard in many aspects of security, for example, in basic anomaly detection, where alerts are created for a security team to then investigate. With their flat architecture and ability to provide context, data lakes have also helped companies expand the amount of data available for security systems to interrogate.
Data science builds on these fundamentals, by allowing organisations to make use of the datasets they have in increasingly complex and automated ways. Machine learning and analytics can be put to work on organisations' raw data to identify patterns that are out of the ordinary, and then suggest or take the appropriate action to remediate external attacks, insider threats, or otherwise concerning events.
While larger companies may now be investigating data science or have already set up a practice within the organisation, only a small proportion will have a dedicated security wing. This is likely to be because the jury's still out on whether data science should be a centralised or line-of-business set-up.
However, data science has a lot to offer the IT department, whichever department it reports into. One of the key advantages that well-deployed data science can offer is that it typically has a great focus on demonstrating the business value of its projects, leaving the board in no doubt as to the return on investment of its projects – and giving those in the IT department valuable ammunition when asking for more funding.
One of the most commonly-cited examples of data science for security purposes comes from the banking and insurance industry. There, data science brings together a combination of analytics and machine learning to detect fraudulent transactions. By scanning various datasets relating to user and network behaviour, companies can detect anomalies and either respond or generate an alert – prioritised according to threat level – for security professionals to investigate further.
This basic premise can be put to work for countless security applications: detecting attempted intrusions on a company network, identifying users acting against corporate policies, or managing risk. And, thanks to machine learning, models and algorithms can be refined further over their lifetime – reflecting changes in staff behaviour, alterations in the technology using the network, or evolution in the threat landscape – to reduce the number of unnecessary alerts that staff are called on to look into.
However, as with any data science project, those in security can only advance with the right fuel – the appropriate data. With the falling cost of storing data and the increasing ease of gathering it, businesses may succumb to the temptation of collecting as much information as possible and holding on to it for as long as they can. With the advent of GDPR focusing minds on issues of data and consent, businesses may choose to examine how much of what they accumulate and keep is really necessary.
The first step for businesses looking to embark on a data science project is to identify the business need that it will address. From there, the data comes into its own: organisations need to query if they have the information they need to generate the insight they require – and if not, develop a plan to do so – and meet that business need.