WHEN YOU’VE FINISHED reading this article, you are going to go to a café and buy a latte and a chocolate eclair. After that, you’re going to spend three hours logged onto your computer at work.
You will check your Facebook feed seven times, maybe shop for and purchase cosmetics, and perhaps Whatsapp your brother in France.
At 5 o’clock you will leave the office and take 1,137 steps along the Grand Canal maintaining an average heart rate of 95bpm. You’ll stop at the Exo garage at between 5.15pm and 5.25pm and buy a bottle of red wine, probably Cabernet Sauvignon.
‘I can see the patterns in your behaviour’
How do I know this about you?
Because you leave a trail – at cash registers, on CCTV cameras, on your phone, your fitbit, your desktop computer. And because it’s payday and you always buy cakes, cosmetics and Cabernet Sauvignon on payday, I can see the patterns in your behaviour from the data trail that you leave everywhere you go.
What’s the word that springs to mind here? Creepy? Maybe so, but it’s not some dystopian future. It’s the world we now live in, and however creepy it might seem there is also a huge opportunity. To take advantage of this opportunity, however, ordinary people need to start paying attention to the trail of data that they leave behind them.
Companies that collect parts of your data trail already use it to make choices for you, or the data trails of people a little bit like you – same age, gender, social class. Our major banks are building algorithms around customer data that will make decisions about whether you get a mortgage or credit approval, based on data they have about other people.
Every online advertisement you see is tailored to the interests observed within your data trail. Political parties use data about you to tailor the messages you receive. In the US, data is now being used to decide whether prison inmates coming up for parole should stay behind bars or walk free. Based on a set of variables (age, gender, type of crime, length of sentence) how likely is this person to reoffend? Feed the data into the algorithm and out comes the decision that determines your liberty.
A data scientist, but also a human
I’m a data scientist. I build the algorithms based on the data that produce the tools like the ones described above. I’m also a human being and I’m not inured to the ‘creepiness’ of the situations I’ve just described. However, I am excited by the possibilities of this science.
The application of data science offers potential for improvements in our lives such as personalised medicine. People are different. Increasingly we are finding that different individuals react differently to the same treatments. We are already using data science in an effort to tailor medicine and treatments to different individuals. The overall aim, however is in disease prevention.
Data science could potentially hold the key to unlocking the earliest signs of injury or disease in your body, long before they start to become a problem. But that’s not all, think about all the problems we face in society now, and how wise use of data for better planning and efficiency might help to solve them.
Smart Cities technologies and data analysis can help with the planning of housing, traffic management, energy efficiency and problem prevention. In education, we can use data to ensure that we have the right schools in the right areas before the crises emerge.
It can teach us a lot
When we talk about college drop out rates we should note that some universities already use algorithms to identify students that are at risk of falling behind and providing tailored help to them. We can use data to plan for massive world events such the refugee crises. Big data holds a lot of solutions. We just need to come up with the right question.
The data matrix is getting deeper and wider – we can collect more data about more parts of our lives than ever before. And there’s no going back, we can’t turn off the data trail.
What we can do is make sure that ordinary citizens are not in the dark about the patterns they create and the ways those patterns are interpreted and used. We have a way to go before people have real control over their data but it starts with education.
The first step is to recognise that data in itself is nothing to be scared of, but it is something to be aware of. It is the responsibility of all of us who work in publicly funded data science research (there are hundreds of us in the Insight Centre for Data Analytics) to reach out to the public and inform them of the possibilities and the potential drawbacks of this exciting field of research.
Your next move should be to be aware the data trail you leave behind you, and what it is being used for.
Brian Mac Namee, who is a lecturer at UCD and a researcher with the Insight Centre for Data Analytics, will deliver the inaugural Royal Irish Academy Public Engineering and Computer Science Lecture series in 2016. The series, entitled ‘Show me your data and I’ll show you who you are’ started in Dublin in March, and moves to Cork, Galway, and Derry later in the year.
For more details, and to register, visit here. Brian is also co-author of “Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies” published by MIT Press in 2015 (www.machinelearningbook.com).