With artificial intelligence (AI) and machine learning (ML) driving business innovation and operational efficiencies, understanding data ethics is paramount to leveraging AI and ML for successful results.
The promise of AI & ML
AI promises to deliver considerable business benefit – IDC estimates that $52 billion will be spent on it annually by 2021. Companies around the globe are exploring ways in which they can use the right data to feed into their AI solutions to reduce costs, meet regulatory demands, deliver an enhanced customer experience, and innovate. Getting it right, the capturing, processing, managing and storing of data, is not as straightforward, however.
The Cambridge Analytica scandal brought the issue of data ethics to the headlines, particularly in the context of social media platforms, but other concerns are being raised within the technology industry that are more subtle. For example, can an AI machine learn morality and just which set of morals should it learn? Morals differ dramatically from culture to culture, as a recent Massachusetts Institute of Technology (MIT) experiment showed. Others are asking if the conscious and unconscious biases of those who assemble an AI solution will be then found baked into that solution.
Think tanks are focusing on defining what the ethical treatment of data should look like. The Information Accountability Foundation recently published a paper that asks probing questions about risks and benefits around data ethics within organisations. The Center for Information Policy Leadership has published a report that also examines data ethics issues.
Google recently issued its own AI principles and we should expect other technology companies to follow suit with their own data ethics policies over the next year or two. Investors may not be asking for data ethics policies today, but they will be soon. The reputational risk from a data ethics failure can destroy considerable shareholder value overnight.
Embedding data ethics
Businesses need to avoid a ‘black box’ situation by becoming a true data citizen. Understanding data – how it’s handled and managed – will ultimately help businesses maintain control and adhere to data ethics guidelines. While a shroud of uncertainty surrounding data ethics still remains, there are five areas of clarity that any business should ensure they understand and implement:
Take a proactive approach to data ethics – Often companies can find themselves focusing only on compliance with regulatory requirements. This can be a mistake. In the event of a data-focused problem, and without a data governance programme in place, businesses will be on the back foot. Unpreparedness often leads to poor business choices. By keeping a proactive mindset when it comes to data ethics, and thinking, “what could we need to solve next?”, businesses will remain more resilient to the potential data challenges that lie ahead.
Understand the data journey – When thinking about how to treat data ethically, think through the entire journey the data will make. This includes, for example, the data governance framework, data application, model building, model validation, and model assignment. It’s important to consider issues such as data quality, data transparency, and data privacy through all of the stages data may go through.
Explain the importance of good data to the business – Often the business needs educating on why issues such as data quality and data ethics are so important. Neither topic should be seen as something separate from the business. The business needs to be involved in setting policies and owning compliance. Make the connection between ethics, transparency, and what is important, such as delivering shareholder value and creating innovation uplift.
Involve a wide range of stakeholders – In any new business project that involves using data, for example, a new retail-focused AI solution, it helps to not just have the CDO or CIO look at the potential data issues. From the IT team, analysts and even the CEO, all employees need to be data cognizant and understand the issues. A diverse group of people looking at potential data ethics issues in a project brings a wider variety of perspectives. A more 360-degree view of the client will increase the likelihood that any challenges will be flagged early on.
Monitor data uses on an ongoing basis – It is not enough to just validate the use of data at the beginning of a project, the development of a model, or the creation of an AI solution. As business circumstances evolve, so too may the uses of the data. Check in regularly that both the data and the system that is using it are continuing to deliver value and are performing as expected.
In the next year, many organisations will prioritise taking next steps in creating a data ethics culture. Working together as data citizens across a business and as a wider cross-company initiative, we can all help impact each other’s journeys. We need to collaborate, share ideas and update each other on the progress each of our individual organisations is making in promoting a more ethical culture around their data resources.
Data ethics is undoubtedly still a work in progress, but it is one that could not only help companies profit, but generate real societal gains.
Stijn is the co-founder and CTO of Collibra and leads the company’s global product organisation with a focus on driving data governance technology innovation. Prior to co-founding Collibra, Stijn was a senior researcher at the Vrije Universiteit of Brussels, a leading semantic research centre in Europe, where he focused on application-oriented research in semantics. Stijn is a sought-after expert resource, industry speaker, and author on the topic of data governance and semantics. Stijn holds a Master of Science degree in Information Technology and a Master’s degree in Artificial Intelligence from Katholieke Universiteit Leuven and a Postgraduate in Industrial Corporate Governance from Europese Hogeschool Brussel.