As we move towards a digital economy, governments and industries are adopting AI to improve operational efficiencies and user experience. To do so, conventional AI requires to collect data – lots of it! And a large proportion of this data is personal data such as names, telephone numbers, email and physical addresses, marital status, age, and so on.
But over the past decade, countries have been boosting up their privacy laws. Singapore’s Personal Data Protection Act 2012 (PDPA), the European General Data Protection Regulation (GDPR), China’s Cyber Security Law and the California Consumer Privacy Act (CCPA) are some of the regulations enacted to protect personal data. Not surprisingly, renowned companies such as Facebook, Capital One and even Google, have been fined due to these laws. In some cases, companies have lost data due to security breaches or hacks, creating embarrassment and concern for governments, corporations, and customers or citizens.
Citizen Concerns Around Data Privacy
Yet, despite the presence of these regulatory frameworks, there is a growing lack of trust among citizens and consumers regarding how governments and organisations handle personal data. A survey on data privacy was conducted by Pew Research Center in June 2019 indicated that 79% of American adults are concerned about how companies use their data while 64% are concerned about how the government use their data. A whopping 70% of the adults think that their personal data is less secure now as compared to five years ago.
Public awareness of data privacy rights has also been improving worldwide. Surveying respondents in 12 of the world’s largest economies in Europe, Asia Pacific and the Americas, the Cisco Consumer Privacy Study conducted in May 2019 showed that 84% of these respondents care about privacy – of these, 80% stated that ‘they are willing to act to protect it’.
While both surveys convey strong sentiments about data privacy, findings also imply that beyond mere awareness, there is an insufficient understanding of privacy regulations on a deeper level. The study by Pew Research Center showed that 63% of American adults disclosed that they have ‘very little or no understanding of the laws and regulations’ that are set to protect their privacy. 33% of them stated that ‘they have some understanding’, while only 3% stated that they have a good comprehension of these laws. In the Cisco Consumer Privacy Study, only approximately one-third of all respondents knew about the regulations.
The Importance of Building Trust
These findings illustrate that more needs to be done to gain consumer trust in the digital realm. And if the majority of the consumers don’t trust how the government and organisations such as Google, Facebook, Microsoft, Amazon and the like, handle data, then will the doors shut on the large AI community that is focused on collecting data and helping create a better world for all citizens? How could government agencies and organisations drive operational efficiency and better user experience if they are unable to obtain value and insights from a collection of very limited dataset, with all the regulatory compliance requirements?
Lately, tech giants such as IBM, Google, Facebook and Microsoft have been researching and developing advancements towards a better AI world, while at the same time remaining compliant to any data privacy laws. Here in Singapore, institutions of higher learning such as the National University of Singapore (NUS), the Nanyang Technological University (NTU) and the Agency of Science, Technology and Research (A*STAR) are similarly researching and developing privacy preservation technologies (PP technologies).
At A*STAR, a programme team was formed and named the Trusted Data Vault (TDV) Programme, to research and develop a suite of PP technologies – be it as a standalone or in combination – to be commercialised by industry partners.
The Role of Privacy Preservation Technologies
Through intense research, technology development and industry collaborations, the TDV programme aims to advocate PP technologies towards applications that will unlock the potential of AI. This will create new value for digital services while ensuring compliance with privacy regulations and high ethical standards.
The heart of the TDV programme lies in the various PP technologies that have been garnering massive interest worldwide in resolving data privacy challenges, stemming from the rapid advancement of digital technology coupled with global privacy laws. The programme covers the research and development of federated learning (a technology coined by Google back in 2018), homomorphic encryption (a technology founded by Craig Gentry from IBM), secure multiparty computing, blockchain, and so on. Progressively, PP technologies have been a subject of interest for organisations who are pursuing ways to relieve privacy concerns of consumers and fulfil the terms of privacy laws. In short, PP technologies aim to keep sensitive data secure and protected while it is being used, as well as enhance the privacy of data when it is being analysed.
The traditional method used to preserve the data privacy is anonymisation where personal identifying information is disassociated from an individual’s record in a dataset. A dataset is anonymised through a combination of pseudonymisation (the replacement of clear identifiers with fictitious information) and de-identification.
This technique may have worked in the past, but it is not sufficient for dealing with the complexities of present technological advancements. Research has proved that anonymisation does not promise privacy. In some cases, researchers were able to re-identify individuals by using statistical techniques or cross-referencing them with publicly available datasets.
As datasets become larger, more complex and diversified, PP technologies have emerged as the most feasible solution to privacy issues. A patent analytics study conducted by Intellectual Property Office of Singapore (IPOS) stated that homomorphic encryption (HE), secure multiparty computing (MPC), differential privacy (DP) and federated learning (FL) have emerged as promising solutions among various the PP technologies developed. These are the same technologies that institutions such as NUS, NTU and A*STAR are pursuing.
In summary:
- Homomorphic encryption (HE) transforms data to a different dataset to protect sensitive information while allowing computation on its encrypted version, also known as cyphertext. This method, therefore, does not compromise any data integrity.
- Secure multiparty computing (MPC) is a cryptographic protocol that distributes computation across multiple parties where no individual party can see the other party’s data. MPC protocols can enable data scientists and analysts to compliantly, securely, and privately compute on distributed data without ever exposing or moving it.
- Differential privacy (DP) is an easier process compared to MPC and HE, and it involves introducing noise such that the query result cannot be used to infer much about any single individual and therefore provides privacy.
- Federated learning (FL) is an emerging PP technology, which is a machine learning technique that achieves computation and model training without exchanging data between data owners and data consumers. This prevents data leakage from the data owner’s premises.
Of these four PP technologies that were outlines, HE, MPC and FL are at the forefront of research and development in the A*STAR’s TDV programme.
On the whole, industrial implementation of these PP technologies are still at an early stage. It has been reflected in the patented inventions at IPOS that the financial sector is the top industry of interest, followed by healthcare, social media applications and logistics/supply chain.
On a global scale, patented inventions relating to PP technologies have been on the rise. From 2009 to 2018, over 23,000 PP-related inventions were being published worldwide, with high annual growth of 18% in the recent five years, according to IPOS. Due to strong market demands, innovations in these technologies are expected to grow even as more research and development works are needed to improve their capabilities and industrial applications.
The Role of Blockchain
A*STAR’s TDV programme also focuses on blockchain. As the literature shows these days, blockchain has grown to varying types since it was first mooted by Satoshi Nakamoto. Blockchain these days is classified into public blockchain, private blockchain, and consortium or sometimes known as a federated blockchain.
A*STAR’s TDV Programme is focussed on consortium/federated blockchain, which is deployed in a decentralised manner on multiple hardware managed by different owners and where authority is shared among members. Consortium/federated blockchain usually involves a group of enterprises collaborating to use the blockchain technology to improve businesses.
Ms. Angela Wang, the Programme Manager of TDV at A*STAR, says, ”The big difference between public blockchain and consortium/federated blockchain is that anyone, even those you don’t trust, can join the public blockchain, while consortium/federated blockchain considers each member as a trusted partner to begin with.”
The future appears bright for the privacy landscape worldwide, as researchers and top tech companies continue to invest their resources in PP technologies to create secure digital platforms, products and services. This will give rise to a healthy ecosystem of digital trust between organisations and consumers.