Privacy by Design

When a person shares data with an online service, how far should it be shared? On one hand, inappropriate data sharing can breakdown trust in a service. On the other hand, no-one likes to type in the same information every time they make a request of the service, so some saving and storing of information is certainly desirable.

I started looking at the options for data sharing within a digital service as part of my research into privacy by design. The resulting model shows that data sharing is occurring in a complex technical environment where the needs of different parties (people and/or organizations) need to be kept in balance.

Figure 1 is the start of the data sharing journey. The end user has entered some data on their mobile device that is connected to a digital service.

Figure 1: data sharing – to the digital service

Data sharing scope labelled (1) represents the case where the end user’s data does not leave their mobile device or PC. The digital service processes the data locally and potentially shares anonymized results, or requests for actions with the connected digital service. This sort of pattern is rare because many digital services want to capture the raw data about the user for future processing. However, this approach is very useful where highly sensitive data is being used – such as when biometric information being used to confirm identity. The digital service does not need the biometric information of the end user – it only needs to know he/she is an authorized user. This type of pattern is an example of data minimization that is recommended by the privacy-by-design practices. It can be used as a mechanism to gain permission to process data that a person does not want to share.

In the more typical case, data is sent from the end user’s mobile device to the digital service. The act of sending the data brings in data sharing scope (2). Between the end user’s mobile device and the digital service is a multitude of network service providers, each able to see the packets of information flowing. These service providers can see how much data is flowing, how often and between which devices and services. Sometimes that is enough to guess what is going on. If the data exchange between the mobile device is not secured, then the contents of the information packets are also being shared.

Figure 2 considers what happens inside the digital service. There are 3 data sharing scopes shown, labelled (3), (4) and (5). Each represent data that is only seen and used by the digital service, but for different lengths of time.

Figure 2: data sharing – inside the digital service

Data sharing scope (3) represents data that is used for a single transaction. You can think of a transaction as a business exchange – such as selecting items to purchase and then confirming the order and paying for it. Within the transaction are a number of exchanges of data. The digital service may refer to other data it has stored from earlier transactions.

Some of the data from each of the end user’s transactions is often kept for future use. Data sharing scope (4) is data kept for the exclusive use of the digital service when working on behalf of this specific end user. It includes commonly use information such as their name, contact details, preference and historical information about their transactions. This end user data is typically removed when the end user closes their account, or removes their profile. Data sharing scope (5) covers data that is used by the digital service for any user. For example, a navigation system may use the data from all users to detect where congestion is occurring and then use that insight to guide an individual user.

Often a digital service provider supports multiple digital services that the end user can sign up to either incrementally or as a package. The end user is encouraged to use the broader range of services because information they have already entered is pre-populated in the other digital services. This data sharing is shown in Figure 3.

Figure 3: data sharing – digital service packages

Data sharing scope (6) shows the sharing of data between the digital services during a transaction. This may be to offer the end user additional capabilities as they use the service.

Data sharing scope (7) covers the sharing of end user data between the digital services for the specific user and data sharing scope (8) covers the shared data that is made available to all digital services in the package for all users.

Some digital service providers have many digital services that are grouped in packages. Figure 4 shows data sharing within a digital service provider with multiple digital service packages.

Figure 4: data sharing – digital service platforms

The data sharing between multiple digital service packages follows a similar pattern to that within a single digital service package. There is sharing during a transaction – data sharing scope (9) – and sharing of user data across digital services from different packages for whenever the specific user is being served – data sharing scope (10). Data sharing scope (11) covers data shared by with all digital services from the digital service provider, irrespective of the package they are in.

I have called it out as a separate set of sharing scopes because these packages typically represent different lines of business. So there may be a package of digital services for banking, another for insurance, another for loans. From the end user’s perspective, just because these packages are owned and operated by the same organization does not mean that the sharing of information between them is always acceptable.

Figure 5 adds the complexity brought in by the use of external cloud platforms. Cloud platforms are particularly complex environments for understanding data sharing because there are often multiple organizations involved in supporting a cloud-based digital service. The result is that deep in the technical implementation, out of the sight or control of the end user, their data is being processed and stored on computer systems owned by organizations unknown to them.

The cloud platform provider is the organization that provides the data centers and the infrastructure (computers, operating systems and basic services for running a digital service). .

Figure 5: data sharing – cloud platforms

The cloud platform provider can see the number and types of requests being received by the digital services they host. If the digital service provider does no properly secure and encrypt the data stored with the cloud provider using their own private encryption keys, they are inadvertently sharing their data with the cloud provider.

Digital service providers are not restricted to using a single cloud platform either. Figure 5 shows a digital service package that spans two cloud platforms. This means that data shared with or inferred by the cloud platform provider – data sharing scope (12) – may be received by multiple organizations.

Figure 6 shows the sharing of data with third parties. Data sharing scope (13) indicates the sharing of data within a transaction – for example, a call to a payment service during a purchase. Data sharing scope (14) is where accumulated information about the end user is passed to a third party. This typically occurs when the end user needs an account on the third party’s digital service platform for their digital services to operate properly. Data sharing scope (15) covers more general sharing of data with third parties by the digital service provider. This may include personal data about the end users but is more likely to be aggregated information about the digital service’s operation and the volumes of different types of requests it is processing.

Figure 6: data sharing – third parties

Once the data passes to a third party, the digital service provider looses control of the data and must rely on contracts and other legal obligations to control their use of the data. This is why there are no details shown on diagram as to what happens to the data once it is received from the digital service provider.

Finally figure 7 shows the sharing of data with everyone – or at least, anyone who signs up to a service, or open data site. At this point there is a total loss of control on how this data will be used, combined and shared going forward.

Figure 6: data sharing – public

Data sharing scope (16) shows the digital service provider making data public; (17) is the cloud provider publishing data and (18) is a third party that received data from the digital service provider that is making the data available for public use.

Public data sharing may of course be under the control of the individual – for example, when they send a message to social media to publicize that they have purchased something or have achieved a goal. It may also be an intentional behaviour of the service. However, for many digital service providers and cloud service providers, the presence of a particular data set in the public domain may be the first indication they have that they have had a data breach.

So what do these data sharing scopes teach us, apart from the fact that this is a complex topic :)? This model is firstly an analysis tool. It provides a scheme for a digital service provider to characterize the sharing scopes for each of the data sets they manage. This analysis may identify additional opportunities to share data, and places where additional security may be required. Secondly, this model provides a framework to explain to an end user how their personal data is being shared.

In summary, understanding the data sharing scopes helps digital service providers design the optimal use of the data they are capturing and where to secure it in order to balance the needs of their end users’ privacy and the breadth of services that could be offered.

Photo: Rhodopi Mountains, Bulgaria

Abstract:

Almost all data that is generated for our digital economy is about individuals and data-driven services typically provide targetted, customised services to people as they go about their daily lives. How does an organization whose business is built around data ensure that their work is ethical?

Privacy is a very personal perspective and it is contextual. We tend to have less concerns about our privacy when we deal with an organization that we trust. More importantly, when trust is present, we are more likely to grant access to our data and allow the organization greater license to process it. Maintaining this trust is an essential part of a digital economy.

To most people, today’s digital technology is baffling. They use the technology and see its benefit – but when they hear about cyber attacks; identity theft; the buying and selling of their data; phishing, ransomware and other scams, they have no foundation on which to judge the size or seriousness of the problem.

Emerging Regulation

In answer to these threats, countries around the world are passing legislation that is aimed at protecting their citizens from the inappropriate and careless processing of their data.

For the European Union we have the General Data Protection Regulation (GDPR). The GDPR defines a comprehensive set of requirements for the processing and protection of personal data. It seeks to address this issue of trust by creating high standards for data security and transparency in the way this data is processed. The GDPR makes no specific recommendations on how this is to be achieved – just what the effect must be – which makes sense in such a fast moving technical landscape.

The breadth of the GDPR is also interesting – it is covering all data that could potentially be connected with an individual. So this covers monitoring of assets, devices and activity at specific locations since this data may be used to understand and target an individual. It is effectively scoped to our entire digital economy and will have a significant impact on all commercial activity in this space.

So what are the implications of this type of legislation? How will digital businesses thrive in an open and transparent way, protecting their investment whilst creating a level of choice and control in people’s lives?

As with all technology, our digital technology is essentially ethics agnostic but it pushes the art of the possible to new limits:

The availability of a wide range of data from many sources.
The ability to cheaply process and link this data together to understand a bigger picture.
The accuracy with which an individual can be identified and targeted.
The ability to pinpoint location for contextual insight and surveillance.
The application of this new insight to a wide range of activities and actions.
The operation of this insight in real-time or near real-time.

It is the use of technology that determines its impact – in terms of how it consumes people’s time, their ability to move about and the information they see when making a decision.

The digital economy begins to fail when individuals become overwhelmed with interrupts as they go about their daily lives. Imagine getting an unsolicited text message offering a new service every minute – how long would it be before you switched your phone off?

It also fails if people are reluctant to share their information with a digital service because of unknown and imagined negative consequences.

In many respects we are already living in a virtual reality where our perceptions of the world and the opportunities we see are shaped by the algorithms that funnel information and offers to us. If these algorithms are trained on data that reflects the prejudices and inequalities of our society, they optimise and amplify discrimination in a way that is both illegal and unethical.

So how does the digital economy respect and individuals rights, privacy and freedom whilst providing personalised digital services?

Building Trust

One of the first lessons I learnt when I started looking at ethical issues around big data and analytics is that there are no universal definitions of concepts such as privacy, freedom and ethics. Each of these are measured by personal perspectives based on experience, education, religion, culture, upbringing and family – and this perspective is not static. It can change as they become more familiar with a situation, see the consequences and the benefits.

So you can imagine a sign-on sequence to a service as:

Computer:

Who are you?                                            

Person:

I am Fred, here is my password, fingerprint etc.

Computer:

OK I can see that you are Fred, which means you can do a,b and d.

Person:

Thank you computer but before we go on I would like to set some
ground rules for our interaction. This is what you can do for me; 
this is what you can store about me to support our interaction and
this is what you can share.

Understanding what permissions an individual is likely to grant is where we need to bring in the expertise of social science and psychology.

From their work we know that individuals are not a single persona. Their behaviour is influenced by the context in which they find themselves. This also influences what information they are willing to share. We can think of this as spheres of trust. So we are more open with our friends than with strangers.

Figure 1 – spheres of trust

We also have relationships with organizations and we extend different levels of trust and data sharing accordingly.

The spheres of trust begin to break down when we use digital technology because the same device is used in multiple spheres.

Figure 2 – mobile use

In addition, it is a complex world of interlinking services behind the logo. Data can be shared with multiple organizations without the awareness of the individual.

How do we design systems that respect our spheres of trust when data is collected from our devices and shared with other organizations?

Privacy by design is an emerging discipline that recognizes this complexity and seeks to design systems that are cautious in their use of personal data – to assume that it is not able to understand the sphere of trust it is operating and aiming to minimize what they collect, process and share and purposefully avoiding the identification of the individual in the data they collect.

It also encourages transparency in processing too so the individual understands what data is collected, how it is maintained, what it is used for, how long it is kept and where/who it is shared with.

A digital business that respects an individual’s choice, gives them control over the processing that occurs and how data is shared, and operated it services in an open and transparent way is going to be given time to create familiarity, demonstrate value and build up trust in their operation. As trust grows, so does loyalty and the broader use of the organization’s digital services.

Summary

Establishing trust and transparency will become as necessary to a digital business as cyber-security.
There is huge scope for differentiation around the ethical processing of data since a higher level of trust leads to a more open sharing of data for a broader set of services.
Everyone has their own notions of privacy and ethical behaviour. It is important to offer choice and gather consent so individuals can customise the behaviour of their digital services to a level that they are comfortable with – and over time this builds trust and them opening up to a broader use of their data by digital services.

Photo: The Remarkables mountain range, Queenstown, New Zealand