A. INTRODUCTION

1. 'The world's most valuable resource is no longer oil but data' ran the title of a leader in The Economist in May 2017, drawing a comparison between Standard Oil in the early 20th century and the world's most valuable listed companies today.1 But, even as 'data as the new oil' has become a trope of the fourth industrial revolution, it's easy to overdo the parallels: oil is a finite natural resource whilst data is created by people – "the world's most renewable resource" in the words of Microsoft CEO Satya Nadella2 - and oil used by one person can't be consumed by anyone else, where data can be used time and again without lessening its value.

2. Data volumes created are doubling every two years or so. Information and data are also infinite as expression and communication and as a resource, data volumes growing exponentially. In the April 2019 edition of its 'Data Age 2025' white paper, research company IDC estimated that global data volumes will grow at a compound annual growth rate of 30% to 40% to reach 6x current levels by 2025.3 Four things are driving this growth: hyperscale data centres at the cloud's core; proliferating compute capabilities at the edge; ubiquitous mobile phones and devices; and (particularly) connected sensors, where the Internet of Things ('IOT') "aims to do for information what electricity did for energy".4 To put this growth in perspective, data volumes created have already doubled since The Economist's May 2017 leader.

3. The value of personal data is becoming more measurable. Attributing value to data is another area of difference between oil and data. Whilst the market accurately informs the price and value of oil, the price and value of data are opaque. Data valuation will depend on what is being looked at – financial market data differs from individuals' personal data – but two recent studies of personal data show how value is becoming more measurable. In March 2019 the US Democratic Party strategy group Future Majority found the value generated from Americans' personal data to be US$366 per person in 2018, set to rise to US$612 in 2022.5 Then in July 2019 accounting firm EY published a study on valuing UK health care data which estimated that the entire dataset of the UK's National Health Service ('NHS') was worth £9.6bn per year in benefits, or £175 (US$217) per person across the 55 million record NHS estate.6 These two studies show that individuals' personal data is currently worth in the low $00s per person per year and that value is rising.

4. Data is elusive in legal terms. For all the rapid growth in data volume, value and value measurability, data remains elusive in legal terms. When extracted, oil is goods under English law7 and as such may be owned, bought, sold or stolen. Data on the other hand isn't a tangible and can't be bought and sold, at least in the same way; and under UK criminal law information has been held not to be intangible property either so it can't be stolen.8 Although legally inert in and of itself however, different legal rights and duties apply to and act on data in different ways and use cases: a useful heuristic is 'there are no rights in data, but rights arise in relation to data'. Reflecting the increasing value of data, the legal aspects of these rights (and the duties that are their converse) are developing rapidly. To quote The Economist again, in the context of IOT, "a world of connected sensors will generate huge amounts of data. It will also generate arguments about who can use and analyse those data".9 These rights and duties - as intellectual property ('IP'), contract and regulation and especially in the context of big data - are the main subject of this white paper.

5. Purpose and scope of this white paper. Accordingly, the purpose of this white paper is to provide a practical guide to legal rights in data - what they are, how they arise and how they can be managed. Its primary audience is in-house legal counsel lawyering their organisation's data estate and data operations. Section B overviews data across a number of different verticals (financial services, insurance, air transport, recorded music, healthcare and the public sector) and developing data policy. Section C looks at different types of data before offering a common 8-layer framework for the legal analysis of data. Sections D, E and F work through each level of the framework: (1) platform infrastructure, (2) information architecture, (3) IP rights, (4) contract rights, (5) data regulation, (6) data protection, (7) information security and (8) data governance.

This white paper is one in an occasional series on aspects of IT law. Others include the legal aspects of artificial intelligence, cloud contracting, cloud security and IOT, and demystifying IT law.10 This paper is not legal advice. It is written as at 30 September 201911 and from the standpoint of English law.

B. THE BUSINESS AND POLICY CONTEXTS: DATA IN KEY VERTICALS

6. Introduction. This section briefly looks at data in the context of financial markets (paragraph 7), open banking (para 8), insurance (para 9), air transport (para 10), recorded music (para 11), healthcare (para 12) and public (para 13) sectors before looking at the policy perspective and directions of travel (para 14).

7. Financial market data. The financial sector is one of the largest users of IT globally. Trading platforms – complex computer systems to buy and sell securities, derivatives and other financial instruments – are its beating heart and data its lifeblood. Based on an ecosystem of exchanges, index providers, data vendors and data users (asset managers on the buy-side and banks and brokers sell-side), these platforms generate market data, indexes, reference data and analytics and together form the world's financial market data/analysis industry. Increasing regulatory requirements, the growing ability of AI to interpret data and rising market volatility are currently fueling increasing demand both for financial market data (where global revenues hit $30bn for the first time in 2018) and exchanges (whose global revenues were $34bn in 2018).12

In legal terms this complex ecosystem is held in place by contract, with market practice based on agreement structures that license, restrict and allocate risk around data use. These contracts have grown up over the years and constitute a stable, cohesive normative framework in markets that have seen little litigation. Exchanges and data vendors will seek to apply their standard terms, which are almost universally based on the reservation to the data provider of all IP (copyright, database right in the EU and confidentiality) in the data being supplied and a limited licence to the customer to use the data for specified purposes. Points of contention in exchange, index and data vendor agreements typically centre on:

  • scope of licence and redistribution rights (internal use only or onward supply, and increasingly data use for AI, machine learning ('ML') and data science purposes);
  • treatment of data derived from the data initially supplied (who owns it; what the user may do with it);
  • use of the data after termination of the agreement; and
  • scope of compliance audits and remedies for unpermissioned use and over deployment.

8. Open banking. In January 2018, two important data-related developments took place in the UK banking industry. First, the UK implemented the second Payment Services Directive ('PSD2'), which aims (among other things) to enable banks and other payment account providers, their customers and third parties to share data securely with each other.13 Second, in a sort of 'own brand' version of PSD2, the UK went live with its own Open Banking initiative, representing an important endorsement of Open Data principles (see paragraph 14 below). This mandates the nine largest UK banks to allow their personal and small business customers to share their account data securely and directly with third party providers regulated by the Financial Conduct Authority ('FCA') and enrolled in the Open Banking initiative. The Open Banking Ecosystem refers to all the components of Open Banking, including the Application Programming Interface ('API') standard and the security, processes and procedures, systems and governance to support participants in the initiative. As of September 2019, 143 FCA regulated providers are enrolled in Open Banking.14

9. The insurance sector. Insurance is based on the insured transferring the risk of a particular loss to the insurer by paying a premium in return for the insurer's commitment to pay if the loss occurs. The combination of big data and AI/ML enables insurance risk to be assessed and predicted much more precisely than in the past by reference to specific data about the insured and the risk insured. In turn, these factors enable the price of the policy to be calculated more accurately.

As well as the traditional 'top down' statistical and actuarial techniques of risk calibration and pricing, insurers can now rely on data relating to the insured person and insights delivered by AI. For example in vehicle insurance, location based data from the driver's mobile can show where they were at the time of the accident and other telematics data from on-board IT can show how safely they were driving; smart domestic sensors help improve responsiveness to the risk of fire, flooding or theft at home; and health apps and wearables provide information relevant to health and life insurance. Comparing this specific data with insights gleaned from trained AI/ML algorithms enables further accuracy in calibration.

These examples – data from location based services, vehicle telematics, home sensors and wearables – is having a material impact on vehicle, home and health insurance pricing and terms. Big data and analytics in insurance also point up two other common themes. First, the tension between the privacy of the insured's personal data and its availability to others – a tension that insurers are wrestling with in the context of genetic pre-disposition to illness and the socialisation of risk. Secondly, as in the banking sector, increasing regulatory scrutiny is accentuating the importance of data analytics. For example, the Solvency II directive15 regulates the amount of capital that an EU insurance company must hold against the risk of insolvency, and this required capital amount is based on likelihood of aggregated policy pay outs where again the predictive insights of big data and AI/ML are critical.

10. The air transport industry ('ATI'). The ATI has grown up with computerisation and standardisation as key components in getting passengers (4.3 billion globally in 2018, up by 75% from 2008) and their baggage to the airport of departure, on to the plane, and to and from the airport of arrival. In doing so, airlines and other ATI companies generate and hold vast amounts of data during all stages of the customer journey – for example, the average transatlantic flight generates 1 terabyte of data. But this data can be siloed in a particular application or airline, so big data techniques have emerged to support the service and efficiency improvements that lie at the heart of ATI growth. A recent study quoted in the Financial Times found that big data analytics was a higher priority for the ATI than any other industry16 as gathering, analysing and using big data enable airports and airlines to develop insights about customers and their air travel preferences and harness competitive advantage.

Using big data is also improving ATI efficiencies, as illustrated by Resolution 753 passed by ATI trade association and standards body IATA (the International Air Transport Association) which has required its member airlines since June 2018 to track passengers' baggage from start to finish. Over the decade to 2018, a combination of increasing standards, smart technology, automation and new processes has enabled a 50% reduction in baggage items mishandled (from 47 to 25 million) while passenger numbers have risen by 75%.17

11. The recorded music industry. The recorded music industry is a $19bn global business in full digital transformation as streaming comes to dominate music consumption. The structure of the industry has grown up around norms based on the individual and collective licensing and management of the various and distinct copyrights that arise in a song's composition, lyrics and publication, and in its recording and performance. These copyright norms operate primarily on a national basis with harmonisation and equivalence established internationally through copyright treaties like the Berne Convention and WIPO Treaties.

The big three record companies (Universal, Sony and Warner) together account for around 70% of the global recorded music market. The music track is effectively the product unit for the sector and PPL, the UK CMO (Collective Management Organisation) for the public performance rights of its 100,000 recording and performer members, operates a repertoire database of 15 million tracks that is currently growing by 37,000 new recordings per week. Management of data is a large part of PPL's work, driving more accurate distributions and better international collections, where the trend is towards standardising of data submission and exchange formats between country CMOs, their members and licensees.

The record industry is another sector where data techniques are enabling rapid insights into consumer preferences. These insights have historically been the province of record company A&R (Artist & Repertoire) teams but data is increasingly influencing musical taste, fashion, trends and hence the creation of music itself in a way that has not been possible before. In the words of Geoff Taylor, CEO of UK trade body, BPI:

"Increasingly, data, in all its forms – spanning metadata to big data – is playing a key role in shaping this process. As streaming comes to dominate music consumption, data is becoming a progressively more important part of the process of producing and marketing music and, arguably, one of the determinants of a song's popularity."18

12. The healthcare sector. Healthcare remains the sector where data use will have the greatest impact on people's daily lives. Four drivers lie behind data innovation in UK healthcare: intensifying cost pressures leading to demands for better data; increasing availability of national collections of clinical and treatment outcome datasets; growing investment in anonymising, aggregating and analysing data from individual care centres; and government support of open data and interoperability standards. Public spending on healthcare in the UK (principally the NHS) at around £162bn for the 2020 financial year accounts for roughly 20% of total UK public spending of £848bn. NHS Digital, part of the UK's Department of Health and Social Care, is responsible for the standardising, collecting and publishing of data from across UK health and care systems and in its September 2018 paper 'Data, insights and statistics', it commented:

"Artificial intelligence, machine learning, predictive analytics and the internet of things are no longer dreams of tomorrow. They are here and evolving at pace to support increasingly personalised care. The power of data also has huge potential to drive our economy. The NHS has an unrivalled data set covering a single, large population, stretching back two decades and supported by robust governance and the unique identifying NHS number. It is a major national asset and we have a responsibility to continue to build the trust and infrastructure that will allow the UK and the NHS to be global leaders in this space."19

13. The public sector. Like all developed states, HMG's database about its citizens is the largest in the country, and government departments like BEIS, Education, Health and Social Care, HMRC, Home Office and Work and Pensions have huge and growing databases. As individual government departments increasingly master their own digital data and central government as a whole starts to move towards data sharing, HMG's data estate is now recognised as a valuable national asset. Looked at as an asset, managing the UK's data estate raises complex policy questions as to protection, growth, maintenance and monetisation, along with the reconciliation of competing interests, including protection of privacy and other individual liberties, the security of the State and its citizens, crime and fraud prevention, commercial interests, safeguards against State overreaching and maximising the benefits of technological progress for citizens.

The 'Public Sector Data Report 2019'20 noted four major trends. First, data and analytics are being widely used in the UK to help address the challenges of public spending cuts; second, security and data breaches top UK public sector concerns; third, half of the survey respondents were either confident and well-trained or confident and eager to learn about working with data and analytics; and fourth data and analytics are seen as high value in the UK public sector. For further information about AI and cloud security in the public sector, please see our white papers on the Legal Aspects of AI and Cloud Security21.

14. The policy perspective. In June 2018, the UK government announced that it would develop a national data strategy. The first step was to set up of a new Centre for Data Ethics and Innovation22 and in June 2019 an open call for evidence was announced based on three areas of focus – people, economy and government.23 The European Commission has since 2014 driven a number of policy initiatives designed to build the data economy in the context of its Digital Single Market strategy, including the creation of a common European data space, reviewing the directive on the re-use of public sector information and the recommendation on access to and preservation of scientific information and guidance on data sharing with private sector bodies.24

The elusive nature of data in legal terms has tended to confuse rather than clarify policy debates around data, but the following draws out a number of current themes:

  1. Data: ownership or control? A seminar held in October 2018 under the auspices of the British Academy, Royal Society and techUK found that:

    "use of the term "data ownership" raises significant challenges and may be unsuitable because data is not like property and other goods that can be owned or exchange. Instead discussion should explore the rights and controls individuals, groups and organisations have over data, and should encompass a societal as well as individual point of view. Broader debate could help to better describe the data rights and controls that are often associated with the concept of 'data ownership'."25
  2. Data: asset or utility? Should organisations think of the data they use as an asset with value on the balance sheet or a utility like electricity?
  3. Data: asset or liability? Increasingly as GDPR and other obligations and duties are perceived as giving rise to significant potential liabilities, organisations are looking at data not only as benefit and an asset but also as a risk and potential liability.
  4. Data: proprietary or open? From the 1980s, the open source software ('OSS') movement rejected the traditional 'cathedral' based approach to software development in favour of an open 'bazaar' approach, and today OSS accounts for a large and growing share of software markets around the world. Similar developments are occurring in data where government and the public sector are increasingly open sourcing publicly held datasets and APIs and making open sourcing of data and research a condition of public funding. This is leading to change in market sectors including open banking (see paragraph 8 above) and scientific and academic research publishing. The new Copyright in the Digital Single Market Directive (see paragraph C.27(a) below) is an example of copyright legislation moving to accommodate this new approach.
  5. Data: regulation or market forces? Finally, there is a growing groundswell of views in the USA and the EU about whether and if so how to address the perceived influence and power of large data-oriented businesses like Google, Apple, Facebook and Amazon, particularly around whether competition rules should be refashioned or developed to provide greater regulation (see paragraph 31 below).

Download the white paper here.

Footnotes

1 The Economist, Leaders, 6 May 2017 - https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data; see also 'Data, data everywhere', The Economist, Special Report, 27 February 2010: "data is becoming the new raw material of business: an economic input on a par with capital and labour" - https://www.economist.com/special-report/2010/02/27/data-data-everywhere

2 'Tools and Weapons: the Promise and the Peril of the Digital Age', Microsoft President Brad Smith and Carol Anne Brown, Hodder & Stoughton, 2019 at p.274.

3 'Data Age 2025 – the digitisation of the world from edge to core', IDC White Paper sponsored by Seagate, April 2019 - https://www.seagate.com/gb/en/our-story/data-age-2025/

4 'Drastic falls in cost are powering another computer revolution', The Economist, 12 September 2019 - https://www.economist.com/technology-quarterly/2019/09/12/drastic-falls-in-cost-are-powering-another-computer-revolution

5 'Who owns Americans' personal information and what is it worth?', Robert Shapiro and Siddhartha Aneja, Future Majority, 8 March 2019, at p. 21.

6 'Realising the value of health care data: a framework for the future', EY, 19 July 2019 

7 See, for example, Benjamin, Sale of Goods, 10th Edition (Sweet & Maxwell, 2017), paragraph 1-087, pp. 75 & 76.

8 Oxford v Moss ([1979] Crim LR 119) is authority that there is no property in data (in that case, confidential information in an exam question) as it was not 'intangible property' within the meaning of the Theft Act 1968.

9 'When humans are connected – what happens when humans are connected to smart machines', the Economist, 13 September 2019 - https://www.economist.com/technology-quarterly/2019/09/12/hugo-campos-has-waged-a-decade-long-battle-for-access-to-his-heart-implant

10 'Legal Aspects of Artificial Intelligence' (September 2018), 'Legal Aspects of Cloud Computing: Cloud Contracting' (June 2019), 'Legal Aspects of Cloud Computing: Cloud Security'(June 2018), 'Legal Aspects of the Internet of Things' (June 2017), 'Demystifying IT Law' (June 2018).

11 We have not addressed here issues relating to the withdrawal of the UK from the EU (Brexit) and will update the paper if necessary when the position is clearer. 

12 See 'Exchange industry revenues reach record levels in 2018', David Takaba, Burton-Taylor, Inc., 15 July 2019 - https://burton-taylor.com/exchange-industry-revenues-reach-record-levels-in-2018/; 'LSE Refinitiv deal would create the world's largest exchange group and second largest market data supplier', Burton-Taylor, Inc, 31 July 2019 - https://burton-taylor.com/london-stock-exchange-refinitiv-deal-would-create-the-worlds-largest-exchange-group-and-second-largest-market-data-supplier-burton-taylor-report/

13 Directive (EU) 2015/2366 of 25 November 2016 on payment services in the internal market (and amending previous directives) - https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32015L2366&from=EN; PSD2 was implemented in the UK by the Payment Services Regulations 2017 (UK SI 2017/752) - Directive (EU) 2015/2366 of 25 November 2016 on payment services in the internal market (and amending previous directives) - https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32015L2366&from=EN; PSD2 was implemented in the UK by the Payment Services Regulations 2017 (UK SI 2017/752) - http://www.legislation.gov.uk/uksi/2017/752/contents/made

14 See also 'Open Banking – guidelines for Open Data Participants', Open Banking limited, July 2018 - https://www.openbanking.org.uk/wp-content/uploads/Guidelines-for-Open-Data-Participants.pdf

15 Directive 2009/138/EC of 25 November 2009 on the taking-up and pursuit of the business of Insurance and Reinsurance (Solvency II) (OJ L 335, 17.12.09, p.1) - https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32009L0138&from=en

16 'How airlines aim to use big data to boost profits: technology provides carriers with treasure trove of information to optimise customer service', Financial Times, 8 May 2018 - https://www.ft.com/content/f3a931be-47aa-11e8-8ae9-4b5ddcca99b3

17 SITA, '2019 Baggage IT Insights' - https://www.sita.aero/resources/type/surveys-reports/baggage-it-insights-2019

18 'Magic Numbers: how data and analytics can really help the music industry – a special insight report by music:)ally for the BPI and the Entertainment Retailers Association', July 2018, at p.2, https://musically.com/wp-content/uploads/2018/07/MagicNumbersBPI_ERA-1.pdf

19 'Data, insights and statistics', NHS Digital, September 2018 - https://digital.nhs.uk/data-and-information/data-insights-and-statistics

20 https://bigdataldn.com/wp-content/uploads/2019/04/The-Public-Sector-Data-Report-2019.pdf

21 'Legal Aspects of Artificial Intelligence', September 2018, pages 33 to 36. 'Legal Aspects of Cloud Computing: Cloud Security, June 2018, pages 11 to 17.

22 https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/715760/CDEI_consultation__1_.pdf

23 https://www.gov.uk/government/publications/national-data-strategy-open-call-for-evidence/national-data-strategy-open-call-for-evidence

24 https://ec.europa.eu/digital-single-market/en/policies/building-european-data-economy

25 'Data ownership, rights and controls: reaching a common understanding', 3 October 2018, https://royalsociety.org/-/media/policy/projects/data-governance/data-ownership-rights-and-controls-October-2018.pdf

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.