In this first of our series of briefings on the DSM Directive, we look at the two new mandatory exceptions to copyright for text and data mining.

Directive (EU) 2019/790 on Copyright in the Digital Single Market (the "DSM Directive") was agreed at EU level on 17 April 2019 and is due to be transposed into national law by Member States by 7 June 2021. Companies engaged in the exploitation of copyright works should therefore be considering the upcoming changes to copyright law.

In this briefing we look at the two new mandatory exceptions to copyright for text and data mining ("TDM"). These exceptions, which are contained in Articles 3 and 4 of the DSM Directive, will need to be carefully navigated by users of TDM and by rightholders, as they are narrowly framed and will likely give rise to legal debate as to their precise limits.

The DSM Directive defines TDM as "any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations." In practice, this means that TDM can be used to add value to and make sense of big data sets. There are countless ways that TDM is used, but recent applications include the mining of data sets of COVID-19 cases to create advanced mapping tools to track and predict the spread of the virus, a study that mined social media posts of food to monitor obesity rates in London and the use of crime data to enable "predictive policing".

However, the process of TDM may involve acts that are restricted by copyright, such as extracting the contents of a database or reproducing large amounts of text, sounds or images. Without an exception or limitation to copyright for these acts, they may, without the prior authorisation of the rightholder, constitute copyright infringement. In order to remove this legal uncertainty around TDM, and in recognition of the fact that the use of TDM is, as noted in Recital 8 to the DSM Directive, "prevalent across the digital economy", the DSM Directive provides two new mandatory exceptions to copyright for TDM.

Article 3: TDM for the purposes of scientific research

Article 3 of the DSM Directive provides research organisations and cultural heritage institutions with a new mandatory exception to copyright that allows them to extract and reproduce text and data from databases, or other sources to which they have lawful access, to carry out text and data mining for the purposes of scientific research. There is no requirement to obtain authorisation from rightholders to avail of the exception and rightholders are not entitled to any compensation.

Limiting conditions

While the TDM right for scientific research appears on its face to be broad, it is subject to important conditions:

The first, and most obvious, is that the TDM must be carried out for the purposes of scientific research, which the recitals to the DSM Directive clarify covers both the natural sciences and the human sciences. This means that the commercial or industrial application of TDM will fall outside the scope of Article 3 (but may be able to benefit from Article 4).

The TDM must be performed by either a "cultural heritage institution" or a "research organisation." A cultural heritage institution is defined as "a publicly accessible library or museum, an archive or a film or radio heritage institution." A research organisation can be any research performing entity, but it must conduct scientific research on a not-for-profit basis or pursuant to a public interest mission recognised by a Member State. Importantly, the access to the results generated by any such scientific research cannot be enjoyed on a preferential basis by an undertaking that exercises a decisive influence upon such organisation. Therefore, commercially funded or orientated research organisations will fall outside the scope of Article 3.

Copies of the data sets on which TDM is performed must be "stored with an appropriate level of security" by the research organisation or cultural heritage institution and rightholders are allowed to "apply measures to ensure the security and integrity of the networks and databases where the works or other subject matter are hosted." What this means in practice is unclear, but Article 3(3) encourages the relevant parties to "define commonly agreed best practices" in this regard.

Collaborations with the private sector

Recital 11 to the DSM Directive clarifies that research organisations that come within Article 3 should be able to benefit from the TDM exception when their research activities are carried out in the framework of public-private partnerships. The benefits to a research organisation of such partnerships are referred to in Recital 11, which states that research organisations should be "able to rely on their private partners for carrying out text and data mining, including by using their technological tools." However this must be read in the context of the Article 3 exemption overall, which cannot be relied upon where the access to the results generated by any such scientific research are enjoyed on a preferential basis by an undertaking that exercises a decisive influence upon such organisation. Therefore, whether a private partner can stand to reap the full commercial benefits of such a partnership is doubtful, given the limits of the Article 3 exception. This appears to be acknowledged in Recital 11, which states that research organisations and cultural heritage institutions should be the "beneficiaries" of the Article 3 exception.

Article 4: TDM for everything else (if permitted by rightholders)

If research organisations and cultural heritage institutions are the beneficiaries of Article 3, then everyone else is left with Article 4. However, while Article 4 covers a much broader group of users, its application is more limited than Article 3.

Article 4 allows for acts of reproduction and extraction from databases and other sources for the purposes of TDM generally –the motive behind the TDM is irrelevant as there is no purpose requirement or limitation like Article 3. Therefore, commercial applications of TDM fall within Article 4. However, Article 4 contains an opt-out provision for rightholders, which means that they can exclude their copyright works from the scope of Article 4 by expressly reserving their rights in an "appropriate manner". An example of an appropriate manner is given in Article 4, being "machine-readable means" for content made publicly available online (i.e. technological restrictions on extraction from online databases and other sources). Other such appropriate means would include contractual restrictions or a unilateral reservation of rights. Therefore, rightholders who stand to monetise TDM of their works can continue to do so under Article 4.

What does this mean for your business?

With Articles 3 and 4 to be implemented into Irish law in little over one years' time, organisations that use TDM and rightholders that own big data sets should start to think about the opportunities and challenges that they present. For example:

There are likely to be opportunities for public-private partnerships to be set up to avail of the exceptions to test, train and develop TDM technologies.

The agreements governing these public-private partnerships will need to be carefully considered and drafted.

Research organisations in particular will need to consider whether they fall within the definition of a "research organisation" under the DSM Directive and may need to undertake structural and operational changes to be able to avail of the exceptions.

Rightholders wishing to take their works outside the scope of Article 4 should consider how best to reserve their rights in an "appropriate manner".

Originally published 26 June, 2020

This article contains a general summary of developments and is not a complete or definitive statement of the law. Specific legal advice should be obtained where appropriate.