Is data really that important to a business?
A quick survey of the top companies by market capitalisation readily reveals that data is key.
In furtherance of the objective of leveraging existing datasets paid for by public funds, a number of jurisdictions have sought to make public sector information (PSI) available to industry. Examples include geographical information, statistics and weather data.
This has been particularly important in relation to the AI industry, which has a dependency on getting access to sufficiently large datasets.
The EU Commission takes the view that, where PSI is made available, it should be shared for re-use by all in the EU, in a borderless manner, in order to promote the digital economy and innovation. To that end, it implemented the PSI Directive in 2003 (it was revised in 2013).
Significant aspects of the amended PSI Directive are as follows:
The European Open Data Portal contains examples of businesses that have used European PSI. Despite such examples of use, the PSI Directive has been subject to criticisms:
In light of such problems, the EU has since legislated for an Open Data Directive. It will be implemented by July 2021, and is intended to bring about better availability of publicly-held information as part of a package of measures aiming to facilitate the creation of a “Common Data Space.” The EU Common Data Space is intended to facilitate the exchange of data between data providers and users.
The Open Data Directive has similar provisions to the amended PSI Directive (e.g. the same exceptions apply), but it extends the scope of availability for re-use of data in the following ways:
These datasets are defined as documents whose re-use is associated with important benefits for the society and economy. All publicly-held data is encouraged to be made available with regard to the FAIR principle (that is, findable, accessible, interoperable and re-usable) as much as possible.
There are already available private company-led industrial spaces facilitating secure data exchange among different organizations based on common standards, enabling easy integration of data from different sources within an agreed data governance framework. The EU Common Data Space is likely to be similar to these systems.
There are also EU initiatives to stimulate business-to-government (B2G) sharing of data to boost improvement to policy making and public services, which may increase the available range of information for re-use.1
The UK has implemented the PSI Directive (as amended) but does not plan to implement the EU’s Open Data Directive. The Open Government Licence Framework is commonly used to facilitate the re-use of PSI. The requestor can procure the information by making a request under the Freedom of Information Act 2000.
In 2019 the United States enacted the Open, Public, Electronic, and Necessary Government Data Act, which specifically makes most government “mass data” open to the public.2 As of the date of publication, the US government has made more than 200,000 datasets publicly available at www.data.gov.
Regionally, open data directives and portals also exist in eight provinces and one territory. For example, the Province of Ontario’s Open Data Directive requires all government data to be made public and sets out key principles and requirements for publishing open data.5
China does not have PSI laws for the time being.
Singapore does not have any legislation which provides for the sharing of PSI with the public. However, the Singaporean government has introduced a number of initiatives to share PSI with the public. These include:
The Public Sector (Governance) Act 2018 sets out a framework for the sharing of PSI between various Singaporean public sector agencies. There are 61 public sector agencies in total, which are categorised into three main groups: (1) agencies that fulfil public functions; (2) professional boards established to self-regulate members of their respective professions; and (3) bodies whose main function is to represent particular community interests or the volunteer movement.
In Australia the Commonwealth government has committed to optimising the use and re-use of PSI, as noted in the Digital Transformation Agency Open Data web page.
The Australian government had proposed introducing a Data Availability and Transparency Act as part of a general push to greater sharing and use of government datasets, and to establish safeguards in respect of that sharing and use. However, as of the date of publication this has been delayed.
Availability of sector-specific data as a result of market failure
Although antitrust / competition law data issues are dealt with more generally elsewhere (see Antitrust / Competition Law Data Issues), this part deals with the specific question as to whether there are circumstances in which antitrust / competition law, or sector-specific rules, will apply to provide access to a business’s data on a market failure.
Although in principle data access should not be made compulsory (having regard to the legitimate interests of data holders), where there are market failures which antitrust / competition law cannot solve, the EU Commission has legislated for a sector-specific data access right under fair, transparent, reasonable, proportionate and/or non-discriminatory conditions.
The Directive for Automotives stipulates that original equipment manufacturers (OEMs) must make available necessary information to enable independent firms to provide aftermarket services.
The information may be provided on a fair, reasonable, and non-discriminatory (FRAND) basis.
The Directive has been amended recently to account for the fact that modern cars rely on increasingly critical software and telematics, and that new methods or techniques for vehicle diagnostics and repair (such as remote access to vehicle information) and software have become available.
In future, the European Commission may also be able to use a proposed new ex ante competition “tool” in circumstances like these.11 The proposed tool is based on the United Kingdom’s market investigation regime, which currently allows the United Kingdom’s Competition and Markets Authority to investigate markets that are not working well and to impose remedies, including access to information and data. For more information, see EU Commission Launches Consultations on Ex Ante Antitrust Tool and Platform Regulation.
Australia’s competition regulator, the Australian Competition and Consumer Commission (ACCC), has long taken the view that data must not be used in an anti-competitive manner, such as through collusive arrangements or through the abuse of a dominant market position. That is, Australia’s competition laws apply to conduct in respect of data in the same way as other conduct.
The ACCC has established a Data Analytics Unit which has been deployed in a number of market studies undertaken by the ACCC, as well as to support the work of ACCC investigations teams and economists.
There is no point in being able to transfer data if it cannot be re-used easily. Data can come in all types of formats (for example, in word, PDF, CSV, etc.). Datasets may not be in a format readily readable by machines, and some may lack metadata, reducing the usability of the data file.
Such problems in interoperability (that is, the ability to transfer data meaningfully between different IT platforms) make it difficult to share, aggregate or combine data and impede re-use.
Complex supply chains, service delivery models, and the possibility of providing complementary products and services may give rise to the need for businesses to share data with a variety of stakeholders. Specific interoperability requirements should be discussed with them so that data can be easily shared, combined and analyzed.
The EU has issued a Rolling Plan on ICT Standardisation to support the development of standardized and compatible formats and protocols for gathering and processing data from different across sectors and vertical markets.12
Text and data mining
Uptake of Big Data analytics is supported by the emergence of exceptional technologies, such as data mining, which enable the gathering of large quantities of data. Accordingly, the technique is often used for collecting data for training and developing AI. Text and data mining,13 sometimes referred to as “data scraping,” has been the source of a number of intellectual property disputes in recent years in several jurisdictions (particularly in relation to data scraped from websites, such as flight schedules), reflecting the value of the data at issue.
In the EU, text and data mining is defined as “any automated analytical technique aimed at analyzing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations” (Article 2, Directive on Copyright in the Digital Single Market).14
In certain instances, text and data mining may involve acts protected by copyright and / or by the sui generis database right (notably the reproduction of works or other subject-matter and / or the extraction of contents from a database.)15
However, the Directive on Copyright in the Digital Single Market includes provisions that enable research organizations to undertake such text and data mining of works protected by copyright or a database right without there being an infringement, provided: (1) they are carrying out scientific research (public-private partnerships can also benefit if the research organization is not under the control of a commercial undertaking);16 and (2) they have lawful access to the material.
The exception does not apply to works protected under the law of confidence. The Directive is to be implemented by member states by June 2021.
Data rights-holders retain some level of control, as they can still control access to their data, although once they have given research organizations access to the data, they cannot disapply the exception by contract, nor obstruct text and data mining by implementing technical measures for that purpose (they may do so to ensure the security and integrity of the networks and databases).
Under the Directive commercial entities can text and data mine content they have lawful access to, unless rights-holders explicitly reserve their rights in an appropriate, machine-readable manner.
Uses for the purpose of scientific research, other than text and data mining, should remain covered by any exceptions provided for in the InfoSoc Directive.17
As at the date of publication, the UK government has confirmed that the UK has no plans to implement the Directive on Copyright in the Digital Single Market.18
As at the date of publication the US Supreme Court has been asked to rule on the scope of the federal law commonly used in data-scraping cases, the Computer Fraud and Abuse Act (CFAA). The case at issue is a federal appeals court case (hiQ Labs, Inc. v. LinkedIn Corp., 938 F.3d 985 (9th Cir. 2019)), permitting hiQ to scrape publicly available social media profile data from LinkedIn.
The federal appeals court held that hiQ had shown a likelihood of success on the merits in its claim that, when a computer network generally permits public access to its data, a user’s accessing of that publicly available data will not constitute access “without authorization” under the CFAA.
The US has no research purpose exceptions in relation to the legality of data mining of accessible data in its laws, meaning that data mining is not prohibited insofar as it falls within the fair use provision.
Similar to the US, when copyright is recognized as subsisting in data, data mining is not prohibited by the Canadian Copyright Act insofar as it falls within a fair dealing exception,19 such as being for the purpose of research, private study, or education and being otherwise fair under the circumstances.
PRC laws have no specific rules relating to text and data mining.
Singapore's Copyright Act is, at the date of publication, being reviewed with a view to implementing several significant proposed amendments, including the creation of a new exception in the Copyright Act to allow the copying of copyright works for the purposes of data analysis.
The exception is expressly targeted at text and data mining, which is broadly understood to mean the use of automated techniques to analyze text, data and other content to generate insights and information that may not have been possible to obtain through manual effort.
A quick survey of the top companies by market capitalisation readily reveals that data is key.
The value that can be gained from data by businesses will inevitably lead to an increase in the use of data to improve daily operations and to develop new products, services and processes.
In many jurisdictions pure information, or data, is not considered to be property. This is because a claim to property in intangible information presents obvious definitional difficulties.
There is a patchwork of different rights, intellectual property rights and contract rights that may apply to data. Understanding the way in which these rights come into play enables a business to understand how its data assets can be protected.
Disruptive technologies, such as AI, IoT, AVs, distributed ledger technology (DLT), cryptocurrencies and smart contracts, generate many different forms of data. What are the particular characteristics of such data, and to what extent can intellectual property rights or other rights protect them?
In this section, we review the EU’s position with regards to industrial and non-personal data and look at whether other jurisdictions have similar initiatives.
Data location laws (in relation to industrial and non-personal data) can be restrictive (as in banking secrecy laws, which may require some types of data to remain onshore or to be “localised”) or liberalising (as in laws that ban the prohibition of export of data from a locality).
In furtherance of the objective of leveraging existing datasets paid for by public funds, a number of jurisdictions have sought to make public sector information (PSI) available to industry.
The exclusive possession or control of data can have antitrust / competition law considerations, giving rise to access disputes.
The uncertain nature of intellectual property rights in data means that “contract is king” in data transactions.
Data is an incredibly valuable resource for businesses, enabling organisations to effectively operate and to make business improvements. In order to exploit this value most effectively, businesses must invest in good data management.
Errors, incompleteness or biases within data may flow through, and be amplified by, data analytics process outputs upon which a business's strategic and investment decisions may depend, potentially causing business losses. In this section we deal with liability arising out of use of data / datasets that are in some respect sub-optimal.