Skip to Main content Skip to Navigation
Conference papers

The regulation of Text and Data Mining

Abstract : Researchers need text and data which can be accessed and reused for data mining purposes. Research libraries play a role not only in providing them material, but also legal advise on which actions can or cannot be performed on the material. They are negotiating licensing agreements with content providers. Research libraries are also advising governments, participating to consultations such as the Licensing for Europe Text and data mining Working Group and the European Commission public consultation on copyright. The regulation of Text and Data Mining (TDM) is affected by the legislation on the creation and the usage, access to and reuse of data related to research, copyright, public sector information, data protection, education and sectorial domains such as environment. It also considers licensing agreements imposed by data providers and licensing options available to the research institution as data producers, between all rights reserved and public domain including the various open licenses. The legal framework of law and licenses (regulation by law) is completed by opportunities and restrictions embedded in the technical architecture (regulation by technology) of the platforms hosting the data, which can make it practically impossible or difficult to perform certain actions. The discrepancies between this techno-legal framework and the requirements of researchers' applications to process data, perform queries, mining, visualization or other analysis tasks without restriction indicate points of frictions which should be solved. Most important issues are attribution, non commercial and share alike requirements, the lack of definition of data, the framing of TDM as an exception instead of a right and technical restrictions. The methodology associates legal research and argumentation to produce policy recommendations. The geographic focus is Europe, but US and Latin American Open Access legislations are included in the sources as they should be analysed with a critical perspective. While most literature and projects are dealing with Open Access to publications, this article targets more specifically Open Access to research data and includes recent developments: the 4.0 Creative Commons licenses available since November 2013, the Horizon 2020 pilot published in December 2013, the Elsevier TDM policy and the Twitter Data Grant both released in February 2014.
Complete list of metadata
Contributor : Melanie Dulong de Rosnay <>
Submitted on : Wednesday, February 11, 2015 - 2:17:40 PM
Last modification on : Monday, February 24, 2020 - 11:38:02 AM


Distributed under a Creative Commons Attribution 4.0 International License


  • HAL Id : halshs-01115017, version 1



Melanie Dulong de Rosnay. The regulation of Text and Data Mining. 43rd Annual Conference on ‘Research Libraries in the 2020 Information Landscape’, LIBER (Association of European Research Libraries), Jul 2014, Riga, Latvia. ⟨halshs-01115017⟩



Record views


Files downloads