What is Text & Data Mining (TDM)?

Text & Data Mining (TDM) is a collective term for automatic analysis methods that can be used to search through large amounts of information, to set them in relation to each other and identify trends and new correlations. In Switzerland, TDM has been legally permitted for scientific research since the revised Copyright Act 2020 came into force. The precondition for TDM is legal access to the texts that are being analysed. This means that TDM may be used for texts that have either been licenced by the library or were published open access.

Publishers and databases allowing TDM

Many publishers have general rules on the use of text and data mining in their publications (not complete list):

TDM in LORY and LARA

Text & data mining is also possible in LORY (the institutional repository of the Lucerne universities) and LARA (the repository of the ZHB Lucerne). LORY is based on the Zenodo platform and the necessary information on the APIs can be found under the following link

TDM in Swissdox

For employees and students of the University of Lucerne:
Swissdox Liri is a tool licensed for the University of Lucerne for data extraction of press articles and other content from Swissdox.

Access via Switch edu-ID of the University of Lucerne:
Terms of use (only in German)

Freely accessible databases and TDM

There are also freely accessible databases allowing TDM (not complete list):

  • Arxiv
    Free access to preprints from the fields of physics, mathematics, computer science, statistics, financial mathematics and biology
    BioMed Central
    Over 300 open access journals from BioMed Central, Chemistry Central and SpringerOpen in the fields of biology and medicine
  • Europeana
    Digital library with digitised data on scientific and cultural heritage from over 2000 European institutions
  • HathiTrust Digital Library
    Digitised material from over 120 academic institutions worldwide
  • Internet Archive
    Access to millions of freely accessible electronic books and texts
  • Public Library of Science (PLOS)
    Access to the contents of the journals of the Public Library of Science, a scientific open access publisher
  • PubMed Central: Databases and Text Mining Tools
    Various freely accessible mining tools that can be used to search PubMed Central, an archive of freely accessible content from the fields of biology and biomedicine