RAS History & PhilologyРусская литература Russian literature

  • ISSN (Print) 0131-6095
  • ISSN (Online) 3034-591X

Metadata of Linguistic Resources: History and Current State

PII
S160578800018917-4-
DOI
10.31857/S160578800018917-4
Publication type
Article
Status
Published
Authors
Volume/ Edition
Volume 81 / Issue 1
Pages
21-36
Abstract

The main metadata projects for linguistic (language) resources developed over the past 20 years are described. These include the IMDI initiative, the OLAC metadata system. the META-SHARE meta-model, the International Standard Number of Language resources, the evaluation map of language resources, and the CLARIN component metadata model. The content of the ISO metadata standard is described. Projects for creating dictionaries, ontologies, and lexical databases for metadata of language resources are described.

Keywords
Metadata, linguistic resources, language resources, standards, dictionaries, ontologies
Date of publication
11.03.2022
Year of publication
2022
Number of purchasers
14
Views
1100

References

  1. 1. A Proposal for a Meta Description Standard for Language Resources https://www.mpi.nl/ISLE/documents/papers/white_paper_11.pdf
  2. 2. Metadata Elements for Lexicon Descriptions https://www.mpi.nl/ISLE/documents/draft/ISLE_Lexicon_1.0.pdf
  3. 3. IMDI Team, (August 2001), Vocabulary Taxonomy and Structure, Version 1.1, MPI Nijmegen
  4. 4. Mapping IMDI Session Descriptions with OLAC Draft Proposal Version 1.0 August, 2001 IMDI Technical Report Max-Planck-Institute for Psycholinguistics NL, Nijmegen
  5. 5. Arbil for editing and managing IMDI metadata. Version 2.6. https://www.mpi.nl/corpus/html/arbil-imdi/index.html
  6. 6. IMDI Documents https://www.mpi.nl/ISLE/documents/docs_frame.html
  7. 7. OLAC Metadata http://olac.ldc.upenn.edu/OLAC/metadata.html
  8. 8. OLAC Metadata Usage Guidelines http://olac.ldc.upenn.edu/NOTE/usage.html
  9. 9. Dublin Core XML https://dcxml.readthedocs.io/en/latest/
  10. 10. Documentation and User Manual of the META-SHARE Metadata Model http://www.meta-net.eu/public_documents/t4me/META-NET-D7.2.4-Final.pdf
  11. 11. Gavrilidou, M., Labropoulou, P., Piperidis, S., Speranza, M., Monachini, M., Arranz, V., Francopoulo, G. META-NET Deliverable D7.2.1 – Specification of Metadata-Based Descriptions for Language Resources and Technologies, 2011, http://t4me.dfki.de/intranet/document_repository/deliverables/wp07-infrastructure-functional-and-technical-specification/meta-net-d7.2.1-final.pdf/view
  12. 12. Technologies for the Multilingual European Information Society. Specification of metadata-based descriptions for language resources and technologies. Penny Labropoulou, Maria Gavrilidou, Elina Desipri, Stelios, Piperidis (R.C. Athena. ILSP), Francesca Frontini, Monica Monachini (ILC. CNR), Victoria Arranz (ELDA), Gil Francopoulo (LIMSI). Final Report, 2012 http://www.meta-net.eu/public_documents/t4me/META-NET-D7.2.2-Final.pdf
  13. 13. International Standard Language Resource Number http://www.islrn.org/
  14. 14. LRE map http://www.elra.info/en/catalogues/lre-map/
  15. 15. Component Metadata https://www.clarin.eu/content/component-metadata
  16. 16. CMDI 1.2 specification Version 1 Date 2016-10-20 https://office.clarin.eu/v/CE-2016-0880-CMDI_12_specification.pdf
  17. 17. CMDI 1.2 https://www.clarin.eu/cmdi1.2
  18. 18. CMDI Best Practices Guide https://www.clarin.eu/content/cmdi-best-practices-guide
  19. 19. AP3-007-CMDI_and_granularity.pdf https://www.clarin.eu/media/1790
  20. 20. CMDI-first-aid-kit.pdf https://www.clarin.eu/sites/default/files/CMDI-first-aid-kit.pdf
  21. 21. Component Registry Documentation. Component Registry, Browser and Editor Reference Manual https://www.clarin.eu/content/component-registry-documentation
  22. 22. CLARIN Concept Registry https://www.clarin.eu/ccr
  23. 23. Virtual Language Observatory (VLO) https://www.clarin.eu/content/virtual-language-observatory-vlo
  24. 24. Poiskovye servisy i instrumenty Instituta Meertensa [Search Services and Tools of the Mertens Institute] https://www.meertens.knaw.nl/cmdi/search/#q=*%3A* (In Russ.)
  25. 25. Fedora_OAI_Konfiguration_v3.pdf https://www.clarin-d.net/images/ leipzig/Fedora_OAI_Konfiguration_v3.pdf
  26. 26. IDS Repository Architecture and Ingest Pipelines http://repos.ids-mannheim.de/reposdescription.html
  27. 27. Linguistic Data and NLP Tools. About metadata https://lindat.mff.cuni.cz/repository/xmlui/page/ metadata
  28. 28. ISO 24622-1:2015 Language resource management – Component Metadata Infrastructure (CMDI) – Part 1: The Component Metadata Model https://www.iso.org/ru/standard/37336.html
  29. 29. ISO 24622-2:2019 Language resource management – Component metadata infrasctructure (CMDI) – Part 2: Component metadata specification language https://www.iso.org/obp/ui/#iso:std:iso:24622:-2:ed-1:v1:en
  30. 30. ISO 12620:2009 Terminology and other language and content resources – Specification of data categories and management of a Data Category Registry for language resources https://www.iso.org/standard/37243.html
  31. 31. ISO 12620:2019 Management of terminology resources – Data category specifications https://www.iso.org/standard/69550.html
  32. 32. GOST R ISO 12620-2012 Terminologiya, drugie yazykovye resursy i resursy soderzhaniya. Spetsifikatsiya kategorij dannykh i vedenie reestra kategorij dannykh dlya yazykovykh resursov http://docs.cntd.ru/document/ 1200104401 [GOST R ISO 12620-2012 Terminologiya, drugie yazykovye resursy i resursy soderzhaniya. Specifikaciya kategorij dannyh i vedenie reestra kategorij dannyh dlya yazykovyh resursov [GOST R ISO 12620-2012 Terminology, Other Language Resources and Content Resources. Specification of Data Categories and Maintaining a Register of Data Categories for Language Resources] http://docs.cntd.ru/document/ 1200104401 (In Russ.)].
  33. 33. The Center for Sustainability of Linguistic Data (NaLiDa) http://www.sfs.uni-tuebingen.de/nalida/en/
  34. 34. Rational Reconstruction for TDG Metadata http://www.sfs.uni-tuebingen.de/nalida/images/isocat/isocat_hierarchy.html
  35. 35. Data Category Repository (DCR) http://datcatinfo.net/
  36. 36. TERMWEB https://datcatinfo.termweb.se/termweb/app
  37. 37. CLARIN Concept Registry Browser https://concepts.clarin.eu/ccr/browser/
  38. 38. Linguistic Metadata (LIME) vocabulary https://lod-cloud.net/dataset/lime
  39. 39. About the ontology. What is LexInfo? https://lexinfo.net/
  40. 40. Antopol'skij A.B., Savchuk S.O., Tameev A.A. O razrabotke ontologii poiskovykh terminov po lingvistike // Informatsionnye resursy Rossii. 2020. № 4. S. 2–7. [Antopolsky, A.B., Savchuk, S.O., Tameev, A.A. O razrabotke ontologii poiskovyh terminov po lingvistike [On the Development of an Ontology of Search Terms in Linguistics] Informacionnye resursy Rossii [Information Resources of Russia]. 2020, No. 4, pp. 2–7. (In Russ.)].
  41. 41. Ontologiya poiskovykh terminov po lingvistike http://db.inion.ru/optel/ [Ontologiya poiskovyh terminov po lingvistike [Ontology of Search Terms in Linguistics] http://db.inion.ru/optel/ (In Russ.)].
  42. 42. Antopol'skij A.B., Maksimov N.V., Tameev A.A. Ehksperimental'naya baza dannykh istochnikov dlya sozdaniya ontologii po lingvistike // Informatsionnye resursy Rossii. 2021. № 3. S. 24–30. DOI: 10.46920/0204-3653_2021_03181_24 [Antopolsky, A.B., Maksimov, N.V., Tameev, A.A. Eksperimentalnaya baza dannyh istochnikov dlya sozdaniya ontologii po lingvistike [Experimental Database of Sources for Creating an Ontology on Linguistics]. Informacionnye resursy Rossii [Information Resources of Russia]. 2021, No. 3, pp. 24–30. DOI: 10.46920/0204-3653_2021_03181_24 (In Russ.)].
QR
Translate

Индексирование

Scopus

Scopus

Scopus

Crossref

Scopus

Higher Attestation Commission

At the Ministry of Education and Science of the Russian Federation

Scopus

Scientific Electronic Library