Jump to content

GermaNet

From Wikipedia, the free encyclopedia

GermaNet is a semantic network for the German language. It relates nouns, verbs, and adjectives semantically by grouping lexical units that express the same concept into synsets and by defining semantic relations between these synsets.[1] GermaNet is free for academic use, after signing a license. GermaNet has much in common with the English WordNet and can be viewed as an on-line thesaurus or a light-weight ontology. GermaNet has been developed and maintained at the University of Tübingen since 1997 within the research group for General and Computational Linguistics. It has been integrated into the EuroWordNet, a multilingual lexical-semantic database.[2]

Database

[edit]

Contents

[edit]

GermaNet partitions the lexical space into a set of concepts that are interlinked by semantic relations. A semantic concept is modeled by a synset. A synset is a set of words (called lexical units) where all the words are taken to have the same or almost the same meaning. Thus, a synset is a set of synonyms grouped under one definition, or "gloss".

In addition to the gloss, synsets are labeled with their syntactic function and accompanied by example sentences for each distinct meaning in the synset.[3] Just as in WordNet, for each word category the semantic space is divided into a number of semantic fields closely related to major nodes in the semantic network: Ort, or "location", Körper, or "body", etc.[2]

As of version 15.0 (release May 2020), GermaNet contains:[2]

  • Synsets: 144113
  • Lexical Units: 185000
  • Literals: 169521
  • Conceptual Relations: 157921
  • Lexical Relations (synonymy excluded): 12203
  • Split Compounds: 98905
  • Interlingual Index (ILI) Records: 28564
  • Wiktionary Sense Descriptions: 29548

Format

[edit]

All GermaNet data is stored in a PostgreSQL relational database. The database schema follows the internal structure of GermaNet: there are tables to store synsets, lexical units, conceptual and lexical relations, etc.[3] GermaNet data is distributed both in this database format and as XML files. In the XML data, two types of files, one for synsets and the other for relations, represent all data available in the GermaNet database.[4]

Interfaces

[edit]

There are software libraries and APIs available for Java, Python, JavaScript, and Perl.[5][6] These programs are distributed under free-software licenses and provide easy access to all information in various versions of GermaNet.

GermaNet Rover is an on-line application that can be used to search for synsets in GermaNet, explore the data associated with them, and calculate the semantic similarity of pairs of synsets. It features visualizations of the hypernym relation and advanced filtering options for synset searching.

Licenses

[edit]

GermaNet 15.0 (released May 2020) can be distributed under one of the following types of license agreements:[7]

  • Academic Research License Agreement: for the purpose of research at academic institutions. There is no license fee for academic use. Licenses are not given to individual students, and those seeking a license are required to talk to an academic advisor.
  • Research and Development License Agreement: applies to non-academic institutions and research consortia. To be used strictly for technology development and internal research.
  • Commercial License Agreement: applies to non-academic institutions and commercial enterprises. It permits technology development and internal research, as well as giving the non-exclusive right to distribute and market any derived product or service.

Alternatives

[edit]

Open-de-WordNet is a freely available alternative to GermaNet which is compatible with WordNet.[8]

Linguistic Applications

[edit]

GermaNet has been used for a variety of applications, including:

  • semantic analysis[9]
  • shallow recognition of implicit document structure[9]
  • compound analysis[9]
  • analyzing sectional preferences[10]
  • word sense disambiguation[11]

See also

[edit]

References

[edit]
  1. ^ Petra Storjohann (23 June 2010). Lexical-semantic relations: theoretical and practical perspectives. John Benjamins Publishing Company. pp. 165–. ISBN 978-90-272-3138-3. Retrieved 16 November 2011.
  2. ^ a b c "GermaNet - an Introduction". uni-tuebingen.de. Retrieved October 1, 2020.
  3. ^ a b V. Henrich, E. Hinrichs. 2010. GernEdiT - The GermaNet Editing Tool. In: Proceedings of the Seventh Conference on International Language Resources and Evaluation.
  4. ^ "Data format". Retrieved October 1, 2020.
  5. ^ "Applications and Tools". uni-tuebingen.de. Retrieved October 1, 2020.
  6. ^ "GermaNet::Flat". metacpan.org. Retrieved October 1, 2020.
  7. ^ "Licenses". uni-tuebingen.de. Retrieved October 1, 2020.
  8. ^ "GitHub - hdaSprachtechnologie/odenet: Open German WordNet". November 14, 2019. Retrieved November 20, 2019 – via GitHub.
  9. ^ a b c Manuela Kunze and Dietmar Rösner. 2004. Issues in Exploiting GermaNet as a Resource in Real Applications.
  10. ^ Sabine Schulte im Walde, 2004. GermaNet Synsets as Selectional Preferences in Semantic Verb Clustering.
  11. ^ Saito et al., 2002. Evaluation of GermanNet: Problems Using GermaNet for Automatic Word Sense Disambiguation.
[edit]