OLAC Record: Hindi WordNet

OLAC Record
oai:www.ldc.upenn.edu:LDC2008L02

Metadata

Title: Hindi WordNet

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Bhattacharyya, Pushpak, Prabhakar Pande, and Laxmi Lupu. Hindi WordNet LDC2008L02. Web Download. Philadelphia: Linguistic Data Consortium, 2008

Contributor: Bhattacharyya, Pushpak

Pande, Prabhakar

Lupu, Laxmi

Date (W3CDTF): 2008

Date Issued (W3CDTF): 2008-05-19

Description: *Introduction* Hindi WordNet was developed by researchers at the Center for Indian Language Technology, Computer Science and Engineering Department, IIT Bombay. A member of the Indo-Iranian language family, Hindi is the primary national language of India and is spoken by approximately 500 million people, making it the fifth largest language in the world. Inspired by the well-known English language WordNet, Hindi WordNet is the first wordnet for an Indian language. Wordnets are systems for analyzing the different lexical and semantic relations between words. Specifically, a wordnet is a word sense network in which words are grouped into sematically equivalent units called synsets. Each synset represents a lexical concept, and synsets are linked to each other by semantic relations (between synsets) and lexical relations (between words). Similar in design to the Princeton WordNet for English, Hindi WordNet incorporates additional features to capture the complexities of Hindi. This release of Hindi WordNet consists of 56,928 unique words and 26,208 synsets. Additional information about the development of Hindi WordNet is available at the Hindi WordNet web site. *Data* Hindi WordNet contains nouns, verbs, adjectives and adverbs. Each entry consists of the following elements: * Synset: a set of synonymous words. For example, ?विद्यालय, पाठशाला, स्कूल? (vidyaalay, paaThshaalaa, skuul) represents the concept of school as an educational institution. The words in the synset are arranged according to the frequency of usage. * Gloss: the concept. It consists of two parts: Text definition: It explains the concept denoted by the synset. For example, ?वह स्थान जहाँ प्राथमिक या माध्यमिक स्तर की औपचारिक शिक्षा दी जाती है? (vah sthaan jahaaM praathamik yaa maadhyamik star kii aupacaarik sikshaa dii jaatii hai) explains the concept of school as an educational institution. Example sentence: It gives the usage of the words in the sentence. Generally, the words in a synset are replaceable in the sentence. For example,"इस विद्यालय में पहली से पाँचवीं तक की शिक्षा दी जाती है? (is vidyaalay me pahalii se pancvii tak kii shikshaa dii jaatii hai) gives the usage for the words in the synset representing schoolas an educational institution. * Position in Ontology: An ontology is a hierarchical organization of concepts, or more specifically, a categorization of entities and actions. A separate ontological hierarchy exists for each syntactic category (noun, verb, adjective adverb). Each synset is mapped into some place in the ontology.. This release of Hindi WordNet is made available as a complete Java application along with an API to facilitate further development.

Extent: Corpus size: 21504 KB

Identifier: LDC2008L02

https://catalog.ldc.upenn.edu/LDC2008L02

ISBN: 1-58563-470-0

ISLRN: 853-261-507-123-4

DOI: 10.35111/s81s-5n27

Language: Hindi

Language (ISO639): hin

License: Hindi WordNet Agreement (LDC2008L02): https://catalog.ldc.upenn.edu/license/hindi-wordnet.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC2008L02

Rights Holder: Portions © 2007 IIT Bombay, © 2008 Trustees of the University of Pennsylvania

Subject: Hindi language

Subject (ISO639): hin

Type (DCMI): Text

Type (OLAC): lexicon

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC2008L02

DateStamp: 2026-02-20

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Bhattacharyya, Pushpak; Pande, Prabhakar; Lupu, Laxmi. 2008. Linguistic Data Consortium.
Terms: area_Asia country_IN dcmi_Text iso639_hin olac_lexicon

Inferred Metadata
Country: India
Area: Asia

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2008L02
Up-to-date as of: Wed Jul 8 7:30:28 EDT 2026

Metadata
Title:		Hindi WordNet
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Bhattacharyya, Pushpak, Prabhakar Pande, and Laxmi Lupu. Hindi WordNet LDC2008L02. Web Download. Philadelphia: Linguistic Data Consortium, 2008
Contributor:		Bhattacharyya, Pushpak
		Pande, Prabhakar
		Lupu, Laxmi
Date (W3CDTF):		2008
Date Issued (W3CDTF):		2008-05-19
Description:		Introduction Hindi WordNet was developed by researchers at the Center for Indian Language Technology, Computer Science and Engineering Department, IIT Bombay. A member of the Indo-Iranian language family, Hindi is the primary national language of India and is spoken by approximately 500 million people, making it the fifth largest language in the world. Inspired by the well-known English language WordNet, Hindi WordNet is the first wordnet for an Indian language. Wordnets are systems for analyzing the different lexical and semantic relations between words. Specifically, a wordnet is a word sense network in which words are grouped into sematically equivalent units called synsets. Each synset represents a lexical concept, and synsets are linked to each other by semantic relations (between synsets) and lexical relations (between words). Similar in design to the Princeton WordNet for English, Hindi WordNet incorporates additional features to capture the complexities of Hindi. This release of Hindi WordNet consists of 56,928 unique words and 26,208 synsets. Additional information about the development of Hindi WordNet is available at the Hindi WordNet web site. Data Hindi WordNet contains nouns, verbs, adjectives and adverbs. Each entry consists of the following elements: * Synset: a set of synonymous words. For example, ?विद्यालय, पाठशाला, स्कूल? (vidyaalay, paaThshaalaa, skuul) represents the concept of school as an educational institution. The words in the synset are arranged according to the frequency of usage. * Gloss: the concept. It consists of two parts: Text definition: It explains the concept denoted by the synset. For example, ?वह स्थान जहाँ प्राथमिक या माध्यमिक स्तर की औपचारिक शिक्षा दी जाती है? (vah sthaan jahaaM praathamik yaa maadhyamik star kii aupacaarik sikshaa dii jaatii hai) explains the concept of school as an educational institution. Example sentence: It gives the usage of the words in the sentence. Generally, the words in a synset are replaceable in the sentence. For example,"इस विद्यालय में पहली से पाँचवीं तक की शिक्षा दी जाती है? (is vidyaalay me pahalii se pancvii tak kii shikshaa dii jaatii hai) gives the usage for the words in the synset representing schoolas an educational institution. * Position in Ontology: An ontology is a hierarchical organization of concepts, or more specifically, a categorization of entities and actions. A separate ontological hierarchy exists for each syntactic category (noun, verb, adjective adverb). Each synset is mapped into some place in the ontology.. This release of Hindi WordNet is made available as a complete Java application along with an API to facilitate further development.
Extent:		Corpus size: 21504 KB
Identifier:		LDC2008L02
		https://catalog.ldc.upenn.edu/LDC2008L02
		ISBN: 1-58563-470-0
		ISLRN: 853-261-507-123-4
		DOI: 10.35111/s81s-5n27
Language:		Hindi
Language (ISO639):		hin
License:		Hindi WordNet Agreement (LDC2008L02): https://catalog.ldc.upenn.edu/license/hindi-wordnet.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC2008L02
Rights Holder:		Portions © 2007 IIT Bombay, © 2008 Trustees of the University of Pennsylvania
Subject:		Hindi language
Subject (ISO639):		hin
Type (DCMI):		Text
Type (OLAC):		lexicon
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC2008L02
DateStamp:		2026-02-20
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Bhattacharyya, Pushpak; Pande, Prabhakar; Lupu, Laxmi. 2008. Linguistic Data Consortium.
Terms:		area_Asia country_IN dcmi_Text iso639_hin olac_lexicon
Inferred Metadata
Country:		India
Area:		Asia