Selectional Preferences extracted from Semcor for WordNet 1.6 Synsets (v 1.0)

The selectional preferences that we compute in this process for WordNet 1.6 nouns and verbs are obtained from relations extracted from Semcor. The first step is to apply the Minipar parser [1], and obtain dependencies for all the examples in Semcor. Then we extract [noun-synset, relation, verb-synset] triples for the relations "object" and "subject".

In order to compute the weights of the triples, we use the weights of all the concepts above the target noun and verb synsets in the WordNet hierarchy. The formula to obtain these probabilities is based on estimated frequencies acquired from Semcor. For each occurrence of a synset or a triple in Semcor, we distribute the frequency among its ancestors. The formulas, and a complete description of this work can be found in [2] and [3]. An application of the method for multilingual analysis is described in [4].

With the frequency estimations obtained from Semcor, we can apply the formula to get the weight of any triple with the form [noun-synset, relation, verb-synset] for the relations "object" and "subject". A method for "pruning" is also provided, in order to limit the noun synsets that are linked to verb synsets (we keep only one synset per branch in the WordNet hierarchy).

With this document, we distribute the weights for the selectional preferences extracted from Semcor, and also for all the combinations formed with the ancestors of the triples extracted directly. The pruning step has not been applied. The weights have been normalized and the values range between 0 and 1 (the higher the weight, the stronger the relation).

Contact: David Martinez (IXA NLP group)

Download:

README.txt
Syntactic relations in Semcor (646K)
Selectional preferences for Subject relations (950K)
Selectional preferences for Object relations (1.5M)

References

[1] Lin, D. 1993.
Principle Based parsing without Overgeneration.
Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics. Columbus, Ohio. pp 112-120.

[2] Agirre E., Martínez D. 2001.
Learning class-to-class selectional preferences.
Proceedings of the Workshop "Computational Natural Language Learning" (CoNLL-2001). In conjunction with ACL'2001/EACL'2001. Toulouse, France.
(pdf)

[3] Agirre E., Martínez D. 2002.
Integrating Selectional Preferences in WordNet.
Proceedings of First International WordNet Conference. Mysore (India). 2002.
(pdf)

[4] Agirre E., Aldezabal I., Pociello E. 2003.
A pilot study of English Selectional Preferences and their Cross-Lingual Compatibility with Basque.
International Conference on Text Speech and Dialogue (TSD 2003, Czech Republic)
(pdf)