Skip to content

Commit f5bdf44

Browse files
committed
(add): add WN18 dataset; (feat): implement data-dependent entity num counter; (add): add README
1 parent 8be96d1 commit f5bdf44

File tree

9 files changed

+192501
-2
lines changed

9 files changed

+192501
-2
lines changed

README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Pytorch RGCN (Link Prediction)
2+
3+
## About
4+
5+
PyTorch implementation of Relational Link Prediction for RGCN (Modeling Relational Data with Graph Convolutional Networks). Check https://github.com/MichSchli/RelationPrediction, https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn for more information.
6+
7+
Requirements: python >= 3.6; pytorch >= 1.4.0; torch_geometric >= 1.4.0

data/wn18/README

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
----------------------------------------------
2+
-- WORDNET TENSOR DATA -- A. Bordes -- 2013 --
3+
----------------------------------------------
4+
5+
------------------
6+
OUTLINE:
7+
1. Introduction
8+
2. Content
9+
3. Data Format
10+
4. Data Statistics
11+
5. How to Cite
12+
6. License
13+
7. Contact
14+
-------------------
15+
16+
17+
1. INTRODUCTION:
18+
19+
This WORDNET TENSOR DATA consists of a collection of triplets (synset, relation_type,
20+
triplet) extracted from WordNet 3.0 (http://wordnet.princeton.edu). This data set can
21+
be seen as a 3-mode tensor depicting ternary relationships between synsets.
22+
23+
24+
2. CONTENT:
25+
26+
The data archive contains 6 files:
27+
- README 3K
28+
- wordnet-mlj12-definitions.txt 4,2M
29+
- wordnet-mlj12-train.txt 4,5M
30+
- wordnet-mlj12-valid.txt 165K
31+
- wordnet-mlj12-test.txt 165K
32+
33+
The 3 files wordnet-mlj12-*.txt contain the triplets (training, validation
34+
and test sets), while the file wordnet-mlj12-definitions.txt lists the WordNet
35+
synsets definitions.
36+
37+
38+
3. DATA FORMAT
39+
40+
The definitions file (wordnet-mlj12-definitions.txt) contains one synset
41+
per line with the following format: synset_id (a 8-digit unique identifier)
42+
intelligible name (word+POS_tag+sense_index), definition. The previous 3
43+
pieces of information are separated by a tab ('\t').
44+
45+
All wordnet-mlj12-*.txt files contain one triplet per line, with 2 synset_ids
46+
and relation type identifier in a tab separated format. The first element is the
47+
synset_id of the left hand side of the relation triple, the third one is the
48+
synset_id of the right hand side and the second element is the name of the type
49+
of relations between them.
50+
51+
52+
4. DATA STATISTICS
53+
54+
There are 40,943 synsets and 18 relation types among them. The training set contains
55+
141,442 triplets, the validation set 5,000 and the test set 5,000.
56+
57+
All triplets are unique and we made sure that all synsets appearing in
58+
the validation or test sets were occurring in the training set.
59+
60+
5. HOW TO CITE
61+
62+
When using this data, one should cite the original paper:
63+
@article{bordes-mlj13,
64+
title = {A Semantic Matching Energy Function for Learning with Multi-relational Data},
65+
author = {Antoine Bordes and Xavier Glorot and Jason Weston and Yoshua Bengio},
66+
journal={Machine Learning},
67+
publisher={Springer},
68+
year={2013},
69+
note={to appear}
70+
}
71+
72+
One should also point at the project page with either the long URL:
73+
https://www.hds.utc.fr/everest/doku.php?id=en:smemlj12 , or the short
74+
one: http://goo.gl/bHWsK .
75+
76+
6. LICENSE:
77+
78+
WordNet data follows the attach license agreement.
79+
80+
7. CONTACT
81+
82+
For all remarks or questions please contact Antoine Bordes: antoine
83+
(dot) bordes (at) utc (dot) fr .
84+
85+
86+

data/wn18/Wordnet3.0-LICENSE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
WordNet Release 3.0 This software and database is being provided to you, the LICENSEE, by Princeton University under the following license. By obtaining, using and/or copying this software and database, you agree that you have read, understood, and will comply with these terms and conditions.: Permission to use, copy, modify and distribute this software and database and its documentation for any purpose and without fee or royalty is hereby granted, provided that you agree to comply with the following copyright notice and statements, including the disclaimer, and that the same appear on ALL copies of the software, database and documentation, including modifications that you make for internal use or for distribution. WordNet 3.0 Copyright 2006 by Princeton University. All rights reserved. THIS SOFTWARE AND DATABASE IS PROVIDED "AS IS" AND PRINCETON UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PRINCETON UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES OF MERCHANT- ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE LICENSED SOFTWARE, DATABASE OR DOCUMENTATION WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. The name of Princeton University or Princeton may not be used in advertising or publicity pertaining to distribution of the software and/or database. Title to copyright in this software, database and any associated documentation shall at all times remain with Princeton University and LICENSEE agrees to preserve same.

0 commit comments

Comments
 (0)