Skip to content

Marhabibi/TabSim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TabSim

A Siamese Neural Network for Accurate Estimation of Table Similarity

This repository contains a table corpus made from tables extracted from scientific articles published in PMC, and additionally contains the code and the models for measuring the similarity between a pair of tables. You can also reproduce our results on tables from Wikipedia and arXiv articles.

Train a model

To train a model, you need to use the TabSimTrain.py and provide the number of epochs, the learning rate value and the location of the trained embeddings (download it from here), the table data (in tables.pickle file) and the directory of the trained model:

TabSimTrain.py [-h] -e N_EPOCH -l LR -v EMBEDDING_LOC -i INPUT_TABLES -o MODEL_DIR

Table Similarity Score

The following command measures the similarity of the table query with the tables in the repository. Before using this command you need to train a model.

TabSimEval.py [-h] -m MODEL -i INPUT_TABLES -o OUTPUT_TAGS

Citation

Please cite the following work

@article{habibi2020tabsim,
  title={TabSim: A Siamese Neural Network for Accurate Estimation of Table Similarity},
  author={Habibi, Maryam and Starlinger, Johannes and Leser, Ulf},
  journal={arXiv preprint arXiv:2008.10856},
  year={2020}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages