Tabular data concept type detection using star-transformers

2021
Download Copy BibTeX
Copy BibTeX
Tabular data is an invaluable information resource for search, information extraction and question answering about the world. It is critical to understand the semantic concept types for table columns in order to fully exploit the information in tabular data. In this paper, we focus on learning-based approaches for column concept type detection without relying on any metadata or queries to existing knowledge bases. We propose a model that employs both statistical and semantic features of table columns, and use Star-Transformers to gather and scatter information across the whole table to boost the performance on individual columns. We apply distant supervision to construct a tabular dataset with columns annotated with DBpedia classes. Our experiment results show that our model achieves 93.57 accuracy on the dataset, exceeding that of the state-of-the-art baselines.

Latest news

GB, MLN, Edinburgh
We’re looking for a Machine Learning Scientist in the Personalization team for our Edinburgh office experienced in generative AI and large models. You will be responsible for developing and disseminating customer-facing personalized recommendation models. This is a hands-on role with global impact working with a team of world-class engineers and scientists across the Edinburgh offices and wider organization. You will lead the design of machine learning models that scale to very large quantities of data, and serve high-scale low-latency recommendations to all customers worldwide. You will embody scientific rigor, designing and executing experiments to demonstrate the technical efficacy and business value of your methods. You will work alongside aRead more