Integrating noisy knowledge into language representations for e-commerce applications
2023
Integrating structured knowledge into language model representations increases recall of domain-specific information useful for downstream tasks. Matching between knowledge graph entities and text entity mentions can be easily performed when entity names are unique or there exists entity linking data. When extending this setting to new domains, newly mined knowledge contains ambiguous and incorrect information, with no explicit linking information. In such settings, we design a framework to robustly link relevant knowledge to input texts as an intermediate modeling step while performing end-to-end domain fine-tuning tasks. This is done by first computing the similarity of the existing task labels with candidate knowledge triplets to generate relevance labels. We use these labels to train a relevance model, which predicts the relevance of the inserted triplets to the original text. This relevance model is integrated within a language model, leading to our Knowledge Relevance BERT (KR-BERT) framework. We test KR-BERT for entity linking tasks on a real-world e-commerce dataset as well as a public linking task, where we show performance improvements over strong baselines.
Research areas