Accurate customer address matching via weak supervision for geocode learning
2024
Determining the precise location of customers is important for an efficient and reliable delivery experience, both for customers and delivery associates. Address text is a primary source of information provided by customers about their location. In this paper, we study the important and challenging task of matching free-form customer address text to determine if two addresses represent the same physical building. We introduce a novel address matching framework that leverages transformer-based encoder to prevent tedious and time-consuming efforts spent on manual feature engineering by the baseline model. Furthermore, our proposed framework employs weak supervision to leverage historic delivery information and generate high-quality labeled data. This reduces the requirement for massive amounts of labeled data, typically needed for transformer-based models. Our experiments on manually curated datasets demonstrate the effective and generic nature of our approach, as we achieve 15.57% improvement in recall at 95% precision, on average, compared to the current baseline model across four geographies. We also introduce delivery point (DP) geocode learning for cold-start addresses as a downstream application of customer address matching. In addition to offline experiments, we performed online A/B experiments for DP geocode learning with our proposed approach and observed delivery precision improved by 8.09% and delivery defects reduced by 11.78% on average across four geographies in comparison to the baseline model.
Research areas