首页 | 本学科首页   官方微博 | 高级检索  
     


Detecting incorrect product names in online sources for product master data
Authors:Stephan Karpischek  Florian Michahelles  Elgar Fleisch
Affiliation:1. ETH Zürich, WEV G 222.2, Weinbergstrasse 56/58, 8092, Zürich, Switzerland
Abstract:The global trade item number (GTIN) is traditionally used to identify trade items and look up corresponding information within industrial supply chains. Recently, consumers have also started using GTINs to access additional product information with mobile barcode scanning applications. Providers of these applications use different sources to provide product names for scanned GTINs. In this paper we analyze data from eight publicly available sources for a set of GTINs scanned by users of a mobile barcode scanning application. Our aim is to measure the correctness of product names in online sources and to quantify the problem of product data quality. We use a combination of string matching and supervised learning to estimate the number of incorrect product names. Our results show that approximately 2 % of all product names are incorrect. The applied method is useful for brand owners to monitor the data quality for their products and enables efficient data integration for application providers.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号