Zhang, Ji (2010) An efficient and effective duplication detection method in large database applications. In: NSS 2010: 4th International Conference on Network and System Security, 1-3 Sep 2010, Melbourne, Australia.
|HTML Citation||EndNote||Dublin Core||Reference Manager|
Full text available as:
|PDF (Documentation) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader|
Identification Number or DOI: doi: 10.1109/NSS.2010.78
In this paper, we developed a robust data cleaning technique, called PC-Filter+ (PC stands for partition comparison) based on its predecessor, for effective and efficient duplicate record detection in large databases. PC-Filter+ provides more flexible algorithmic options for constructing the Partition Comparison Graph (PCG). In addition, PC-Filter+ is able to deal with duplicate detection under different memory constraints.
|Item Type:||Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)|
|Additional Information:||Permanent restricted access to Published version, due to publisher copyright restrictions.|
|Uncontrolled Keywords:||database management; duplicate detection; quality control|
|Fields of Research (FOR2008):||08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining|
15 Commerce, Management, Tourism and Services > 1503 Business and Management > 150307 Innovation and Technology Management
08 Information and Computing Sciences > 0803 Computer Software > 080309 Software Engineering
|Socio-Economic Objective (SEO2008):||E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences|
|Deposited On:||16 Feb 2011 11:53|
|Last Modified:||03 Feb 2012 16:29|
Archive Staff Only: edit this record