An efficient and effective duplication detection method in large database applications

Zhang, Ji (2010) An efficient and effective duplication detection method in large database applications. In: NSS 2010: 4th International Conference on Network and System Security, 1-3 Sep 2010, Melbourne, Australia.

[img] PDF (Documentation)
Documentation.pdf

Download (120Kb)

Abstract

In this paper, we developed a robust data cleaning technique, called PC-Filter+ (PC stands for partition comparison) based on its predecessor, for effective and efficient duplicate record detection in large databases. PC-Filter+ provides more flexible algorithmic options for constructing the Partition Comparison Graph (PCG). In addition, PC-Filter+ is able to deal with duplicate detection under different memory constraints.


Statistics for USQ ePrint 18208
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: Permanent restricted access to Published version, due to publisher copyright restrictions.
Depositing User: Dr Ji Zhang
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 16 Feb 2011 01:53
Last Modified: 03 Jul 2013 00:27
Uncontrolled Keywords: database management; duplicate detection; quality control
Fields of Research (FOR2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
15 Commerce, Management, Tourism and Services > 1503 Business and Management > 150307 Innovation and Technology Management
08 Information and Computing Sciences > 0803 Computer Software > 080309 Software Engineering
Socio-Economic Objective (SEO2008): E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences
Identification Number or DOI: doi: 10.1109/NSS.2010.78
URI: http://eprints.usq.edu.au/id/eprint/18208

Actions (login required)

View Item Archive Repository Staff Only