Zhang, Ji (2010) An efficient and effective duplication detection method in large database applications. In: NSS 2010: 4th International Conference on Network and System Security, 1-3 Sep 2010, Melbourne, Australia.
In this paper, we developed a robust data cleaning technique, called PC-Filter+ (PC stands for partition comparison) based on its predecessor, for effective and efficient duplicate record detection in large databases. PC-Filter+ provides more flexible algorithmic options for constructing the Partition Comparison Graph (PCG). In addition, PC-Filter+ is able to deal with duplicate detection under different memory constraints.
|Item Type:||Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)|
|Additional Information:||Permanent restricted access to Published version, due to publisher copyright restrictions.|
|Uncontrolled Keywords:||database management; duplicate detection; quality control|
|Depositing User:||Dr Ji Zhang|
|Date Deposited:||16 Feb 2011 01:53|
|Last Modified:||03 Jul 2013 00:27|
Actions (login required)
|Archive Repository Staff Only|