Zhang, Ji (2010) An efficient and effective duplication detection method in large database applications. In: NSS 2010: 4th International Conference on Network and System Security, 1-3 Sep 2010, Melbourne, Australia.
Metadata
| HTML Citation | EndNote | Dublin Core | Reference Manager |
Full text available as:
| PDF (Documentation) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader 120Kb |
Official URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5635844
Identification Number or DOI: doi: 10.1109/NSS.2010.78
Abstract
In this paper, we developed a robust data cleaning technique, called PC-Filter+ (PC stands for partition comparison) based on its predecessor, for effective and efficient duplicate record detection in large databases. PC-Filter+ provides more flexible algorithmic options for constructing the Partition Comparison Graph (PCG). In addition, PC-Filter+ is able to deal with duplicate detection under different memory constraints.
| Item Type: | Conference or Workshop Item (Commonwealth Reporting Category E) (Paper) |
|---|---|
| Additional Information: | Permanent restricted access to Published version, due to publisher copyright restrictions. |
| Uncontrolled Keywords: | database management; duplicate detection; quality control |
| Fields of Research (FOR2008): | 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining 15 Commerce, Management, Tourism and Services > 1503 Business and Management > 150307 Innovation and Technology Management 08 Information and Computing Sciences > 0803 Computer Software > 080309 Software Engineering |
| Subjects: | UNSPECIFIED |
| Socio-Economic Objective (SEO2008): | E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences |
| ID Code: | 18208 |
| Deposited By: | |
| Deposited On: | 16 Feb 2011 11:53 |
| Last Modified: | 03 Feb 2012 16:29 |
Archive Staff Only: edit this record
