Data mining using Matlab

Woolf, Rodney J. (2005) Data mining using Matlab. [USQ Project] (Unpublished)

[img]
Preview
PDF
dissert.pdf

Download (805Kb)

Abstract

Data mining is a relatively new field emerging in many disciplines. It is becoming more popular as technology advances, and the need for efficient data analysis is required. The aim of data mining itself is not to provide strict rules by analysing the full data set, data mining is used to predict with some certainty while only analysing a small portion of the data. This project seeks to compare the efficiency of a decision tree induction method with that of the neural network method. MATLAB has inbuilt data mining toolboxes. However the decision tree induction method is not as yet implemented. Decision tree induction has been implemented in several forms in the past. The greatest contribution to this method has been made by DR John Ross Quinlan, who has brought forward this method in the form of ID3, C4.5 and C5 algorithms. The methodologies used within ID3 and C4.5 are well documented and therefore provide a strong platform for the implementation of this method within a higher level language. The objectives of this study are to fully comprehend two methods of data mining, namely decision tree induction and neural networks. The decision tree induction method is to be implemented within the mathematical computer language MATLAB. The results found when analysing some suitable data will be compared with the results from the neural network toolbox already implemented in MATLAB. The data used to compare and contrast the two methods included voting records from the US House of Representatives, which consists of yes, no and undecided votes on sixteen separate issues. The voters are grouped into categories according to their political party. This can be either republican or democratic. The objective of using this data set is to predict what party a congressman is affiliated with by analysing their voting trends. The findings of this study reveal that the decision tree method can accurately predict outcomes if an ideal data set is used for building the tree. The neural network method has less accuracy in some situations however it is more robust towards unexpected data.


Statistics for USQ ePrint 58
Statistics for this ePrint Item
Item Type: USQ Project
Refereed: No
Item Status: Live Archive
Additional Information: Additional files (C4.5 data, Golf verification, Matlab files, Case studies, LaTex source) available on CD-ROM held in USQ library.
Depositing User: epEditor USQ
Faculty / Department / School: Historic - Faculty of Engineering and Surveying - Department of Mechanical and Mechatronic Engineering
Date Deposited: 11 Oct 2007 00:13
Last Modified: 02 Jul 2013 22:30
Uncontrolled Keywords: data mining, MATLAB, decision tree induction method, artificial neural networks
Fields of Research (FOR2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080108 Neural, Evolutionary and Fuzzy Computation
08 Information and Computing Sciences > 0803 Computer Software > 080309 Software Engineering
08 Information and Computing Sciences > 0807 Library and Information Studies > 080704 Information Retrieval and Web Search
URI: http://eprints.usq.edu.au/id/eprint/58

Actions (login required)

View Item Archive Repository Staff Only