Harmony Analysis in A’Capella Singing

Oliver, Jarred (2019) Harmony Analysis in A’Capella Singing. [USQ Project]

Text (Project)

Download (2MB) | Preview


Speech production is made by the larynx and then modified by the articulators; this speech contains large amounts of useful information. Similar to speech, singing is made by the same method; albeit with a specific acoustic difference; singing contains rhythm and is usually of a higher intensity. Singing is almost always accompanied by musical instruments which generally makes detecting and separating voice difficult (Kim Hm 2012). A’ Capella singing is known for singing without musical accompaniment, making it somewhat easier to retrieve vocal information.

The methods developed to detect information from speech are not new concepts and are commonly applied to almost every item in the average household. Singing processing adapts a large portion of these techniques to detect vocal information of singers including melody, language, emotion, harmony and pitch. The techniques used in speech and singing processing are catagorised into one of three categories:

1. Time Domain
2. Frequency Domain
3. Other Algorithms

This project will utilise an algorithm from each category; In particular, Average Magnitude Difference Function (AMDF), Cepstral Analysis and Linear Predictive Coding (LPC). AMDF is the result of taking the absolute value of a sample taken a time (k) and a delayed version of itself at (k-n). Its known to provide relatively good accuracy with low computational cost, however it is prone to variation in background noise (Hui, L et al 2006).

Cepstral Analysis is known for separating the convolved version of a signal into the source and voice tract components and provides fast computational speeds from utilising the ii Fourier Transform and its Inverse. LPC provides a linear estimation of past values of a signal, the resulting predictor and error coefficients are utilised to develop the spectral envelope for pitch detection.

The project tested the algorithms against 11 tracks containing different harmonic content, each method was compared on their speed, accuracy, where applicable the number of notes correctly identified. All three algorithms gave relatively good results against single note tracks, with the LPC algorithms providing the most accurate results. When tested against multi-note tracks and pre-recorder singing tracks the AMDF and Cepstral Analysis methods performed poorly in terms of the accuracy and number of correctly identified notes. LPC method performed considerably better returning an average of 66.8% of notes correctly.

Statistics for USQ ePrint 43103
Statistics for this ePrint Item
Item Type: USQ Project
Item Status: Live Archive
Additional Information: Bachelor of Electrical & Electronic Engineering
Faculty/School / Institute/Centre: Current - Faculty of Health, Engineering and Sciences - School of Mechanical and Electrical Engineering (1 Jul 2013 -)
Faculty/School / Institute/Centre: Current - Faculty of Health, Engineering and Sciences - School of Mechanical and Electrical Engineering (1 Jul 2013 -)
Supervisors: Phythian, Mark
Date Deposited: 11 Aug 2021 04:44
Last Modified: 11 Aug 2021 04:44
URI: http://eprints.usq.edu.au/id/eprint/43103

Actions (login required)

View Item Archive Repository Staff Only