- Open Access
Frequency shifting approach towards textual transcription of heartbeat sounds
© Arvin et al; licensee BioMed Central Ltd. 2011
- Received: 12 May 2011
- Accepted: 4 October 2011
- Published: 4 October 2011
Auscultation is an approach for diagnosing many cardiovascular problems. Automatic analysis of heartbeat sounds and extraction of its audio features can assist physicians towards diagnosing diseases. Textual transcription allows recording a continuous heart sound stream using a text format which can be stored in very small memory in comparison with other audio formats. In addition, a text-based data allows applying indexing and searching techniques to access to the critical events. Hence, the transcribed heartbeat sounds provides useful information to monitor the behavior of a patient for the long duration of time. This paper proposes a frequency shifting method in order to improve the performance of the transcription. The main objective of this study is to transfer the heartbeat sounds to the music domain. The proposed technique is tested with 100 samples which were recorded from different heart diseases categories. The observed results show that, the proposed shifting method significantly improves the performance of the transcription.
- Window Size
- Heart Sound
- Audio Signal
- Digital Signal Processor
- Musical Note
Auscultation is the most remarkable approach that has been used in diagnosing many cardiovascular diseases for many years. It still plays an important role in the diagnosis of heart disease. Sounds produced by the heart frequently reflect the structural abnormalities of the heart. Physicians use the stethoscope as a common tool to listen to the heart sounds and make a correct diagnosis. Modern stethoscopes are making the auscultation easier to be done. Despite murmurs and tones are easily distinguished, weak murmurs and below audibility threshold easily disappear in background sound. Analysis of heart sounds and extraction of its audio features would be important towards the development of automatic diagnosis systems. Phonocardiogram (PCG) is a diagram of sonic vibration of heart beats. Most researches used PCG as an audio input of system to apply different techniques of digital signal processing [1–3]. Based on characteristics of the audio signals, it is possible to apply various signal processing and modeling approaches. Healthy heart sound includes symmetric cycles and pulse values. In contrary, unhealthy heart sounds are commonly disordered by different unexpected frequencies.
Segmentation is a technique for separating cycles and its pulses [2, 3]. Classification of heart sound is another research area that divides heartbeat sounds in different clusters based on their characteristics [1, 4, 5]. In the similar study, neural network has been used for classification of different heart sounds such as normal, systolic and diastolic murmurs . A high performance localization technique of the first heart sound pulse was proposed in . The localization was performed based on an additional enhancement to improve the accuracy of pulse detection. In our previous study on real-time segmentation , a simple segmentation technique using amplitude reconstruction was proposed which divided the heartbeat sound pulses with a high accuracy. However, the limitation was to lose of low-amplitude harmonics.
Automatic music transcription [9–12] is an approach to process the audio signals to extract the pitch levels that can be notated as musical notes and the music. Most researches in automatic music transcription attempted to increase the accuracy of the transcription to cover different frequency levels [9, 11]. Transcription can be applied on heartbeat sounds in order to represent heartbeat sounds with the music notation. In previous studies [13–15], heart sounds were represented with MIDI (Musical Instrument Digital Interface) format. A good performance of transcription was illustrated in those studies. For long duration sampling of the heartbeat sounds and developing a biomedical database, text-based formats (i.e. MIDI) are the suitable mediums to convert and store the biomedical signals. Text-based music information retrieval [16, 17] allows developing query-based system to highlight various events of heartbeat sounds in particularly. In our previous study , music transcription of heartbeat sounds was performed that demonstrated good accuracy for different heart sound samples. We proposed several preparation techniques for de-noising and cleaning heart sound signals in order to use in real-time systems. The results showed that, heart sounds can be represented as musical notation. Since heart sound signals are in very low-frequency domain , automatic transcription techniques that are used for music transcription are not suitable for this particular application. Therefore, in order to provide a high accuracy transcription, two methods can be used. The first method is to provide an automatic transcription technique with a new configuration to cover very low-frequency spectrum which requires complex algorithms and several modifications. The second method is to transfer the heartbeat sounds to the frequency that is used by music instruments, which allows utilizing the ordinary music processing methods.
In this paper, we propose a frequency shifting (transferring) method to increase the accuracy of the heartbeat sounds transcription. We modify automatic music transcription methods to be used in specific frequency spectrum. The process begins with a frequency estimation technique using Fast Fourier Transform (FFT), a commonly used technique. Heart sounds are divided in several parts with similar size that is called window. Thus, FFT is applied for each window and the estimated frequency is approximated to the nearest pitch number. The main problem in this step is the lower frequency of heart sounds in comparison with music. The proposed shifting method aims to solve the problem with transferring the low-frequency samples to high-frequency notes (music instruments). Moreover, the textual transcription is implemented in two processing methods which are real-time (RT) and non-real-time (NRT). The performance of the transcription is investigated in both methods.
Transcription of polyphonic music is more complex than monophonic due to the occurrence of several notes at a given point in time .
This section presents the transcription technique that is used to process the heartbeat sounds. According to the nature of the heart sound, the music signal processing techniques can be adopted with a few modifications in terms of frequency and window sizes. These modifications are required due to differences between music and heartbeat sounds' characteristics.
where f(p) denotes the estimated frequency and N(p) is the nearest pitch approximation. The note number 60 is the musical note C4 with frequency value of 261.6 Hz. This formula shows that, if the value f(p) is increased to 2 times, the value of 12 which is an octave interval is added to the N(p). Each calculated section is equivalent to one musical note, and generates binary codes based on N(p).
We use textual transcription term instead of musical transcription due to storage format of the converted samples that is a plain text consists of binary value of the note numbers. Moreover, the notes are started periodically with a constant window size.
This section explains experimental configuration and platforms that are utilized to perform the proposed transcription. In addition, the method of sampling and resources are described in this section as well. In this study, the aim is in obtaining a high accuracy transcription of the heartbeat sounds with both real-time and non-real-time methods. Furthermore, the size of the converted textual files must be small that can be used for a continuous transcription of the audio heartbeat data stream.
The proposed transcription method is implemented using real-time and non-real-time processes. We investigate both processing methods in terms of accuracy and feasibility.
After real-time processing of each window, the estimated byte (pitch number) is sent to the PC to save as a plain text. The aim of the use of this module is to evaluate the possibility of the RT process for the future applications.
For the non-real-time (NRT) process, we use MATLAB software version 6. Each record is loaded into an array of integer values where each cell contains a magnitude of a sample. The array is divided into several sub-arrays with length of window and pitch extraction is performed for each window separately. The extracted notes are then stored as plain text format. To reduce the number of calculations, the estimated frequencies are limited in a range from 0 to 500 Hz.
Category of samples that are used in this study
Number of samples
Split Second Sound
Therefore, the higher ratio of threshold values show better performance. This results were also illustrated in our previous study . In the following experiments, the threshold level was assigned as T = 0.6 A max , where A max denotes the sample with maximum magnitude.
In contrast with the previous studies [13, 14], the proposed frequency shifting could cover the low frequency samples without complex calculations. In addition, based on the duration of each pulse that is enough for applying the frequency estimation, with selecting small window sizes it was shown to be possible to implement the frequency shifting in real-time applications.
In this paper, we proposed a frequency shifting method in order to increase the accuracy of the transcription. This method was tested in various recorded heart sounds samples including healthy and unhealthy cases that were categorized in 8 groups. The suitable values for configuration parameters of signal processing such as window size and threshold level were estimated by the initial experiments (w = 250 ms and T = 0.6). Following this, the shifting method was evaluated and an appropriate shifting size (14 semi-notes) was selected. The performance of the transcription was tested in different heart sound samples using real-time and non-real-time processes. The observed results showed that non-real-time process has a better performance in comparison with real-time process (95% and 90% for healthy and unhealthy cases respectively). The accuracy of the real-time process was also good (89% and 85% for healthy and unhealthy cases respectively). It reveals that, this method can be used in real-time systems such as house hold heart problem detection systems as early warning systems.
This work was supported by a grant from University Putra Malaysia (Grant number: 05-01-09-0810RU).
- Babaei S, Geranmayeh A: Heart sound reproduction based on neural network classification of cardiac valve disorders using wavelet transforms of PCG signals. Computers in Biology and Medicine. 2009, 39: 8-15. 10.1016/j.compbiomed.2008.10.004.View ArticlePubMedGoogle Scholar
- Sepehri A, Gharehbaghi A, Dutoit T, Kocharian A, Kiani A: A novel method for pediatric heart sound segmentation without using the ECG. Computer methods and programs in biomedicine. 2010, 99: 43-48. 10.1016/j.cmpb.2009.10.006.View ArticlePubMedGoogle Scholar
- Yan Z, Jiang Z, Miyamoto A, Wei Y: The moment segmentation analysis of heart sound pattern. Computer methods and programs in biomedicine. 2010, 98 (2): 140-150. 10.1016/j.cmpb.2009.09.008.View ArticlePubMedGoogle Scholar
- Dokur Z, Olmez T: Heart sound classification using wavelet transform and incremental self-organizing map. Digital Signal Processing. 2008, 18 (6): 951-959. 10.1016/j.dsp.2008.06.001.View ArticleGoogle Scholar
- Olmez T, Dokur Z: Classification of heart sounds using an artificial neural network. Pattern Recognition Letters. 2003, 24 (1-3): 617-629. 10.1016/S0167-8655(02)00281-7.View ArticleGoogle Scholar
- Gupta C, Palaniappan R, Swaminathan S, Krishnan S: Neural network classification of homomorphic segmented heart sounds. Applied Soft Computing. 2007, 7: 286-297. 10.1016/j.asoc.2005.06.006.View ArticleGoogle Scholar
- Ahlstrom C, Lanne T, Ask P, Johansson A: A method for accurate localization of the first heart sound and possible applications. Physiological Measurement. 2008, 29: 417-10.1088/0967-3334/29/3/011.View ArticlePubMedGoogle Scholar
- Arvin F, Doraisamy S: Real-time segmentation of heart sound pattern with amplitude reconstruction. IEEE EMBS Conference on Biomedical Engineering and Sciences. 2010, 130-133.Google Scholar
- Bello J, Sandler M: Blackboard system and top-down processing for the transcription of simple polyphonic music. COST G-6 Conference on Digital Audio Effects (DAFx-01). 2001Google Scholar
- Plumbley M, Abdallah S, Bello J, Davies M, Monti G, Sandler M: Automatic music transcription and audio source separation. Cybernetics and Systems. 2002, 33 (6): 603-627. 10.1080/01969720290040777.View ArticleGoogle Scholar
- Ryynanen M, Klapuri A: Polyphonic music transcription using note event modeling. Applications of Signal Processing to Audio and Acoustics, 2005. IEEE Workshop on, IEEE. 2005, 319-322.View ArticleGoogle Scholar
- Arvin F, Doraisamy S: Real-Time Pitch Extraction of Acoustical Signals Using Windowing Approach. Australian Journal of Basic and Applied Sciences. 2009, 3 (4): 3557-3563.Google Scholar
- Modegi T, Iisaku S: Proposals of MIDI coding and its application for audio authoring. Multimedia Computing and Systems, 1998. Proceedings. IEEE International Conference on, IEEE. 1998, 305-314.Google Scholar
- Modegi T: MIDI encoding method based on variable frame-length analysis and its evaluation of coding precision. Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on, IEEE. 2000, 2: 1043-1046.Google Scholar
- Modegi T: Studies in Health Technology and Informatics. Studies in health technology and informatics. 2001, 84: 366-370.PubMedGoogle Scholar
- Doraisamy S: Polyphonic Music Retrieval: The N-gram Approach. PhD thesis. 2004, Department of Computing, Imperial College LondonGoogle Scholar
- Doraisamy S, Ruger S: Robust polyphonic music retrieval with n-grams. Journal of Intelligent Information Systems. 2003, 21: 53-70. 10.1023/A:1023553801115.View ArticleGoogle Scholar
- Arvin F, Doraisamy S: Heart Sound Musical Transcription Technique Using Multi-Level Preparation. International Review on Computers and Software. 2010, 5 (6): 595-600.Google Scholar
- Phua K, Chen J, Dat T, Shue L: Heart sound as a biometric. Pattern Recognition. 2008, 41 (3): 906-919. 10.1016/j.patcog.2007.07.018.View ArticleGoogle Scholar
- Jiang Z, Choi S: A cardiac sound characteristic waveform method for in-home heart disorder monitoring with electric stethoscope. Expert Systems with Applications. 2006, 31 (2): 286-298. 10.1016/j.eswa.2005.09.025.View ArticleGoogle Scholar
- Khorasani E, Doraisamy S, Arvin F: An Approach for Heartbeat Sound Transcription. International Conference on Computer Technology and Development, IEEE. 2009, 38-41.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.