research centers


Search results: Found 4

Listing 1 - 4 of 4
Sort by

Article
Preprocessing Signal for Speech Emotion Recognition

Author: Bashar M. Nema
Journal: Al-Mustansiriyah Journal of Science مجلة علوم المستنصرية ISSN: 1814635X Year: 2017 Volume: 28 Issue: 3 Pages: 157-165
Publisher: Al-Mustansyriah University الجامعة المستنصرية

Loading...
Loading...
Abstract

In this paper the preprocessing signal for speech emotion recognition was introduced. The literature review on speech emotion recognition was presented. The discrimination between speech and music files was performed depend on a comparative between more than one statistical indicator such as mean, standard deviation, energy and silence interval. The preprocessing include silence removal, pre-emphasis, normalization and windowing so it is an important phase to get pure signal which is used in the next stage (feature extraction). The wave files (male, female) and the music file which are used in this paper have sample rate 48000; bit resolution 16 bit and mono channel. The wave files of this paper are taken from the Berlin dataset and RAVDESS dataset.

Keywords

Speech --- Features extraction --- Emotion --- Vocal --- LPC --- MFCC.


Article
Advantages and Disadvantages of Automatic Speaker Recognition Systems

Authors: Rawia Ab. Mohammed --- Akbas E. Ali --- Nidaa F. Hassan
Journal: Journal of Al-Qadisiyah for Computer Science and Mathematics مجلة القادسية لعلوم الحاسوب والرياضيات ISSN: 20740204 / 25213504 Year: 2019 Volume: 11 Issue: 3 Pages: Comp Page 21-30
Publisher: Al-Qadisiyah University جامعة القادسية

Loading...
Loading...
Abstract

Automatic speaker recognition systems use the machines to recognize an individual via a spoken sentence. Those systems recognize a specific individual or confirm an individual’s claimed identity. The most common type of voice biometrics is the Speaker Recognition. Its task focused on validation of a person’s claimed identity, using features that have been obtained via their voices. Throughout the last decades a wide range of new advances in the speaker recognition area have been accomplished, but there are still many problems that need solving or require enhanced solutions. In this paper, a brief overview of speech processing is given firstly, then some feature extraction and classifier techniques are described, also a comparative and analysis of some previous research are studied in depth, all this work leads to determine the best methods for speaker recognition. Adaptive MFCC and Deep Learning methods are determined to be more efficient and accurate than other methods in speaker recognition, thus these methods are recommend to be more suitable for practical applications.


Article
Evaluation of Human Voice Biometrics and Frog Bioacoustics Identification Systems Based on Feature Extraction Method and Classifiers
التقييم على انظمة تحديد الصوتيات البشرية والصوتيات الحيوية للضفدع اعتماداً على طريقة استخراج وتصنيف الخصائص

Author: Aws Saad Shawkat أوس سعد شوكت حسن
Journal: Journal of Al-Ma'moon College مجلة كلية المأمون ISSN: 19924453 Year: 2018 Issue: 31 Pages: 176-195
Publisher: AlMamon University College كلية المامون الجامعة

Loading...
Loading...
Abstract

Biometrics is defined as the science of recognizing human by using their personal biological characteristics, for example voice, fingerprint and signature. Biometrics approach has then been implemented for recognizing animal for the purpose of biological and ecological research and development. Due to the research on animal based recognition is still in infancy, so in this study, the evaluation on the effectiveness of the audio based biometric system approach to the bioacoustics identification system is experimented. Bioacoustics based on frog call in order to identify the frog species is employed in this study. Consequently, the well-known features used in audio based biometric system i.e. Mel-frequency Cepstral Coefficients (MFCC) is experimented as features for the frog bioacoustics based identification system. For the classification process, performances of Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Local Mean k Nearest Neighbor (LMkNN) and Fuzzy k-NN (FkNN) classifiers have been compared in this study. The performances of the biometric system and the frog bioacoustics system based on the proposed classifiers are evaluated. The best performance has been observed using FkNN classifier with the accuracy of 97% for the frog bioacoustics identification system and 93.38% for the biometric speaker identification system with 20 training data.

يتم تعريف القياسات الحيوية كعلم تمييز الإنسان باستخدام خصائصه البيولوجية الشخصية على سبيل المثال الصوت وبصمات الأصابع والتوقيع. ثم تم تطبيق نهج القياسات الحيوية لتمييز الحيوان لغرض البحوث البيولوجية والبيئية والتنمية. ويرجع ذلك إلى كون بحوث التمييز على أساس الحيوان لا يزال في مرحلة الطفولة، لذلك في هذه الدراسة، يتم عمل تقييم فعالية النهج القائم على نظام القياسات الحيوية الصوتية لنظام تحديد الصوتيات الحيوية. يستخدم علم الصوتيات الحيوية على أساس دعوة الضفدع من أجل التعرف على الضفادع الاخرى في هذه الدراسة. ونتيجة لذلك، يتم اختبار الميزات المعروفة المستخدمة في نظام القياسات الحيوية الصوتية مثل استخدام Mel-frequency Cepstral Coefficients (MFCC) كميزات لنظام التعرف على الصوتيات الحيوية للضفدع. أما بالنسبة لعملية التصنيف، فقد تمت مقارنة أداء Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Local Mean k Nearest Neighbor (LMkNN) and Fuzzy k-NN (FkNN)وقد تم في هذه الدراسة تقييم أداء نظام القياسات الحيوية للضفدع على أساس المصنفات المقترحة. وقد لوحظ أفضل أداء باستخدام FkNN classifier مع دقة 97٪ لنظام التعرف على الصوتيات البيولوجية للضفدع و 93.38٪ لنظام تمييز القياسات الحيوية على المتحدث مع 20 بيانات التدريب.


Article
Influence of Noisy Environment on the Speech Recognition Rate Based on the Altera FPGA
تأثیر البیئة الصاخبة على معدل تمییز الكلام مستند على البوابات المنطقیة المبرمجة نوع اللتیرا

Authors: Eyad I. Abbas --- Alaa Abdulhussain Refeis
Journal: Engineering and Technology Journal مجلة الهندسة والتكنولوجيا ISSN: 16816900 24120758 Year: 2013 Volume: 31 Issue: 13 Part (A) Engineering Pages: 2513-2530
Publisher: University of Technology الجامعة التكنولوجية

Loading...
Loading...
Abstract

This paper introduce an approach to study the effects of different levels ofenvironment noise on the recognition rate of speech recognition systems, which arenot used any type of filters to deal with this issue. This is achieved by implementingan embedded SoPC (System on a Programmable Chip) technique with Altera Nios IIprocessor for real-time speech recognition system. Mel Frequency CepstralCoefficients (MFCCs) technique was used for speech signal feature extraction(observation vector). Model the observation vector of voice information by usingGaussian Mixture Model (GMM), this model passed to the Hidden Markov Model(HMM) as probabilistic model to process the GMM statistically to make decision onutterance words recognition, whether a single or composite, one or more syllablewords. The framework was implemented on Altera Cyclone II EP2C70F896C6NFPGA chip sitting on ALTERA DE2-70 Development Board. Each word model(template) stored as Transition Matrix, Diagonal Covariance Matrices, and MeanVectors in the system memory. Each word model utilizes only 4.45Kbytes regardlessof the spoken word length. Recognition words rate (digit/0 to digit/10) given 100%for the individual speaker. The test was conducted at different sound levels of thesurrounding environment (53dB to 73dB) as measured by Sound Level Meter (SLM)instrument.

Listing 1 - 4 of 4
Sort by
Narrow your search

Resource type

article (4)


Language

English (4)


Year
From To Submit

2019 (1)

2018 (1)

2017 (1)

2013 (1)