Speech Processing Approaches towards Characterization and Identification of Dialects

Chittaragi, Nagaratna B.

Please use this identifier to cite or link to this item: https://idr.l3.nitk.ac.in/jspui/handle/123456789/16839

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Koolagudi, Shashidhar G.	-
dc.contributor.author	Chittaragi, Nagaratna B.	-
dc.date.accessioned	2021-08-17T10:43:29Z	-
dc.date.available	2021-08-17T10:43:29Z	-
dc.date.issued	2020	-
dc.identifier.uri	http://idr.nitk.ac.in/jspui/handle/123456789/16839	-
dc.description.abstract	Dialects constitute the phonological, lexical, and grammatical variations in the usage of a language with very minor and subtle differences. These variations are mainly due to specific speaking patterns followed among the group of speakers. In the recent past, dialect identification from the speech is emerging as one of the prominent speech research areas. This is mainly due to the extensive increase in the use of interactive voice-based systems. Therefore, it is essential to address speech variabilities caused due to dialectal differences in order to achieve effective, realistic man-machine interaction. The existing research on characterization and identification of dialects has mainly focused on acoustic, phonetic and phonotactic approaches on several languages such as English, Chinese, Arabic, Hindi, Spanish, etc. However, these models are not proved to be language independent. Applying these models to other languages may not perform equally well as there are many fundamental differences between dialects of different languages. However, in the literature dialect processing models reported with respect to Indian regional languages are considerably less. In this thesis, an attempt is made to develop few useful language independent and dependent Automatic Dialect Identification (ADI) systems for the Kannada language. In the beginning, a new text-independent Kannada Dialect Speech Corpus (KDSC) is collected from native speakers belonging to five prominent dialectal regions of Karnataka. This thesis investigates the significances of the excitation source, spectral, and prosodic features of speech for dialect identification. Additionally, spectrotemporal variations across dialects are captured through 2D Gabor features which are known to be biologically inspired ones. Further, the existence of non- conventional dialect-specific rhythmic and melodic correlations among dialects are explored using chroma features. These are well-established features in music-related applications. Robustness of these proposed features has been investigated under noisy background conditions and with small sized (limited data) audio clips. Inaddition, word and sentence based ADI systems are proposed using intonation and intensity variations representing the dynamic and static prosodic behaviors. Further, language dependent dialect identification systems are proposed for Kannada language using basic phonetic unit level dialect information. Additionally, Kannada language specific ’case’ (Vibhakthi Prathyayas) based dialect identification approaches are proposed. A single classifier based Support Vector Machines (SVM) and multiple classifiers based ensemble algorithms are used for classification of dialects. Experiments are carried out using individual and combinations of features. Use of different features has illustrated their complementary nature towards dialect processing. Performance comparison of both categories of classification algorithms has shown that ensemble algorithms perform better over single classifier based algorithms. Further, the intuition to use rhythm based aspects of dialects through chroma and spectral-shape features has shown better performance over state-of-the-art i-vector features. Moreover, this feature set has shown the noise robustness over the conventional MFCCs. In this work, we also have proposed intonation and intensity features to capture dialectal information from words and sentences for effective classification of dialects. In continuation, the role of duration, energy, pitch, three formants, and spectral features is also found to be evidential in Kannada dialect classification.	en_US
dc.language.iso	en	en_US
dc.publisher	National Institute of Technology Karnataka, Surathkal	en_US
dc.subject	Department of Computer Science & Engineering	en_US
dc.subject	Kannada dialect identification	en_US
dc.subject	Spectral features	en_US
dc.subject	Prosodic features	en_US
dc.subject	Excitation source features	en_US
dc.subject	Spectro-temporal features	en_US
dc.subject	Chroma features	en_US
dc.subject	Spectral-shaped features	en_US
dc.subject	Dynamic and static features	en_US
dc.subject	Cases	en_US
dc.subject	Support vector machine	en_US
dc.subject	Random forest	en_US
dc.subject	Extreme random forest	en_US
dc.subject	Extreme gradient boosting	en_US
dc.title	Speech Processing Approaches towards Characterization and Identification of Dialects	en_US
dc.type	Thesis	en_US
Appears in Collections:	1. Ph.D Theses

Files in This Item:

File	Description	Size	Format
155112CS15F09.pdf		1.9 MB	Adobe PDF	View/Open

Show simple item record