Título: | CONTINUOUS SPEECH RECOGNITION FOR THE PORTUGUESE USING HIDDEN MARKOV MODELS | |||||||
Autor: |
SIDNEY CERQUEIRA BISPO DOS SANTOS |
|||||||
Colaborador(es): |
ABRAHAM ALCAIM - Orientador |
|||||||
Catalogação: | 24/MAI/2006 | Língua(s): | PORTUGUESE - BRAZIL |
|||||
Tipo: | TEXT | Subtipo: | THESIS | |||||
Notas: |
[pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio. [en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio. |
|||||||
Referência(s): |
[pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=8372&idi=1 [en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=8372&idi=2 |
|||||||
DOI: | https://doi.org/10.17771/PUCRio.acad.8372 | |||||||
Resumo: | ||||||||
This work presents several contributions for the
improvement of CDHMM-based Continuous Speech Recognition
(CSR) Systems. Most of these contributions are specific
for Portuguese language.
Two reduced sets of phonetic units, based on the
characteristics of the Portuguese language, are proposed.
Several initialization procedures are analized and an
efficient and fast method of model initialization is
proposed. Methods are described for segmentation of
sentences and for concatenation of unit to form word and
sentence models. An efficient training algorithm for the
reduced sets of units is then proposed. Simulation results
show that the performance of the two sets are comparable
when bigrams are used. The number of units of these sets
are significantly reduced when compared to diphones and
triphones, which are widely used sets of context-dependent
units.
The performance of Continuous Speech Recognizers is
strongly dependent on the speech features. For this
reason, a comparative performance of several sets of
features for the Portuguese language is carried out. The
PLP coefficients with their first and second derivatives
yielded the best results.
A Continuous Speech Recognition System that uses syntactic
knowledge of the Portuguese language is proposed. This
system makes use of task dependent knowledges for
automatic dial-up telephone calls. The recognition system
can allows parsing of digits as well as natural numbers.
This is a user friendly feature feature that permits, for
the caller, a large degree of freedom in placing a call.
Based on the finite state machine proposed for the
implementation of the speech recognizer described in this
thesis, two parsing algorithms are analized - the Level
Building and the One pass. Then, a new algorithm is
proposed, which is more efficient than the other two
techniques. The proposed scheme is more suitable for the
use of synthatic and task-dependent knowledge sources.
The contribution of this thesis is concerned with the use
of the syllables as phonetic units in Portuguese-based CSR
systems. Dependent and Independent speaker tasks are
examined. It is shown that syllables provide good results
when used as phonetic units in Portuguese-based CSR
systemsm, in contrast with their poor performance in
English-based recognition schemes.
Finally, the influence of word-functions is analized in
Portuguese-based speech recognition systems. Although word-
functions play a critical role in the English-basec CSR,
it was found that this is not true for the Portuguese
language.
|
||||||||