As obras disponibilizadas nesta Biblioteca Digital foram publicadas sob expressa autorização dos respectivos autores, em conformidade com a Lei 9610/98.
A consulta aos textos, permitida por seus respectivos autores, é livre, bem como a impressão de trechos ou de um exemplar completo exclusivamente para uso próprio. Não são permitidas a impressão e a reprodução de obras completas com qualquer outra finalidade que não o uso próprio de quem imprime.
A reprodução de pequenos trechos, na forma de citações em trabalhos de terceiros que não o próprio autor do texto consultado,é permitida, na medida justificada para a compreeensão da citação e mediante a informação, junto à citação, do nome do autor do texto original, bem como da fonte da pesquisa.
A violação de direitos autorais é passível de sanções civis e penais.
This work presents several contributions for the
improvement of CDHMM-based Continuous Speech Recognition
(CSR) Systems. Most of these contributions are specific
for Portuguese language.
Two reduced sets of phonetic units, based on the
characteristics of the Portuguese language, are proposed.
Several initialization procedures are analized and an
efficient and fast method of model initialization is
proposed. Methods are described for segmentation of
sentences and for concatenation of unit to form word and
sentence models. An efficient training algorithm for the
reduced sets of units is then proposed. Simulation results
show that the performance of the two sets are comparable
when bigrams are used. The number of units of these sets
are significantly reduced when compared to diphones and
triphones, which are widely used sets of context-dependent
The performance of Continuous Speech Recognizers is
strongly dependent on the speech features. For this
reason, a comparative performance of several sets of
features for the Portuguese language is carried out. The
PLP coefficients with their first and second derivatives
yielded the best results.
A Continuous Speech Recognition System that uses syntactic
knowledge of the Portuguese language is proposed. This
system makes use of task dependent knowledges for
automatic dial-up telephone calls. The recognition system
can allows parsing of digits as well as natural numbers.
This is a user friendly feature feature that permits, for
the caller, a large degree of freedom in placing a call.
Based on the finite state machine proposed for the
implementation of the speech recognizer described in this
thesis, two parsing algorithms are analized - the Level
Building and the One pass. Then, a new algorithm is
proposed, which is more efficient than the other two
techniques. The proposed scheme is more suitable for the
use of synthatic and task-dependent knowledge sources.
The contribution of this thesis is concerned with the use
of the syllables as phonetic units in Portuguese-based CSR
systems. Dependent and Independent speaker tasks are
examined. It is shown that syllables provide good results
when used as phonetic units in Portuguese-based CSR
systemsm, in contrast with their poor performance in
English-based recognition schemes.
Finally, the influence of word-functions is analized in
Portuguese-based speech recognition systems. Although word-
functions play a critical role in the English-basec CSR,
it was found that this is not true for the Portuguese