Convert stereo to mono

Category : Speech Recognition Technology

http://en.wikipedia.org/wiki/Audio_file_format

using cubase

http://www.gearslutz.com/board/rap-hip-hop-engineering-production/628274-cubase-convert-stereo-track-mono-track.html

using audacity

http://www.ehow.com/how_5039268_convert-stereo-mono-mp.html

HTK Julius Tutorial

Category : Speech Recognition Technology

Fixed problem windows 7-64bit

step 1

http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial/data-prep/step-1/comments/windows-rror——-lexical-mistake-#O2x_KCy-WEaDrXyaNQqxZA

 

Installing CYGWIN in Windows Server 2003

Category : Speech Recognition Technology

http://www.cygwin.com/ml/cygwin-xfree/2007-12/msg00033.html

http://cygwin.com/faq/faq.using.html#faq.using.bloda

solution:

http://ist.uwaterloo.ca/~kscully/CygwinSSHD_W2K3.html

 

 

HTK Tutorials

Category : Speech Recognition Technology

http://www.voxforge.org/home/dev

http://www.voxforge.org/home/docs/cygwin-cheat-sheet

 

Tutorial: Create Acoustic Model – Manually

HTK (v.3.1): Basic Tutorial Nicolas Moreau / 02.02.2002

HTK Tutorial Giampiero Salvi

Introduction to HTK Toolkit Berlin Chen 2005

Building a Simple Speaker Identification System

A TUTORIAL ON USING HIDDEN MARKOV MODELS FOR PHONEME RECOGNITION

HTK

Category : Speech Recognition Technology

HTKbook

http://svr-www.eng.cam.ac.uk/comp.speech/Section1/Lexical/beep.html - beep dictionary

http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S1- timit database

 

http://htk.eng.cam.ac.uk/docs/faq.shtml

Installing HTK on Microsoft Windows

how to

http://www.phon.ucl.ac.uk/resource/sfs/howto/htk.htm How To: Use HTK Hidden Markov modelling toolkit with SFS http://www.phon.ucl.ac.uk/resource/sfs/

http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html VOICEBOX: Speech Processing Toolbox for MATLAB

 

http://www.activestate.com/activeperl/downloads

 

How the system works

Category : Speech Recognition Technology

Training – building HMM with HTK toolkit

Speech signal sent. MFCC will extract the speech features into vectors.

Transcription file of the speech + dictionary to produce the phonetic dictionary

BW algorithm to match the speech features and the phonetic dictionary to build the HMM

 

Testing – using Matlab software

Speech signal sent. MFCC will extract the speech features into vectors.

Viterbi algorithm to compare the speech features with the HMM to match the input

using likelihood to estimate the scores by comparing the speech features sent divided by the HMM model

 

Acoustic model

Language model

pronunciation model

example of speech processing in matlab

Category : Speech Recognition Technology

http://labrosa.ee.columbia.edu/matlab/

http://www.speech-recognition.de/matlab-files.html

http://www.lintech.org/speech/

http://www.mathworks.com/company/newsletters/digest/2010/jan/word-recognition-system-matlab.html

http://cronos.rutgers.edu/~lrr/dsp%20design%20course/lectures_2010/Lecture3a_2010.pdf

http://sipi.usc.edu/~nkatsam/nassos/resources/speech_course_2004/OnlineSpeechDemos/speechDemo_2004_Part1.html

 

 

HTK Toolkit steps

Category : Speech Recognition Technology

1.set path> toolbox, adapt

2.Current folder>AR

3.>>HTK_Scorer_UIAr

clc>clear command

Speech Features – Acoustic Modelling

Category : Speech Recognition Technology

discrete cosine transformation (DCT)

Mel-frequency cepstral coefficients (MFCC)

linear prediction coefficients (LPC).

Speech recognition engines

Category : Speech Recognition Technology

HMM – Hidden Markov Model

ANN - artificial neural networks

DTW – dynamic time warping

a combination of fuzzy logic and ANN

Gaussian mixture model (GMM)

 

Several pattern matching algorithms such as DTW,HMM, and Gaussian mixture model (GMM) have beenapplied with some well-known speech features, such asMFCC and linear prediction coefficients (LPC).

SVM - support vector machines