Electronic Journal of University Malaya (EJUM)

Malaysian Journal of Computer Science (ISSN 0127-9084)

Elsevier Computer Speech and Language Journal

EURASIP Journal on Speech, Audio and Music processing

Arabian Journal for Science and Engineering

Speech Communication Journal

Speech Communication


Authors’ Home

Tutorial for Authors


Subjective comparison and evaluation of speech enhancement algorithms

Automatic speech recognition and speech variability: A review

Thai speech processing technology: A review

Web of Science

Don’t waste time on other database, straight away go to what matters most i.e. Web of Science


Journals for submission

The Arabian Journal for Science and Engineering (AJSE)


Lexical - In linguistics, the lexicon (from the Greek: Λεξικόν) of a language is its vocabulary, including its words and expressions. More formally, it is a language’s inventory of lexemes.

Lexicography - The practice of compiling dictionaries.

concordance - An alphabetical list of words present in a text, usually with citations of the passages concerned: “a concordance to the Bible”.

prosody - In linguistics, prosody (from Greek προσῳδία, prosōidía) is the rhythm, stress, and intonation of speech. …

speech synthesis - Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware. …

diphthong – in phonetics, a diphthong, or , (also gliding vowel) (from Greek δίφθογγος, diphthongos, literally “two sounds” or “two tones”) is a contour vowel—that is, a unitary vowel that changes quality during its pronunciation, or “glides”, with a smooth movement of the tongue from one articulation to …

F0 - Center frequency; the midpoint of the bandpass filter passband, normally expressed as the arithmetic mean of the two -3 dB points.

suprasegmental - The cues of language that come from pitch, intensity and durational differences in the pattern of speech. Suprasegmentals are what allow an English speaker to recognize the inflection of a question, even though the question is asked in another language.

monosyllabic - A syllable (Greek: ) is a unit of organization for a sequence of speech sounds. For example, the word water is composed of two syllables: wa and ter. A syllable is typically made up of a syllable nuclear (most often a vowel) with optional initial and final margins (typically, consonants).

polysyllabic - (of words) long and ponderous; having many syllables; “sesquipedalian technical terms”

Morphological - relating to or concerned with the formation of admissible words in a language

morpheme - minimal meaningful language unit; it cannot be divided into smaller meaningful units

semantic - of or relating to meaning or the study of meaning; “semantic analysis”

syntactic - of or relating to or conforming to the rules of syntax; “the syntactic rules of a language”

ambiguous - having more than one possible meaning; “ambiguous words”; “frustrated by ambiguous instructions, the parents were unable to assemble the toy”

Homograph - two words are homographs if they are spelled the same way but differ in meaning (e.g. fair)

concatenation - the state of being linked together as in a chain; union in a linked series

sparse - Having widely spaced intervals; Not dense; meager

heuristic - a commonsense rule (or set of rules) intended to increase the probability of solving some problem

Viterbi decoder - A Viterbi decoder uses the Viterbi algorithm for decoding a bitstream that has been encoded using Forward error correction based on a Convolutional code.

pharyngeal - guttural: a consonant articulated in the back of the mouth or throat.

pharynx – throat: the passage to the stomach and lungs; in the front part of the neck below the chin and above the collarbone

uvular - Uvulars are consonants articulated with the back of the tongue against or near the uvula, that is, further back in the mouth than velar consonants. …

consonants - In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract. …

accented - accent – dialect: the usage or vocabulary that is characteristic of a specific group of people; “the immigrants spoke an odd dialect of English”; “he has a strong German accent”; “it has been said that a language is a dialect with an army and navy”

lattice – In mathematics, a lattice is a partially ordered set

pervasive - Spreading widely throughout an area or a group of people.

geminated - In phonetics, gemination happens when a spoken consonant is pronounced for an audibly longer period of time than a short consonant.

syllable - A unit of pronunciation having one vowel sound, with or without surrounding consonants, forming the whole or a part of a word; e.g., there are two syllables in water and three in inferno.

vowel - A speech sound that is produced by comparatively open configuration of the vocal tract, with vibration of the vocal cords but without audible friction and is a unit of the sound system of a language that forms the nucleus of a syllable.

consonant - A basic speech sound in which the breath is at least partly obstructed and which can be combined with a vowel to form a syllable

semantic - (from Greek sēmantiká, neuter plural of sēmantikós) is the study of meaning. It focuses on the relation between signifiers, such as words, 

semitic - Relating to or denoting a family of languages that includes Hebrew, Arabic, and Aramaic and certain ancient languages such as Phoenician and Akkadian, constituting the main subgroup of the Afro-Asiatic family.

Standard Arabic – has basically 34 phonemes, of which six are vowels, and 28 are consonants

phoneme - Any of the perceptually distinct units of sound in a specified language that distinguish one word from another, for example p, b, d, and t in the English words pad, pat, bad, and bat.

emphatic1. Showing or giving emphasis; expressing something forcibly and clearly. 2. (of an action or event or its result) Definite and clear.

narrator - 1. A person who narrates something, esp. the events of a novel or narrative poem: “his poetic efforts are mocked by the narrator of the story”. 2. A person who delivers a commentary accompanying a movie, broadcast, piece of music, etc.

formant - Formants are defined by Fant as ‘the spectral peaks of the sound spectrum |P(f)|’ of the voice. Formant is also used to mean an acoustic resonance, and, in speech science and phonetics, a resonance of the human vocal tract.

Resonance frequency – In physics, resonance is the tendency of a system (usually a linear system) to oscillate with larger amplitude at some frequencies than at others. These are known as the system’s resonant frequencies (or resonance frequencies).

allophone - Any of the spoken speech sounds that represent a single phoneme, such as the aspirated k in kit and the unaspirated k in skit, which are allophones of the phoneme /k/.

pronunciation - The way in which a word is pronounced.

phonology1. The branch of linguistics that deals with systems of sounds (including or excluding phonetics), esp. in a particular language 2. The system of relationships among the speech sounds that constitute the fundamental components of a language.

morphology – in linguistics, morphology is the identification, analysis and description of the structure of morphemes and other units of meaning in a language like words 

syntax - The arrangement of words and phrases to create well-formed sentences in a language.

lexicon - 1. The vocabulary of a person, language, or branch of knowledge. 2. A dictionary, esp. of Greek, Hebrew, Syriac, or Arabic: “a Greek–Latin lexicon“.

acoustic model - An acoustic model is created by taking audio recordings of speech, and their text transcriptions, and using software to create statistical representations of the sounds that make up each word. It is used by a speech recognition engine to recognize speech..

language model - A statistical language model assigns a probability to a sequence of m words by means of a probability distribution..

continuum – 1. Dialect continuum, the transition of one language to another through a series of speech variations. 2. A continuous sequence in which adjacent elements are not perceptibly different from each other, although the extremes are quite distinct

adaptation - 1. The action or process of adapting or being adapted.

bootstrapping - Get (oneself or something) into or out of a situation using existing resources.

diacricticNoun: A sign, such as an accent or cedilla, which when written above or below a letter indicates a difference in pronunciation from the same letter when unmarked or differently marked. Adjective: (of a mark or sign) Indicating a difference in pronunciation.

transcription1. A written or printed representation of something. 2. The action or process of transcribing something.

Buckwalter stemmer – which is an Arabic morphological analysis tool available from the LDC.

stemming - In linguistic morphology and information retrieval, stemming is the process for reducing inflected (or sometimes derived) words to their stem, base or root ..

concomitant - Adjective: Naturally accompanying or associated. Noun: A phenomenon that naturally accompanies or follows something.

corpus - 1. A collection of written texts, esp. the entire works of a particular author or a body of writing on a particular subject. 2. The main body or mass of a structure.

Expectation-Maximization algorithm - In statistics, an expectation-maximization (EM) algorithm is a method for finding maximum likelihood estimates of parameters in statistical models, where the model depends on unobserved latent variables.

graphical modeling toolkit GMTK - The Graphical Models Toolkit (GMTK) is an open source, publically available toolkit for developing graphical-model and dynamic Bayesian network (DBN) based

triphone - In linguistics, a triphone is a sequence of three phonemes. Triphones are useful in models of natural language processing to establish the various contexts in which a phoneme can occur in a particular natural language

Modern Standard Arabic - Modern Standard Arabic (MSA; اللغة العربية الفصحى ‘ “the most eloquent Arabic language”), Standard Arabic, or Literary Arabic”’ is the standard and literary variety of Arabic used in writing and in formal speech. It is part of the Arabic macrolanguage..

WestPoint Language Data Consortium (LDC) - modern standard Arabic database http://www.ldc.upenn.edu/

Hidden Markov Model Toolkit (HTK) - What is HTK? The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models.

interpolated - 1. Insert (something) between fixed points. 2. Insert (words) in a book or other text, esp. in order to give a false impression as to its date.

glottal consonant -  also called laryngeal consonants, are consonants articulated with the glottis. Many phoneticians consider them, or at least the so-called fricative, to be transitional states of the glottis without a point of articulation as other consonants have; in fact, some do not

geminated consonant - In phonetics, gemination happens when a spoken consonant is pronounced for an audibly longer period of time than a short consonant..

In Arabic language, the gemination is defined as the succession of two identical consonants pronounced consecutively expressed using the ( الشدٌة ) « » symbol.

emphatic consonant - Emphatic consonant is a term widely used in Semitic linguistics to describe one of a series of obstruent consonants which originally contrasted with series of both voiced and voiceless obstruents.

fricative consonant - a continuant consonant produced by breath moving against a narrowing of the vocal tract.

diacritize, diacritized, diacritization

Baum-Welch algorithm - In electrical engineering, computer science, statistical computing and bioinformatics, the Baum–Welch algorithm is used to find the unknown parameters of a hidden Markov model (HMM). It makes use of the forward-backward algorithm and is named for Leonard E. Baum and Lloyd R. Welch

decision tree method – A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility.

perplexity - 1. Inability to deal with or understand something complicated or unaccountable. 2. A complicated or baffling situation or thing.

polysyllable – a word of more than three syllables.

Maximum Likelihood Linear Regression (MLLR)

Maximum a posteriori (MAP)

monolingual - Adjective: (of a person or society) Speaking only one language. Noun: A person who speaks only one language.

Colloquial - Adjective: (of language) Used in ordinary conversation; not formal or literary.

SAMPA - The Speech Assessment Methods Phonetic Alphabet (SAMPA) is a computer-readable phonetic script using 7-bit printable ASCII characters, based on the International Phonetic Alphabet (IPA) http://www.phon.ucl.ac.uk/home/sampa/

Buckwalter transliteration - The Buckwalter Arabic transliteration was developed at Xerox by Tim Buckwalter in the 1990s. It is an ASCII only transliteration scheme, representing Arabic orthography strictly one-to-one, unlike the more common romanization schemes that add morphological information not expressed in Arabic

Buckwalter Morphological Analyzer - to generate all possible pronunciations of a word

International Phonetic Alphabet (IPA) - An internationally recognized set of phonetic symbols developed in the late 19th century, based on the principle of strict one-to-one correspondence between sounds and symbols.

UCLA Phonological Segment Inventory Database (UPSID) - The UCLA Phonological Segment Inventory Database (or UPSID) is a statistical survey of the phoneme inventories in 451 of the world’s languages.

pertinence - applicability: relevance by virtue of being applicable to the matter at hand.

prolongation - the act of prolonging something; “there was an indefinite prolongation of the peace talks”.

elongated - 1. Unusually long in relation to its width. 2. Having grown or been made longer.

PRAAT - Praat (also the Dutch word for “talk”) is a free scientific software program for the analysis of speech in phonetics. It has been designed and continuously developed by Paul Boersma and David Weenink of the University of Amsterdam.

perilous - 1. Full of danger or risk. 2. Exposed to imminent risk of disaster or ruin

Romanization - In linguistics, romanization or latinization is the representation of a written word or spoken speech with the Roman (Latin) script, or a system for doing so, where the original word or language uses a different writing system (or none).

Romanization of Arabichttp://en.wikipedia.org/wiki/Romanization_of_Arabic

Romanization is often termed “transliteration”, but this is not technically correct. Transliteration is the direct representation of foreign letters using Latin symbols, while most systems for romanizing Arabic are actuallytranscription systems, which represent the sound of the language. As an example, the above rendering munāẓarat al-ḥurūf al-ʿarabiyyah of the Arabic Arabic: مناظرة الحروف العربية‎ is a transcription, indicating the pronunciation; an example transliteration would be mnaẓrH alḥrwf alʿrbyH.

ambiguous - 1. (of language) Open to more than one interpretation; having a double meaning. 2. Unclear or inexact because a choice between alternatives has not been made.

grapheme-based system - The grapheme based recognizer will function on context-dependent models, which are generated by applying a decision tree defined…The major advantage of using graphemes as subword units is that the definition of lexicon is easy. In previous studies, results comparable to phoneme-based

phoneme based vs grapheme based

LDC Iraqi ArabicMorphological Lexicon

discriminative - capable of making fine distinctions

DARPA TransTAC (Spoken Language Communication and Translation System for Tactical Use)

Lavent - The Levant (play /ləˈvænt/) is the area of Western Asia bounded by the Mediterranean to the west, the Taurus Mountains to the north, theArabian Desert to the south, and the Syrian Desert to the east. The Levant includes modern LebanonSyriaJordanIsrael and the Palestinian Territories and is similar to the historic area called Syria, Greater Syria, or the Bilad al-Sham. The Levant has been described as the “crossroads of western Asia, the eastern Mediterranean and northeast Africa“.[1]

concatenate – in computer programming, string concatenation is the operation of joining two character strings end-to-end. For example, the strings “snow” and “ball” may be concatenated to give “snowball”

densities - 1. The degree of compactness of a substance: “bone density“. 2. A measure of the amount of information on a storage medium (tape or disk).

Manner of production:

fricative - Adjective: Denoting a type of consonant made by the friction of breath in a narrow opening, producing a turbulent air flow.Noun: A consonant made in this way, e.g., f and th.

plosive - Noun: A plosive speech sound. The basic plosives in English are t, k, and p (voiceless) and d, g, and b (voiced). Adjective: Denoting a consonant that is produced by stopping the airflow using the lips, teeth, or palate, followed by a sudden release of air.

trill - Noun: A quavering or vibratory sound, esp. a rapid alternation of sung or played notes. Verb: Produce a quavering or warbling sound.

lateral - Noun: A side part of something, esp. a shoot or branch growing out from the side of a stem. Verb: Throw (a football) sideways or backward.

nasal - Noun: A nasal speech sound. Adjective: Of, for, or relating to the nose: “the nasal passages”; “nasal congestion”; “a nasalspray”.

approximant - Adjective: Close to the actual, but not completely accurate or exact.Verb: Come close or be similar to something in quality, nature, or quantity: “a leasing agreement approximating to ownership”; “reality can be approximated by computational techniques”.

Place of articulation:

glottalPhonetics articulated or pronounced at or with the glottis

bilabial - Adjective: (of a speech sound) Formed by closure or near closure of the lips, as in p, b, m, w. Noun: A consonant sound made in such a way.

dental - Noun: A dental consonant. Adjective: Of or relating to the teeth.

velar - Noun: A velar sound. Adjective: Of or relating to a veil or velum.

alveolar - Noun: An alveolar consonant. Adjective: Relating to an alveolus.

palatal - Noun: A palatal sound. Adjective: Of or relating to the palate: “a palatal lesion”.

convolved - past participle, past tense of con·volve Verb: Roll or coil together; entwine.