Commit 3ac6f2db authored by Michal Terepeta's avatar Michal Terepeta Committed by Ben Gamari

real: remove HMMS

Summary:
It doesn't compile and I didn't see any easy way to fix it (I got
stuck at `import Native`, which, according to a comment, comes from
`hbc`). According to another comment,  there were other problems with
the test even when it did compile:
  "HMMS test only works on SPARC machines"

In any case, this has been broken for a while so I don't think anyone
will miss it.
Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>

Test Plan: run nofib

Reviewers: erikd, bgamari

Reviewed By: bgamari

Differential Revision: https://phabricator.haskell.org/D3088
parent 8a11a385
The purpose of executing the Viterbi algorithm is to get an
{\em alignment\/} between the states of the HMM that models an
utterance and the feature vectors generated by the signal processing
module. The programs that analyze alignment files in order to compute
new HMM network models or compute statistics need to be able to read
the alignment data.
\begin{haskell}{Alignments}
> module Alignments(
> FrameData, Alignment, readAlignment,
> strip_off_frame_number
> ) where
> import MaybeStateT
> import PlainTextIO
> import Phones
> import HmmDigraphs
> import Data.Char(isSpace)--1.3
\end{haskell}
Formally, an {\em alignment\/}\index{alignment} is a sequence
of triples. The first element is the feature vector number, the
second element is the phonetic symbol (i.e., which HMM), and the third
component is the HMM state.
\begin{haskell}{Alignment}
> type FrameData = (Int, Phone, HmmState)
> type Alignment = [FrameData]
\end{haskell}
The function \verb~readAlignment~ reads alignment data from a
file. It is assumed that the file contains nothing but alignment
data.
\begin{haskell}{readAlignment}
> readAlignment :: [Char] -> Alignment
> readAlignment cs =
> let
> cs' = dropWhile isSpace cs
> in
> case readAlignmentFrame cs of
> Nothing -> if null cs'
> then []
> else error "unparsable chars"
> Just (f, cs'') -> f : readAlignment cs''
\end{haskell}
\begin{haskell}{readAlignmentFrame}
> readAlignmentFrame :: MST [Char] FrameData
> readAlignmentFrame = readsItem `thenMST` \ i ->
> readsItem `thenMST` \ p ->
> readsItem `thenMST` \ s ->
> returnMST (i, p, s)
\end{haskell}
\begin{haskell}{strip_off_frame_number}
> strip_off_frame_number :: [(a,b,c)] -> [(b,c)]
> strip_off_frame_number = map drop_frame_number
> drop_frame_number :: (a,b,c) -> (b,c)
> drop_frame_number (_,b,c) = (b,c)
\end{haskell}
A balanced binary search tree is a data structure enabling
efficient retrieval of data. For training hidden Markov models on a
large set of training utterances with a large vocabulary, it is
necessary to provide efficient retrieval of word pronunciation models.
This section describes a Haskell implementation of balanced binary
search trees as described in~\cite{BirdWadl88} that is sufficient for
our needs (i.e., we don't implement all of the functions normally
associated with the abstract type).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{The Balanced Binary Search Tree Datatype}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
While the data type and functions are basically those of Bird
\& Wadler~\cite[Chapter 9]{BirdWadl88}, we have extended their search
tree structure by tagging each node in the tree with two values
instead of one: a {\em key\/} and a {\em definition}. Note that we
only provide a partial implementation of balanced binary search trees;
for example, we have not bothered to implement a ``delete'' function.
\begin{haskell}{BalBinSTrees}
> module BalBinSTrees(
> BalBinSTree, -- don't export the data constructors
> bbstBuild,
> bbstInsert, -- error upon finding duplicate keys
> bbstInsertQuiet, -- don't complain about duplicate keys
> bbstLookUp,
> bbstMember,
> bbstDepth,
> bbstShowKeys,
> bbstFlatten
> ) where
\end{haskell}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Implementation}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%====================================================================
\subsection*{Representation}
%%====================================================================
We let the balanced binary search tree data type inherit the
methods of the class ``\verb~Text~'' so that we can easily read and
write such trees from/to plain text files. The type variable \verb~k~
represents the type of the key and the type variable \verb~b~
represents the type of the data to be retrieved.
\begin{haskell}{BalBinSTree}
> data BalBinSTree a b =
> Nil | Node a b (BalBinSTree a b) (BalBinSTree a b)
> deriving Show{-was:Text-}
\end{haskell}
%%====================================================================
\subsection*{bbstBuild}
%%====================================================================
The function \verb~bbstBuild~ takes an association list (i.e.,
a list of pairs where the first element of the pair is the ``key'' and
the second element of the pair is the ``definition'') as an argument
and returns a balanced binary search tree. The key type must belong to
the Haskell class \verb~Ord~ because the function \verb~bbstInsert~
uses the methods of that class.
\begin{haskell}{bbstBuild}
> bbstBuild :: (Ord a) => [(a,b)] -> BalBinSTree a b
> bbstBuild = foldl bbstInsert Nil
\end{haskell}
%%====================================================================
\subsection*{bbstInsert}
%%====================================================================
The function \verb~bbstInsert~ takes a balanced binary search
tree and an association pair and returns a new balanced binary search
tree which includes the pair, provided that the key is not already
found in the tree. If the key is already in the tree, an error is
signaled and evaluation is halted. The definition of
\verb~bbstInsert~ follows the definition of {\em insert\/}
of~\cite[p.\ 255]{BirdWadl88} except for the way we handle duplicate
keys.
\begin{haskell}{bbstInsert}
> bbstInsert :: (Ord a) => BalBinSTree a b -> (a,b) ->
> BalBinSTree a b
> bbstInsert Nil (x,d) = Node x d Nil Nil
> bbstInsert (Node y e l r) (x,d)
> | x < y = rebalance (Node y e (bbstInsert l (x,d)) r)
> | x == y = error "duplicate key"
> | otherwise = rebalance (Node y e l (bbstInsert r (x,d)))
\end{haskell}
The function \verb~bbstInsertQuiet~ is similar to
\verb~bbstInsert~ but doesn't complain about duplicate keys, quietly
returning the original tree. The definition of this function follows
the definition of {\em insert\/} of~\cite[p.\ 255]{BirdWadl88}.
\begin{haskell}{bbstInsertQuiet}
> bbstInsertQuiet :: (Ord a) => BalBinSTree a b -> (a,b) ->
> BalBinSTree a b
> bbstInsertQuiet Nil (x,d) = Node x d Nil Nil
> bbstInsertQuiet t@(Node y e l r) a@(x,d)
> | x < y = rebalance (Node y e (bbstInsertQuiet l a) r)
> | x == y = t
> | otherwise = rebalance (Node y e l (bbstInsertQuiet r a))
\end{haskell}
The function \verb~rebalance~ is used to bring a binary tree
that is slightly out of balance back into balance. This is the
function {\em rebal\/} of~\cite[p.\ 255]{BirdWadl88}.
\begin{haskell}{rebalance}
> rebalance :: BalBinSTree a b -> BalBinSTree a b
> rebalance t = case slope t of
> 2 -> shift_right t
> -2 -> shift_left t
> _ -> t
\end{haskell}
The function \verb~slope~ is the function {\em slope\/}
of~\cite[p.\ 253]{BirdWadl88}.
\begin{haskell}{slope}
> slope :: BalBinSTree a b -> Int
> slope Nil = 0
> slope (Node _ _ l r) = bbstDepth l - bbstDepth r
\end{haskell}
The function \verb~bbstDepth~ computes the depth of a binary
search tree; it is the function {\em depth\/} of~\cite[p.\
235]{BirdWadl88}.
\begin{haskell}{bbstDepth}
> bbstDepth :: BalBinSTree a b -> Int
> bbstDepth Nil = 0
> bbstDepth (Node _ _ l r) = 1 + (bbstDepth l `max` bbstDepth r)
\end{haskell}
The functions \verb~shift_right~ and \verb~shift_left~ are
used to rebalance a tree. These are the functions {\em shiftr\/} and
{\em shiftl\/} of~\cite[p.\ 255]{BirdWadl88}.
\begin{haskell}{shift_right}
> shift_right (Node x d l r)
> | slope l == -1 = rotate_right (
> Node x d (rotate_left l) r)
>
> | otherwise = rotate_right (
> Node x d l r)
\end{haskell}
\fixhaskellspacing\begin{haskell}{shift_left}
> shift_left (Node x d l r)
> | slope r == 1 = rotate_left (
> Node x d l (rotate_right r))
>
> | otherwise = rotate_left (
> Node x d l r)
\end{haskell}
The two rotation operations are defined as follows. These are
the functions {\em rotr\/} and {\em rotl\/} of~\cite[p.\
255]{BirdWadl88}.
\begin{haskell}{rotate_right}
> rotate_right (Node x d (Node y e t1 t2) t3) =
> Node y e t1 (Node x d t2 t3)
\end{haskell}
\fixhaskellspacing\begin{haskell}{rotate_left}
> rotate_left (Node x d t1 (Node y e t2 t3)) =
> Node y e (Node x d t1 t2) t3
\end{haskell}
%%====================================================================
\subsection*{bbstLookUp}
%%====================================================================
The function \verb~bbstLookUp~ looks for a given key in the
search tree and returns the definition associated with that key. We
restrict the key to the class \verb~Ord~ so that it can be compared to
other keys and to the class \verb~Text~ so that an informative error
message can be printed when a key is not found.
\begin{haskell}{bbstLookUp}
> bbstLookUp :: (Ord a, Show{-was:Text-} a) => BalBinSTree a b -> a -> b
> bbstLookUp Nil x = error ("key " ++ shows x " not found in tree")
> bbstLookUp (Node k d l r) x
> | x < k = bbstLookUp l x
> | x == k = d
> | x > k = bbstLookUp r x
\end{haskell}
%%====================================================================
\subsection*{bbstMember}
%%====================================================================
The function \verb~bbstMember~ looks for a given key in the
search tree and returns \verb~True~ if found and \verb~False~ if not.
This is the function {\em member\/} of~\cite[p.\ 246]{BirdWadl88}.
\begin{haskell}{bbstMember}
> bbstMember :: (Ord a) => BalBinSTree a b -> a -> Bool
> bbstMember Nil _ = False
> bbstMember (Node k d l r) x
> | x < k = bbstMember l x
> | x == k = True
> | x > k = bbstMember r x
\end{haskell}
%%====================================================================
\subsection*{bbstShowKeys}
%%====================================================================
We provide a function for displaying the tree structure along
with the keys for the special case when the key type belongs to the
type class \verb~Text~. The function uses tab characters to indent
the different levels.
\begin{haskell}{bbstShowKeys}
> bbstShowKeys :: (Show{-was:Text-} a) => Int -> BalBinSTree a b -> String
> bbstShowKeys ntabs Nil = tabs ntabs ++ "NIL\n"
> bbstShowKeys ntabs (Node x _ l r) =
> bbstShowKeys (ntabs+1) r ++
> tabs ntabs ++ show x ++ "\n" ++
> bbstShowKeys (ntabs+1) l
> tabs ntabs = take ntabs (repeat '\t')
\end{haskell}
%%======================================================================
\subsection*{bbstFlatten}
%%======================================================================
The function \verb~bbstFlatten~ returns a list of all of
key-data pairs within the binary search tree. It is basically the
function {\em labels\/} of~\cite[p.\ 247]{BirdWadl88}.
\begin{haskell}{bbstFlatten}
> bbstFlatten :: BalBinSTree a b -> [(a,b)]
> bbstFlatten Nil = []
> bbstFlatten (Node k v l r) = bbstFlatten l ++ [(k,v)] ++ bbstFlatten r
\end{haskell}
%%%%%%%%%% End of BalBinSTrees.lhs %%%%%%%%%%
This program aligns a collection of transcribed speech files
with their hidden Markov models. For each speech file, a plain-text
alignment file is produced.
\begin{haskell}{BatchAlign}
> module Main where
\end{haskell}
The module \verb~Printf~ is a library module provided with the
Chalmers and Glasgow Haskell compilers. It allows C-like printing of
integers, floating point numbers, and strings.
\begin{verbatim}
> import Printf -- needed if you want to use hbc; comment
> -- out this import if you use ghc v0.19
> -- import GhcPrintf -- needed if you want to use ghc v0.19;
> -- comment this import out if you use hbc
\end{verbatim}
The following modules are from a general library and are
described in later chapters in Part~\ref{part:library}.
\begin{verbatim}
> import NativeIO
> import PlainTextIO
\end{verbatim}
The following modules are specifically for training HMMs.
They were documented in earlier chapters (Part~\ref{part:modules}).
\begin{verbatim}
> import Phones
> import Pronunciations
> import HmmDigraphs
> import HmmDensities
> import Viterbi
> import HmmConstants
> import System.Environment
> import Data.Array
> import System.IO
> type Assoc a b = (a,b)
> (=:) a b = (a,b)
#define amap fmap
\end{verbatim}
In the main expression, the function \verb~build_tmt~
(Section~\ref{sc:tmt}) builds the tied-mixture table and then applies
a continuation.
\begin{verbatim}
> main = getArgs >>= \args ->
> case args of
> [gms_dir, -- Gaussian mixtures directory
> dmap_file, -- density map file
> dgs_file,
> utts_file] -> readFile dmap_file >>= \cs0 ->
> readFile dgs_file >>= \cs1 ->
> readFile utts_file >>= \cs2 ->
> let
>
> density_map = concat (
> map restructure (
> readElements cs0))
>
> hmm_dgs = get_log_probs (
> build_hmm_array (
> readHmms cs1))
>
> file_names = lines cs2
>
> in
> build_tmt gms_dir density_map [] >>=
> \hmm_tms -> align_each_file hmm_tms hmm_dgs
> 0 0.0 file_names
>
> _ -> error usage
> usage = "usage: BatchAlign <gms dir> <density map file> <dgs file> <utt list file>"
> -- partain: got rid of string gap because of doing -cpp
\end{verbatim}
%----------------------------------------------------------------------
\section {The Density Map}
%----------------------------------------------------------------------
The association list that tells us which HMM states have their
own mixtures and which are tied to other HMM states is stored in a
{\em density map file}. The name of this file is the second argument
on this program's command line. Figure~\ref{fg:density-map-file}
shows the first six lines of an example file. In this example, all
HMMs have three states and all states have their own mixture,
indicated by the data constructor \verb~Mix~. If the density of any
state was {\em tied\/} to that of another state
(Chapter~\ref{ch:HmmDensities}), the constructor \verb~Mix~ would be
replaced by the constructor \verb~TiedM~ along with the
``targeted'' phone and state values. The targeted HMM state must have
its own mixture; that is, a state must either have its own density or
be tied to a state that does, no multiple chaining of ties is allowed.
(This program does not explicitly check for this, it is the
responsibility of the user to make sure the density map file is
correctly structured).
\begin{figure}
\begin{verbatim}
(AA, [1 =: Mix, 2 =: Mix, 3 =: Mix])
(AE, [1 =: Mix, 2 =: Mix, 3 =: Mix])
(AH, [1 =: Mix, 2 =: Mix, 3 =: Mix])
(AO, [1 =: Mix, 2 =: Mix, 3 =: Mix])
(AW, [1 =: Mix, 2 =: Mix, 3 =: Mix])
(AX, [1 =: Mix, 2 =: Mix, 3 =: Mix])
\end{verbatim}
\caption[]{The first six lines of an example density map file.}
\label{fg:density-map-file}
\end{figure}
\begin{haskell}{DensityMap}
> data DensityMap = Mix | TiedM Phone HmmState deriving (Read, Show)
\end{haskell}
The function \verb~restructure~ restructures the input list.
Hence, the lines shown in Figure~\ref{fg:density-map-file} would be
restructured as shown in Figure~\ref{fg:restructure}.
\begin{figure}
\begin{verbatim}
[(AA, 1 =: Mix), (AA, 2 =: Mix), (AA, 3 =: Mix)]
[(AE, 1 =: Mix), (AE, 2 =: Mix), (AE, 3 =: Mix)]
[(AH, 1 =: Mix), (AH, 2 =: Mix), (AH, 3 =: Mix)]
[(AO, 1 =: Mix), (AO, 2 =: Mix), (AO, 3 =: Mix)]
[(AW, 1 =: Mix), (AW, 2 =: Mix), (AW, 3 =: Mix)]
[(AX, 1 =: Mix), (AX, 2 =: Mix), (AX, 3 =: Mix)]
\end{verbatim}
\caption[]{The results of applying the function {\tt
restructure} to each of the first six lines of the example density map
file.}
\label{fg:restructure}
\end{figure}
\begin{haskell}{restructure}
> restructure :: (Phone, [Assoc HmmState DensityMap]) ->
> [(Phone, Assoc HmmState DensityMap)]
> restructure (p,is) = [(p,k) | k<-is ]
\end{haskell}
These individual lists can be concatenated to form the density
map structure that is used to read a set of Gaussian mixture files.
%----------------------------------------------------------------------
\section {Building the Tied Mixture Table}
\label{sc:tmt}
%----------------------------------------------------------------------
The mixtures are stored in separate files, one mixture per
file, all in the same directory. The name of that directory is the
first argument on this program's command line. Each file has a name
with the syntax
\begin{quote}\it
$<$phone$>$.$<$state$>$.gm
\end{quote}
where {\it $<$phone$>$} is one of the legal values of
\verb~Phone~ and {\it $<$state$>$} is the state index. Keeping the
densities in separate files is natural since the densities are
estimated individually by collecting all the feature vectors that
aligned with a given HMM state (or those that are tied to it) into a
single file\footnote{This redistributing of feature vectors is done
using a C program not described in this report.} and then estimating
the density parameters using a C program (not described in this
report).
The function \verb~get_gm_fname~ is used to build the filename
of a file containing Gaussian mixture parameters.
\begin{haskell}{get_gm_fname}
> get_gm_fname :: String -> Phone -> Int -> String
> get_gm_fname dir p k = dir ++ "/" ++ shows p ('.' : shows k ".gm")
\end{haskell}
The function \verb~build_tmt~ builds the tied mixture table.
If the density for HMM $p$, state $k$, is tied to another mixture,
then we just pass that information through. If, however, the density
for HMM $p$, state $k$, is its own mixture, then we open the