HMM

Object
- HMM

```
public class HMM
extends Object
```
An instance of the HMM class represents a hidden Markov model, including the model's states, transitions, and the optimal annotation (i.e., parse) of an observation sequence.

Constructor Summary

Constructors
Constructor and Description
`HMM(String states_fileName, String transitions_fileName, String sequence_fileName)` Creates a hidden Markov model based on the specified states, transitions, and observation sequence.

Method Summary

Methods
Modifier and Type	Method and Description
`void`	`annotationToFile(String filename)` Outputs the optimal annotation and its score to a file.
`String`	`arrayToString(double[][] a)` Returns a String representation of a 2D array.
`String`	`arrayToString(int[][] a)` Returns a String representation of a 2D array.
`void`	`determineOptimalAnnotation()` Determines the optimal annotation of the observation sequence.
`void`	`fillInTableEntry(int row, int col)` Fills in the entry at row `row` and column `col` in both the dynamic programming table and the backtracking table.
`int[][]`	`getBacktrackTable()` Returns the backtracking table used by the Viterbi algorithm.
`double`	`getScoreOfOptimalAnnotation()` Returns the score (the natural logarithm of the probability) of the optimal annotation of the observation sequence.
`double[][]`	`getTable()` Returns the dynamic programming table used by the Viterbi algorithm.
`static void`	`main(String[] args)` The `main` method generates a `HMM` based on the state and transition information in two files specified by command line arguments.
`String`	`optimalAnnotationToString()` Returns a String representation of the optimal annotation.
`String`	`toString()` Returns a String representation of this HMM.
`void`	`viterbi()` Implements the Viterbi algorithm by filling in a dynamic programming table and a backtracking table.

Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - HMM
```
public HMM(String states_fileName,
   String transitions_fileName,
   String sequence_fileName)
```
    Creates a hidden Markov model based on the specified states, transitions, and observation sequence.
    The states of the HMM are read in from the file specified by the first parameter. The transitions between states of the HMM are read in from the file specified by the second parameter. The observation sequence emitting by the HMM is read in from the file specified by the third parameter. It is assumed that all paths (state sequences) through the HMM start in the starting state, which is the first state (i.e., the state at index 0). It is assumed that the first state emits characters only of length 1. It is assumed that for any state in the HMM, all characters emitted by that state are the same length. The constructor initializes the HMM, but it does not compute via the Viterbi algorithm the optimal annotation for the specified observation sequence.
    
    Parameters:
    states_fileName - the name of a tab-delimited file specifying the states of the HMM
    transitions_fileName - the name of a tab-delimited file specifying the transitions between states of the HMM
    sequence_fileName - the name of a FASTA formatted file containing an observation sequence, i.e., a sequence emitted by the HMM for which we want to determine the optimal annotation
- Method Detail
  - fillInTableEntry
```
public void fillInTableEntry(int row,
                    int col)
```
    Fills in the entry at row row and column col in both the dynamic programming table and the backtracking table.
    Each entry in the dynamic programming table corresponds to the natural logarithm of a probability. For the dynamic programming table, the table entry at row row and column col corresponds to the probability (to be precise, the natural logarithm of the probability) of the best annotation of the first (col+1) characters in the observationSequence assuming the best annotation of these characters ends in state row. While the dynamic programming table enables computation of the "probability" of the optimal annotation, the backtracking table enables computation of the optimal annotation itself. The backtracking table entry at row row and column col corresponds to the state prior to state row in the optimal annotation, assuming state row is the end state after observing the first (col+1) characters in the observationSequence. Any dynamic programming table entry corresponding to a state that is unreachable should be populated with a value of negative infinity corresponding to the natural logarithm of a probability of zero.
    
    Parameters:
    row - the row of the table entry to be filled in
    col - the column of the table entry to be filled in
  - viterbi
```
public void viterbi()
```
    Implements the Viterbi algorithm by filling in a dynamic programming table and a backtracking table.
    The method populates every entry in a dynamic programming table as well as every entry in a backtracking table. The tables have one row for each state in the HMM. The tables have one column for each character in the observationSequence. For the dynamic programming table, the table entry at row i and column j corresponds to the probability (to be precise, the natural logarithm of the probability) of the best annotation of the first (j+1) characters in the observationSequence assuming the best annotation of these characters ends in state i. While the dynamic programming table enables computation of the "probability" of the optimal annotation, the backtracking table enables computation of the optimal annotation itself. The backtracking table entry at row i and column j corresponds to the state prior to state i in the optimal annotation, assuming state i is the end state after observing the first (j+1) characters in the observationSequence. The following assumptions are made:
    - It is assumed that all paths (state sequences) through the HMM start in the starting state, which is the first state (i.e., the state at index 0).
    - It is assumed that the first state emits characters only of length 1.
    - It is assumed that for any state in the HMM, all characters emitted by that state are the same length.
    Since it is assumed that all annotations start in the first state, every other state besides the first state in the first column of the dynamic programming table should be populated with a value of negative infinity corresponding to the natural logarithm of a probability of zero.
  - determineOptimalAnnotation
```
public void determineOptimalAnnotation()
```
    Determines the optimal annotation of the observation sequence.
    Once the dynamic programming table and backtracking table have been filled in, this method uses the tables to construct a String representation of the optimal annotation. The maximum value in the final column of the dynamic programming table indicates the final state in the optimal annotation. The backtracking table then can be used to backtrack from this final state to determine the optimal state sequence (i.e., annotation). The optimal annotation should be the same length as the observation sequence and each state in the optimal annotation should be represented by the first character in the state's name.
  - getScoreOfOptimalAnnotation
```
public double getScoreOfOptimalAnnotation()
```
    Returns the score (the natural logarithm of the probability) of the optimal annotation of the observation sequence.
    
    Returns:
    a double representing the score of the optimal annotation
  - optimalAnnotationToString
```
public String optimalAnnotationToString()
```
    Returns a String representation of the optimal annotation. The annotation includes the observation sequence. The returned String is in FASTA format.
    
    Returns:
    a String representing the optimal annotation.
  - toString
```
public String toString()
```
    Returns a String representation of this HMM. The String representation of the HMM consists of a String representation of each state in the HMM.
    
    Overrides:
    
    toString in class Object
    
    Returns:
    a String representing this HMM.
  - getTable
```
public double[][] getTable()
```
    Returns the dynamic programming table used by the Viterbi algorithm.
    
    Returns:
    a 2D array of doubles corresponding to the dynamic programming table.
  - getBacktrackTable
```
public int[][] getBacktrackTable()
```
    Returns the backtracking table used by the Viterbi algorithm.
    
    Returns:
    a 2D array of ints corresponding to the backtracking table.
  - arrayToString
```
public String arrayToString(double[][] a)
```
    Returns a String representation of a 2D array.
    A helpful method for debugging. A String representation of the dynamic programming table can be obtained using this method. However, the method is only practical for small 2D arrays.
    
    Parameters:
    a - a 2D array of doubles
    
    Returns:
    a String representation of the 2D array
  - arrayToString
```
public String arrayToString(int[][] a)
```
    Returns a String representation of a 2D array.
    A helpful method for debugging. A String representation of the backtracking table can be obtained using this method. However, the method is only practical for small 2D arrays.
    
    Parameters:
    a - a 2D array of ints
    
    Returns:
    a String representation of the 2D array
  - annotationToFile
```
public void annotationToFile(String filename)
```
    Outputs the optimal annotation and its score to a file.
    
    Parameters:
    filename - the name of the output file
  - main
```
public static void main(String[] args)
```
    The main method generates a HMM based on the state and transition information in two files specified by command line arguments. An observation sequence is determined from a FASTA formatted file also specified as a command line argument. After the model is built based on the information in the three files, the Viterbi algorithm along with backtracking is used to determine the optimal annotation of the observation sequence. The optimal annotation, as well as its score, is output to the screen and to a file named annotation.txt. The application can be executed as follows:
    java HMM states.txt transitions.txt observation.txt
    
    Parameters:
    args - an array of Strings representing any command line arguments

Class HMM

Constructor Summary

Method Summary

Methods inherited from class Object

Constructor Detail

HMM

Method Detail

fillInTableEntry

viterbi

determineOptimalAnnotation

getScoreOfOptimalAnnotation

optimalAnnotationToString

toString

getTable

getBacktrackTable

arrayToString

arrayToString

annotationToFile

main