Finally, a new project update

It is about time to say something again about how the project is going. First of all, as a matter of excuse (probably a bad one though) I want to say why the blog has not been updated lately. There have been a couple of weeks in which I have been debugging the code for the primal formulation of the SO-SVM based on MOSEK using a simple multiclass classification example. Those two weeks were basically that, debugging testing, debugging again, testing, … it is not interesting stuff to tell the truth but very necessary in any case! However, last week I started to work in the – probably – most interesting part of the project, the Hidden Markov – Support Vector Machine [1].

I like to think of HM-SVMs as a particular instance of the general Structured Output (SO) problem I have been talking about in the previous posts. In SO learning we are trying to find a function f that takes an object from the space \mathcal{X}  and outputs and object from the space \mathcal{Y}. In HM-SVMs the objects of \mathcal{Y} are sequences of discrete data (e.g. integers). In order to deal with the structure of the sequences properly, we extract features of the data in the space \mathcal{Y} together with the data in the space \mathcal{X}. This is done via a function we denote as \Psi(x, y) whose definition appears in the tutorial about HM-SVM that Nico, my GSoC mentor, wrote [2, pages 4 and 5].

Another important piece of the SO-SVM is the argmax function. In HM-SVMs this function is computed using the well-known Viterbi algorithm. I had never studied before this algorithm but I was happy to discover its similar structure to the forward-backward algorithm used to train an HMM from data that I have already studied and implemented [3]. I found this tutorial [4] on the Viterbi algorithm very useful and easy to understand. The implementation I have done in SHOGUN of this algorithm is based on the one in the HMSVM toolbox [5] (in particular, it is in this file [6]).

One more piece of code I have implemented in the toolbox for the HM-SVM is the loss function, commonly denoted as \Delta. In this case, we use the Hamming distance defined for strings. In our case, the inputs of the loss function are sequences of the same length so the Hamming loss is simple the count of the number of elements that are different in the sequences (think of it as the minimum number of modifications one may do to one of the sequences to make it equal to the other. Providing that the sequences are of the same length, the only modification allowed is to change the value of an element in one of the sequences).

These three functions, together with some data structures to hold features and labels used by the HM-SVM, constitute what I have implemented so far into SHOGUN about this interesting discriminative model. The plan now is to correct with Nico the possible mistakes I have introduced into the code and to generate some training and test data so I can start debugging phase! My idea is to port this function [7] from HM-SVM toolbox that generates random two-state data.

Comments and suggestions are very welcome!

You can find the code I have been talking about in github:
https://github.com/iglesias/shogun/tree/so/src/shogun/
(mainly in the files structure/HMSVMModel.[cpp|h], features/MatrixFeatures.[cpp|h], structure/HMSVMLabels.[cpp|h]).

References

[1] Altun, Y., Tsochantaridis, I., Hofmann, T. Hidden Markov Support Vector Machines.
http://cs.brown.edu/~th/papers/AltTsoHof-ICML2003.pdf
[2] SHOGUN Hidden Markov SO-SVM written by Nico Görnitz.
https://dl.dropbox.com/u/11020840/shogun/hmsvm.pdf
[3] The code of my HMM implementation for one of the homeworks of the Artificial Intelligence course that I took at KTH. https://github.com/iglesias/hmm-ai-ducks
[4] Viterbi algorithm tutorial.
http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/
html_dev/viterbi_algorithm/s1_pg2.html
[5] Entry in mloss.org for the HM-SVM toolbox written by Gunnar Raetsch and Georg Zeller.
http://mloss.org/software/tags/hmsvm/
[6] Viterbi algorithm implementation in the HM-SVM toolbox.
https://dl.dropbox.com/u/11020840/shogun/best_path.cpp
[7] Data generation for a two-state model in the HM-SVM toolbox.
https://dl.dropbox.com/u/11020840/shogun/simulate_data.m 

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s