First Weekly Report GSoC 2012

Let’s do the weekly reports kick-off of this summer!

Although GSoC started officially a couple of days ago, on May 21st, I have been working on the project for about two weeks. Next, I am going to summarize what progress has been made during this time.

First of all, based on the code skeletons [1] that Nico wrote, I started with the design of the SO framework in Shogun. The design decisions taken during this phase are summarized in the attached class diagram [2]. In the diagram, the classes in light green existed in Shogun previous to this project whereas the classes filled with light red are brand new. Among the new classes, the class CStructuredModel seeks to offer functionality to put together all the application-dependent parts of a SO problem instance.  The CLossFunction class became very handy since I just needed to extend it with a few methods in order to support the functionality required by SO. The idea of this class is to provide a generic interface for well-defined loss functions (e.g. Hinge loss). Needless to say, the design shown in the diagram is very likely to evolve. For example, CStructuredModel is currently implemented to be used with function pointers for some of its members and this will change to use a more understandable interface with classes.

Initial SO class diagram.

In addition, classes (labels/CStructuredLabels and lib/CStructuredData) to provide labels with structure (e.g. sequences, graphs) have been added. This is probably the feature that distinguishes the most SO learning from the other strategies already present in Shogun.

Finally, the optimization algorithm presented in [3]. This is still work in progress and the code is in CPrimalMosekSOSVM. The main difficulty I have found here is that, in order to solve the quadratic program (QP) that arises, we need to use a non Open Source tool since libqp does not support all the required constraints (in particular inequality constraints of the type A \cdot x \leq b for the QP with box constraints). I have started to write some code in CPrimalMosekSOSVM that makes use of MOSEK to solve the QP. This piece of code is still rather poor and it is just in my local repository.

The current working plan is, in this order: finish the code in CPrimalMosekSOSVM mentioned above (I have set a deadline for this on Friday, June 1st), prepare the first case of use with multiclass SVMs, extend the design creating a class for the \arg \max computation and another one for the structured loss function \Delta(y_{pred}, y_{true}).

References
[1] Gist with main concepts of the framework written by Nico Görnitz. https://gist.github.com/2634487.
[2] Structured Output framework – Class Diagram.
[3] Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y. Support vector machine learning for interdependent and structured output spaces.
[4] SO learning branch in git:  https://github.com/iglesias/shogun/tree/so.

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s