Let’s have some fun, fun with optimization!

Basically during the last days my work in the project has been strictly focused on the implementation of the optimization algorithm for SO-SVM (**S**tructured **O**utput – **S**upport **V**ector **M**achine). This algorithm is presented and fully analysed in [1]. In the paper the dual formulation of the optimization problem is explored whereas we are concerned with the primal version of it at this point in the project. The reason why the primal has been selected stems from its properties regarding complexity and efficiency; the dual requires an explicit expansion of the weight vector and the memory requirements are quadratic rather than linear as with the primal formulation. In [2] appears the primal formulation of the algorithm even though the main topic of this article is beyond our current approach for SO learning.

The algorithm belongs to a class called *Cutting Plane Algorithms. *These optimization methods are based on the idea of refining iteration by iteration the feasible set of solutions using the most violated constraints. In [3] there is a video lecture and around the instant 6:20 it is nicely introduced the intuition of how cutting plane methods work, which I really like.

As it was commented in the previous post, a quadratic program (QP) appears in the algorithm. In particular, every iteration of the algorithm we need to solve a QP that looks like

We are using MOSEK to solve this QP [4]. The QP shown above needs to be translated to a more general formulation as the one provided by MOSEK though. In order to do this, I have created an interface to MOSEK from Shogun inspired by the CPLEX one that is in the toolbox.

Right now the algorithm is close to be finished. However, nothing has been properly tested this far and this makes me feel uncomfortable. Once the algorithm is finally ready, I will start to prepare the first and simplest case of use of the framework, multiclass classification. Debugging and testing time will come then!

*References*

[1] Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y. Support vector machine learning for interdependent and structured output spaces.

[2] Finley, T., Joachims, T. Training Structural SVMs when Exact Inference is Intractable.

[3] Video lecture by Thomas Finley on Training Structural SVMs when Exact Inference is Intractable. http://videolectures.net/icml08_finley_tssvm/

[4] MOSEK C API Quadratic Optimization. http://docs.mosek.com/6.0/capi/node007.html#329747704

[5] Nowozin, S., Lampert, C. H. Structural Learning and Prediction in Computer Vision. The section 6 is mainly related with this part of the project.