C++ IVM Software

This page describes how to compile and gives some examples of use of the C++ Informative Vector Machine Software (IVM) available for download here.

Design Philosophy

The software is written in C++ to try and get a degree of flexibility in the models that can be used without a serious performance hit.

The software is mainly written in C++ but relies for some functions on FORTRAN code by other authors and the LAPACK and BLAS libraries.

Compiling the Software

The software was written with gcc vs 3.2.2. There are definitely Standard Template Library issues on Solaris with gcc 2.95, so I suggest that at least version 3.2 or above is used.

Part of the reason for using gcc is the ease of interoperability with FORTRAN. The code base makes fairly extensive use of FORTRAN so you need to have g77 installed.

The software is compiled by writing

$ make ivm

at the command line. Architecture specific options are included in the make.ARCHITECTURE files. Rename the file with the relevant architecture to make.inc for it to be included.

Optimisation

One of the advantages of interfacing to the LAPACK and BLAS libraries is that they are often optimised for particular architectures. The file make.atlas includes options for compiling the ATLAS optimised versions of lapack and blas that are available on a server I have access to. These options may vary for particular machines.

Cygwin

For Windows users the code compiles under cygwin. However you will need version s of the lapack and blas libraries available (see www.netlib.org. This can take some time to compile, and in the absence of any pre-compiled versions on the web I've provided some pre-compiled versions you may want to make use of (see the cygwin directory). Note that these pre-compiled versions are not optimised for the specific architecture and therefore do not give the speed up you would hope for from using lapack and blas.

Microsoft Visual C++

As of Release 0.101 the code compiles under Microsoft Visual Studio 7.1. A project file is provided in the current release in the directory MSVC/ivm. The compilation makes use of f2c versions of the FORTRAN code and the C version of LAPACK/BLAS, CLAPACK. Detailed instructions on how to compile are in the readme.msvc file. Much of the work to convert the code (which included ironing out several bugs) was done by William V. Baxter for the GPLVM code.

General Information

The way the software operates is through the command line. There is one executable, ivm. Help can be obtained by writing

$ ivm -h

which lists the commands available under the software. Help for each command can then be obtained by writing, for example,

$ ivm learn -h All the tutorial optimisations are suggested take less than 1/2 hour to run on my less than 2GHz Pentium IV machine. The first oil example runs in a couple of minutes. Below I suggest using the highest verbosity options -v 3 in each of the examples so that you can track the iterations.

Bugs

Victor Cheng writes:

" ... I've tested your IVM C++ Gaussian Process tool (IVMCPP0p12 version). It is quite useful. However, the gnuplot function seems has a problem. Every time I type the command: "Ivm gnuplot traindata name.model", an error comes out as: "Unknown noise model!". When I test this function with IVMCPP0p11 IVM, its fine, but IVMCPP0p11 has another problem that it gives "out of memory" error in test mode! So I use two vesions simultaneously. "

I'm working (as of 31/12/2007) on a major rewrite, so it's unlikely that these bugs will be fixed in the near future, however if anyone makes a fix I'll be happy to incorporate it! Please let me know.

Examples

The software loads in data in the SVM light format. Anton Schwaighofer has written a package which can write from MATLAB to the SVM light format.

Toy Data Sets

In this section we present some simple examples. The results will be visualised using gnuplot. It is suggested that you have access to gnuplot vs 4.0 or above.

Provided with the software, in the examples directory, are some simple two dimensional problems. We will first try classification with these examples.

The first example is data sampled from a Gaussian process with an RBF kernel function with inverse width of 10. The input data is sampled uniformly from the unit square. This data can be learnt with the following command.

$ ivm -v 3 learn -a 200 -k rbf examples/unitsquaregp.svml unitsquaregp.model

The flag -v 3 sets the verbosity level to 3 (the highest level) which causes the iterations of the scaled conjugate gradient algorithm to be shown. The flag -a 200 sets the active set size. The kernel type is selected with the flag -k rbf.

Gnuplot

The learned model is saved in a file called unitsquaregp.model. This file has a plain text format to make it human readable. Once training is complete, the learned kernel parameters of the model can be displayed using

$ ivm display unitsquaregp.model

Loading model file. ... done. IVM Model: Active Set Size: 200 Kernel Type: compound kernel: rbfinverseWidth: 12.1211 rbfvariance: 0.136772 biasvariance: 0.000229177 whitevariance: 0.0784375 Noise Type: Probit noise: Bias on process 0: 0.237516

Notice the fact that the kernel is composed of an RBF kernel, also known as squared exponential kernel or Gaussian kernel; a bias kernel, which is just a constant, and a white noise kernel, which is a diagonal term. The bias kernel and the white kernel are automatically added to the rbf kernel. Other kernels may also be used, see ivm learn -h for details.

For this model the input data is two dimensional, you can therefore visualise the decision boundary using

$ ivm gnuplot examples/unitsquaregp.svml unitsquaregp.model unitsquaregp

The unitsquaregp supplied as the last argument acts as a stub for gnuplot to create names from, so for example (using gnuplot vs 4.0 or above), you can write

$ gnuplot unitsquaregp_plot.gp

and obtain the plot shown below

The decision boundary learnt for the data sampled from a Gaussian process classification. Note the active points (blue stars) typically lie along the decision boundary.

The other files created are oil100_variance_matrix.dat, which produces the grayscale map of the log precisions and oil100_latent_data1-3.dat which are files containing the latent data positions associated with each label.

Feature Selection

Next we consider a simple ARD kernel. The toy data in this case is sampled from three Gaussian distributions. To separate the data only one input dimension is necessary. The command is run as follows,

$ ivm learn -a 100 -k rbf -i 1 examples/ard_gaussian_clusters.svml ard_gaussian_clusters.model

Displaying the model it is clear that it has selected one of the input dimensions,

Loading model file. ... done. IVM Model: Active Set Size: 100 Kernel Type: compound kernel: rbfardinverseWidth: 0.12293 rbfardvariance: 2.25369 rbfardinputScale: 5.88538e-08 rbfardinputScale: 0.935148 biasvariance: 9.10663e-07 whitevariance: 2.75252e-08 Noise Type: Probit noise: Bias on process 0: 0.745098

Once again the results can be displayed as a two dimensional plot,

$ ivm gnuplot examples/ard_gaussian_clusters.svml ard_gaussian_clusters.model ard_gaussian_clusters

The IVM learnt with an ARD RBF kernel. One of the input directions has been recognised as not relevant.

Semi-Supervised Learning

The software also provides an implementation of the null category noise model described in Lawrence and Jordan.

The toy example given in the paper is reconstructed here. To run it type

$ ivm learn -a 100 -k rbf examples/semisupercrescent.svml semisupercrescent.model

The result of learning is

Loading model file. ... done. IVM Model: Active Set Size: 100 Kernel Type: compound kernel: rbfinverseWidth: 0.0716589 rbfvariance: 2.58166 biasvariance: 2.03635e-05 whitevariance: 3.9588e-06 Noise Type: Ncnm noise: Bias on process 0: 0.237009 Missing label probability for -ve class: 0.9075 Missing label probability for +ve class: 0.9075

and can be visualised using

$ ivm gnuplot examples/semisupercrescent.svml semisupercrescent.model semisupercrescent

followed by

$ gnuplot semisupercrescent_plot.gp

The result of the visualisation being,

The result of semi-supervised learning on the crescent data. At the top is the result from the null category noise model. The bottom shows the result from training only on the labelled data only with the standard probit noise model. Purple squares are unlabelled data, blue stars are the active set.

Page last modified on Fri Jan 5 12:36:11 GMT 2007.