PRTools A Matlab toolbox for pattern recognition Imported pages from 37Steps

### PRTools Introductory Examples

Make sure PRTools is in the path. This should not result in an error (copy and paste statements like this to your Matlab command line) :

`which ldc`

Get rid of old figures

`delfigs`

Take an arbitrary 2D dataset of two classes and plot it

`A = gendatb; % The banana set`
` scatterd(A); % Show a scatterplot`

The dataset `A` is a PRTools ‘object’. The formal name of this variable type is `prdataset`, which is the name of its constructor. Typing `A`, without ‘`;`‘ gives some info:

`A`

It can be converted to a structure by

`struct(A)`

Its contents can be listed by

`+A`

This shows all information that can possibly be stored in a dataset. PRTools routines make use of it in computations and annotation of plots.

The 2 feature values of the first 5 objects can be inspected by

`+A(1:5,:)`

Mark them in the scatterplot

`hold on; scatterd(A(1:5,:),'o');`

Compute a simple classifier: Fisher’s Linear Discriminant

`W1 = A*fisherc;`

Note that in PRTools `A*PROC([],PARS)` is an alternative for `PROC(A,PARS)`. The advantage of the first notation is that `PROC([],PARS)` can be stored in a variable and supplied as a parameter to a function. PRTools operations on datasets are called mappings. The formal name of this variable type is `prmapping`, according to the name of its constructor. `fisherc` in the above example is an untrained mapping. Here it is trained by the dataset `A`, resulting in the trained mapping, the classifier `W1`.

The error on the training set `A` can be found by:

`A*W1*testc`

Here the original dataset `A` used for training the classifier is used for testing it as well. Formally this is not such a good idea (why not?). The classifier can be plotted it in the scatterplot

`plotc(W1)`

Compute a 3rd degree polynomial classifier based on `fisherc` and plot it

`W2 = A*polyc([],fisherc,3);`
` plotc(W2,'r')`

The error on the training set

`A*W2*testc`

Now, let us split the dataset in a separate part for training and one for testing

`[AT,AS] = gendat(A,0.5)ĀĀĀĀĀĀĀĀĀĀĀĀĀ ``% 50-50 split in trainset and testset`
` W = AT*{fisherc,polyc([],fisherc,3)} ``% Train classifiers by AT `
` ``testc(AS,W)ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ % TestĀ classifiers by AS`

### Exercises:

(skip these exercises if you continue with the introductory scatterplot examples)

• Repeat the above three statements for a much larger dataset, e.g. of 1000 objects per class, generated by `A = gendatb([1000 1000])`
• Try other classifiers, e.g. the nearest neighbor classifier `knnc` or a decision tree, `dtc`. Note that most classifiers have useful default parameters, e.g. `W = A*dtc` will do.
• Find in the user guide another 2D dataset, generate data, produce a scatterplot, compute a classifier and plot it.