PRTools
A Matlab toolbox for pattern recognition Imported pages from 37Steps

Home   Guide   Software   Glossary   FAQ   Blog   About

PRTools, elements, operations, user commands, introductory examples, advanced examples

Mappings

On this page the mapping is introduced. It is one of the key elements of PRTools.

Mapping background

Mappings pave the way from object observations to the recognition of the pattern class they belong to. They define the normalization of the raw measurements, the extraction of the initial features, the reduction to a small, relevant feature set, the estimation of densities of classes or models, the transformation to the output space of a classifier, which can be posteriors, confidences or distances, and finally the selection of the most likely class. In every step a mapping procedure defines how input data is transformed into output data. In programming terms:

output_data = mapping_procedure(input_data,parameters)

In PRTools input_data and output_data are usually defined a dataset or a datafile, but occasionally they may be a cell array or an array of doubles (in which rows represent the objects), or even a set of scalars or a string. There are many mapping procedures available, e.g. procedures for handling mappings, image handling, image operations, image features, feature selection, fixed mappings, trainable mappings, density estimation..

Mappings can be of the following types: , fixed_cell, , , ,

Classifiers are a special kind of trainable mapping as they map feature data (input_data) on class labels (output_data) or on class confidences. There is a large set of classifiers available:  linear and quadratic, SVMs, neural networks, various (e.g. decision trees and nearest neighbor rules) and combining classifiers.

Mapping definition

Mappings are typically defined inside functions that compute or apply a mapping, e.g. inside classifiers or routines for feature reduction. This section is thereby just given as a background information for the interested reader. There is no need to study this for users that use PRTools from the command line or write simple experimental scripts on the basis of the available routines.

The mapping constructor looks like

W = prmapping(command,type,data,labels,size_in,size_out)

The defined mapping W will usually be used for mapping a dataset A into another space, resulting into a dataset B:

B = A*W

Consequently, if the A has a size of [m,size_in] and W has a size of [size_in,size_out], then the mapping B will have a size of [m,size_out]. Thereby B = A*W is consistent with a Matlab expression in which B, A and W are matrices of the corresponding sizes.

PRTools uses the values of size_in and size_out mainly for error detection and returning appropriate error messages. In many cases the values may be replaced by 0, e.g. when the sizes are unknown at the moment of defining the mapping. This will not effect a correct execution, but may result in badly understandable error messages in case a dataset A of a wrong size is used for input.

It he W defines an untrained mapping a dataset A can be used for training W, resulting into a trained mapping V:

V = A*W

In this case the size of V is irregularly [size_in,k] in which k is determined by the training procedure. In case of a trainable classifier the size of  V is [size_in,c] with c the number of classes.

Mapping overload

Mappings define transformations of vector spaces of possibly different dimensionality. Their size [size_in,size_out] corresponds to these dimensionalities. Consequently they behave somewhat similar as matrices of this size. Linear (affine) mappings are almost identical to such matrices but includes a shift operation. In addition they may carry several other types of information as explained in the mapping definition.

Structure

A full list of all information stored in a mapping can be found by converting a mapping into a structure.

 a = iris;   w = pcam(a,2);   struct(w)%    mapping_file: 'affine'%    mapping_type: 'trained'%            data: [1x1 struct]%          labels: [2x1 double]%         size_in: 4%        size_out: 2%           scale: 1%        out_conv: 0%            cost: []%            name: 'PCA to 2D'%            user: []%         version: {[1x1 struct]  '07-Apr-2011 15:21:01'}

All fields have a corresponding set-command (e.g. setdata) to store it and a get-command (e.g. getdata) to retrieve it. In some cases not the exact fields are retrieved but some derived data. In the table more information is given.

> The fields of the mapping structure
mapping_file The name of the command (m-file) that executes the mapping.
mapping_type The type of the mapping: fixed, untrained, trained or combiner.
data A structure or a cell array with all information needed to execute the mapping.
labels Array with features used to annotate the features (featlab) of the output dataset.
size_in Input dimensionality. Number of features of the input dataset.
size_out Output dimensionality. Number of features of the output dataset.
scale A scalar or a vector to scale the output features.
out_conv Type of desired output conversion.
cost Classification costs in case the mapping defines a classifier.
name String with a name, just used for annotation of plots and other outputs.
user User defined field.
version PRTools version number and date of creating the mapping.

Examples

A very simple example of a mapping is the routine featsel which selects a pre-determined set of features. In the following example 10 objects in 5 dimensional space are generated. After that the feature 1, 2 and 5 are selected. The means before and after selection are computed and show to make clear what is going on.

 % generate 10 objects in 5D, mean is [1 2 3 4 5], small variances
    A = gauss(10,[1:5],0.01*eye(5));
    % show the rounded values of the mean of A
    disp(round(mean(A)));
    % select features 1, 2 and 5
    B = featsel(A,[1 2 5]);
    % show the rounded values of the mean of B
    disp(round(mean(B)));

This shows

 1     2     3     4     5
     1     2     5

as B contains just the features 1 2 5 of A.

An important property of the way mappings are implemented in PRTools is that the following statements are equivalent:

 B = featsel(A,[1 2 5]);
    B = A*featsel([],[1 2 5]);
    W = featsel([],[1 2 5]); B = A*W

which is realized by overloading the matrix multiplier * for mappings. It has to be read as a piping symbol: the dataset A is fed into the mapping procedure and replaces the placeholder [] as the first parameter. This only holds for a first parameter in a mapping routine in which instead of a dataset variable a [] is given.

The advantage of the construct in the last line of the above example is that a mapping, which is in fact a procedure, together with some chosen parameters, can be stored in a variable, there called W. This variable can be used as an input for other PRTools routines that operate on arbitrary mappings.

elements: datasets datafiles cells and doubles mappings classifiers mapping types.
operations:datasets datafiles cells and doubles mappings classifiers stacked parallel sequential dyadic.
user commands:datasets representation classifiers evaluation clustering examples support routines.
introductory examples:IntroductionScatterplotsDatasets Datafiles Mappings Classifiers Evaluation Learning curves Feature curves Dimension reductionCombining classifiers Dissimilarities.
advanced examples.