PRTools A Matlab toolbox for pattern recognition Imported pages from 37Steps

### DisTools examples: Spectra

Some exercises are defined on the basis of spectral data. It is assumed that readers are familiar with PRTools and will consult the following pages where needed:

Spectra arise in many pattern recognition application areas like remote sensing, chemometrics, seismics and speech. Spectra behave very similar to histograms as they can be considered as counts (of, for instance fotons) on a set of pre-defined wavelengths. Spectra are even more simple than  one-dimensional images, as the latter are usually shift invariant, while spectra are naturally aligned by the frequencies.

Like the use of pixels in images the number of wavelength used for defining a spectrum or the number of bins used for a histogram may on one side yield a better, more complete description of the objects while many pattern recognition procedures suffer from a higher dimensionality. The dissimilarity representation may just profit from a higher sampling rate as it may result in a more accurate dissimilarity, while the dimensionality remains constant (equal to the size of the representation set). In comparison with images, distances between spectra are better defined due to the natural aligment of spectra.

Depending on the application various distance measures are studied. The routine specproxm offers measures based on  on L1, L2, the original spectra, cumulative spectra and derivatives. The routines specdata gives access to a set of data examples, among others, from chemometrics and remote sensing.

dismeas = {'L1','L2','C1','C2','D1','D2'}; % define distance measures
S = specdata('tecator'); % load a spectral dataset
E = zeros(1,6);          % space for LOO 1NN errors
for j=1:6
D = S*specproxm(S,dismeas{j},1); % compute dissimilarity matrices
E(j) = nne(D);           % determine 1NN error
end
disp(E)                  % display

### Exercise

Analyze one or more spectral dataset offered by specdata.

1. Plot a number of spectra. Are there class differences visible?
2. Study scatterplots based on pcam and fisherm.
3. Determine some learning curves for 1NN and at least one other classifier.
4. Compare these learning curves with those found for a dissimilarity approach based on the above mentioned distance measures. Make your own choice for the representation set.
5. In an advanced study smaller representation sets may be compared with a feature representation found by resampled spectra.