Examples and Tutorials

We provide examples using six different datasets (15-Scene, Corel, MNIST, Yale, KTH, and 20NG) to reproduce the results obtained in the original research paper. Note that slight differences from the original paper are due to some changes (batch-based optimization, faster estimation of the scaling factor, port to PyTorch). For all the reported results a Linear SVM is trained, unless otherwise stated.

Prerequisites

To run the examples you have to install PySEF (please also refer to Installation):

pip install pysef

Before running any of the following examples, please download the pre-extracted feature vectors from the following drobpox folder into a folder named data. Then, run the downloaded example (refer to the following sections) from the same root folder that contains the data folder. Alternatively, you can simply update the data path in the code (please refer to Data loading for more details on data loading).

Please also install matplotlib, which is also needed for some of the following examples/tutorials:

pip install matplotlib

Linear approximation of a high-dimensional technique

In unsupervised_approximation_multiple.py we demonstrate how to recreate the 50-d PCA using just 10 dimensions. The proposed method (abbreviated as S-PCA) is compared to the 10-d PCA method. To run the example, simply download the aforementioned file and execute it:

python unsupervised_approximation_multiple.py

The following results should be obtained:

Dataset PCA S-PCA
15-scene 61.94% 67.20%
Corel 36.18% 38.55%
MNIST 82.88% 84.71%
Yale 56.69% 65.16%
KTH 76.82% 86.56%
20NG 39.73% 45.79%

Supervised dimensionality reduction

In supervised_reduction_multiple.py we demonstrate how to perform supervised dimensionality reduction using the SEF. Two different setups are used: a) S-LDA-1, where the same dimensionality as the LDA method is used, and b) S-LDA-2, where the number of dimensions is doubled. To run the example, simply download the aforementioned file and execute it:

python supervised_reduction_multiple.py

The following results should be obtained:

Dataset LDA S-LDA-1 S-LDA-2
15-scene 66.76% 75.58% 76.98%
Corel 37.28% 42.58% 42.33%
MNIST 85.66% 89.03% 89.27%
Yale 93.95% 92.50% 92.74%
KTH 90.38% 90.73% 91.66%
20NG 63.57% 70.35% 70.25%

Providing out-of-sample-extensions

In linear_outofsample_mutiple.py we demonstrate how to provide out-of-sample extensions for the ISOMAP technique. Two different setups are used: a) cS-ISOMAP-1, where the dimensionality of the projection is set to 10, and b) cS-ISOMAP-2, where the dimensionality of the projection is set to 20. The proposed method is compared to performing linear regression (LR). To run the example, simply download the aforementioned file and execute it:

python linear_outofsample_mutiple.py

The following results should be obtained:

Dataset LR cS-ISOMAP-1 cS-ISOMAP-2
15-scene 58.29% 67.26% 69.04%
Corel 34.93% 38.70% 40.45%
MNIST 85.11% 85.93% 93.37%
Yale 35.97% 62.09% 82.58%
KTH 67.20% 86.56% 89.80%
20NG 33.14% 41.52% 47.97%

Kernel extensions can be also used (kernel_outofsample_mutiple.py). The following results should be obtained:

Dataset KR cKS-ISOMAP-1 cKS-ISOMAP-2
15-scene 60.10% 63.89% 68.14%
Corel 36.22% 37.85% 42.27%
MNIST 89.48% 88.30% 91.35%
Yale 46.94% 29.84% 62.25%
KTH 72.31% 78.22% 83.31%
20NG 44.50% 41.57% 48.81%

SVM-based analysis

PySEF can be used to mimic the similarity induced by the hyperplanes of the 1-vs-1 SVMs and perform DR (svm_approximation_multiple.py). The proposed technique is combined with a lightweight Nearest Centroid Classifier. Two different setups are used: a) S-SVM-A-1, where the dimensionality of the projection is set to the number of classes, and b) S-SVM-A-1, where the dimensionality of the projection is set to twice the number of classes. To run the example, simply download the aforementioned file and execute it:

python svm_approximation_multiple.py

The following results should be obtained:

Dataset Original S-SVM-A-1 S-SVM-A-1
15-scene 59.67% 74.47% 74.10%
Corel 37.40% 42.15% 41.77%
MNIST 80.84% 86.71% 86.80%
Yale 13.95% 84.44% 88.63%
KTH 79.72% 92.24% 94.09%
20NG 60.79% 65.37% 65.78%

PySEF tutorials

To run the tutorials you have to install the Jupyter Notebook (also refer to Installing Jupyter):

pip install jupyter

Then, download the notebook tutorial you are interested in. Currently two tutorial are available: a) Supervised dimensionality reduction, and b) Defining new dimensionality reduction methods. After that, navigate to the folder that contains the notebook and start the Jupyter Notebook:

jupyter notebook

Finally, use your browser to navigate to the default URL of Jupyter web app (http://localhost:8888) and select the notebook. Please make sure that you appropriately update the folder that contains the MNIST dataset when running the tutorials (refer to Data loading for more details), or just create an empty folder named data in the same root folder as the notebook and the dataset will be automatically downloaded.