Results

Note

This project was done within the scope of a class in Artificial Intelligence given by UniDistance and the Idiap Research Institute. It is merely an exercise on reproducibility and in no way published work, and should not be interpreted as such.

Hypothesis

Our hypothesis which initiated the project the following:

Random Forests can give similar resulting prediction models to MLP Neural Networks on the M-NIST digit dataset in significantly less time.

To test this, we elaborated this project, which pulls data from the M-NIST digit dataset, trains and tests different algorithms with different configurations and provides reports to compare the algorithms.

By running our code with the simple command

mnist-predict -ts test_suite_example.json

We can observe these results. A partial view (without redundant algorithm parameters) of the generated report is the following table :

Iteration

Algorithm

Train Accuracy

Test Accuracy

Standard Error

Training time (s)

0

MLP

0.984

0.9561

0.004

98.26

1

MLP

0.996

0.9744

0.003

153.64

2

Random Forest

0.798

0.8054

0.008

1.7

3

Random Forest

0.999

0.9665

0.004

22.86

Conclusion

In conclusion, we accept our hypothesis. The data shows that Random Forest algorithms can indeed converge to acceptable results in much shorter training time. Here, the most notable algorithms are the MLP in iteration 1, and the Random Forest in iteration 3. However, this conclusion does come with several limitations.

Future Works

More in depth study could probably speed up training time and accuracy of both algorithms. For instance, using the generated views showing which pixels bring the most importance to the algorithm’s final decision as seen in Figures 1 and 2, we can determine that many pixels on the outer edges are mostly irrelevant to the final decision. Irrelevant pixels can be filtered out in pre-processing to speed up the MLP.

_images/rf_feat_importance.png
Figure 1

The feature importance of the Random Forest Algorithm in Iteration 3

_images/mlp_coefficients.png
Figure 2

The input layer’s coefficients into the first 16 neurons of the first hidden layer of the MLP in Iteration 1