Introduction

This article is about a Microsoft C# 4.0 WPF implementation of a framework that allows to create, train and test convolutional neural networks against the MNIST Dataset of handwritten digits. There is a magnificent article by Mike O'Neill on the The Code Project about the same subject. Whithout his great article and C++ demo code this project wouldn't exist. I also relied heavily on Dr. Yann LeCun's paper: Gradient-Based Learning Applied to Document Recognition to understand more about the principles of convolutional neural networks and the reason why they are so successful in the area of machine vision. Mike O'Neill uses Patrice Simards implementation where the subsampling step is integrated in the structure of the convolutional layer itself. Dr. Yann Lecun uses in his LeNet-5 a seperate subsampling step and also uses non-fully connected layers. The framework presented allows to use all types of layers and has an additional Max-Pooling layer that you can uses instead of plain Average-Pooling. The default squashing function used is tanh() and the value to train for is set to 0.8 because it is the second derivative of the used non-linearity so there is less saturation. The input images are all normalised (-1/1) and the input layer is at a fixed 32x32 window.

The Code

The main goal of this project was to build an enhanced and extended version of Mike O'Neill excellent C++ project. This time written in C# 4.0 and using WPF with a simple MVVM pattern as the GUI instead of Windows Forms. I also used the Windows API Code Pack for better integration with Windows 7. So Visual Studio 2010 and Windows Vista SP2 are the minimum requirements to use my application. I also made maximal use of the parallel functionality offered in C# 4.0.

Using the code

Example code to construct a LeNet-5 network in my code (see the InitializeDefaultNeuralNetwork() function in MainViewWindows.xaml.cs)

       
NeuralNetworks network = new NeuralNetworks("LeNet-5", 0.8D, LossFunctions.MeanSquareError, 0.02D);
network.Layers.Add(new Layers(network, LayerTypes.Input, 1, 32, 32));
network.Layers.Add(new Layers(network, LayerTypes.Convolutional, KernelTypes.Sigmoid, 6, 28, 28, 5, 5));
network.Layers.Add(new Layers(network, LayerTypes.Subsampling, KernelTypes.AveragePooling, 6, 14, 14, 2, 2));

List<bool> mapCombinations = new List<bool>(16 * 6) 
{  
true,  false, false, false, true,  true,  true,  false, false, true,  true, true,  true,  false, true,  true,
true,  true,  false, false, false, true,  true,  true,  false, false, true,  true,  true,  true,  false, true,
true,  true,  true,  false, false, false, true,  true,  true,  false, false, true,  false, true,  true,  true,
false, true,  true,  true,  false, false, true,  true,  true,  true,  false, false, true,  false, true,  true,
false, false, true,  true,  true,  false, false, true,  true,  true,  true,  false, true,  true,  false, true,
false, false, false, true,  true,  true,  false, false, true,  true,  true,  true,  false, true,  true,  true
};

network.Layers.Add(new Layers(network, LayerTypes.Convolutional, KernelTypes.Sigmoid, 16, 10, 10, 5, 5, new Mappings(network, 2, mapCombinations)));
network.Layers.Add(new Layers(network, LayerTypes.Subsampling, KernelTypes.AveragePooling, 16, 5, 5, 2, 2));
network.Layers.Add(new Layers(network, LayerTypes.Convolutional, KernelTypes.Sigmoid, 120, 1, 1, 5, 5));
network.Layers.Add(new Layers(network, LayerTypes.FullyConnected, KernelTypes.Sigmoid, 10));
network.InitWeights();


Design View

Design.png

This is the Design View where you can see how the network is defined and see the weights of the convolutional layers.


Training View

Training.png

This is the Training View where you train the network. The 'Play' button give you the 'Select Training Parameters' dialog where you can define basic training parameters. The 'Training Schema Editor' button gives you the posibility to fully define your own training schema's and to save and load them as you want. At any time the training can be easily aborted.

SelectTrainingParameters.PNG

TrainingSchemaEditor.PNG


Testing View

Testing.png

In the Testing View you can test your network and get a graphical confusion matrix that represents all the misses.


Calculate View

Calculate.PNG

In Calculate View we can test a single digit with the desired properties and fire them through the network and get a graphical view of all the outputs in every layer.


I would love to see a DirectCompute 5.0 integration for offloading the highly parallel task of learning the neural network to a DirectX 11 compliant GPU if one is available. But I’ve never programmed with DirectX or any other shader based language before so if there’s anyone out there with some more experience in this area, any help is very welcome. I made an attempt to use a simple MVVM structure in this WPF application. In the Model folder you can find the files for the neural network class and also a DataProvider class who deals with loading and providing the necessary MNIST training and testing samples. There is also a NeuralNetworkDataSet class that is used by the project to load and save neural network definitions, weights or both (full) from or to a file on disk. Then there is the View folder that contains the four different PageViews in the project and a global PageView who act as a container for the different views (Design, Training, Testing and Calculate). In the ViewModel folder you find a PageViewModelBase class where the corresponding four ViewModels are derived from. All the rest is found in MainViewWindows.xaml.cs class. Hope there’s someone out there who can actually use this code and improve on it. Extend it with an unsupervised learning stage for example (encoder/decoder construction) or implement a better loss-function (Negative log likelihood instead of MSE). Extend to more test databases, other than just only the handwritten MNIST. Make use of more advanced squashing functions, etc.

推荐.NET配套的通用数据层ORM框架:CYQ.Data 通用数据层框架