Chapter 2. Advanced Usage

This section describes some of the low-level functions and how they can be used to obtain more control of the fann library. For a full list of functions, lease see the API Reference, which has an explanation of all the fann library functions. Also feel free to take a look at the source code.

This section describes different procedures, which can help to get more power out of the fann library: Adjusting Parameters, Network Design, Understanding the Error Value, and Training and Testing.

2.1. Adjusting Parameters

Several different parameters exists in an ANN, these parameters are given defaults in the fann library, but they can be adjusted at runtime. There is no sense in adjusting most of these parameters after the training, since it would invalidate the training, but it does make sense to adjust some of the parameters during training, as will be described in Training and Testing. Generally speaking, these are parameters that should be adjusted before training.

The learning rate is one of the most important parameters, but unfortunately it is also a parameter which is hard to find a reasonable default for. I (SN) have several times ended up using 0.7, but it is a good idea to test several different learning rates when training a network. It is also worth noting that the activation function has a profound effect on the optimal learning rate [Thimm and Fiesler, 1997]. The learning rate can be set when creating the network, but it can also be set by the fann_set_learning_rate function.

The initial weights are random values between -0.1 and 0.1, if other weights are preferred, the weights can be altered by the fann_randomize_weights or fann_init_weights function.

In [Thimm and Fiesler, High-Order and Multilayer Perceptron Initialization, 1997], Thimm and Fiesler state that, "An (sic) fixed weight variance of 0.2, which corresponds to a weight range of [-0.77, 0.77], gave the best mean performance for all the applications tested in this study. This performance is similar or better as compared to those of the other weight initialization methods."

The standard activation function is the sigmoid activation function, but it is also possible to use the threshold activation function. A list of the currently available activation functions is available in the Activation Functions section. The activation functions are chosen using the fann_set_activation_function_hidden and fann_set_activation_function_output functions.

These two functions set the activation function for the hidden layers and for the output layer. Likewise the steepness parameter used in the sigmoid function can be adjusted with the fann_set_activation_steepness_hidden and fann_set_activation_steepness_output functions.

FANN distinguishes between the hidden layers and the output layer, to allow more flexibility. This is especially a good idea for users wanting discrete output from the network, since they can set the activation function for the output to threshold. Please note, that it is not possible to train a network when using the threshold activation function, due to the fact, that it is not differentiable.


SourceForge.net Logo