Complex Systems

Symmetrization of Information-theoretic Error-measures Applied to Artificial Neural Network Training Download PDF

Joseph C. Park
2d3D Incorporated,
2003 North Swinton Avenue,
Delray Beach, FL 33444

Salahalddin T. Abusalah
University of West Florida,
Department of Electrical and Computer Engineering,
Pensacola, FL 32514

Abstract

The typical training scenario for an artificial neural network involves minimization of a cost-function in terms of the network output variables. Alternatively, the minimization may be done on the basis of the probability distributions of the network output, specifically, in terms of an informa- tional entropy. A large class of such entropy based cost-functions are not suitable for training as they provide a directed divergence of mutual information between the network output and the desired behavior, that is, they are one-sided or asymmetric divergence functions. It is shown that a "symmetrization" of such divergence functions can transform them into suitable cost-functions for gradient-descent based optimizations. Nine such divergence measures are explicitly detailed and employed in training a multilayer perceptron to demonstrate their utility as pragmatic cost- functionals.