## Scaling Relationships in Back-propagation Learning

**Gerald Tesauro****Bob Janssens***Center for Complex Systems Research, University of Illinois at Urbana-Champaign**508 South Sixth Street, Champaign, IL 61820, USA*

#### Abstract

We present an empirical study of the required training time for neural networks to learn to compute the parity function using the back-propagation learning algorithm, as a function of the number of inputs. The parity function is a Boolean predicate whose order is equal to the number of inputs. We find that the training time behaves roughly as where is the number of inputs, for values of between 2 and 8. This is consistent with recent theoretical analyses of similar algorithms. As a part of this study we searched for optimal parameter tunings for each value of . We suggest that the learning rate should decrease faster than , the moment coefficient should approach 1 exponentially, and the initial random weight scale should remain approximately constant.