Scaling Relationships in Back-Propagation Learning: Dependence on Training Set Size

Gerald Tesauro

Scaling Relationships in Back-Propagation Learning: Dependence on Training Set Size

Gerald Tesauro
Center for Complex Systems Research and Department of Physics,
University of Illinois at Urbana-Champaign,
508 South Sixth Street, Champaign, IL 61820, USA

Abstract

We study the amount of time needed to learn a fixed training set in the "back-propagation" procedure for learning in multi-layer neural network models. The task chosen was 32-bit parity, a high-order function for which memorization of specific input-output pairs is necessary. For small training sets, the learning time is consistent with a -power law dependence on the number of patterns in the training set. For larger training sets, the learning time diverges at a critical training set size which appears to be related to the storage capacity of the network.