Complex Systems

Hyperplane Dynamics as a Means to Understanding Back-Propagation Learning and Network Plasticity Download PDF

Frank J. Śmieja
German National Research Centre for Computer Science (GMD),
Schloss Birlinghoven, 52357 St. Augustin, Germany

Abstract

The processing performed by a feed-forward neural network is often interpreted through use of decision hyperplanes at each layer. The adaptation process, however, is normally explained using the picture of gradient descent of an error landscape. In this paper the dynamics of the decision hyperplanes is used as the model of the adaptation process. An electro-mechanical analogy is drawn where the dynamics of hyperplanes is determined by interaction forces between hyperplanes and the particles that represent the patterns. Relaxation of the system is determined by increasing hyperplane inertia (mass). This picture is used to clarify the dynamics of learning, and goes some way toward explaining learning deadlocks and escaping from certain local minima. Furthermore, network plasticity is introduced as a dynamic property of the system, and reduction of plasticity as a necessary consequence of information storage. Hyperplane inertia is used to explain and avoid destructive relearning in trained networks.