Department of Computing and Information Science, Queen's University, Kingston, Canada K7L 3N6
Abstract:
We derive cost formulae for three different parallelisation techniques for training both supervised and unsupervised networks. These formulae are parameterised by properties of the target computer architecture. It is therefore possible to decide both which technique is best for a given parallel computer, and which parallel computer best suits a given technique. One technique, exemplar parallelism, is far superior to almost all parallel computer architectures. Formulae also take into account optimal batch learning as the overall training approach. Cost predictions are made for several of today's popular parallel computers.