WG211/M12Fischer
Automatic Generation of Data Distribution Strategies for Data Parallel Programs by Bernd Fischer, joint work with Tristan Aubrey-Jones
Computer systems with distributed memory architectures like GPUs and clusters have become increasingly common, and are often used for large data parallel problems. However they remain considerably more difficult to program than architectures with a single memory and thread of control. Right at the heart of this difficulty is the problem of partitioning and distributing the data structures and computation to minimize expensive memory access/network traffic, and the considerable additional complexity this issue adds to programs. We present a technique to automatically generate data distribution strategies for programs written in a small data parallel DSL. This involves formalizing the concept of distributions for collections in our language, using these to characterize the distributed behaviours of various implementations of high-level data parallel combinators, and using a constraint solving algorithm to search for possible combinations of these distributed building blocks to implement a given program. In this talk we present a simplified version of this formalization, and a worked example