Ripple: Asynchronous Programming for Spatial Dataflow Architectures
Souradip Ghosh, Carnegie Mellon University
Spatial dataflow architectures (SDAs) have emerged as highly performant and highly efficient general-purpose computer architecture. Nonetheless, most SDAs are still used mostly for regular workloads with few abstractions that cover irregular applications. We observe that irregular programs have abundant parallelism, but commonly used sequential languages (e.g. C) poorly encode parallel work, especially when executing programs using the dataflow execution model. This creates an *abstraction inversion*, forcing a dataflow machine to conservatively recover the parallelism and excessively serialize work.
We find that applications written in an asynchronous, pipeline-parallel fashion are amenable to dataflow architectures; they map well to the execution model and to existing hardware resources such as hardware queues between processing elements.
We introduce Ripple: a parallel programming model (with first-class queueing, streaming, and synchronization primitives), dataflow ISA, compiler, and hardware extensions to improve performance and reduce code size for irregular programs on dataflow architectures. Ripple achieves an average of 3.5x speedup, reduces code size by an average of 2.1x and slashes dynamic instructions by half across important graph analytics and linear algebra workloads.