Compiling High Performance Recursive Filters

Chaurasia, Gaurav; Ragan-Kelley, Jonathan; Paris, Sylvain; Drettakis, George; Durand, Frédo

Compiling High Performance Recursive Filters

dc.contributor.author	Chaurasia, Gaurav	en_US
dc.contributor.author	Ragan-Kelley, Jonathan	en_US
dc.contributor.author	Paris, Sylvain	en_US
dc.contributor.author	Drettakis, George	en_US
dc.contributor.author	Durand, Frédo	en_US
dc.contributor.editor	Petrik Clarberg and Elmar Eisemann	en_US
dc.date.accessioned	2016-01-19T10:32:56Z
dc.date.available	2016-01-19T10:32:56Z
dc.date.issued	2015	en_US
dc.description.abstract	Infinite impulse response (IIR) or recursive filters, are essential for image processing because they turn expensive large-footprint convolutions into operations that have a constant cost per pixel regardless of kernel size. However, their recursive nature constrains the order in which pixels can be computed, severely limiting both parallelism within a filter and memory locality across multiple filters. Prior research has developed algorithms that can compute IIR filters with image tiles. Using a divide-and-recombine strategy inspired by parallel prefix sum, they expose greater parallelism and exploit producer-consumer locality in pipelines of IIR filters over multidimensional images. While the principles are simple, it is hard, given a recursive filter, to derive a corresponding tile-parallel algorithm, and even harder to implement and debug it. We show that parallel and locality-aware implementations of IIR filter pipelines can be obtained through program transformations, which we mechanize through a domain-specific compiler. We show that the composition of a small set of transformations suffices to cover the space of possible strategies. We also demonstrate that the tiled implementations can be automatically scheduled in hardwarespecific manners using a small set of generic heuristics. The programmer specifies the basic recursive filters, and the choice of transformation requires only a few lines of code. Our compiler then generates high-performance implementations that are an order of magnitude faster than standard GPU implementations, and outperform hand tuned tiled implementations of specialized algorithms which require orders of magnitude more programming effort-a few lines of code instead of a few thousand lines per pipeline.	en_US
dc.description.sectionheaders	High-Performance Data Processing	en_US
dc.description.seriesinformation	High-Performance Graphics	en_US
dc.identifier.doi	10.1145/2790060.2790063	en_US
dc.identifier.isbn	978-1-4503-3707-6	en_US
dc.identifier.pages	85-94	en_US
dc.identifier.uri	https://doi.org/10.1145/2790060.2790063	en_US
dc.publisher	ACM Siggraph	en_US
dc.subject	Image processing	en_US
dc.subject	IIR filter	en_US
dc.subject	GPU computation	en_US
dc.subject	parallelism	en_US
dc.subject	memory locality	en_US
dc.subject	compiler	en_US
dc.subject	domain	en_US
dc.subject	specific language	en_US
dc.subject	high performance	en_US
dc.title	Compiling High Performance Recursive Filters	en_US

Collections

High-Performance Graphics 2015

Compiling High Performance Recursive Filters

Files

Collections