research-article

Open access

Halide: decoupling algorithms from schedules for high-performance image processing

Authors:

Jonathan Ragan-Kelley,

Saman Amarasinghe, and

Frédo DurandAuthors Info & Claims

Communications of the ACM, Volume 61, Issue 1

Pages 106 - 115

https://doi.org/10.1145/3150211

Published: 27 December 2017 Publication History

All formats PDF

Abstract

Writing high-performance code on modern machines requires not just locally optimizing inner loops, but globally reorganizing computations to exploit parallelism and locality---doing things such as tiling and blocking whole pipelines to fit in cache. This is especially true for image processing pipelines, where individual stages do much too little work to amortize the cost of loading and storing results to and from off-chip memory. As a result, the performance difference between a naive implementation of a pipeline and one globally optimized for parallelism and locality is often an order of magnitude. However, using existing programming tools, writing high-performance image processing code requires sacrificing simplicity, portability, and modularity. We argue that this is because traditional programming models conflate the computations defining the algorithm with decisions about intermediate storage and the order of computation, which we call the schedule.

We propose a new programming language for image processing pipelines, called Halide, that separates the algorithm from its schedule. Programmers can change the schedule to express many possible organizations of a single algorithm. The Halide compiler then synthesizes a globally combined loop nest for an entire algorithm, given a schedule. Halide models a space of schedules which is expressive enough to describe organizations that match or outperform state-of-the-art hand-written implementations of many computational photography and computer vision algorithms. Its model is simple enough to do so often in only a few lines of code, and small changes generate efficient implementations for x86, ARM, Graphics Processors (GPUs), and specialized image processors, all from a single algorithm.

Halide has been public and open source for over four years, during which it has been used by hundreds of programmers to deploy code to tens of thousands of servers and hundreds of millions of phones, processing billions of images every day.

References

[1]

Adams, A., Talvala, E., Park, S.H., Jacobs, D.E., Ajdin, B., Gelfand, N., Dolson, J., Vaquero, D., Baek, J., Tico, M., Lensch, H.P.A., Matusik, W., Pulli, K., Horowitz, M., Levoy, M. The Frankencamera: An experimental platform for computational photography. ACM Trans. Graph. 29, 4 (2010), 29:1--29:12.

Abstract

References

Cited By

Index Terms

Recommendations

Automatically scheduling halide image processing pipelines

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Digital Edition

Magazine Site

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations