References: EE599/699 GPU Computing

All materials posted here are for personal use only. This page has not yet been updated for Fall 2011.

Basic SIMD Architecture & Concepts

Architecture of a massively parallel processor (PDF)

This paper describes Ken Batcher's SIMD MPP design at Goodyear Aerospace.

@inproceedings{285977,
 author = {Kenneth E. Batcher},
 title = {Architecture of a massively parallel processor},
 booktitle = {ISCA '98: 25 years of the international symposia on Computer architecture (selected papers)},
 year = {1998},
 isbn = {1-58113-058-9},
 pages = {174--179},
 location = {Barcelona, Spain},
 doi = {http://doi.acm.org/10.1145/285930.285977},
 publisher = {ACM Press},
 address = {New York, NY, USA},
 }

DAP -- a distributed array processor (PDF)

This paper describes the ICL DAP, another early SIMD machine.

@inproceedings{803971,
 author = {S. F. Reddaway},
 title = {a distributed array processor},
 booktitle = {ISCA '73: Proceedings of the 1st annual symposium on Computer architecture},
 year = {1973},
 pages = {61--65},
 doi = {http://doi.acm.org/10.1145/800123.803971},
 publisher = {ACM Press},
 address = {New York, NY, USA},
 }

Thinking Machines CM-2 (PDF)

A (relatively late) version of the "Connection Machine Model CM-2 Technical Summary, Version 6.0, November 1990." This includes description of the (CM-200) floating-point hardware to the design.

Activity Counter Implementation Of Enable Logic (PDF)

This paper describes a clever method for handling nested tracking of nested SIMD enable/disable without use of a bit stack.

@inproceedings{ keryell93activity,
    author = "Roman Keryell and Nicolas Paris",
    title = "Activity Counter: New Optimization for the Dynamic Scheduling of {SIMD} Control",
    booktitle = "Proceedings of the 1993 International Conference on Parallel Processing",
    volume = "II - Software",
    publisher = "CRC Press",
    address = "Boca Raton, FL",
    pages = "II--184--II--187",
    year = "1993",
    url = "citeseer.ist.psu.edu/keryell93activity.html" }

Multimedia Extensions For Microprocessors: SIMD Within A Register (HTML)

One of the first talks on the concepts of SWAR... originally presented in February 1997 at Purdue University.

Compiling for SIMD within a Register (PDF)

One of the best generic descriptions of the concepts of SWAR. The above link is direct from Springer-Verlag.

@inproceedings{663771,
 author = {Randall J. Fisher and Henry G. Dietz},
 title = {Compiling for SIMD Within a Register},
 booktitle = {LCPC '98: Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing},
 year = {1999},
 isbn = {3-540-66426-2},
 pages = {290--304},
 publisher = {Springer-Verlag},
 address = {London, UK},
 }

GPU Computing In General

GPGPU (HTML)

This site contains a variety of news, paper links, etc., about use of GPUs (Graphic Processing Units) for General-Purpose computing -- commonly known as GPGPU. Note that general-purpose is a misnomer; it is really about programming GPUs for tasks that are not entirely graphical.

A Performance-Oriented Data Parallel Virtual Machine for GPUs (PDF)

The first paper on ATI's CTM (Close To the Metal) software interface to GPUs (Graphics Processing Units) for general-purpose computing. Referenced directly from ATI's site, which is now part of AMD's site. There are also slides and a full manual at the ATI/AMD site.

GPU Programming Support

We'll be starting with NVIDIA's CUDA environment. The latest version is 4.0. Note that the version numbers are different for the various components of the CUDA system, and do not have any obvious relationship to the Compute Capability levels that are supported. However, version numbers are consistent across the supported platforms.


EE599/699 GPU Computing