Sunteți pe pagina 1din 4

TEChnologIES

Editor: Michael A. Gray, gray@american.edu

GettinG Started with GPU ProGramminG


By Michael A. Gray
This tutorial describes a step-by-step procedure for programming a Macintosh Nvidia GPU. General scientific programmers with some C knowledge can get started in parallel processing application development with relative ease.

recent surge in articles describes the use of graphics processing units (GPUs) for scientific computation.1,2 This isnt surprising given that the processor resources available on GPUseven those included with standard desktop computersare surprisingly powerful and put high-performance computing within everyones reach. Even if you dont want to use your workstations onboard GPU, an add-on is reasonably inexpensive and easy to install. But many of us, myself included, lack experience in programming GPUs, so were reluctant to start using them. A series of excellent discussions about how researchers have used GPUs in particular applications appears elsewhere.3,4 But those involved in these discussions havent provided a tutorial on GPU programming, so they lack step-by-step descriptions on where to start. Even understanding the example applications can be daunting for someone whos not in the field. Therefore, I devote this article to a step-by-step walkthrough of how to use a GPU on a MacBook Pro with an Nvidia graphics unit. I chose the Nvidia graphics unit as my example GPU because Nvidia makes available a substantial development system called CUDA (compute unified device architecture) for programming its units. Because Macintosh computers often come equipped with Nvidia

graphics units, I focus my discussion on the Mac. The CUDA system that runs on the Mac OS X requires the Leopard version, so my comments refer to Mac OS X 10.5.6.

What Is a Graphics Unit?

The monitor for a modern computer displays a wide variety of visuals, ranging from DVD and video players to simple text-editing windows. A separate unit within the computer called the graphics unit drives this highperformance display. Early graphics units came on separate boards that installers plugged into the computers bus, which gave rise to the terminology graphics cards. They contained onboard memory for caching display information and an onboard processor for processing it. Thinking of the monitor as a two-dimensional array of pixels, each of which has a color, then the graphics units role is to produce the next pixel array fast enough to give the appearance of continuous motion. Because the pixel array size is large1,440 900 = 1,296,000 pixels in my MacBook Proadditional dedicated processing is required for the update. The rise of more challenging display needs, such as modern 3D gaming with realistic lighting, caused developers to increase the units power. The operations of reading a set of vertices defining an objects

display from the CPU, computing the vertices new color (shading in the vertices), and assembling them into fragments for final processing is a natural SOMD (single-operation, multiple-data) model because graphics processors process the vertices in a largely independent fashion. Developers redesigned the graphics units to contain several processors that would operate in parallel to stream the data through the unit to the output frame buffer. GPU programming takes advantage of these streaming units to conduct classic SOMD-type scientific computations of the same kind that were commonly done in the past on vector processor computers.5

Determining the Macs Graphics Resources

To determine your machines graphics resources, you must first determine your Macs graphics unit model. Because the Mac must run Mac OS X 10.5.6 (Leopard), the basic window is the standard Aqua GUI, which displays a fixed menu bar at the top; clicking on the Apple icon on the left side opens a dropdown menu. The first item on this menu is About This Mac, which displays basic information; selecting More Info gives detailed information about the hardware. Halfway down is the Graphics/Displays item, which shows the graphics units characteristics. The graphics

July/August 2009

Copublished by the IEEE CS and the AIP

1521-9615/09/$25.00 2009 IEEE

61

TEChnologIES

unit must be on the list of CUDAenabled units found at www.nvidia. com/object/cuda_learn_products.html to be a candidate for programming. I performed the testing for this tutorial using a MacBook Pro that has a Nvidia GeForce 8600M GT with a 256-Mbyte memory. By going to Nvidias Web site (www.nvidia.com) and searching for the GeForce 8600 GTs specifications, youll find that the GeForce 8600M GT is a GPU with 32 stream processors and a shader clock speed of 1.45 GHz. So, this is a significant computational resource. Of course, every computers GPU is programmable because they all execute a graphics driver program. But, in this case, I was most interested in finding a GPU with a general-purpose programming interface thats reasonably easy to master without becoming a graphics unit expert. After determining that my GPU was CUDA-enabled, my next step was to investigate the CUDA menu under the Technologies main menu on Nvidias Web site. (The What is CUDA dropdown menu provides an introduction to the Nvidia generalpurpose programming system for its GPUs.) CUDA lets a GPU programmer use the C language to program the GPU and is free to download, so it fulfills the requirement of a generalpurpose programming environment thats easy for the general scientific programmer to use.

download button for the CUDA Toolkit 2.1 for Mac OS, and finally the button for CUDA SDK 2.1 code samples for Mac OS X. After downloading the first of these (a 21.8-Mbyte file named cudatoolkit_2.1_macos.pkg), the installer automatically launches and shows the usual license agreement pages. It then installs a 59.4-Mbyte system of files on the Mac, and puts up a completion screen with the warning that you should add the line
export PATH=<toolkit _install_path>/cuda/ bin:$PATH

to the shell environment variables and that the toolkit install path is /usr/ local. When I performed the tests at the end of this article, I had to modify the line above and add an additional line in the shell environment. The actual shell modifications I used for the testing were
export PATH=.:/usr/local/ cuda/bin:/usr/local/cuda/ lib:$PATH export DYLD_LIBRARY_ PATH=/usr/local/cuda/ lib:$DYLD_LIBRARY_PATH

When the second of these two downloadsa 45.2-Mbyte file named


NVIDIA_SDK10_CUDA_2.1_macosx. pkgis complete, the installer launch-

Installing the CUDA Programming System

The CUDA Downloads page links to a CUDA driver, a toolkit, and software developer kit (SDK) code samples. To download and install CUDA, go to the downloads page, and select Mac OS for the operating system and CUDA 2.1 for the version, then the

es and installs CUDA in the Developer/CUDA folder. The release notes call for the execution of the make file to create all the code examples. To go further, you must use the Terminal application and work at the Unix level.

Using the Terminal Application

The Macs developers built the Mac OS X 10.5.6 on top of a Unix system

named Darwin. The Aqua GUI is just an interactive shell that lets the user operate Darwin without explicitly working in a low-level Unix command shell. However, to program in CUDA, you do need a low-level command shell, so the programmer has to use Terminal, an application that connects you directly to a Unix shell. There are different kinds of Unix shells, each with its own command syntax: the Bourne shell (called bash) is the default shell for Terminal. You can execute Terminal. appfound in a Finder window in the Applications/Utilities folder or by using the Go/Utilities menu to open a bash shell. At this level, you must insert the two export commands into the shell environment, which you can do by editing the .bash_profile file using a Unix editor (such as emacs). As an alternate method that doesnt require modifying the .bash_ profile, a programmer can use the Text Editor application to make a file named myBash in the top-level Finder. Then, you use this file to modify the runtime environment by entering the command source myBash in a bash shell. Doing this inserts the names into their appropriate paths for use during the shells lifetime. This approachs drawback is that you must use this command to start each shell session. After preparing the shell environment, you must change to the /Developer/CUDA directory to make the SDK examples. Using the bash command cd, /Developer/ CUDA moves into the correct directory so that you can execute the make command, which builds the example projects and compiles them using the nvcc compiler. The following list shows partial output from an execution:

62

Computing in sCienCe & engineering

dhcp-103-206:CUDA mg$ make q - obj/release/bank_checker. cpp.o q - obj/release/cmd_arg_ reader.cpp.o ... make -C projects/ alignedTypes/ make -C projects/asyncAPI/ ...

If this step produces compiled examples in the Projects directory, then the toolkit is installed and working.

With the CUDA system installed and the PATH variable properly set up, you can proceed to the GPU program compilation. The CUDA programming system consists of a set of C libraries that you use to wrap up the actual GPU coding in simple C function calls. So, all thats left is to write GPU programs in CUDA C and let the library functions do the work. But, before starting compilation, make sure that the base C compiler (gcc 4.0) is available for CUDA to use. On the Mac OS X 10.5.6 system, this is a relatively simple task because the gcc compiler is available on the Leopard installation disk as part of the Xcode package. To see if the Xcode developer tools are installed on the Mac, you must open a Finder window and select the boot volume name. This area has several folders, including one named Developer, which is where youll find the installed Xcode developer tools. Select the Developer folder and then select the subfolder named usr/bin. This folder should contain the gcc 4.0 compiler. If you dont find it there, install Xcode Developer Tools Essentials on the Leopard installation disk.

Preparing to Compile GPU Files

To install the developer tools, you insert the Leopard installation DVD, open the Optional Installs folder, open the Xcode Developer Tools Essentials folder, and then launch the file XcodeTools.mpkg. When the installer runs, go through the various licensing and destination screens to the Installation Type screen. To just install the developer tools, select Customize at the bottom to open a Custom Install screen. At a minimum, you must install the developer tools essentials, core reference library, and the Unix development support. Then, select Install for a standard installation or Customize for a custom installation.

Installing the Xcode Developer Tools Essentials

you unzipped the file and copy them into a username login folder. Open a Terminal window, source your myBash file, and cd ~ (this moves to the Username directory). Listing the files there (use the command ls -al) should show the exercise programs. The Exercise Instructions PDF has a list of commands to compile the exercise code with the CUDA C compiler. For the Mac, look at the instructions for Linuxyoull find four variantsbe sure to use the build release variant to test the GPU operation. Executing the command
nvcc cudaMallocAndMemcpy.cu -o cudaMallocAndMemcpy

Compiling GPU Files

A fast and effective way to check the GPUs execution is to compile and execute the test exercises found in the CUDA U Web page at www.nvidia. com/object/cuda_education.html CUDA U is part of the CUDA Web site that contains several tutorial articles on CUDA programming and a selection of sample CUDA code exercises. From the introductory CUDA training courses, download the Instructions for Exercises PDF file and the exercises (for Linux and Mac) .tar file, then unzip the .tar file to produce a set of basic tutorials that are also excellent tests to see if the GPU is working. These exercises run the basic operations of copying GPU data to or from GPU memory, launching GPU kernels, and a basic array reversal GPU program. The CUDA source code is in the files with the .cu qualifiers. To test the GPU operation, go to the Finder solutions subfolder of the Exercises directory created when

on the next line to run the memory program. If the GPU is working, the execution produces the output Correct!, which signals that the GPU program is ready for use. Its worthwhile and very easy to compile and run the two additional programs in the Exercises folder myFirstKernel.cu and reverseArray_multiblock_fast.cu using the same method as the one I mentioned earlier.

will give a file named cudaMalloc AndMemcpy.o that you can execute

he CUDA U site has a wealth of information about Nvidia GPU programming, including movie files, PDFs, and examples. The SDK has many more examples in the Projects files. A large source of C examples for gaining GPU programming experience is the classic Numerical Recipes in C, by William H. Press and his colleagues.6 I hope this short tutorial will be helpful in starting you on the road to programming your Macs GPU. A GPU provides you

July/August 2009

63

TEChnologIES

PURPOSE: The IEEE Computer Society is the worlds largest association of computing professionals and is the leading provider of technical information in the field. MEMBERSHIP: Members receive the monthly magazine Computer, discounts, and opportunities to serve (all activities are led by volunteer members). Membership is open to all IEEE members, affiliate society members, and others interested in the computer field. COMPUTER SOCIETY WEB SITE: www.computer.org OMBUDSMAN: Email help@computer.org. Next Board Meeting: 17 Nov. 2009, New Brunswick, NJ, USA

with your own parallel processing engine for porting your single-processor applications into the world of parallel processing. Acknowledgments
I thank Michael Levin of American University for his assistance with testing the programs used in this article.

EXECUTIVE COMMITTEE
President: Susan K. (Kathy) Land, CSDP* President-Elect: James D. Isaak;* Past President: Rangachar Kasturi;* Secretary: David A. Grier;* VP, Chapters Activities: Sattupathu V. Sankaran; VP, Educational Activities: Alan Clements (2nd VP);* VP, Professional Activities: James W. Moore; VP, Publications: Sorel Reisman; VP, Standards Activities: John Harauz; VP, Technical & Conference Activities: John W. Walz (1st VP);* Treasurer: Donald F. Shafer;* 20082009 IEEE Division V Director: Deborah M. Cooper; 20092010 IEEE Division VIII Director: Stephen L. Diamond; 2009 IEEE Division V Director-Elect: Michael R. Williams; Computer Editor in Chief: Carl K. Chang
*voting member of the Board of Governors nonvoting member of the Board of Governors

COMPUTER SOCIETY OFFICES


Washington, D.C.: 2001 L St., Ste. 700, Washington, D.C. 20036 Phone: +1 202 371 0101; Fax: +1 202 728 9614; Email: hq.ofc@computer.org Los Alamitos: 10662 Los Vaqueros Circle, Los Alamitos, CA 90720-1314 Phone: +1 714 821 8380; Email: help@computer.org Membership & Publication Orders: Phone: +1 800 272 6657; Fax: +1 714 821 4641; Email: help@computer.org Asia/Pacific: Watanabe Building, 1-4-2 Minami-Aoyama, Minato-ku, Tokyo 107-0062, Japan Phone: +81 3 3408 3118; Fax: +81 3 3408 3553 Email: tokyo.ofc@computer.org

References
1. J. Kurzak et al., The Playstation 3 for high-Performance Scientific Computing, Computing in Science & Eng., vol. 10, no. 3, 2008, pp. 8487. 2. P. Messmer et al., gPUlib: gPU Computing in high-level languages, Computing in Science & Eng., vol. 10, no. 5, 2008, pp. 7073. 3. g. Stantchev et al., Using graphics Processors for high-Performance Computation and Visualization of Plasma Turbulence, Computing in Science & Eng., vol. 11, no. 2, 2009, pp. 5259. 4. I.S. Ufimtsev et al., graphical Processing Units for Quantum Chemistry, Computing in Science & Eng., vol. 10, no. 6, 2008, pp. 2634. 5. A. Watt and F. Policarpo, Advanced Game Development with Programmable Graphics Hardware, A.K. Peters, ed., 2005. 6. W.h. Press et al., Numerical Recipes in C, Cambridge Univ. Press, 1988.

IEEE OFFICERS
President: John R. Vig; President-Elect: Pedro A. Ray; Past President: Lewis M. Terman; Secretary: Barry L. Shoop; Treasurer: Peter W. Staecker; VP, Educational Activities: Teofilo Ramos; VP, Publication Services & Products: Jon G. Rokne; VP, Membership & Geographic Activities: Joseph V. Lillie; President, Standards Association Board of Governors: W. Charlton Adams; VP, Technical Activities: Harold L. Flescher; IEEE Division V Director: Deborah M. Cooper; IEEE Division VIII Director: Stephen L. Diamond; President, IEEE-USA: Gordon W. Day

BOARD OF GOVERNORS
Term Expiring 2009: Van L. Eden; Robert Dupuis; Frank E. Ferrante; Roger U. Fujii; Ann Q. Gates, CSDP; Juan E. Gilbert; Don F. Shafer Term Expiring 2010: Andr Ivanov; Phillip A. Laplante; Itaru Mimura; Jon G. Rokne; Christina M. Schober; Ann E.K. Sobel; Jeffrey M. Voas Term Expiring 2011: Elisa Bertino, George V. Cybenko, Ann DeMarle, David S. Ebert, David A. Grier, Hironori Kasahara, Steven L. Tanimoto

EXECUTIVE STAFF
Executive Director: Angela R. Burgess; Director, Business & Product Development: Ann Vu; Director, Finance & Accounting: John Miller; Director, Governance, & Associate Executive Director: Anne Marie Kelly; Director, Information Technology & Services: Carl Scott; Director, Membership Development: Violet S. Doan; Director, Products & Services: Evan Butterfield; Director, Sales & Marketing: Dick Price

Michael A. Gray is an associate professor at American University. his research interests include computer science and physics. gray has a PhD in physics from Pennsylvania State University. hes a member of the ACM, the IEEE, and the IEEE Computer Society. Contact him at gray@american.edu.

For more information on these or any other computing topics, please visit the IEEE Computer Society Digital Library at www.computer.org/
revised 1 May 2009

publications/dlib/.

64

Computing in sCienCe & engineering

S-ar putea să vă placă și