Timing Operations in CUDA

Joel Adams, Calvin College, and Jeffrey Lyman, Macalester College


The purpose of this document is to teach students the basics of CUDA programming and to give them an understanding of when it is appropriate to offload work to the GPU.

Through completion of Vector Addition, multiplication, square root, and squaring programs, students will gain an understanding of when the overhead of creating threads and copying memory is worth the speedup of GPU coding.

This activity contains three parts, linked below. First there is a short introduction to setting up code in CUDA to run on a GPU. Then you will try running vector addition code on your GPU machine. Lastly, you will experiment with various types of operations and large sizes of arrays to determine when it is worthwhile to use a GPU for general-purpose computing.

Learning Goals

  • Students will be able to time CUDA code and experiment with various data operations and problem sizes.
  • Students will be able to describe the circumstances when using a GPU for general purpose computing is beneficial.

Context for Use

This is an activity that can be completed in a lab or active classroom setting.

Description and Teaching Materials

You can visit the module in your browser:

Timing CUDA Operations

or you can download the module in either PDF format or latex format.

PDF Format: TimingCUDAOperations.pdf.
Latex Format: TimingCUDA.tar.gz.
Word Format: TimingCUDAOperations.docx.

Teaching Notes and Tips

You will need the CUDA development tools installed on a machine with an nVIDIA GPU card that is capable of general purpose computing.


No assessment instruments available.

References and Resources

You may want to refer to the nVIDA CUDA developer documentation for more information. This activity is designed to stand on its own, however.

Comment? Start the discussion about Timing Operations in CUDA