Map-reduce Computing for Introductory Students using WebMapReduce
Summary
Designed for the first CS course, yet usable in many later courses, students use a web application called WebMapReduce(WMR) and one of several supported programming languages to learn how using multiple processes can aid in solving challenging problems faster. This module emphasizes data-parallel problems and solutions, the so-called 'embarrassingly parallel' problems where processing of input data can easily be split among several parallel processes. Examples in this include manipulation of very large data files, such as counting the frequency of words in large texts, or transforming a collection of numeric data values. Following the widely used map-reduce computational pattern, students write code for mapper and reducer functions, submit them in the web interface of WMR, enter the data file to be used as input, and submit the job to be run using cluster computation. The readings, concept presentation material, active in-class exercises, and homework exercises build on base material commonly covered in an introductory course, such as iteration over collections and working with strings, and file manipulation.
Module Characteristics
Languages Supported: Python, Scheme, C++, Java
Relevant Parallel Computing Concepts: Data Parallelism, Task Parallelism
Recommended Teaching Level: Introductory
Possible Course Use: Introduction to Computer Science
Learning Goals
- Students should be able to identify basic forms of data parallelism in computational problems.
- Students should be able to distinguish between sequential and parallel computation, and identify the practical significance of each.
Context for Use
Description and Teaching Materials
You can visit the module in your browser:
Map-reduce Computing for Introductory Students using WebMapReduce
or you can download the module in either PDF format or latex format.
PDF Format: Map-reduce Computing for Introductory Students using WebMapReduce.pdf.
Latex Format: Map-reduce Computing for Introductory Students using WebMapReduce.tar.gz.
Word Format: Map-reduce Computing for Introductory Students using
WebMapReduce.docx.
Teaching Notes and Tips
You will need WebMapReduceinstalled on some platform resouce, such as a cluster or the cloud.
To have students work with very large files, you will need to place some on your underlying hadoop installation.