3858:13157
ShareConcurrent Access to Data Structures
Summary
This module enables students to experiment with creating a task-parallel solution to the problem of crawling the web by using Java threads and thread-safe data structures available in the java.util.concurrent package. We provide enough code for a sequential solution, which the students can use as the basis for creating a task-parallel threaded solution. We provide some code to get them started with the threaded solution, because in our case this is the first threaded program they try.
Module Characteristics
Languages Supported: Java
Relevant Parallel Computing Concepts: Shared Memory
Recommended Teaching Level: Introductory/Intermediate, perhaps in a data structures or algorithms course.
Learning Goals
- Students should be able to define what a thread-safe data structure object is
- Given a definition of an object, students should be able to distinguish whether it is properly thread-safe or not and why
- Students should be able to describe how locking is used to ensure proper concurrent access to data structures and how deadlock is avoided
- Given a problem that can be improved using parallel access to shared data, students should be able to use built-in thread-safe classes to implement parallel solutions and determine the speedup of their solution
Context for Use
Description and Teaching Materials
The code for students to complete is in ConcurrentDataStructures.jar (Jar Archive 131kB Jul19 14)
You can visit the module in your browser:
Concurrent Access to Data Structures
or you can download the module in either PDF format or latex format.
PDF Format: Concurrent Access to Data Structures.pdf.
Latex Format: Concurrent Access to Data Structures.tar.gz.
Word Format: Concurrent Access to Data Structures.docx.
Teaching Notes and Tips
This module utilizes the java.util.concurrent package.
- main() is within RunSpider.java (original sequential version) and RunThreadedSpider.java (new version for students to complete)
- The Spider and concurrentSpider folders contain the starting .java and .class files for students to used in exercises in the lab
- SpiderComplete and concurrentSpiderComplete contain the finished code (please request)
The concept of the lab/homework is that students should have worked out a sequential version of the Spider already. You could do this by explaining it to them, or by having them work on that first. We have used this in the past by staging assignments so that they get some parts of the code and then must complete to make a working spider. What is provided here is the already working sequential version.
You may feel like using different data structures for parts of this assignment- we realize there are other ways to hold the information kept by the crawler. However, we are pretty sure that the concurrent data structures that we chose for the threaded version likely are the proper ones from the java.util.concurrent library. We'd welcome comments or suggestion on this.
The lecture/class materials present the concepts of using multiple threads that share data structures in memory. Since the intended audience is introductory to intermediate students, there is not a deep conversation about race conditions, rather we intended a gentle introduction to this issue when data is shared, and an explanation of how the designers of the data structures in the java.util.concurrent library build them to ensure atomic operations by multiple therads (i.e. they are thread-safe).