Google Code offered in: 中文 - English - Português - Pусский - Español - 日本語
One of the most important recent developments in computing is the growth in distributed and parallel applications.
In these tutorials, we distinguish between local programming (on a single machine) and distributed programming using multiple components via a network.
We cover what designers and programmers need to consider in developing applications in a distributed environment. We also cover parallel computation using an open source tool called Hadoop, which is a MapReduce implementation, running on a distributed file system. The goal is to help build an understanding of these important new trends, and provide opportunities to practice with them.
These submissions from industry and academia are designed to help teach distributed computing to students around the world.
|
An introduction to advanced control-flow with an emphasis
on concurrency and writing concurrent programs at the
programming-language level in C++. Programming techniques
and styles are examined to express complex forms of control
flow, such as exceptions, co routines, and multiple forms of
concurrency. Students will learn how to structure, implement
and debug complex control-flow.
|
|
Three projects designed to familiarize students with developing
client/server applications and dealing with issues of asynchronous
communication and parallel programming.
|
|
The course covers a broad spectrum of topics encompassing system
architecture, software abstractions, distributed algorithms, and
issues pertaining to distributed environments such as security.
Course topics include network communications, remote procedure
calls, remote file systems, distributed agreement, clock
synchronization, clustering, and a variety of security and
system design topics.
|
|
During the Summer of 2007 a week long course in Cluster Computing and MapReduce was offered to interns working at Google. This submission contains the materials used in that class, along with video recordings of each of the lectures. This material builds on Introduction to Problem Solving on Large Scale Clusters, listed below.
|
|
The University of Washington ran an upper-division course on Distributed Computing with MapReduce in Spring 2007. This submission contains the materials used for the class: five lectures in Powerpoint format, as well as four lab exercises designed to create a toolbox of distributed algorithms and data structures for the student. These were completed by students in the course on a cluster running Hadoop. This material builds on MapReduce in a Week, listed below.
|
|
This submission contains a complete set of lectures, programming assignments, and reading materials.
It is designed to provide you with all the material you need in order to teach MapReduce
as a section within a course on distributed systems.
|
Getting started with a distributed system environment can be challenging. To help with this, we've assembled a few tools and resources that can be useful to both students and educators.
|
Hadoop Virtual Image
This VMware image contains a preconfigured single node
instance of Hadoop. This provides the same interface as a
full cluster without any of the overhead. It is suitable
for educators exploring the platform and students working
independently. The following Download and VMware Player
links point to websites external to Google.
|
|
MapReduce Tools for Eclipse Plug-In
A robust plug-in that brings Hadoop support to the Eclipse platform. Features include server configuration, support for launching MapReduce jobs and browsing the distributed file system. The following Plug-in and Eclipse links point to websites external to Google.
|
|
Sample Datasets
The following links provide interesting data samples that are most efficiently manipulated using distributed systems techniques.
|
In this area, you will find a set of video-taped lectures from Google Video on various technology areas. These videos are great opportunities for students and faculty to hear directly from some of the current pioneers in high-tech. They can also potentially serve as "guest lectures" for courses in these areas.
Presenter: Shiva Shivakumar - Google Distinguished Entreprenuer
Google deals with large amounts of data and millions of users. We'll take a behind-the-scenes look at some of the distributed systems and computing platform that power Google's various products, and make the products scalable and reliable.
Presenter: Jeff Dean - Google Distinguished Engineer
Google's Jeff Dean discusses the Bigtable content storage system used in google's backend at the University of Washington.
Prsenters: Martin Omander, Jason Huggins
Testing Distributed Systems with AJAX, XML - Lessons Learned from Google Checkout.