Please note that the contents of this offline web site may be out of date. To access the most recent documentation visit the online version .
Note that links that point to online resources are green in color and will open in a new window.
We would love it if you could give us feedback about this material by filling this form (You have to be online to fill it)



The MapreducePipeline Class

Experimental!

Mapreduce is an experimental, innovative, and rapidly changing new feature for Google App Engine. Unfortunately, being on the bleeding edge means that we may make backwards-incompatible changes to Mapreduce. We will inform the community when this feature is no longer experimental.


A Pipeline used for Mapreduce jobs.

MapreducePipeline is provided by the google.appengine.ext.mapreduce module.

  1. Introduction
  2. Constructor
  3. Instance methods:

Introduction

The MapreducePipeline class is used to "wire-together" or connect all the steps needed to perform a specific Mapreduce job. It specifies the mapper, reducer, data input reader, output writer and so forth to be used to carry out the job.

Returns filenames from the output writer.

Constructor

class MapreducePipeline ( job_name , mapper_spec , reducer_spec , input_reader_spec , output_writer_spec = None , mapper_params = None , reducer_params = None, shards = None )
The MapreducePipeline constructor's arguments fully specify the Mapreduce job.

Arguments

job_name
The name of the Mapreduce job. This name shows up in the logs and in the UI.
mapper_spec
The name of the mapper used in this mapreduce job. The mapper processes the line by line input from the input reader specified in the input_reader_spec param.
reducer_spec
The name of the reducer used in this mapreduce job. The reducer performs work and yields results, using the optional output writer specified in the output_writer_spec param.
input_reader_spec
The name of the input reader used in the mapper for this Mapreduce job. The mapper processes the line by line input from the input reader specified.
output_writer_spec
The name of the output writer (if any) used to store results from this Mapreduce job.
mapper_params
Parameters to use in the input reader.
reducer_params
Parameters to use in the output writer.
shards
Number of shards to use for this Mapreduce job.

Instance Methods

A Mapreduce instance has the following methods:

start ()
Starts the Mapreduce job.

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.