Skip to content

Latest commit

 

History

History
140 lines (115 loc) · 2.79 KB

cluster-technologies.adoc

File metadata and controls

140 lines (115 loc) · 2.79 KB

Platform Comparison Checklist

A catalog of attributes, qualities, and features supported by listed platforms.

Note
This is a work in progress, and is likely incorrect.

See Notes on Cluster Computing for definitions.

Usage

Please feel free to open issues or a pull request to contribute corrections or add new platforms.

Platforms should be in alphabetical order of project name.

Some attributes are independent or mutually exclusive, the first line in the table makes this clear.

  • O denotes mutually exclusive properties

  • X denotes supported property, one or more

  • NA denotes not applicable

Comparison Tables

Model Design Attributes

  • MapReduce - Apache Hadoop 2.x MapReduce implementation

  • Tez - Apache Tez implementation

Table 1. Step Work Parallelism
Platform Sequential Parallel

-

O

O

MapReduce

O

Tez

O

Table 2. Node Work Parallelism
Platform Sequential Parallel

-

O

O

MapReduce

NA

NA

Tez

O

Table 3. Node Data Parallelism
Platform Split Partitioned

-

X

X

MapReduce

X

X

Tez

X

X

Table 4. Node Topology
Platform Two Node Directed In-Tree Directed Acyclic Graph Directed Acyclic MultiGraph

-

O

O

O

O

MapReduce

O

Tez

O

Table 5. Node Data Routing
Platform Forward Broadcast Ordered Scatter-Gather Unordered Scatter-Gather

-

X

X

X

X

MapReduce

X

Tez

X

X

X

X

Table 6. Intermediate Result Availability
Platform Unavailable Available

-

O

O

MapReduce

NA

NA

Tez

O

Table 7. Staging
Platform Simultaneous Incremental

-

X

X

MapReduce

X

Tez

X

Table 8. Mode
Platform Offline Online

-

X

X

MapReduce

X

Tez

X

Implementation Features

Table 9. Speculative Execution
Platform Unavailable Available

-

O

O

MapReduce

O

Tez

O

Table 10. Data Routing Service
Platform Unavailable Available

-

O

O

MapReduce

O*

Tez

O

Note
* experimental
Table 11. Routed Data Durability
Platform Shared Filesystem Local Filesystem On-Heap Memory Off-Heap Memory

-

X

X

X

X

MapReduce

X

Tez

X

Table 12. Intermediate Result Durability
Platform Shared Filesystem Local Filesystem On-Heap Memory Off-Heap Memory

-

X

X

X

X

MapReduce

NA

NA

NA

NA

Tez

NA

NA

NA

NA