Resilient Distributed Dataset

Glossary Page

An abstract distributed memory model that enables programmers to perform in-memory computations on extensive clusters while maintaining fault tolerance. RDDs address the inefficiencies of current computing frameworks when dealing with iterative algorithms and interactive data mining tools. By keeping data in memory, RDDs significantly enhance performance. To ensure efficient fault tolerance, RDDs utilize a restricted form of shared memory that relies on coarse-grained transformations rather than fine-grained updates to shared state.

http://dl.acm.org/citation.cfm?id=2228301 external-link

Latest Webinars

Latest Articles