RDD Actions with examples
- kumarnitinkarn
- Nov 6, 2019
- 2 min read
Transformation creates RDD for other RDD/RDDs but the result is not computed until we trigger an action. When we a trigger an action, another RDD is not created but it gives non RDD values.
The values of action are stored to drivers or to the external storage system. It brings laziness of RDD into motion.An action is one of the ways of sending data from Executer to the driver.
count() :
count() returns the number of elements in RDD.
collect() :
collect() returns our entire RDDs content to driver program.
The application of collect() is unit testing where the entire RDD is expected to fit in memory.
take(n) :
take(n) returns n number of elements from RDD. It tries to cut the number of partition it accesses, so it represents a biased collection. We cannot presume the order of the elements.
top()
If ordering is present in our RDD, then we can extract top elements from our RDD using top(). It use default ordering of data.
countByValue()
The countByValue() returns, many times each element occur in RDD.
reduce()
The reduce() function takes the two elements as input from the RDD and then produces the output of the same type as that of the input elements. The simple forms of such function are an addition. We can add the elements of RDD, count the number of words. It accepts commutative and associative operations as an argument.
foreach()
When we have a situation where we want to apply operation on each element of RDD,
but it should not return value to the driver. In this case, foreach() function is useful.
For example, inserting a record into the database.
Happy Learning !!
コメント