Persist Spark DataFrame/RDD

Node / Other

Persist Spark DataFrame/RDD

This node persists (caches) the incoming SparkDataFrame/RDD using the specified persistence level. The different storage levels are described in detail in the Spark documentation .

Caching Spark DataFrames/RDDs might speed up operations that need to access the same DataFrame/RDD several times e.g. when working with the same DataFrame/RDD within a loop body in a KNIME workflow.

Node details

Ports Options Views

Input ports

Type: Spark Data

Spark DataFrame/RDD

Spark DataFrame/RDD to persist.

Output ports

Type: Spark Data

Persisted Spark DataFrame/RDD

The persisted Spark DataFrame/RDD.

Extension

The Persist Spark DataFrame/RDD node is part of this extension:

Go to item

Persist Spark DataFrame/RDD

Node details

Input ports

Output ports

Extension

Related workflows & nodes