This node creates a global set of terms over all documents. Optionally, it is possible to filter the top-k words in terms of frequencies. There are three different frequencies to choose from for filtering: the term frequency, the document frequency and the inverse document frequency.
- Term Frequency ( TF ): Overall count of a term in all documents.
- Document Frequency ( DF ): Number of documents in which a term occurs.
- Inverse Document Frequency ( IDF ): The logarithm of the total number of documents divided by the DF .