fix idf calculation with documentsCount
Apparently the idf value is calculated incorectly by not taking the global documents count in to account.
TODO
- Count all documents in advance
- Broadcast variable
- Add documentsCount property to
TFIDF.java
and use this value instead of 2