Improve checking and reporting of invalid documents.
When unparsable documents are added, the tagger does not fail on them but the processed json result has total_words = 0 which causes NaN's in the analysis. Currently it passes these files with the bad data. Should add check for this kind of bad data and set status to error.