- Topic Documents
- Topic Correlations
- Time Series
- Vocabulary
- Downloads
- Upload
Documents are sorted by their proportion of the currently selected topic, biased to prefer longer documents.
Words occurring in only one topic have specificity 1.0, words evenly distributed among all topics have specificity 0.0.
View options
Word | Frequency | Topic Specificity | Stoplist |
Documents are grouped by their "date" field (the second column in the input file). These plots show the average document proportion of each topic at each date value. Date values are not parsed, but simply sorted in the order they appear in the input file.
Topics that occur together more than expected are blue, topics that occur together less than expected are red.
Upload your own training text and stoplist.
The documents and stoplist files have their own specific formats:
Documents file. This is a tab deliminated file with one document per line. Each line has three fields in the following format:
[doc ID] [tab] [label] [tab] [text...]
Stopwords file. Each word gets its own line. The vocabulary tab may be useful for creating a new list of stopwords.
For more information, refer to jsLDA's README.
Each file is in comma-separated format.