Entropy and learnability

Dan DeGenaro


Unigrams vs. bigrams


Unigrams vs. bigrams, distribution


Vocab size


Model type


Embedding size


Softmax normalization