Entropy and learnability
Dan DeGenaro
Unigrams vs. bigrams
Unigrams vs. bigrams, distribution
Vocab size
Model type
Embedding size
Softmax normalization