Stanford Topic Modeling Tool Box

Last quarter, I started learning how to use topic modeling to analyze open-ended data (or unstructured data in data science terms). Basically data where participants are able to say anything they want in response to a prompt I give them.

Another time I’ll write a post about what I did at that time, but first I wanted to note something new I learned related to topic modeling today!

Today I wanted to revisit my data with topic modeling and in the process learn more about topic modeling. So far I’ve used topic modeling in a narrow sense–how to use it in R. Today I wanted to take a step back to understand what topic modeling is doing theoretically.

One of the first results that popped up for “topic modeling” was actually a link by Stanford! So of course I had to check it out.

It’s really useful! They use scala, so I want to see if I can take their advice and apply it to R (or Python?). But I really liked their explanations and tutorials. They discuss how to test the model and how to modify the parameters to create a better model. It’s neat! I want to try this tomorrow!

I realize now this wasn’t a very useful blog post, so let’s think of it as being a post-it note for useful information that will be revisited in the near future.

Okay fineee, here’s the real reason why I wanted to make this post, to discuss this point:

Model Convergence

At the time I read this point, I thought it was about refining the number of topics, but now I realize it’s not about topics, but iterations on the data. So I was super excited by this piece of advice, but now my excitement level has dropped about two notches. Still useful, but not what I was looking for.

However, later on, the tutorial does mention refining number of topics, but their sentence gets cut off:

Number of topics

“…has started to decrease at a” …a what?! a what?! Don’t leave me hanging here…

Sorry folks, you’ll just have to be left hanging! If I find out more info, I will update this post or make a new one. Until next time!


