李文亮微博评论 LDA 分析

Slide to adjust relevance metric:(2)
0.00.20.40.60.81.0
PC1PC2Marginal topic distribution2%5%10%12345Intertopic Distance Map (via multidimensional scaling)Overall term frequencyEstimated term frequency within the selected topic1. saliency(term w) = frequency(w) * [sum_t p(t | w) * log(p(t | w)/p(t))] for topics t; see Chuang et. al (2012)2. relevance(term w | topic t) = λ * p(w | t) + (1 - λ) * p(w | t)/p(w); see Sievert & Shirley (2014)蜡烛晚安疫情世界鲜花感觉工作生活开学医生女足亮哥时间家人朋友有点情人节考研天堂事情妈妈成绩日子结果我会月亮孩子学校和平社会Top-30 Most Salient Terms(1)05001,0001,5002,0002,5003,000
总访问量 49845