現在位置首頁 > 博碩士論文 > 詳目
論文中文名稱:基於主題與時間序列模型之社群主題趨勢預測 [以論文名稱查詢館藏系統]
論文英文名稱:Trend Forecasting for Social Topics Based on Topic and Time Series Model [以論文名稱查詢館藏系統]
院校名稱:臺北科技大學
學院名稱:管理學院
系所名稱:資訊與財金管理系碩士班
畢業學年度:104
畢業學期:第二學期
中文姓名:許懷文
英文姓名:Huai-Wen Hsu
研究生學號:103AB8008
學位類別:碩士
語文別:中文
口試日期:2016/07/05
指導教授中文名:翁頌舜
口試委員中文名:楊欣哲;吳瑞堯;陳育威;翁頌舜
中文關鍵詞:趨勢預測主題模型時間序列模型社群媒體
英文關鍵詞:Trend ForecastingTopic ModelTime Series ModelSocial Media
論文中文摘要:社群媒體的迅速發展使社會大眾轉由透過社群網站參與生活周遭正在發酵與討論的熱門議題,其充斥著大量口碑訊息與新聞事件文章;如何從龐大繁雜的社群訊息中,識別出社會大眾所關心的主流議題之趨勢,掌握市場商機並採取相對的策略已成為企業與政府組織重要課題之一。過去社群主題偵測研究著重於內容的情感分析,本研究則以使用者評論的時間序列行為基礎,結合主題偵測模型與時間序列模型對台大批踢踢實業坊討論版進行社群主題趨勢預測,研究結果顯示,本研究可識別潛在的PTT社群主題與相關關鍵字,並且有效預測各主題的趨勢,本研究模型平均MAPE值為3.9%,顯示本研究所提出之方法可達到很好的準確率。
論文英文摘要:The rapid growth of the social media leads people participate in the popular topics which are being discussed and fermented in our lives by the social networks. Large amounts of word-of-mouth and news events have flooded the social media. Recognizing the trends of the main topics that people care about from the huge and miscellaneous social messages, grasping the business opportunities and adopt appropriate strategies become an important lesson for business, governmental and non-governmental organizations.
Previous researches on social topic detection have focused on sentiment analysis for content. This study integrates the topic detection model and time series model to forecast trends of the social topics based on time series data of user reviews. Based on the experimental results on real dataset, this study can recognize the latent social topics, keywords and forecast the trend of each topic effectively on the PTT. In this research, the average MAPE of this model is 3.9% and indicates that our approach can reach pretty good accuracy.
論文目次:摘要 I
ABSTRACT II
誌謝 III
目錄 IV
表目錄 VI
圖目錄 VII
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 4
1.3 章節概要與研究流程 5
第二章 文獻探討 6
2.1 社群媒體(SOCIAL MEDIA) 6
2.1.1 批踢踢實業坊(PTT) 6
2.1.2 社群媒體行銷(Social Media Marketing) 7
2.1.3 社群口碑與社群探勘(Social Word-of-Mouth and Social Mining) 8
2.2 主題偵測(TOPIC DETECTION) 9
2.2.1 以關鍵字為基礎之主題偵測 (Keyword-based Topic Detection) 9
2.2.2 以分群方法為基礎之主題偵測(Clustering-based Topic Detection) 10
2.2.3 以主題模型為基礎之主題偵測(Topic Model-based Topic Detection) 11
2.3 主題熱門度與趨勢預測(TOPIC POPULARITY AND TREND FORECASTING) 11
2.3.1 以統計模型為基礎之熱門主題與趨勢預測 12
2.3.2 以機器學習方法為基礎之熱門主題與趨勢預測 13
第三章 研究方法 16
3.1 研究架構 16
3.2 資料蒐集 17
3.3 資料預處理 17
3.3.1 斷詞處理 18
3.3.2 停用詞過濾 18
3.4 主題發現與偵測 20
3.5 主題熱門度與趨勢預測 23
3.5.1 熱門狀態轉移與輸出參數 23
3.5.2 主題熱門度預測模型 27
第四章 實驗結果分析與討論 32
4.1 實驗環境 32
4.2 實驗設計 32
4.3 實驗資料集與前置處理 33
4.4 實驗評估方法 36
4.5 實驗結果與討論 37
4.5.1 主題模型在不同主題數設定下之差異 37
4.5.2 不同時間區間長度之留言序列對文章積分預測之效果比較 39
4.5.3 不同訓練資料比例對文章積分預測之效果比較 41
4.5.4 實際案例與各主題平均預測結果比較 42
第五章 結論 44
5.1 研究結論與貢獻 44
5.2 研究限制與未來展望 45
參考文獻 46
論文參考文獻:1. MIC產業情報研究所,網路社群口碑需求調查報告。https://mic.iii.org.tw/micnew/IndustryObservations_PressRelease02.aspx?sqno=366,2014。
2. MIC產業情報研究所,網路社群使用現況分析調查報告。https://mic.iii.org.tw/micnew/IndustryObservations_PressRelease02.aspx?sqno=364,2014。
3. Achrekar, H., Gandhe, A., Lazarus, R., Yu, S.-H., & Liu, B. (2011). Predicting Flu Trends using Twitter data. In 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) (pp. 702–707).
4. Agarwal, D., Chen, B.-C., & Elango, P. (2009). Spatio-temporal Models for Estimating Click-through Rate. In Proceedings of the 18th International Conference on World Wide Web (pp. 21–30). New York, NY, USA: ACM.
5. Arias, M., Arratia, A., & Xuriguera, R. (2014). Forecasting with Twitter Data. ACM Trans. Intell. Syst. Technol., 5(1), 8:1–8:24.
6. Asur, S., & Huberman, B. A. (2010). Predicting the Future with Social Media. In 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)(Vol. 1, pp. 492–499).
7. Baum, L. E., & Petrie, T. (1966). Statistical inference for probabilistic functions of finite state Markov chains. The annals of mathematical statistics, 37(6), 1554-1563.
8. Bandari, R., Asur, S., & Huberman, B. A. (2012). The pulse of news in social media: Forecasting popularity.
9. Becker, H., Naaman, M., & Gravano, L. (2010). Learning Similarity Metrics for Event Identification in Social Media. In Proceedings of the Third ACM International Conference on Web Search and Data Mining (pp. 291–300). New York, NY, USA: ACM.
10. Blei, D. M. (2012). Probabilistic Topic Models. Commun. ACM, 55(4), 77–84.
11. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. J. Mach. Learn. Res., 3, 993–1022.
12. Cataldi, M., Di Caro, L., & Schifanella, C. (2010). Emerging Topic Detection on Twitter Based on Temporal and Social Terms Evaluation. In Proceedings of the Tenth International Workshop on Multimedia Data Mining (p. 4:1–4:10). New York, NY, USA: ACM.
13. Cha, M., Haddadi, H., Benevenuto, F., & Gummadi, P. K. (2010). Measuring User Influence in Twitter: The Million Follower Fallacy. ICWSM, 10(10-17), 30.
14. Chen, C. C., Chen, Y.-T., Sun, Y., & Chen, M. C. (2003). Life Cycle Modeling of News Events Using Aging Theory. In N. Lavrač, D. Gamberger, H. Blockeel, & L. Todorovski (Eds.), Machine Learning: ECML 2003 (pp. 47–59). Springer Berlin Heidelberg.
15. Chen, Y., Amiri, H., Li, Z., & Chua, T.-S. (2013). Emerging Topic Detection for Organizations from Microblogs. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 43–52). New York, NY, USA: ACM.
16. Chen, Y., Amiri, H., Li, Z., & Chua, T.-S. (2013). Emerging Topic Detection for Organizations from Microblogs. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 43–52). New York, NY, USA: ACM.
17. Fernandes, K., Vinagre, P., & Cortez, P. (2015). A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News. In F. Pereira, P. Machado, E. Costa, & A. Cardoso (Eds.), Progress in Artificial Intelligence (pp. 535–546). Springer International Publishing.
18. Figueiredo, F. (2013). On the Prediction of Popularity of Trends and Hits for User Generated Videos. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining(pp. 741–746). New York, NY, USA: ACM.
19. Gartner Predicts That Refusing to Communicate by Social Media Will Be as Harmful to Companies as Ignoring Phone Calls or Emails Is Today. (2012, August 1). Retrieved from http://www.gartner.com/newsroom/id/2101515
20. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1), 5228–5235.
21. Gupta, M., Gao, J., Zhai, C., & Han, J. (2012). Predicting future popularity trend of events in microblogging platforms. Proceedings of the American Society for Information Science and Technology, 49(1), 1–10.
22. Hemann, C., & Burbary, K. (2013). Digital marketing analytics: making sense of consumer data in a digital world. Indianapolis, Ind.: Que.
23. Hofmann, T. (2001). Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning, 42(1–2), 177–196.
24. India Leads Worldwide Social Networking Growth. (2013, November 19). Retrieved from http://www.emarketer.com/Article/India-Leads-Worldwide-Social-Networking-Growth/1010396
25. Kaltenbrunner, A., Kaltenbrunner, A., Gómez, V., Gómez, V., López, V., & López, V. (2007). Description and Prediction of Slashdot Activity. In Web Conference, 2007. LA-WEB 2007. Latin American (pp. 57–66).
26. Kaplan, A. M., & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, 53(1), 59–68.
27. Kim, S. D., Kim, S. H., & Cho, H. G. (2011). Predicting the Virtual Temperature of Web-Blog Articles as a Measurement Tool for Online Popularity. In 2011 IEEE 11th International Conference on Computer and Information Technology (CIT) (pp. 449–454).
28. Kotler, P., & Keller, K. L. (2009). Marketing management. Upper Saddle River, N.J.: Pearson Prentice Hall.
29. Kotov, A., Zhai, C., & Sproat, R. (2011). Mining Named Entities with Temporally Correlated Bursts from Multilingual Web News Streams. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (pp. 237–246). New York, NY, USA: ACM.
30. Lerman, K., & Hogg, T. (2010). Using a Model of Social Dynamics to Predict Popularity of News. In Proceedings of the 19th International Conference on World Wide Web (pp. 621–630). New York, NY, USA: ACM.
31. Lewis, C. D. (1982). Industrial and business forecasting methods: A practical guide to exponential smoothing and curve fitting. Butterworth-Heinemann.
32. Li, H., Mukherjee, A., Liu, B., Kornfield, R., & Emery, S. (2014). Detecting Campaign Promoters on Twitter Using Markov Random Fields. In 2014 IEEE International Conference on Data Mining (pp. 290–299).
33. Liu, B. (2011). Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. Berlin: Springer.
34. Nobakht, B., Joseph, C. E., & Loni, B. (2012, March). Stock market analysis and prediction using hidden markov models. In Student Conference on Engg and Systems (SCES) (pp. 1-4).
35. Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.
36. Ritter, A., Mausam, Etzioni, O., & Clark, S. (2012). Open Domain Event Extraction from Twitter. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1104–1112). New York, NY, USA: ACM.
37. Ritterman, J., Osborne, M., & Klein, E. (2009, November). Using prediction markets and Twitter to predict a swine flu pandemic. In 1st international workshop on mining social media (Vol. 9, pp. 9-17).
38. Sayyadi, H., Hurst, M., & Maykov, A. (2009, May). Event Detection and Tracking in Social Streams. In ICWSM.
39. Singh, V. K., Piryani, R., Uddin, A., & Waila, P. (2013). Sentiment analysis of Movie reviews and Blog posts. In Advance Computing Conference (IACC), 2013 IEEE 3rd International (pp. 893–898).
40. Smith, N., Wollan, R., & Zhou, C. (2011). The social media management handbook: everything you need to know to get social media working in your business. Hoboken: John Wiley & Sons.
41. Straubhaar, J. D., LaRose, R., & Davenport, L. (2013). Media now: understanding media, culture, and technology. Boston, MA: Wadsworth Cengage Learning.
42. Szabo, G., & Huberman, B. A. (2010). Predicting the Popularity of Online Content. Commun. ACM, 53(8), 80–88.
43. Takahashi, T., Tomioka, R., & Yamanishi, K. (2011). Discovering Emerging Topics in Social Streams via Link Anomaly Detection. In 2011 IEEE 11th International Conference on Data Mining(pp. 1230–1235).
44. Tatar, A., Antoniadis, P., de Amorim, M. D., & Fdida, S. (2012). Ranking News Articles Based on Popularity Prediction. In Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) (pp. 106–110). Washington, DC, USA: IEEE Computer Society.
45. Turban, E., King, D. R., & Lang, J. (2009). Introduction to electronic commerce. Upper Saddle River, NJ: Prentice Hall.
46. Wang, X., & McCallum, A. (2006). Topics over Time: A non-Markov Continuous-time Model of Topical Trends. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 424–433). New York, NY, USA: ACM.
47. Wang, Y., Agichtein, E., & Benzi, M. (2012). TM-LDA: Efficient Online Modeling of Latent Topic Transitions in Social Media. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 123–131). New York, NY, USA: ACM.
48. Xie, J., Liu, G., & Ning, W. (2012). A Topic Detection Method for Chinese Microblog. In 2012 International Symposium on Information Science and Engineering (ISISE) (pp. 100–103).
49. You, L., Du, Y., Ge, J., Huang, X., & Wu, L. (2004). BBS Based Hot Topic Retrieval Using Back-Propagation Neural Network. In K.-Y. Su, J. Tsujii, J.-H. Lee, & O. Y. Kwong (Eds.), Natural Language Processing – IJCNLP 2004 (pp. 139–148). Springer Berlin Heidelberg.
50. Zaman, T., Fox, E. B., & Bradlow, E. T. (2014). A Bayesian approach for predicting the popularity of tweets. The Annals of Applied Statistics, 8(3), 1583–1611.
51. Zhang, D., & Li, S. (2011). Topic detection based on K-means. In 2011 International Conference on Electronics, Communications and Control (ICECC) (pp. 2983–2985).
論文全文使用權限:不同意授權