基于條件隨機場的微博情感對象識別研究
發(fā)布時間:2019-04-12 13:20
【摘要】:近年來社交網(wǎng)絡飛速發(fā)展,越來越多的人通過微博來進行信息交換和分享。由于微博具有短小精悍,使用便捷,傳播迅速等特點,使得其廣受歡迎。用戶樂于在微博上分享自己的觀點或體驗,這使得微博中存在著大量具有情感傾向的用戶評論信息。隨著這樣的評論信息迅速膨脹,僅靠人工的方法難以應對海量信息的處理和分析。因此,如何利用計算機技術(shù)對微博中的評論數(shù)據(jù)進行有效的加工處理和分析挖掘己成為當前熱門研究問題,情感對象識別研究就是用于解決這個問題的一種非常有效的途徑。 本文主要是針對中文微博文本進行情感對象識別研究,然而對非結(jié)構(gòu)化的文本進行情感對象識別本身就是一個困難的問題,現(xiàn)有研究往往存在一些不足之處。一方面,微博和傳統(tǒng)文本是有區(qū)別的,其表達簡短且具有較大的自由性,通常不是規(guī)范的中文語言表達,現(xiàn)有的基礎(chǔ)中文文本處理工具并不能很好的適用于微博這種特殊的文本,這為情感對象識別任務提高了難度。為了解決這個問題,本文提出對微博文本進行了規(guī)范化處理并構(gòu)建了包括網(wǎng)絡用語詞典、表情詞典、情感詞典和否定詞詞典等在內(nèi)的多個詞典,通過這種方式不但能夠改善現(xiàn)有文本處理工具對微博進行分詞和句法依賴解析,而且還能夠更加有效地結(jié)合上下文信息進行特征提取。另一方面,針對文本中顯性出現(xiàn)的情感對象,目前一些方法已經(jīng)能夠有效的識別,但是面對隱性的情感對象時還是顯得力不從心。因此,當情感對象直接出現(xiàn)在文本中時,本文采用條件隨機場模型和分類模型相融合的方式進行情感對象識別;而對于情感對象并不出現(xiàn)在文本中時,則嘗試對蘊含的情感對象進行抽象化處理,提出了一種包含隱節(jié)點的條件隨機場改進模型用于識別隱藏情感對象。 本課題研究的核心思想是將情感對象識別問題看成序列標記問題,利用條件隨機場模型在句子級的微博文本上進行對象標注,模型綜合利用多種特征改善識別準確度。在實驗部分,本文在公開評測數(shù)據(jù)集和自建數(shù)據(jù)集兩個數(shù)據(jù)集上進行了實驗驗證和評估,結(jié)果表明模型不但能夠較好識別出微博中顯性的情感對象,還能夠識別出隱藏情感對象。
[Abstract]:In recent years, with the rapid development of social networks, more and more people use Weibo to exchange and share information. Weibo is popular because it is short, easy to use and spread quickly. Users are happy to share their views or experiences on Weibo, which leads to a large number of emotional user comments in Weibo. With the rapid expansion of such comment information, it is difficult to deal with the massive information processing and analysis only by artificial method. Therefore, how to process and mine the comment data in Weibo effectively by using computer technology has become a hot research problem at present. Emotion object recognition is a very effective way to solve this problem. This paper mainly focuses on the emotional object recognition of Chinese Weibo text. However, the emotional object recognition of unstructured text is a difficult problem in itself, and there are often some shortcomings in the existing research. On the one hand, Weibo is different from traditional text in that it is short and free, and is usually not a canonical Chinese language. The existing basic Chinese text processing tools are not suitable for the special text such as Weibo, which makes the task of emotional object recognition more difficult. In order to solve this problem, this paper proposes to normalize the Weibo text and construct a number of dictionaries including network dictionary, expression dictionary, emotion dictionary and negative word dictionary, etc. This approach can not only improve the existing text processing tools for word segmentation and syntactic dependency analysis of Weibo, but also can more effectively combine context information for feature extraction. On the other hand, some methods have been able to effectively identify the explicit emotional objects in the text, but they still appear to be weak in the face of implicit emotional objects. Therefore, when emotional objects appear directly in the text, this paper uses the combination of conditional random field model and classification model to identify emotional objects. When the emotion object does not appear in the text, the implied emotion object is abstracted, and a modified conditional random field model with hidden nodes is proposed to identify hidden emotion object. The key idea of this paper is to consider the emotional object recognition as a sequence marking problem. The conditional random field model is used to label the object on the sentence-level Weibo text. The model comprehensively uses a variety of features to improve the recognition accuracy. In the experiment part, two sets of open evaluation data set and self-built data set are tested and evaluated. The results show that the model can not only recognize the dominant emotional objects in Weibo well. It can also identify hidden emotional objects.
【學位授予單位】:廣東工業(yè)大學
【學位級別】:碩士
【學位授予年份】:2014
【分類號】:TP393.092;TP391.1
本文編號:2457053
[Abstract]:In recent years, with the rapid development of social networks, more and more people use Weibo to exchange and share information. Weibo is popular because it is short, easy to use and spread quickly. Users are happy to share their views or experiences on Weibo, which leads to a large number of emotional user comments in Weibo. With the rapid expansion of such comment information, it is difficult to deal with the massive information processing and analysis only by artificial method. Therefore, how to process and mine the comment data in Weibo effectively by using computer technology has become a hot research problem at present. Emotion object recognition is a very effective way to solve this problem. This paper mainly focuses on the emotional object recognition of Chinese Weibo text. However, the emotional object recognition of unstructured text is a difficult problem in itself, and there are often some shortcomings in the existing research. On the one hand, Weibo is different from traditional text in that it is short and free, and is usually not a canonical Chinese language. The existing basic Chinese text processing tools are not suitable for the special text such as Weibo, which makes the task of emotional object recognition more difficult. In order to solve this problem, this paper proposes to normalize the Weibo text and construct a number of dictionaries including network dictionary, expression dictionary, emotion dictionary and negative word dictionary, etc. This approach can not only improve the existing text processing tools for word segmentation and syntactic dependency analysis of Weibo, but also can more effectively combine context information for feature extraction. On the other hand, some methods have been able to effectively identify the explicit emotional objects in the text, but they still appear to be weak in the face of implicit emotional objects. Therefore, when emotional objects appear directly in the text, this paper uses the combination of conditional random field model and classification model to identify emotional objects. When the emotion object does not appear in the text, the implied emotion object is abstracted, and a modified conditional random field model with hidden nodes is proposed to identify hidden emotion object. The key idea of this paper is to consider the emotional object recognition as a sequence marking problem. The conditional random field model is used to label the object on the sentence-level Weibo text. The model comprehensively uses a variety of features to improve the recognition accuracy. In the experiment part, two sets of open evaluation data set and self-built data set are tested and evaluated. The results show that the model can not only recognize the dominant emotional objects in Weibo well. It can also identify hidden emotional objects.
【學位授予單位】:廣東工業(yè)大學
【學位級別】:碩士
【學位授予年份】:2014
【分類號】:TP393.092;TP391.1
【參考文獻】
相關(guān)期刊論文 前7條
1 謝麗星;周明;孫茂松;;基于層次結(jié)構(gòu)的多策略中文微博情感分析和特征抽取[J];中文信息學報;2012年01期
2 王榮洋;鞠久朋;李壽山;周國棟;;基于CRFs的評價對象抽取特征研究[J];中文信息學報;2012年02期
3 徐冰;趙鐵軍;王山雨;鄭德權(quán);;基于淺層句法特征的評價對象抽取研究[J];自動化學報;2011年10期
4 周勝臣;瞿文婷;石英子;施詢之;孫韻辰;;中文微博情感分析研究綜述[J];計算機應用與軟件;2013年03期
5 鄭敏潔;雷志城;廖祥文;陳國龍;;基于層疊CRFs的中文句子評價對象抽取[J];中文信息學報;2013年03期
6 陽愛民;林江豪;周詠梅;;中文文本情感詞典構(gòu)建方法[J];計算機科學與探索;2013年11期
7 宋暉;史南勝;;基于模式匹配與半監(jiān)督學習的評價對象抽取[J];計算機工程;2013年10期
,本文編號:2457053
本文鏈接:http://www.wukwdryxk.cn/guanlilunwen/ydhl/2457053.html
最近更新
教材專著