評(píng)價(jià)對(duì)象短語識(shí)別在情感分析領(lǐng)域的研究與實(shí)現(xiàn)
[Abstract]:In recent years, with the rapid development of mobile Internet, Weibo, as a new social network medium, has a rapid rise, every day has produced a huge amount of social data for users. As a main carrier of mobile social networking, Weibo is rich in content and high in data value. The identification and affective analysis of Weibo data can provide important reference for government public opinion monitoring, enterprise advertising, user behavior prediction and information decision-making. Weibo's affective analysis mainly consists of two elements: target phrase recognition and affective orientation analysis. Due to the scattered content of Weibo, identifying the subject of comment on blog has become a hot and difficult point in the affective analysis of Weibo. The research shows that the recognition of unrecorded words is one of the important factors leading to the low recognition rate of Chinese evaluation object phrases. Therefore, it is very important and meaningful to study the extraction method of Weibo evaluation phrase based on unrecorded word recognition. In this paper, the feature vectors of the unrecorded word recognition model are designed from three aspects: feature selection, classifier selection and feature template selection, to improve the recognition rate, and then the algorithm is applied to the evaluation object phrase recognition. The validity of the experiment is verified by Weibo's actual corpus. The main work of this paper is as follows: 1. First, a statistical feature based on text word sequence, cohesion, left and right degrees of freedom is proposed as the feature of unrecorded word recognition, and then through naive Bayes, decision tree, logic regression, Support vector machine (SVM) and artificial neural network (Ann) are the five classification algorithms to identify unrecorded words, and compare the recognition results. An artificial neural network classification algorithm with good recognition effect for unrecorded words is selected as the decision model of unrecorded words. (2) then, three symbols of BIO are introduced, and the conditional random field CRFs is used to transform the evaluation phrase recognition problem into the sequence tagging problem. When identifying the target phrase, the appropriate feature template is selected, and the unrecorded words generated by artificial neural network training are applied to the process of identifying the evaluation object phrase. 3. The data of one day of Sina Weibo is chosen as the data source of this paper. After manual tagging, the experiment of evaluating object phrase recognition is carried out. The experimental results show that the accuracy and recall rate of phrase extraction of evaluation objects can be significantly improved by adding the unrecorded words in Weibo text which is automatically recognized into the evaluation object phrase recognition algorithm based on CRFs.
【學(xué)位授予單位】:東華大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.1;TP18
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 葉成緒;楊萍;劉少鵬;;基于主題詞的微博熱點(diǎn)話題發(fā)現(xiàn)[J];計(jì)算機(jī)應(yīng)用與軟件;2016年02期
2 李文坤;張仰森;陳若愚;;基于詞內(nèi)部結(jié)合度和邊界自由度的新詞發(fā)現(xiàn)[J];計(jì)算機(jī)應(yīng)用研究;2015年08期
3 唐波;陳光;王星雅;王非;陳小慧;;微博新詞發(fā)現(xiàn)及情感傾向判斷分析[J];山東大學(xué)學(xué)報(bào)(理學(xué)版);2015年01期
4 霍帥;張敏;劉奕群;馬少平;;基于微博內(nèi)容的新詞發(fā)現(xiàn)方法[J];模式識(shí)別與人工智能;2014年02期
5 周紅照;侯明午;顏彭莉;張葉青;侯敏;滕永林;;語義特征在評(píng)價(jià)對(duì)象抽取與極性判定中的作用[J];北京大學(xué)學(xué)報(bào)(自然科學(xué)版);2014年01期
6 陳飛;劉奕群;魏超;張?jiān)屏?張敏;馬少平;;基于條件隨機(jī)場(chǎng)方法的開放領(lǐng)域新詞發(fā)現(xiàn)[J];軟件學(xué)報(bào);2013年05期
7 鄭敏潔;雷志城;廖祥文;陳國龍;;中文句子評(píng)價(jià)對(duì)象抽取的特征分析研究[J];福州大學(xué)學(xué)報(bào)(自然科學(xué)版);2012年05期
8 林江豪;陽愛民;周詠梅;陳錦;蔡澤鍵;;一種基于樸素貝葉斯的微博情感分類[J];計(jì)算機(jī)工程與科學(xué);2012年09期
9 顧正甲;姚天f ;;評(píng)價(jià)對(duì)象及其傾向性的抽取和判別[J];中文信息學(xué)報(bào);2012年04期
10 徐遠(yuǎn)方;李成城;;基于SVM和詞間特征的新詞識(shí)別研究[J];計(jì)算機(jī)技術(shù)與發(fā)展;2012年05期
相關(guān)會(huì)議論文 前4條
1 王倩;何婷婷;聞彬;宋樂;張茂元;;基于依存關(guān)系的中文情感要素抽取技術(shù)研究[A];中國計(jì)算機(jī)語言學(xué)研究前沿進(jìn)展(2007-2009)[C];2009年
2 姚天f ;聶青陽;李建超;李林琳;婁德成;陳珂;付宇;;一個(gè)用于漢語汽車評(píng)論的意見挖掘系統(tǒng)[A];中文信息處理前沿進(jìn)展——中國中文信息學(xué)會(huì)二十五周年學(xué)術(shù)會(huì)議論文集[C];2006年
3 倪茂樹;林鴻飛;;基于關(guān)聯(lián)規(guī)則和極性分析的商品評(píng)論挖掘[A];第三屆全國信息檢索與內(nèi)容安全學(xué)術(shù)會(huì)議論文集[C];2007年
4 王芳;萬常選;;基于可信度的中文完整詞自動(dòng)識(shí)別[A];第四屆全國信息檢索與內(nèi)容安全學(xué)術(shù)會(huì)議論文集(上)[C];2008年
相關(guān)碩士學(xué)位論文 前4條
1 李文坤;面向微博的新詞發(fā)現(xiàn)和話題檢測(cè)技術(shù)研究[D];北京信息科技大學(xué);2015年
2 侯立斌;中文事件抽取與缺失角色填充的研究[D];蘇州大學(xué);2012年
3 朱洪;面向互聯(lián)網(wǎng)中文輿情信息的情感傾向分析[D];國防科學(xué)技術(shù)大學(xué);2011年
4 徐東興;基于Gate框架的信息抽取系統(tǒng)的研究與實(shí)現(xiàn)[D];華東師范大學(xué);2007年
,本文編號(hào):2268739
本文鏈接:http://www.wukwdryxk.cn/wenyilunwen/guanggaoshejilunwen/2268739.html