基于深度學(xué)習(xí)的短文本分類及信息抽取研究

發(fā)布時(shí)間：2018-03-14 03:34

本文選題：深度學(xué)習(xí)　切入點(diǎn)：信息抽取　出處：《鄭州大學(xué)》2017年碩士論文　論文類型：學(xué)位論文

【摘要】：互聯(lián)網(wǎng)的發(fā)展和網(wǎng)絡(luò)信息的爆炸式增長(zhǎng),給人們帶來(lái)更全面、及時(shí)信息的同時(shí),也使得用戶快速、精準(zhǔn)地找到所需要的信息變得更加困難。信息抽取可從海量的數(shù)據(jù)中檢索并返回給用戶更加準(zhǔn)確、簡(jiǎn)潔的信息,可以更好地滿足用戶的需求。文本分類可減少信息抽取的選擇空間,并可針對(duì)不同的信息類型,制定不同的策略,是信息抽取不可或缺的前提步驟。現(xiàn)階段,自然語(yǔ)言的句法和語(yǔ)義充分理解是文本分類和信息抽取的關(guān)鍵。人工提取自然語(yǔ)言的句法語(yǔ)義特征難度大、主觀性較強(qiáng),深度學(xué)習(xí)可自我學(xué)習(xí)特征,對(duì)自然語(yǔ)言理解具有可行性。利用深度學(xué)習(xí)的思想,可以主動(dòng)學(xué)習(xí)文本的句法語(yǔ)義特征,進(jìn)而學(xué)習(xí)到所抽取信息的深度特征,降低人工特征的制定的難度,并有較好的客觀性。本文在文本分類和信息抽取問題上,借助CNN模型、LSTM模型以及傳統(tǒng)句法樹的優(yōu)勢(shì),構(gòu)造深度神經(jīng)網(wǎng)絡(luò)模型,挖掘文本的深度特征。本文主要工作有:在文本分類上,改進(jìn)傳統(tǒng)卷積神經(jīng)網(wǎng)絡(luò)模型(CNN),提出多粒度卷積核的方法,并聯(lián)合長(zhǎng)短時(shí)記憶人工神經(jīng)網(wǎng)絡(luò)模型(LSTM),借助兩個(gè)模型的優(yōu)勢(shì),提出一種新的學(xué)習(xí)模型(L-MFCNN),較好的對(duì)詞序語(yǔ)義學(xué)習(xí)和深度特征挖掘。實(shí)驗(yàn)結(jié)果表明,該方法在沒有制定繁瑣的人工特征規(guī)則的條件下,仍然有較好的性能。在信息抽取上,本文用詞向量表示問題句和候選信息句,并利用長(zhǎng)短時(shí)記憶神經(jīng)網(wǎng)絡(luò)(LSTM)學(xué)習(xí)問題句和候選信息句的語(yǔ)義相關(guān)特征,再利用依存句法樹分析選擇句法結(jié)構(gòu)特征,聯(lián)合表層特征構(gòu)造深度神經(jīng)網(wǎng)絡(luò),學(xué)習(xí)問題、候選信息句和候選信息三者的內(nèi)在關(guān)聯(lián)信息。實(shí)驗(yàn)結(jié)果表明,該方法可自主學(xué)習(xí)到句子的句法語(yǔ)義特征,有較好的信息抽取性能。最后本文設(shè)計(jì)并實(shí)現(xiàn)了信息抽取的應(yīng)用實(shí)例問答系統(tǒng),將文中提出的深度神經(jīng)網(wǎng)絡(luò)的方法運(yùn)用到問答系統(tǒng)中,通過(guò)實(shí)踐驗(yàn)證,沒有制定較復(fù)雜句法語(yǔ)義特征,問答系統(tǒng)具有較好的答案抽取性能。
[Abstract]:The development of the Internet and the explosive growth of network information bring more comprehensive and timely information to people, but also make users quick. Finding exactly what information is needed becomes more difficult. Information extraction can be retrieved from vast amounts of data and returned to users with more accurate and concise information. Text classification can reduce the selection space of information extraction, and can make different strategies for different information types, which is an indispensable prerequisite step for information extraction. The syntactic and semantic understanding of natural language is the key of text classification and information extraction. It is feasible to understand natural language. By using the idea of deep learning, we can learn the syntactic and semantic features of the text actively, and then learn the depth features of the extracted information, and reduce the difficulty of making artificial features. In this paper, based on the advantages of CNN model and traditional syntactic tree, a depth neural network model is constructed for text classification and information extraction. The main work of this paper is to improve the traditional convolution neural network model in text classification, to propose a multi-granularity convolution kernel method, and to combine the LSTM model with long and short memory artificial neural network model to make use of the advantages of the two models. A new learning model, L-MFCNN, is proposed to study word order semantics and deep feature mining. The experimental results show that this method still has good performance without making complicated artificial feature rules. In this paper, the lexical vector is used to represent the question sentence and candidate information sentence, and the long and short memory neural network (LSTM) is used to study the semantic correlation features of the question sentence and candidate information sentence, and then the dependency syntax tree is used to analyze and select the syntactic structure features. The results of experiments show that this method can learn syntactic and semantic features of sentences independently, and combine surface features to construct the internal correlation information of depth neural network, learning problem, candidate information sentence and candidate information, and the experimental results show that the proposed method can learn syntactic and semantic features of sentences independently. Finally, this paper designs and implements the application case answering system of information extraction, and applies the method of depth neural network in this paper to the question and answer system, which is verified by practice. Without complex syntactic and semantic features, the question answering system has better performance of answer extraction.
【學(xué)位授予單位】：鄭州大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP391.1

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 楊曉帥 ,付玫;神經(jīng)網(wǎng)絡(luò)技術(shù)讓管理更輕松[J];軟件世界;2000年11期

2 云中客;新的神經(jīng)網(wǎng)絡(luò)來(lái)自于仿生學(xué)[J];物理;2001年10期

3 唐春明,高協(xié)平;進(jìn)化神經(jīng)網(wǎng)絡(luò)的研究進(jìn)展[J];系統(tǒng)工程與電子技術(shù);2001年10期

4 李智;一種基于神經(jīng)網(wǎng)絡(luò)的煤炭調(diào)運(yùn)優(yōu)化方法[J];長(zhǎng)沙鐵道學(xué)院學(xué)報(bào);2003年02期

5 程科,王士同,楊靜宇;新型模糊形態(tài)神經(jīng)網(wǎng)絡(luò)及其應(yīng)用研究[J];計(jì)算機(jī)工程與應(yīng)用;2004年21期

6 王凡,孟立凡;關(guān)于使用神經(jīng)網(wǎng)絡(luò)推定操作者疲勞的研究[J];人類工效學(xué);2004年03期

7 周麗暉;從統(tǒng)計(jì)角度看神經(jīng)網(wǎng)絡(luò)[J];統(tǒng)計(jì)教育;2005年06期

8 趙奇 ,劉開第 ,龐彥軍;灰色補(bǔ)償神經(jīng)網(wǎng)絡(luò)及其應(yīng)用研究[J];微計(jì)算機(jī)信息;2005年14期

9 袁婷;;神經(jīng)網(wǎng)絡(luò)在股票市場(chǎng)預(yù)測(cè)中的應(yīng)用[J];軟件導(dǎo)刊;2006年05期

10 尚晉;楊有;;從神經(jīng)網(wǎng)絡(luò)的過(guò)去談科學(xué)發(fā)展觀[J];重慶三峽學(xué)院學(xué)報(bào);2006年03期

相關(guān)會(huì)議論文前10條

1 徐春玉;;基于泛集的神經(jīng)網(wǎng)絡(luò)的混沌性[A];1996中國(guó)控制與決策學(xué)術(shù)年會(huì)論文集[C];1996年

2 周樹德;王巖;孫增圻;孫富春;;量子神經(jīng)網(wǎng)絡(luò)[A];2003年中國(guó)智能自動(dòng)化會(huì)議論文集（上冊(cè)）[C];2003年

3 羅山;張琳;范文新;;基于神經(jīng)網(wǎng)絡(luò)和簡(jiǎn)單規(guī)劃的識(shí)別融合算法[A];2009系統(tǒng)仿真技術(shù)及其應(yīng)用學(xué)術(shù)會(huì)議論文集[C];2009年

4 郭愛克;馬盡文;丁康;;序言(二)[A];1999年中國(guó)神經(jīng)網(wǎng)絡(luò)與信號(hào)處理學(xué)術(shù)會(huì)議論文集[C];1999年

5 鐘義信;;知識(shí)論:神經(jīng)網(wǎng)絡(luò)的新機(jī)遇——紀(jì)念中國(guó)神經(jīng)網(wǎng)絡(luò)10周年[A];1999年中國(guó)神經(jīng)網(wǎng)絡(luò)與信號(hào)處理學(xué)術(shù)會(huì)議論文集[C];1999年

6 許進(jìn);保錚;;神經(jīng)網(wǎng)絡(luò)與圖論[A];1999年中國(guó)神經(jīng)網(wǎng)絡(luò)與信號(hào)處理學(xué)術(shù)會(huì)議論文集[C];1999年

7 金龍;朱詩(shī)武;趙成志;陳寧;;數(shù)值預(yù)報(bào)產(chǎn)品的神經(jīng)網(wǎng)絡(luò)釋用預(yù)報(bào)應(yīng)用[A];1999年中國(guó)神經(jīng)網(wǎng)絡(luò)與信號(hào)處理學(xué)術(shù)會(huì)議論文集[C];1999年

8 田金亭;;神經(jīng)網(wǎng)絡(luò)在中學(xué)生創(chuàng)造力評(píng)估中的應(yīng)用[A];第十二屆全國(guó)心理學(xué)學(xué)術(shù)大會(huì)論文摘要集[C];2009年

9 唐墨;王科俊;;自發(fā)展神經(jīng)網(wǎng)絡(luò)的混沌特性研究[A];2009年中國(guó)智能自動(dòng)化會(huì)議論文集（第七分冊(cè)）[南京理工大學(xué)學(xué)報(bào)（增刊）][C];2009年

10 張廣遠(yuǎn);萬(wàn)強(qiáng);曹海源;田方濤;;基于遺傳算法優(yōu)化神經(jīng)網(wǎng)絡(luò)的故障診斷方法研究[A];第十二屆全國(guó)設(shè)備故障診斷學(xué)術(shù)會(huì)議論文集[C];2010年

相關(guān)重要報(bào)紙文章前10條

1 美國(guó)明尼蘇達(dá)大學(xué)社會(huì)學(xué)博士密西西比州立大學(xué)國(guó)家戰(zhàn)略規(guī)劃與分析研究中心資深助理研究員陳心想;維護(hù)好創(chuàng)新的“神經(jīng)網(wǎng)絡(luò)硬件”[N];中國(guó)教師報(bào);2014年

2 盧業(yè)忠;腦控電腦驚世駭俗[N];計(jì)算機(jī)世界;2001年

3 葛一鳴路邊文;人工神經(jīng)網(wǎng)絡(luò)將大顯身手[N];中國(guó)紡織報(bào);2003年

4 中國(guó)科技大學(xué)計(jì)算機(jī)系　邢方亮;神經(jīng)網(wǎng)絡(luò)挑戰(zhàn)人類大腦[N];計(jì)算機(jī)世界;2003年

5 記者孫剛;“神經(jīng)網(wǎng)絡(luò)”：打開復(fù)雜工藝“黑箱”[N];解放日?qǐng)?bào);2007年

6 本報(bào)記者劉霞;美用DNA制造出首個(gè)人造神經(jīng)網(wǎng)絡(luò)[N];科技日?qǐng)?bào);2011年

7 戴洪玲;向Excel中快速輸入相同文本[N];中國(guó)電腦教育報(bào);2004年

8 健康時(shí)報(bào)特約記者　張獻(xiàn)懷;干細(xì)胞移植：修復(fù)受損的神經(jīng)網(wǎng)絡(luò)[N];健康時(shí)報(bào);2006年

9 劉力;我半導(dǎo)體神經(jīng)網(wǎng)絡(luò)技術(shù)及應(yīng)用研究達(dá)國(guó)際先進(jìn)水平[N];中國(guó)電子報(bào);2001年

10 ;神經(jīng)網(wǎng)絡(luò)和模糊邏輯[N];世界金屬導(dǎo)報(bào);2002年

相關(guān)博士學(xué)位論文前10條

1 楊旭華;神經(jīng)網(wǎng)絡(luò)及其在控制中的應(yīng)用研究[D];浙江大學(xué);2004年

2 李素芳;基于神經(jīng)網(wǎng)絡(luò)的無(wú)線通信算法研究[D];山東大學(xué);2015年

3 宋歌;基于聚類森林的文本流分類方法研究[D];哈爾濱工業(yè)大學(xué);2014年

4 石艷超;憶阻神經(jīng)網(wǎng)絡(luò)的混沌性及幾類時(shí)滯神經(jīng)網(wǎng)絡(luò)的同步研究[D];電子科技大學(xué);2014年

5 王新迎;基于隨機(jī)映射神經(jīng)網(wǎng)絡(luò)的多元時(shí)間序列預(yù)測(cè)方法研究[D];大連理工大學(xué);2015年

6 付愛民;極速學(xué)習(xí)機(jī)的訓(xùn)練殘差、穩(wěn)定性及泛化能力研究[D];中國(guó)農(nóng)業(yè)大學(xué);2015年

7 李輝;基于粒計(jì)算的神經(jīng)網(wǎng)絡(luò)及集成方法研究[D];中國(guó)礦業(yè)大學(xué);2015年

8 王衛(wèi)蘋;復(fù)雜網(wǎng)絡(luò)幾類同步控制策略研究及穩(wěn)定性分析[D];北京郵電大學(xué);2015年

9 張海軍;基于云計(jì)算的神經(jīng)網(wǎng)絡(luò)并行實(shí)現(xiàn)及其學(xué)習(xí)方法研究[D];華南理工大學(xué);2015年

10 韓開旭;基于支持向量機(jī)的文本情感分析研究[D];東北石油大學(xué);2014年

相關(guān)碩士學(xué)位論文前10條

1 李超;基于深度學(xué)習(xí)的短文本分類及信息抽取研究[D];鄭州大學(xué);2017年

2 章穎;混合不確定性模塊化神經(jīng)網(wǎng)絡(luò)與高校效益預(yù)測(cè)的研究[D];華南理工大學(xué);2015年

3 賈文靜;基于改進(jìn)型神經(jīng)網(wǎng)絡(luò)的風(fēng)力發(fā)電系統(tǒng)預(yù)測(cè)及控制研究[D];燕山大學(xué);2015年

4 李慧芳;基于憶阻器的渦卷混沌系統(tǒng)及其電路仿真[D];西南大學(xué);2015年

5 陳彥至;神經(jīng)網(wǎng)絡(luò)降維算法研究與應(yīng)用[D];華南理工大學(xué);2015年

6 董哲康;基于憶阻器的組合電路及神經(jīng)網(wǎng)絡(luò)研究[D];西南大學(xué);2015年

7 武創(chuàng)舉;基于神經(jīng)網(wǎng)絡(luò)的遙感圖像分類研究[D];昆明理工大學(xué);2015年

8 王軼霞;基于半監(jiān)督遞歸自編碼的情感分類研究[D];內(nèi)蒙古大學(xué);2015年

9 李志杰;基于神經(jīng)網(wǎng)絡(luò)的上證指數(shù)預(yù)測(cè)研究[D];華南理工大學(xué);2015年

10 陳少吉;基于神經(jīng)網(wǎng)絡(luò)血壓預(yù)測(cè)研究與系統(tǒng)實(shí)現(xiàn)[D];華南理工大學(xué);2015年

，

本文編號(hào)：1609418

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://www.wukwdryxk.cn/shoufeilunwen/xixikjs/1609418.html

上一篇：新聞熱點(diǎn)話題發(fā)現(xiàn)及演化分析研究與應(yīng)用
下一篇：基于文本傾向性分析的民航事件輿情趨勢(shì)預(yù)測(cè)方法研究

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

a国产,中文字幕久久波多野结衣AV,欧美粗大猛烈老熟妇,女人av天堂

基于深度學(xué)習(xí)的短文本分類及信息抽取研究