a国产,中文字幕久久波多野结衣AV,欧美粗大猛烈老熟妇,女人av天堂

當(dāng)前位置:主頁 > 碩博論文 > 信息類博士論文 >

基于詞表示和深度學(xué)習(xí)的生物醫(yī)學(xué)關(guān)系抽取

發(fā)布時間:2018-06-24 09:02

  本文選題:詞表示 + 深度學(xué)習(xí); 參考:《大連理工大學(xué)》2016年博士論文


【摘要】:蛋白質(zhì)關(guān)系抽取和藥物關(guān)系抽取對于生物醫(yī)學(xué)領(lǐng)域相關(guān)數(shù)據(jù)庫的構(gòu)建、生命科學(xué)研究、藥物開發(fā)和疾病的防治都具有重要意義。目前,大量生物醫(yī)學(xué)關(guān)系抽取方法的研究重點在于特征集合的選取和核函數(shù)的設(shè)計,經(jīng)過十余年的發(fā)展,基于特征和核函數(shù)的方法已經(jīng)相對成熟,提升空間變得有限。為了進一步提升性能,本文研究基于詞表示和深度學(xué)習(xí)的抽取方法。深度學(xué)習(xí)能夠建立更深層的關(guān)系抽取模型以提升抽取效果,而詞表示將語義信息融合到詞向量中,是深度學(xué)習(xí)的前提。本文主要貢獻包括:針對生物醫(yī)學(xué)領(lǐng)域文本的特點設(shè)計詞表示模型,在傳統(tǒng)詞表示模型基礎(chǔ)上,融合詞形、詞性、詞干、句法塊、生物醫(yī)學(xué)命名實體這五類重要信息,增強詞向量的語義表示能力,并在蛋白質(zhì)關(guān)系抽取、藥物關(guān)系抽取等任務(wù)上取得了較好的效果,驗證了在詞表示中融入詞性、實體等豐富信息的有效性,為基于深度學(xué)習(xí)的關(guān)系抽取方法提供了良好的詞表示基礎(chǔ)。針對蛋白質(zhì)二類關(guān)系抽取問題,克服傳統(tǒng)方法依賴于特征和核函數(shù)的局限性,提出一種基于實例表示的抽取模型,該模型包含詞向量、骨架特征、特征組合三個部分,在規(guī)模較大的語料上抽取效果達到了目前先進水平,從而驗證了基于詞表示和深度學(xué)習(xí)方法在蛋白質(zhì)關(guān)系抽取問題上的有效性。該模型考慮了蛋白質(zhì)關(guān)系實例的特點,以詞向量作為輸入,配合骨架特征和向量組合,從而在實例表示中融合豐富的語義信息。針對藥物多類關(guān)系抽取問題,提出一種兩階段方法:在第一階段,采用實例表示與句法特征相結(jié)合的方法,利用邏輯回歸分類器,識別出藥物關(guān)系正例;在第二階段,利用長短期記憶網(wǎng)絡(luò)將正例分成四種藥物關(guān)系類型。為了提升第二階段性能,從重要度、實現(xiàn)代價和計算代價這三個方面考慮了多種相關(guān)要素對長短期記憶網(wǎng)絡(luò)的影響,通過實驗發(fā)現(xiàn),詞向量、距離向量、詞性向量和雙層雙向長短期記憶網(wǎng)絡(luò)對于第二階段分類的性能具有提升作用,也是本文兩階段藥物關(guān)系抽取方法能夠取得較好效果的重要因素。綜上所述,本文針對蛋白質(zhì)間二分類關(guān)系抽取和藥物間多分類關(guān)系抽取,利用表示和深度學(xué)習(xí)等技術(shù)提出相應(yīng)的抽取方法,在一定程度上克服了基于特征和核函數(shù)方法的局限性,取得了較好的效果。詞表示和深度學(xué)習(xí)技術(shù)是近年來的研究熱點,在生物醫(yī)學(xué)文本挖掘領(lǐng)域的起步較晚,本文所提出的方法在生物醫(yī)學(xué)關(guān)系抽取任務(wù)上取得了一定成果,驗證了其有效性,并揭示了基于詞表示和深度學(xué)習(xí)方法在生物醫(yī)學(xué)文本挖掘領(lǐng)域具有廣闊的研究空間,值得在未來工作中繼續(xù)探索。
[Abstract]:Protein relation extraction and drug relationship extraction are of great significance to the construction of biomedical database, life science research, drug development and disease prevention and treatment. At present, a large number of biomedical relation extraction methods focus on the selection of feature sets and the design of kernel functions. After more than a decade of development, the methods based on features and kernel functions have been relatively mature, and the lifting space has become limited. To further improve performance, this paper studies extraction methods based on word representation and depth learning. Depth learning can build deeper relational extraction model to improve the extraction effect, and word representation fusion semantic information into word vector is the premise of deep learning. The main contributions of this paper are as follows: according to the characteristics of biomedical text, a word representation model is designed. Based on the traditional word representation model, five kinds of important information, such as lexical form, word-of-speech, stem, syntactic block and biomedical named entity, are fused. The ability of semantic representation of word vectors is enhanced, and good results are obtained in the tasks of protein relation extraction and drug relation extraction, which verify the effectiveness of incorporating part of speech and entity into word representation. It provides a good basis for relation extraction based on deep learning. In order to overcome the limitation of traditional methods, which depend on feature and kernel function, an extraction model based on case representation is proposed. The model consists of three parts: word vector, skeleton feature and feature combination. The effect of extraction on large scale corpus is up to the present advanced level, which verifies the validity of the method based on word representation and depth learning in the extraction of protein relationship. The model considers the characteristics of the case of protein relation, takes word vector as input, and combines skeleton feature and vector, so as to fuse rich semantic information in case representation. In order to solve the problem of drug multi-class relation extraction, a two-stage method is proposed: in the first stage, the method of case representation combined with syntactic features is used to identify the positive case of drug relationship by using logical regression classifier, and in the second stage, By using long-term and short-term memory networks, the positive cases are divided into four types of drug relationships. In order to improve the performance of the second stage, the effects of many related factors on the long-term and short-term memory network are considered from the three aspects of importance, realization cost and computational cost. Part of speech vector and double-layer bidirectional long-term and short-term memory network can improve the performance of the second stage classification, which is also an important factor that the two-stage drug relationship extraction method can achieve better results. To sum up, this paper proposes a new extraction method based on the techniques of representation and depth learning, aiming at the extraction of the two-class relationship between proteins and the multi-classification relationship between drugs. To some extent, the limitation of the method based on feature and kernel function is overcome, and good results are obtained. The technology of word representation and deep learning has been a hot research topic in recent years, and it started late in the field of biomedical text mining. The method proposed in this paper has achieved some results in the task of biomedical relation extraction, and verified its effectiveness. It is also revealed that the word representation and depth learning methods have a wide research space in biomedical text mining field, which is worthy of further exploration in the future work.
【學(xué)位授予單位】:大連理工大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2016
【分類號】:TP391.1

【相似文獻】

相關(guān)期刊論文 前1條

1 朱萬穎;張希府;高志強;;句法模式的泛化及其在關(guān)系學(xué)習(xí)中的應(yīng)用[J];重慶工學(xué)院學(xué)報(自然科學(xué)版);2008年10期

相關(guān)會議論文 前1條

1 虞歡歡;陳九昌;錢龍華;周國棟;;基于樹核函數(shù)的中文語義關(guān)系抽取[A];中國計算機語言學(xué)研究前沿進展(2007-2009)[C];2009年

,

本文編號:2060941

資料下載
論文發(fā)表

本文鏈接:http://www.wukwdryxk.cn/shoufeilunwen/xxkjbs/2060941.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶76696***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
人人插人人| 亚洲成AV人最新无码| 啊v在线观看| 国产激情精品一区二区三区| 欧洲一码二码三码乱码| 人妻中文字系列无码专区 | 亚洲av综合色区| 日本亚洲色大成网站WWW| 亚洲中文字幕成人无码| 国产成人啪精品视频免费软件 | 成人av专区精品无码国产| 色爱区综合| 99精品国产一区二区| 国产激情综合五月久久| 91国内揄拍国内精品对白| 老熟妇2| www.7788久久久久久久久| 欧美6一10sex性hd| 色欲aⅴ亚洲情无码AV| 夂久精品国产久精国产| 色偷偷88888欧美精品久久久| 最新亚洲av日韩av一区二区三区| 成全视频高清免费观看| 亚洲AV综合色区无码二区偷拍 | 久久一日本综合色鬼综合色| 成人午夜性A级毛片免费| 精品国产乱码久久久久软件| 在线日韩日本国产亚洲| 啪啪无码人妻丰满熟妇| 精品久久久久久无码免费| 色欲久久久天天天综合网精品 | 人与动人物A级毛片在线| 欧美黑人又粗又硬xxxxx喷水| av无码精品一区二区三区四区| 天堂中文在线最新版www| 国产成人影院一区二区三区| 国产精品亚洲一区二区三区喷水| 国产自国产自愉自愉免费24区| 欧美日韩国产成人高清视频| 精品国产一区二区三区不卡在线| 欧妇女乱妇女乱视频|