基于血凝素蛋白序列的甲型流感病毒抗原性變異預(yù)測研究
發(fā)布時間:2018-01-15 21:12
本文關(guān)鍵詞:基于血凝素蛋白序列的甲型流感病毒抗原性變異預(yù)測研究 出處:《浙江理工大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 抗原性變異 血凝素蛋白序列 矩陣填充 隨機森林 替換矩陣
【摘要】:及時的鑒定新出現(xiàn)流感病毒的抗原性變異對于流感疫苗的設(shè)計、流感的監(jiān)督以及人們的生命健康都是至關(guān)重要的。傳統(tǒng)的實驗方法(例如血凝抑制試驗)雖然預(yù)測效果不錯,但仍有不少缺點和不足:費時又費力,不能及時有效的對流感起到監(jiān)控作用;有些實驗無法順利進(jìn)行而導(dǎo)致我們獲取的血清學(xué)數(shù)據(jù)比較稀疏(含有大量的缺失值);測量值存在人為和系統(tǒng)誤差故而最終的血清學(xué)數(shù)據(jù)中有不少值過低。為了加速對流感病毒抗原性變異的預(yù)測及提升預(yù)測質(zhì)量,基于流感病毒血凝素蛋白序列的生物信息學(xué)方法不斷的被提出。本文通過提取血凝素蛋白的序列信息并結(jié)合對應(yīng)血清學(xué)數(shù)據(jù)對甲型流感病毒的抗原性變異進(jìn)行分析和預(yù)測,主要研究內(nèi)容如下:1.綜述了近幾年國內(nèi)外甲型流感病毒抗原性變異的預(yù)測研究進(jìn)展,主要是針對血凝素蛋白序列的特征提取以及預(yù)測分類算法。常見的特征表示有二進(jìn)制表示,按氨基酸物化性質(zhì)分組提取等。采用的預(yù)測分類算法主要有矩陣填充,K近鄰,支持向量機,邏輯斯蒂卡回歸,套索算法等。2.提出了一種基于血凝素蛋白序列的聯(lián)合隨機森林算法(JRFR),用于直接預(yù)測甲型流感病毒的抗原性距離。我們的算法結(jié)合94種氨基酸替換矩陣及HA1對甲型流感病毒的抗原性距離進(jìn)行預(yù)測,不僅提升了預(yù)測精度而且對新的病毒序列的抗原性變異有很好的預(yù)測效果。3.提出了一種基于血凝素蛋白序列的矩陣填充算法(BMCSI),用于填充和矯正原本過于稀疏的及含有很多不穩(wěn)定值的血凝素抑制試驗測試數(shù)據(jù),從而得以更精確的計算各病毒間的抗原性距離,并通過多維尺度分析(MDS)算法將其抗原性距離映射到二維空間,從而將甲型流感病毒抗原性距離可視化。我們的方法在1968-2003年數(shù)據(jù)上的預(yù)測精度比之前的研究提升了37%(RMSE=0.6586)。
[Abstract]:Timely identification of the antigenic variation of emerging influenza viruses for the design of influenza vaccines. Surveillance of influenza and the health of people are vital. Traditional experimental methods (such as hemagglutination inhibition tests), while good at predicting, still have many disadvantages and shortcomings: time-consuming and laborious. Unable to monitor influenza in a timely and effective manner; Some of the experiments were not carried out smoothly, which resulted in the sparse serological data (containing a large number of missing values). In order to accelerate the prediction of influenza virus antigenic variation and improve the quality of prediction, many of the final serological data are too low due to human and systematic errors. The bioinformatics method based on the hemagglutinin protein sequence of influenza virus has been proposed continuously. In this paper, the antigenicity variation of influenza A virus was analyzed by extracting the sequence information of hemagglutinin protein and combining the corresponding serological data. Analysis and prediction. The main research contents are as follows: 1. The research progress of antigenicity variation of influenza A virus at home and abroad in recent years is reviewed. Mainly for hemagglutinin protein sequence feature extraction and prediction classification algorithm. The common feature representation has binary representation. The prediction classification algorithms used mainly include matrix filling K nearest neighbor, support vector machine and logical Steka regression. A joint random forest algorithm based on hemagglutinin sequence (JRFR) is proposed. Our algorithm combined 94 amino acid substitution matrix and HA1 to predict the antigenicity distance of influenza A virus. Not only improve the prediction accuracy but also predict the antigenicity variation of new virus sequences. 3. A matrix filling algorithm based on hemagglutinin protein sequence (BMCSI) is proposed. It is used to fill and correct the test data of hemagglutinin inhibition test, which is too sparse and contains many unstable values, so that the antigenicity distance between viruses can be calculated more accurately. The antigenicity distance of MDS is mapped to two-dimensional space by multi-dimensional scale analysis (MDS) algorithm. Therefore, the antigenicity distance of influenza A virus is visualized. The prediction accuracy of our method in 1968-2003 data is 37% higher than previous research.
【學(xué)位授予單位】:浙江理工大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:R373.13
【相似文獻(xiàn)】
相關(guān)會議論文 前1條
1 張萃;;甲型H1N1流感病毒的變異性及其新疫苗的免疫性[A];第五屆全國中醫(yī)藥免疫學(xué)術(shù)研討會——暨環(huán)境·免疫與腫瘤防治綜合交叉會議論文匯編[C];2009年
相關(guān)碩士學(xué)位論文 前1條
1 李顯紅;基于血凝素蛋白序列的甲型流感病毒抗原性變異預(yù)測研究[D];浙江理工大學(xué);2017年
,本文編號:1430065
本文鏈接:http://www.wukwdryxk.cn/shoufeilunwen/mpalunwen/1430065.html
最近更新
教材專著