融合用戶標(biāo)簽和微博內(nèi)容的用戶興趣社區(qū)發(fā)現(xiàn)
發(fā)布時(shí)間:2019-04-26 19:08
【摘要】:隨著社交網(wǎng)絡(luò)的不斷發(fā)展,微博已經(jīng)成為了人們?nèi)粘I钪胁豢苫蛉钡囊粋(gè)重要組成部分。微博當(dāng)中,微博用戶的自定義用戶標(biāo)簽以及用戶發(fā)布微博,轉(zhuǎn)發(fā)微博等微博行為,反映了用戶的興趣愛好,如何通過這些信息,挖掘用戶興趣,發(fā)現(xiàn)用戶興趣社區(qū),具有重要的研究意義和價(jià)值。本文針對(duì)微博用戶興趣社區(qū)發(fā)現(xiàn),主要從以下幾個(gè)方面展開了深入研究與探討: (1)提出了一種基于特征映射的用戶標(biāo)簽興趣建模方法。 針對(duì)用戶標(biāo)簽反映用戶興趣的特點(diǎn),選擇用戶標(biāo)簽作為用戶興趣模型的特征,并為了解決標(biāo)簽表述不一和長(zhǎng)標(biāo)簽所帶來的數(shù)據(jù)稀疏問題和噪聲影響,引入特征映射的思想,將長(zhǎng)標(biāo)簽進(jìn)行分詞處理,表示成子標(biāo)簽集,通過計(jì)算標(biāo)簽之間的相似度,將用戶標(biāo)簽映射到相似度最高的特征維標(biāo)簽,利用標(biāo)簽相似度和標(biāo)簽頻率的乘積作為特征維的特征值,構(gòu)建用戶標(biāo)簽興趣模型,并利用模糊聚類方法,驗(yàn)證用戶標(biāo)簽興趣建模的有效性。 (2)提出了一種基于有指導(dǎo)LDA的用戶微博內(nèi)容興趣建模方法。 針對(duì)微博文本交互對(duì)微博主題分布的影響,提出了一種有指導(dǎo)的LDA微博生成模型,綜合考慮了轉(zhuǎn)發(fā)微博,評(píng)論微博,回復(fù)微博,他人評(píng)論四個(gè)因素對(duì)用戶微博興趣主題分布的影響,在傳統(tǒng)LDA模型的基礎(chǔ)上,構(gòu)建有指導(dǎo)的LDA微博生成模型,得到微博主題分布,并進(jìn)一步得到用戶主題分布,從微博內(nèi)容角度對(duì)用戶進(jìn)行興趣建模。 (3)提出了一種融合用戶標(biāo)簽和微博內(nèi)容的用戶興趣社區(qū)發(fā)現(xiàn)方法。 在(1)和(2)的基礎(chǔ)上,利用用戶模型相似度,構(gòu)建用戶標(biāo)簽興趣關(guān)系網(wǎng)和微博內(nèi)容興趣關(guān)系網(wǎng),與微博當(dāng)中本身存在的用戶關(guān)注關(guān)系網(wǎng)進(jìn)行融合,并在此基礎(chǔ)上,考慮微博用戶可能屬于多個(gè)社區(qū)所造成的社區(qū)重疊問題,提出了一種基于k-clique的用戶興趣社區(qū)發(fā)現(xiàn)方法,通過對(duì)社區(qū)重疊矩陣進(jìn)行求解,得到社區(qū)連接矩陣,最后得到包含多個(gè)相連k-clique的用戶興趣社區(qū)。 (4)利用上述研究成果,設(shè)計(jì)實(shí)現(xiàn)了微博用戶興趣社區(qū)發(fā)現(xiàn)原型系統(tǒng)。
[Abstract]:With the development of social network, Weibo has become an indispensable part of people's daily life. In Weibo, the custom user tags of Weibo users and the Weibo behaviors such as publishing Weibo and forwarding Weibo by users reflect the interests of users, how to discover user interests through these information, and how to discover the community of interests of users. It has important research significance and value. In this paper, Weibo user interest community discovery, mainly from the following aspects of in-depth research and discussion: (1) A user tag interest modeling method based on feature mapping is proposed. In view of the characteristics of user tag reflecting user interest, the user tag is selected as the feature of user interest model. In order to solve the problem of data sparsity and noise caused by different label representation and long label, the idea of feature mapping is introduced. The long label is partitioned to represent the sub-tag set. By calculating the similarity between the tags, the user tag is mapped to the feature dimension tag with the highest similarity, and the product of the label similarity and the label frequency is used as the characteristic value of the feature dimension. User tag interest model is constructed, and fuzzy clustering method is used to verify the effectiveness of user tag interest modeling. (2) A user Weibo content interest modeling method based on guided LDA is proposed. Aiming at the influence of Weibo text interaction on the topic distribution of Weibo, this paper presents a guided generation model of Weibo, which considers forwarding Weibo, commenting Weibo and replying to Weibo. Other people comment on the influence of four factors on user Weibo interest theme distribution. Based on the traditional LDA model, a guided LDA Weibo generation model is constructed, and the Weibo theme distribution is obtained, and the user theme distribution is further obtained. Modeling the user's interest from the point of view of Weibo content. (3) A community of interest discovery method combining user tags and Weibo content is proposed. On the basis of (1) and (2), the user tag interest relation network and Weibo content interest relation network are constructed by using the similarity of user model, and are merged with the user concern relation network in Weibo itself, and on this basis, the user tag interest relation network and the content interest relationship network in Weibo are constructed. Considering the problem of community overlap caused by Weibo users belonging to more than one community, this paper proposes a community of interest community discovery method based on k-clique, and obtains the community connection matrix by solving the community overlap matrix. Finally, the user interest community containing multiple connected k-clique is obtained. (4) based on the above research results, a prototype system of Weibo user interest community discovery is designed and implemented.
【學(xué)位授予單位】:昆明理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.092
本文編號(hào):2466320
[Abstract]:With the development of social network, Weibo has become an indispensable part of people's daily life. In Weibo, the custom user tags of Weibo users and the Weibo behaviors such as publishing Weibo and forwarding Weibo by users reflect the interests of users, how to discover user interests through these information, and how to discover the community of interests of users. It has important research significance and value. In this paper, Weibo user interest community discovery, mainly from the following aspects of in-depth research and discussion: (1) A user tag interest modeling method based on feature mapping is proposed. In view of the characteristics of user tag reflecting user interest, the user tag is selected as the feature of user interest model. In order to solve the problem of data sparsity and noise caused by different label representation and long label, the idea of feature mapping is introduced. The long label is partitioned to represent the sub-tag set. By calculating the similarity between the tags, the user tag is mapped to the feature dimension tag with the highest similarity, and the product of the label similarity and the label frequency is used as the characteristic value of the feature dimension. User tag interest model is constructed, and fuzzy clustering method is used to verify the effectiveness of user tag interest modeling. (2) A user Weibo content interest modeling method based on guided LDA is proposed. Aiming at the influence of Weibo text interaction on the topic distribution of Weibo, this paper presents a guided generation model of Weibo, which considers forwarding Weibo, commenting Weibo and replying to Weibo. Other people comment on the influence of four factors on user Weibo interest theme distribution. Based on the traditional LDA model, a guided LDA Weibo generation model is constructed, and the Weibo theme distribution is obtained, and the user theme distribution is further obtained. Modeling the user's interest from the point of view of Weibo content. (3) A community of interest discovery method combining user tags and Weibo content is proposed. On the basis of (1) and (2), the user tag interest relation network and Weibo content interest relation network are constructed by using the similarity of user model, and are merged with the user concern relation network in Weibo itself, and on this basis, the user tag interest relation network and the content interest relationship network in Weibo are constructed. Considering the problem of community overlap caused by Weibo users belonging to more than one community, this paper proposes a community of interest community discovery method based on k-clique, and obtains the community connection matrix by solving the community overlap matrix. Finally, the user interest community containing multiple connected k-clique is obtained. (4) based on the above research results, a prototype system of Weibo user interest community discovery is designed and implemented.
【學(xué)位授予單位】:昆明理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.092
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 林友芳;王天宇;唐銳;周元煒;黃厚寬;;一種有效的社會(huì)網(wǎng)絡(luò)社區(qū)發(fā)現(xiàn)模型和算法[J];計(jì)算機(jī)研究與發(fā)展;2012年02期
2 趙妍;趙學(xué)民;;基于CURE的用戶聚類算法研究[J];計(jì)算機(jī)工程與應(yīng)用;2012年11期
3 毛曉星;薛安榮;鞠時(shí)光;;基于加權(quán)語義網(wǎng)和有效信息的個(gè)性化用戶興趣建模[J];計(jì)算機(jī)應(yīng)用研究;2010年09期
4 黃發(fā)良;肖南峰;;基于線圖與PSO的網(wǎng)絡(luò)重疊社區(qū)發(fā)現(xiàn)[J];自動(dòng)化學(xué)報(bào);2011年09期
5 蔡國(guó)永;林航;文益民;;社會(huì)語義網(wǎng)社區(qū)發(fā)現(xiàn)標(biāo)簽傳遞算法研究[J];計(jì)算機(jī)科學(xué);2013年02期
6 閆光輝;舒昕;馬志程;李祥;;基于主題和鏈接分析的微博社區(qū)發(fā)現(xiàn)算法[J];計(jì)算機(jī)應(yīng)用研究;2013年07期
7 王志超;于劍;柴變芳;;基于集聚系數(shù)的鏈接社區(qū)發(fā)現(xiàn)方法[J];南京大學(xué)學(xué)報(bào)(自然科學(xué)版);2013年04期
,本文編號(hào):2466320
本文鏈接:http://www.wukwdryxk.cn/guanlilunwen/ydhl/2466320.html
最近更新
教材專著