【摘 要】
:
As the number of scientific publication is getting larger and larger,scientific impact prediction has become an urgent need.However,traditional scientific impact prediction,which is mainly based on lo
【机 构】
:
Information Research Center of Military Science,PLA Academy of Military Science,100142 Beijing,China
【出 处】
:
第六届中国计算机学会大数据学术会议
论文部分内容阅读
As the number of scientific publication is getting larger and larger,scientific impact prediction has become an urgent need.However,traditional scientific impact prediction,which is mainly based on longtime accumulated citation networks,metadata and the whole text of papers,is relatively hysteretic and can hardly fit the rapid development of technology.Moreover,Twitter has become one of the most import channels to spread latest technique information because of its fast information spread speed.The advantage of publishing new messages in realtime can compensate the imperfections of traditional scientific impact prediction methods.Therefore,we propose a new approach to predict scientific impact in Twitter in real time before publishing the paper content.After filtering scholarly tweets(ST tweets),and extracting Tweet Scholar Blocks(TSBs)indicating metadata of papers to help predict scientific impact in real time,author social features,venue popularity features,and title features are exploited to predict whether the article will increase h-index of its first author after five years.Our model achieves an outstanding result that its best accuracy is 80.95%.The best feature conjunction consists of the sum of friends and followers of all the co-authors,followers count of the first author and title embeddings.And the amount of followers of all the co-authors is the most critical feature.Our finding reveals that Twitter has the potential to predict scientific impact in real time.We hope that real-time scientific impact prediction in Twitter can help researchers to expand their influences and more conveniently “stand on the shoulders of giants”.
其他文献
Fault diagnosis techniques based on probabilistic graphical models are often used for uncertain information reasoning.Among them,Bayesian network,an effective tool which has strong characteristics of
将豆瓣短评内容作为分析样本,从用户在线评论数据中挖掘用户喜好,探索适用于中国动漫品牌个性维度研究中各维度权重大小的评价方法,以助于中国动漫企业发现品牌个性维度构建中的不足之处.首先以前人构建好的中国本土品牌个性维度模型“仁、智、勇、乐、雅”作为研究基础,通过《同义词词林》词典对基础特征词进行拓展.其次对样本进行数据预处理,各维度对应的特征词语词频统计与归一化处理,然后运用熵权法计算各品牌个性维度的
网络空间中具有纷繁复杂的多种态势要素、要素属性,以及要素之间的错综关系.对这些信息能否清晰准确地分析并描述,直接关系到所建立的网络空间可视化模型的准确性、完备性、有效性.本文采用知识表示方法,对网络空间中的关键态势信息要素进行描述,主要研究内容包括以下三个方面.首先分析了网络空间态势信息知识的特点,提出了对网络空间态势信息进行知识表示的重要作用.其次研究了基于本体的知识表示理论,分析了采用本体表示
In order to solve the problems of poor portability,complex implemen-tation,and low efficiency in the traditional parameter training of the Belief rule-base,an artificial bee colony algorithm combined
The existing keyword-based search algorithms based on streaming data are hard to meet the needs of users for real-time data processing.To solve this problem,multi-keyword parallel search algorithm for
When smartphones,applications(a.k.a,apps),and app stores have been widely adopted by the billions,an interesting debate emerges: whether and to what extent do device models in uence the behaviors of t
社交网络的蓬勃发展彻底改变了人们的社交行为,也促进了交叉学科的研究.在社交网络中挖掘情感社区,可应用于公共健康、舆情监测等领域.本文作为首个面向中文社交网络进行情感社区检测的研究,以新浪微博为平台建立一种情感社群检测框架,首先融合微博情感表情特征和情感词典,提出基于朴素贝叶斯算法的分类模型SL-SE-NB(Naive Bayes Based Semi-lexicon and Semi-emoji)
Traditional Belief-Rule-Based Ensemble learning methods usually integrate all sub-BRB systems that are trained to obtain better results than a single belief-rule-based system.As the number of BRB syst
现有的文献大多是对位置隐私保护算法的研究,对于位置隐私保护算法的隐私性度量的研究相对缺乏.为此,文中以贪心法的位置K-匿名算法(Greedy-based Location K-anonymous Algorithm,GLKA)为例,提出位置隐私泄露的度量方法.该方法以KL距离(Kullback-Leibler divergence)为基础,将攻击者的背景知识融入其中,用以度量匿名区域中用户位置隐私
Community search plays an important role in complex network analysis.It aims to find a densely connected subgraph containing the query node in a graph.However,the most existing community search method