Efficient Location-Aware Data Placement for Data-Intensive Applications in Geo-distributed Scientifi

来源 :Tsinghua Science and Technology | 被引量 : 0次 | 上传用户:tonerzhang
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Recent developments in cloud computing and big data have spurred the emergence of data-intensive applications for which massive scientific datasets are stored in globally distributed scientific data centers that have a high frequency of data access by scientists worldwide. Multiple associated data items distributed in different scientific data centers may be requested for one data processing task, and data placement decisions must respect the storage capacity limits of the scientific data centers. Therefore, the optimization of data access cost in the placement of data items in globally distributed scientific data centers has become an increasingly important goal.Existing data placement approaches for geo-distributed data items are insufficient because they either cannot cope with the cost incurred by the associated data access, or they overlook storage capacity limitations, which are a very practical constraint of scientific data centers. In this paper, inspired by applications in the field of high energy physics, we propose an integer-programming-based data placement model that addresses the above challenges as a Non-deterministic Polynomial-time(NP)-hard problem. In addition we use a Lagrangian relaxation based heuristics algorithm to obtain ideal data placement solutions. Our simulation results demonstrate that our algorithm is effective and significantly reduces overall data access cost. Recent developments in cloud computing and big data have spurred the emergence of data-intensive applications for which massive scientific datasets are stored in globally distributed scientific data centers that have a high frequency of data access by scientists worldwide. Multiple associated data items distributed in different scientific data centers may be requested for one data processing task, and data placement decisions must respect the storage capacity limits of the scientific data centers. increasingly important goal.Existing data placement approaches for geo-distributed data items are insufficient because they either can not cope with the cost incurred by the associated data access, or which overlook storage capacity limitations, which are a very practical constraint of scientific data centers. this paper, inspired by applications in the field of high energy physics, we propose an integer-programming-based data placement model that addresses the above challenges as a Non-deterministic Polynomial-time (NP) -hard problem. In addition we use a Lagrangian relaxation based heuristics algorithm to obtain ideal data placement solutions. Our simulation results demonstrate that our algorithm is effective and significantly reduced overall data access cost.
其他文献
如果有人问:林肯和卓别林谁更聪明?这个问题的确没法回答。林肯以他的领导天赋率领人民推翻了黑奴制,赢得了美国人民的忠心爱戴:卓别林则以独一无二的表演绝技征服了世界人民
今天是开学的第一天,新任的班主任张老师宣布了我们班委会成员的名单,我有幸成为了班级的宣传委员,这对于从来未尝过“官”味,又“官”瘾十足的我来说,的确是一件令人欣慰的
从1970年到现在的10多年间,我们根据海岛第一线部队特点,组织了多次不同规模和方式的营养调查,发现了一批营养缺乏病,及时地做了相应的处理,积累了一些营养卫生调查的经验,
本文介绍了中南某地质大队职业人员矽肺的发病特点,并对矽肺与工伤等职业危害造成的工作损失及经济损失进行了估算。结果表明,该大队的矽肺病相当严重,具有发病快、跳期多、死亡
大约小铁三四岁的时候,有一天他忽然看见寄给我的一封信的信封上贴着漂亮的纪念邮票,挺好看的,便问我:“这是什么?”我告诉他这是邮票,你要是喜欢,可以把它们都攒在一起,就叫做
在小学语文教学中诵读教学一直占据着重要的位置,且随着新课改的不断加深,已经受到了人们的广泛关注与认可。但是在实际教学中,一些教师在开展诵读教学的过程中由于受到传统
MP(Y、G、B)系列水力空气泡沫灭火设备是低倍数泡沫灭火设备,是当今世界上扑救甲(液化烃除外),乙、丙类液体火灾普遍使用的灭火设备。是一种安全可靠、经济实用、灭火效率高、安装使用方
急死是指外表好像健康的人,出乎意外的突然死亡。这种死亡并非受外界暴力或化学毒物作用,而由于体内潜在的进行性疾病,可在某些外因作用下或没有外因即死亡者。急死是暴死和
请下载后查看,本文暂不支持在线获取查看简介。 Please download to view, this article does not support online access to view profile.
江泽民总书记关于“创建学习型社会”的思想,提出的是一个关系到中国能否持续发展、实现民族复兴大业的战略问题。我们应该认真学习、深入领会。“创建学习型社会”的核心是