论文部分内容阅读
大数据时代,科学家必须使用多个科学工作流管理系统协同完成一项大型实验,来自不同环境和不同科学工作流管理系统的数据构成了科学大数据,科学大数据的产生为科学工作流管理系统中的数据管理带来挑战。科学工作流一般由若干个任务构成,这些任务对输入数据进行运算以产生新的数据为后续任务使用,这些数据需要暂存或者长期存储并在需要时候能够被检索。利用对象存储的优势,以两种不同的模式,对科学工作流的输入数据、中间数据和输出数据予以布局和优化存储,为云计算环境下科学工作流中的数据管理提供参考。
Big data era, scientists must use a number of scientific workflow management system to work together to complete a large experiment, from different environments and different scientific workflow management system data constitute the science of big data, scientific data generation for the scientific workflow management system Data management in the challenge. The scientific workflow typically consists of several tasks that operate on input data to produce new data for subsequent tasks that need to be staged or stored for long periods and retrieved when needed. By using the advantages of object storage, the input data, the intermediate data and the output data of scientific workflow are laid out and optimized for storage in two different modes, which provides reference for data management in scientific workflow in cloud computing environment.