Outlier Detection over Sliding Windows for Probabilistic Data Streams

来源 :Journal of Computer Science & Technology | 被引量 : 0次 | 上传用户:yjjqwertyuiop
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Outlier detection is a very useful technique in many applications, where data is generally uncertain and could be described using probability. While having been studied intensively in the field of deterministic data, outlier detection is still novel in the emerging uncertain data field. In this paper, we study the semantic of outlier detection on probabilistic data stream and present a new definition of distance-based outlier over sliding window. We then show the problem of detecting an outlier over a set of possible world instances is equivalent to the problem of finding the k-th element in its neighborhood. Based on this observation, a dynamic programming algorithm (DPA) is proposed to reduce the detection cost from O(2|R(e,d)|) to O(|k·R(e,d)|), where R(e,d) is the d-neighborhood of e. Furthermore, we propose a pruning-based approach (PBA) to e?ectively and e?ciently filter non-outliers on single window, and dynamically detect recent m elements incrementally. Finally, detailed analysis and thorough experimental results demonstrate the e?ciency and scalability of our approach. Outlier detection is a very useful technique in many applications, where data is generally uncertain and could be described using probability. While having studied thatly intensively in the field of deterministic data, outlier detection is still novel in the emerging uncertain data field. In this paper , we study the semantic of outlier detection on probabilistic data stream and present a new definition of distance-based outlier over sliding window. We then show the problem of detecting an outlier over a set of possible world instances is equivalent to the problem of finding the k-th element in its neighborhood. Based on this observation, a dynamic programming algorithm (DPA) is proposed to reduce the detection cost from O (2 | R (e, d) Here, we propose a pruning-based approach (PBA) to e? ectively and e? ciently filter non-outliers on single window, and dynamically detect recent m elements incrementally. Finally, detailed anal ysis and deep experimental results demonstrate the e? ciency and scalability of our approach.
其他文献
期刊
实行住宅工程质量监督是政府职能,政府通过对住宅工程质量宏观性、强制性、法制性的监督,是促使工程建设各责任主体搞好工程质量控制及确保工程质量的重要手段。我国实行政府对
人类通过不断地认识自身和所处的这个世界,其本身就证明了人类作为一种有意识的主体的存在,而有意识的主体存在,就必然会有对应的客体。内格尔通过对以往心灵观的主客观理论
期刊
期刊
传统网络计划中对于项目实际实施中紧前紧后工作间交接时间并未考虑,不符合工程实际。为了解决该问题,国内外有很多学者和研究人员将接力技术思想应用在工程进度管理上,并取得了很多优秀的成果。但他们的研究重点主要集中于对进度作单独变量研究,很少有人将工期关联成本、质量后进行探讨,无法满足工程项目管理的需求。工程项目的工期、成本和质量是工程项目的三大构成要素,各要素之间相互关联、相互影响,任何一个要素的变动都
目前很多定制系统使用模块选项列表的形式引导客户在网页上定制个性化的产品。因此,研究如何在客户定制产品过程中进行推荐,辅助客户完成个性化产品的定制十分重要。   本文
顺应国际上的第四次产业转移浪潮,在我国东部沿海地区实现产业结构调整和升级过程中,部分在本区域失去比较优势的产业开始向欠发达地区转移,这一现象的发生为欠发达地区学习新技
学位
当今是科技、经济快速发展的时刻,世界各国的竞争力都是通过对科技的研发和应用推广所决定的,对于任何一个国家或者地区来说,科学技术的发展水平是衡量该国家或地区的一个重要标