论文部分内容阅读
Internet的迅速发展,使得worldwideweb已经成为一个巨大的、蕴涵着具有潜在价值知识的分布式信息空间,为数据挖掘研究提供了丰富的资源的同时也提出了新的挑战。该文首先概述了数据挖掘的概念、挖掘算法及其主要应用领域,然后结合Web数据的多样性、丰富和动态的超链接信息以及Web用户访问信息,详细阐述了Web内容挖掘、Web结构挖掘和Web用户访问信息挖掘的概念、定义、主要的挖掘算法及最新研究进展,文章最后介绍了Web挖掘的研究方向和发展趋势。
With the rapid development of Internet, worldwideweb has become a huge distributed information space with potential value knowledge, which provides rich resources for data mining research and also poses new challenges. Firstly, this paper summarizes the concept of data mining, mining algorithm and its main application fields. Then, combining Web data diversity, rich and dynamic hyperlink information and Web user access information, this paper elaborates Web content mining, Web structure mining and Web user access to information mining concepts, definitions, the main mining algorithms and the latest research progress, and finally introduces the research direction and development trend of Web mining.