论文部分内容阅读
从 web日志中发现有用的信息是所有 web站点管理者的迫切愿望 ,但 web服务器日志的不准确导致数据准备阶段的复杂性 .在数据挖掘以往的应用领域如 POS数据库中 ,存在着具有自然特征的事务 ,而在 web日志中不但没有这种事务 ,而且还不容易通过分析得到这种事务 .本文首先描述了引用长度事务分割方法的用户浏览行为模型 ,然后针对这种模型提出了两点改进 :增加了网络延时参数和对噪音数据处理的考虑 .改进后的模型能适应网络延时较大且随时间变化的情况 ,更能够反映用户的实际浏览行为
The discovery of useful information from web logs is an urgent aspiration of all web site administrators, but the inaccuracy of web server logs leads to the complexity of the data preparation phase.In past applications of data mining, such as POS databases, , But it is not easy to find such a transaction in web log.At first, this paper describes the user browsing behavior model which refers to the length transaction segmentation method, and then proposes two improvements for this model : Added network delay parameters and noise data processing considerations.The improved model can adapt to the network delay is large and changes over time, but also to reflect the actual browsing behavior of users