论文部分内容阅读
“语境”是知识论、语言哲学和逻辑学领域的经典话题。当我们谈论语境的时候,存在两层意思:首先,语境指在进行解释时的物理或沟通环境,这种环境会为时间、空间、名字、记号、符号所决定;第二,语境指进行讨论所基于的基本理论框架。在本文中,我们通过预设一种理想语言,来避免对语境在第一层含义之下所导致的问题进行讨论。然后,我们将从特定的文本语料库构造第二层含义之下的语境。在构造的过程中,我们借鉴数据挖掘领域“聚类分析”的思想,将语境视为基于语句之间互信息的语句聚类。
“Context ” is a classic topic in the field of knowledge theory, language philosophy and logic. When we talk about context, there are two meanings: First, context refers to the physical or communication environment at the time of interpretation, which is determined by time, space, name, token, and symbol; and second, context Refers to the basic theoretical framework on which the discussion is based. In this article, we avoid the discussion of the problems caused by context under the first meaning by pre-setting an ideal language. We will then construct the context under the second level of meaning from a particular text corpus. In the process of construction, we draw on the idea of “cluster analysis” in the field of data mining and regard context as a cluster of sentences based on the mutual information between sentences.