,Large margin classification for combating disguise attacks on spam filters

来源 :浙江大学学报(英文版)(C辑:计算机与电子) | 被引量 : 0次 | 上传用户:luohuaxiyushi
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
This paper addresses the challenge of large margin classification for spam filtering in the presence of an adversary who disguises the spam mails to avoid being detected.In practice,the adversary may strategically add good words indicative of a legitimate message or remove bad words indicative of spam.We assume that the adversary could afford to modify a spam message only to a certain extent,without damaging its utility for the spammer.Under this assumption,we present a large margin approach for classification of spam messages that may be disguised.The proposed classifier is formulated as a second-order cone programming optimization.We performed a group of experiments using the TREC 2006 Spam Corpus.Results showed that the performance of the standard support vector machine (SVM) degrades rapidly when more words are injected or removed by the adversary,while the proposed approach is more stable under the disguise attack.
其他文献
美国佐治亚大学植物学系Gary Kocher博士由美国洛克菲基金会推荐,于1989年5月16日至17日在湖南杂交水稻研究中心,作了题为“RFLP(Restriction Fragment Length Polymorphism
采用MS基本培养基,附加不同组合的激素对2个甘蓝型油菜(B.napus L.)和2个白菜型油菜(B.campestris L.)的不同外植体进行培养,其中特别比较了AgNO、水解酪蛋白和谷氨酰胺对
Named entity recognition (NER) is a core component in many natural language processing applications. Most NER systems rely on supervised machine leaing methods,
不久前,新闻界揭发出一件“双胞怪胎”的丑闻:《中国人才报》在今年6月11日,用同一个报名,同一个期号,印出真假两种报纸。在假报上,刊登了三个“专版”。这是他们以索取2.15
当下,社会分工越来越明细,每一个行业的职能都在尽可能地细分,并且每一个细分的职能都在趋向于专业化。广告作为一个边缘行业,在这个社会分工日趋明细的时代,所需要的素质恰
关键事件是指能强化当事者的原有认知或引起当事者原有认知冲突的事件。作为经验事实,关键事件在教师职业生涯的各个阶段普遍存在。对关键事件的挖掘和反思,通常会引发教师认
Pulsars are rapidly rotating neutron stars that generate pulsed electromagnetic radiation.A new method for intersatellite relative position determination betwee