论文部分内容阅读
As many network applications attackers may utilize the protocol vulnerabilities for spreading malicious codes,or exploit some unknown protocols to transfer data secretly.The techniques of inferring protocol formats are helpful to detect such attacks and analysis these unknown protocols.A protocol format inference framework is proposed in this paper,which can automatically inferring protocol formats from binary network traces.It firstly transforms the binary unknown protocol packets into hex messages; the units of keywords are extracted and then spliced into keywords.K-Means based algorithm is adopted to cluster messages according to the keywords distribution,and Needleman Wunsch algorithm is used to merge clusters of same format and to extract the same fields.The framework is tested by extracting formats of ARP and SMTP protocols,and the experiment results indicate its validity.