论文部分内容阅读
提出了一种新型的线性脉动阵列结构用来实现基于Montgomery算法的并行模乘运算,对于n位模乘运算,需要2n+11个时钟周期完成,为了减少每一周期内的运算量,在处理单元内部实现了三级流水线结构,使得每一周期的串行运算量仅为一级全加器,同时,由于处理单元间只有局部互连,连线延迟很小,于是这种新结构脉动阵列模乘器能在很高的频率下工作。另一个方面,每个处理单元结构简单,仅由4个全加器和14个触发器构成,对于n位模乘运算,总的规模约为46n+184个门。所以,它在速度和面积上都是优化的,适于VLSI的实现。作为核心运算部件,能有效地用于如RSA等许多公钥密码体制的加解密运算。对于0.8μmCMOS工艺,200MHz时钟是完全可行的,在仅使用一个模乘器条件下,512位模幂乘加解密运算速度能达到129kbit/s。
A new type of linear pulsating array structure is proposed to realize parallel modular multiplication based on Montgomery algorithm. For n-bit modular multiplication, 2n + 11 clock cycles are needed. In order to reduce the computational complexity in each cycle, The three-stage pipeline structure is realized, so that the serial operation amount per cycle is only one full adder. At the same time, since the processing units are only partially interconnected and the connection delay is small, the new structure of the pulsating array modulus multiplication The device can work at very high frequencies. On the other hand, each processing unit has a simple structure consisting of only four full adders and 14 flip-flops. For a n-bit modular multiplication operation, the total size is about 46n + 184 gates. Therefore, it is optimized both in speed and area, making it suitable for VLSI implementation. As a core arithmetic unit, it can be effectively used in encryption and decryption of many public key cryptosystems such as RSA. For a 0.8μm CMOS process, a 200MHz clock is perfectly feasible, and with only one modulo multiplier, the 512-bit modular exponentiation can achieve 129kbit / s.