List Info

Thread: The market of ASICs (One GigaKey / Second?)




The market of ASICs (One GigaKey / Second?)
user name
2006-10-19 04:48:58
As you might note from Martin's post, this easily generates
verilog with minor changes.

Have fun!
John


	Date: Sun, 8 Aug 2004 21:58:10 -0600
	From: jbassdmsd.com
	To: hardwarelists.distributed.net
	Subject: Re: [Hardware] The market of ASICs (One GigaKey /
Second?)

	Elektron <elektron_rc5yahoo.ca> writes:
	> Imagine loops are unrolled, since the % is incredibly
wasteful. The 
	> B=L[0] line may be able to have a hard-coded value of
A in some systems 
	> for a marginal speed increase (not PPC though). We
don't decrypt. We 
	> encrypt the plaintext to check that it matches the
ciphertext (most 
	> cores do this, and either way it's just as fast,
except as mentioned, 
	> the last round is faster).

	That's pretty much what I did several years back, while
trying to construct
	an FPGA based RC5 engine. The starting point was a C based
code generator
	to build C and VHDL RC5 engines.

	I still having found the VHDL RC5 tools, but this is a
prototype of the
	C code RC5 core generator. It still isn't perfect, but
gives the C compiler
	lots of locality hints to pull all the terms into registers
and discard
	the writes which can be a bottleneck on small write-thru
based caches.
	Later gcc versions nearly get it perfect with -O3

	John

	/*
	 * Construct unrolled RC5 key check core
	 * John L. Bass, Copyright 2001
	 */
	main( )
	{
		const unsigned int P = 0xB7E15163;
		const unsigned int Q = 0x9E3779B9;
		
		unsigned int S[26];
		unsigned int e, i, s, l;

		/* round 0 */
		S[0]= P;
		for (s=1;s<26;s++)
			S[s]= S[s-1] + Q;
		l=0;
		printf("ttL%d = key0 = count;n",l);
		printf("ttL%d = key1 =
count>>32;n",l+1);

		printf("ttS%d = ROTL3(0x%08x]);n",
			s,S[s-26]);
		printf("ttL0 = ROTL(L0 + S%d, S%d);n",
			s,s);
		s++;

		/* rounds 0,1 */
		for (;s<(2*26);s++,l^=1)
		{
			printf("ttS%d = ROTL3(0x%08x + S%d + %s);n",
				s,S[s-26],s-1,l?"L1":"L0");
			printf("tt%s = ROTL(%s + S%d + %s, S%d +
%s);n",
				l?"L0":"L1",l?"L0":"L
1",s,l?"L1":"L0",s,l?"L1"
:"L0");
		}

		/* rounds 2,3 */
		for (;s<(4*26);s++,l^=1)
		{
			printf("ttS%d = ROTL3(S%d + S%d + %s);n",
				s,s-26,s-1,l?"L1":"L0");
			if(s != 103)
			printf("tt%s = ROTL(%s + S%d + %s, S%d +
%s);n",
				l?"L0":"L1",l?"L0":"L
1",s,l?"L1":"L0",s,l?"L1"
:"L0");
			if(s == 78)
			    printf("ttE0 = plain0 + S%d;n",s);
			if(s == 79)
			    printf("ttE1 = plain1 + S%d;n",s);
			if(s > 79)
			printf("ttE%d = ROTL(E0 ^ E1, E%d) + S%d;n",
				(s)%2,(s+1)%2,s);
		}
	}
_______________________________________________
Hardware mailing list
Hardwarelists.distributed.net
http://lists.distributed.net/mailman/listinfo/hardware

The market of ASICs (One GigaKey / Second?)
user name
2006-10-19 13:31:33
John,
I didn't use your code generator. It was all manual labor
unfortunately.
--
Martin K

John L. Bass wrote:
> As you might note from Martin's post, this easily
generates verilog with minor changes.
>
> Have fun!
> John
>
>
> 	Date: Sun, 8 Aug 2004 21:58:10 -0600
> 	From: jbassdmsd.com
> 	To: hardwarelists.distributed.net
> 	Subject: Re: [Hardware] The market of ASICs (One
GigaKey / Second?)
>
> 	Elektron <elektron_rc5yahoo.ca> writes:
> 	> Imagine loops are unrolled, since the % is
incredibly wasteful. The 
> 	> B=L[0] line may be able to have a hard-coded
value of A in some systems 
> 	> for a marginal speed increase (not PPC though).
We don't decrypt. We 
> 	> encrypt the plaintext to check that it matches
the ciphertext (most 
> 	> cores do this, and either way it's just as fast,
except as mentioned, 
> 	> the last round is faster).
>
> 	That's pretty much what I did several years back,
while trying to construct
> 	an FPGA based RC5 engine. The starting point was a C
based code generator
> 	to build C and VHDL RC5 engines.
>
> 	I still having found the VHDL RC5 tools, but this is a
prototype of the
> 	C code RC5 core generator. It still isn't perfect, but
gives the C compiler
> 	lots of locality hints to pull all the terms into
registers and discard
> 	the writes which can be a bottleneck on small
write-thru based caches.
> 	Later gcc versions nearly get it perfect with -O3
>
> 	John
>
> 	/*
> 	 * Construct unrolled RC5 key check core
> 	 * John L. Bass, Copyright 2001
> 	 */
> 	main( )
> 	{
> 		const unsigned int P = 0xB7E15163;
> 		const unsigned int Q = 0x9E3779B9;
> 		
> 		unsigned int S[26];
> 		unsigned int e, i, s, l;
>
> 		/* round 0 */
> 		S[0]= P;
> 		for (s=1;s<26;s++)
> 			S[s]= S[s-1] + Q;
> 		l=0;
> 		printf("ttL%d = key0 = count;n",l);
> 		printf("ttL%d = key1 =
count>>32;n",l+1);
>
> 		printf("ttS%d = ROTL3(0x%08x]);n",
> 			s,S[s-26]);
> 		printf("ttL0 = ROTL(L0 + S%d, S%d);n",
> 			s,s);
> 		s++;
>
> 		/* rounds 0,1 */
> 		for (;s<(2*26);s++,l^=1)
> 		{
> 			printf("ttS%d = ROTL3(0x%08x + S%d +
%s);n",
> 				s,S[s-26],s-1,l?"L1":"L0");
> 			printf("tt%s = ROTL(%s + S%d + %s, S%d +
%s);n",
>
				l?"L0":"L1",l?"L0":"L
1",s,l?"L1":"L0",s,l?"L1"
:"L0");
> 		}
>
> 		/* rounds 2,3 */
> 		for (;s<(4*26);s++,l^=1)
> 		{
> 			printf("ttS%d = ROTL3(S%d + S%d +
%s);n",
> 				s,s-26,s-1,l?"L1":"L0");
> 			if(s != 103)
> 			printf("tt%s = ROTL(%s + S%d + %s, S%d +
%s);n",
>
				l?"L0":"L1",l?"L0":"L
1",s,l?"L1":"L0",s,l?"L1"
:"L0");
> 			if(s == 78)
> 			    printf("ttE0 = plain0 + S%d;n",s);
> 			if(s == 79)
> 			    printf("ttE1 = plain1 + S%d;n",s);
> 			if(s > 79)
> 			printf("ttE%d = ROTL(E0 ^ E1, E%d) +
S%d;n",
> 				(s)%2,(s+1)%2,s);
> 		}
> 	}
> _______________________________________________
> Hardware mailing list
> Hardwarelists.distributed.net
> http://lists.distributed.net/mailman/listinfo/hardware
>   
_______________________________________________
Hardware mailing list
Hardwarelists.distributed.net
http://lists.distributed.net/mailman/listinfo/hardware

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )