Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email email@example.com
Date: Mon Apr 08 2002 - 23:10:57 CDT
On Mon, 08 Apr 2002 23:08:02 EDT, Michael Greenberg said:
> > 1) "do not contain null" - that's *different* than "A..Z". There's a
> > *lot* of instructions that contain non-printable non-null characters.
> > Most of them, in fact. For a 32-bit instruction, there are 2^32 (or
> > 4,294,967,296, of which 4,228,250,625 (or 255^4) do *not* contain nulls.
> > Only 66,716,671 (or about 1.5%) of all possible 32-bit instructions *DO*
> > contain a null.
> While your other comments are insightful (and incredibly interesting),
> I believe KF intended to simply insert all of these generated sequences
> into a call to __asm__() or a similar function and then 'sort the rest
> out in gdb', that is, see what calls were valid. While your method
> will get you better results, isn't a kludge, and will probably work
> while his will not, that's not quite what he was asking.
Well.. Hmm.. 4 billion or so possibilities, at 4 bytes a pop - that's
16G for the .o file - and the .c/.s will be bigger (all those "__asm__()"
chew it up - plus any dorking around you need to get gcc, xlc, /bin/as
or whatever to *create* a 16G .o file (hint - most Power chips are 32-bit ;)
I hope you have a *big* /tmp and lots of time... ;)
But you already *know* that of the 4G entries, only 66M or so are
"interesting" in that they contain a null - so turning it around and
making a list of "everything that DOES have a null" and saying "If
it's not in this list, it doesn't have a null" is a *lot* faster....
What *might* be interesting is generating *just those 66M* bit
patterns, and getting gdb to disassemble them, so you have a list of
66M instructions to avoid... and that's almost plausible to do.
print "\0$a$b$c $a\0$b$c $a$b\0$c $a$b$c\0";
Yes, this generates some duplicates. Deal with it - /bin/uniq ;)
My analysis of instruction formats took me all of 15 minutes with 2
IBM AIX manuals - and hopefully produced a lot more insight into the
problem than simply brute-forcing all 4 billion possibilities For
instance, all the 'X' format instructions have primary opcode 31 or 63
- and the ones with 63 tend be the floating-point instructions while
31 tends to be integer and memory references. (Hmm.. I smell a
5-input AND gate here ;)
4 billion instructions - or 11 instruction formats (of which 3 have only
one instruction of that format), and for each format it's pretty easy
to figure it out. Work smarter, not harder. ;)
-- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: Exmh version 2.5 07/13/2001
iD8DBQE8smnRcC3lWbTT17ARAi3gAJwK/ACj3S5Q1GXUcT/cuTbuZ70jOQCgtSLG J97h/IqmDjpE7TFbxAcRh6M= =u9tf -----END PGP SIGNATURE-----