Tuesday, August 31, 2004
A little rant about Microsoft Internet Explorer's color parsing
A warning to all readers, things get strange and very geeky from here on in. Enjoy.
As my profile says, I am one of POPFile's developers. The POPFile team is commited to finding and making sure POPFile is capable of decoding new spammer "tricks". John Graham-Cumming, author and lead developer of POPFile also maintains the spammer's compendium, a catalogue of such spammer tricks.
The newest trick in the compendium is "Flex Hex", which was reported to John in July. POPFile's CVS code learned how to handle this trick shortly thereafter.
The essence of Flex Hex is that IE is very flexible in how it will interpret hexadecimal RGB values in any HTML attribute (I'm not sure about CSS) that expects color data. John sums it up well in the spammer's compendium:
Missing digits are treated as 0[...]. An incorrect digit is simply interpreted as 0. For example the values #F0F0F0, F0F0F0, F0F0F, #FxFxFx and FxFxFx are all the same.
Though the above generalization would have been good enough in 99% of cases, we found some cases where IE deviated from the fairly simple approach of zero-padding the field and zeroing invalid hex characters.
When color strings longer than 8 characters or shorter than 4 characters are used, things start to get strange. Things getting strange, particularly where undocumented, is always to a spammer's advantage. I will lay out here what IE does with unusual, unpredictable, or invalid color data.
If email filtering software isn't aware of how common HTML-enabled email readers will display HTML, malformed or otherwise, it becomes much easier for spammers to hide text within emails in a way that may fool statistical filters or otherwise evade filters.
As an interesting note, IE does this unusual parsing regardless of the doctype declaration, ignoring "standards mode". Mozilla performs similar parsing, differing only in how long strings are handled. However, in standards mode, invalid color notation is completely ignored by Mozilla and the default or parent color is allowed to set the color of the element.
The iframe below contains a slight variation on the DHTML page I used while determining how IE parses colors. I have gone out of my way to make it cross-platform, so other browsers can be tested with it. The two fields can be used to set the foreground and background colors of some text, and then the DOM of the page is sniffed to display the colors, as interpreted by the browser.
Throughout this explanation I will use a notation similar to CSS's RGB( RR, GG, BB) syntax to show how a value is split into red, green, and blue components. This isn't correct CSS RGB() syntax, but I am using it for clarity.
IE's non-CSS color parsing algorithm appears to behave as follows, in order to get to a 6 digit hexadecimal value from any string:
These steps may not be performed in the same order or using exactly the same criteria as IE, but the end result is identical as far as I can tell.
First, remove any hash-marks, then replace any non-hexadecimal characters (0-9a-f) with 0's.
Eg: #zqbttv becomes 00b000.
For lengths 1-2, right pad to 3 characters with 0's.
Eg: "0F" becomes "0F0", "F" becomes "F00".
For length 3, take each digit as a value for red, green, or blue, and prepend a 0 to that value.
Eg: "0F0" becomes RGB( 0, F, 0), which becomes RGB( 00, 0F, 00) or 000F00.
Any value shorter than 4 digits long is done at this point.
For lengths 4 and longer, the field is right-padded with 0's to the next full multiple of 3. This step is important for longer fields.
Eg: "0F0F" becomes "0F0F00", "0F0F0F0" becomes "0F0F0F000" and "00FF00FF00FF00FF" becomes "00FF00FF00FF00FF00"
Next, the string is broken into three even parts, representing red, green and blue, from left to right.
"0F0F00" behaves as expected, becoming RGB(0F, 0F, 00). Any string of 6 characters is done at this point.
Longer strings, such as "1234567890ABCDE" become RGB(12345, 67890, ABCDE). Extremely long strings are split similarly. "1234567890ABCDE1234567890ABCDE" becomes RGB( 1234567890, ABCDE12345, 67890ABCDE).
At this point, the RGB values are truncated individually.
If the individual RGB values are over 8 characters long, they are truncated to 8 characters by removing characters from the left. This, in particular, was unexpected.
RGB( 1234567890, ABCDE12345, 67890ABCDE) becomes RGB( 34567890, CDE12345, 890ABCDE), and so forth.
Once the individual RGB values are under 8 characters long they are truncated by removing characters from the right.
RGB( 34567890, CDE12345, 890ABCDE) becomes RGB( 34, CD, 89) or #34CD89, in more traditional notation.
Any string should be transformed into a 6-digit hexadeximal color by the above steps.
For instance, <font color="6db6ec49efd278cd0bc92d1e5e072d68"> (yes that is random hexadecimal data) will result in IE displaying text in the color "6ecde0", a rather pleasant light blue. This isn't at all what I would have expected before studying IE's behavior. A truncation to "6db6ec", I might have expected or to "072d68" (also a pale blues, coincidentally). However, if you look closely inside the random hexadecimal string, the components that make up the final RGB value are present, and in sequential order: "6db6ec49efd278cd0bc92d1e5e072d68"
To continue decoding this value, it first needs to be padded:
Then split into three even parts:
RGB( 6db6ec49efd, 278cd0bc92d, 1e5e072d680)
Then those parts are left-trimmed to 8 digits:
RGB( 6ec49efd, cd0bc92d, e072d680)
Then right-trimmed to the 2 most significant digits:
RGB( 6e, cd, e0)
And there you have it, the same color that IE will display if you enter 6db6ec49efd278cd0bc92d1e5e072d68 into one of the fields in the test applet above.
Monday, August 30, 2004
This looks like a very powerful and intuitive way to supplement your windows desktop. It places your monitor in the center of a virtual sphere and lets you move and store windows within that sphere. Windows can be moved further and closer to the user, or brought right to the front. I'm not sure how usable it really is, but the videos of it in use present a fairly different UI experience.
Sunday, August 29, 2004
An Illustrated Guide to Cryptographic Hashes
An Illustrated Guide to Cryptographic Hashes
GmailFS - Gmail Filesystem
GmailFS - Gmail Filesystem
GmailFS provides a mountable Linux filesystem which uses your Gmail account as its storage medium. GmailFS is a Python application and uses the FUSE userland filesystem infrastructure to help provide the filesystem, and libgmail to communicate with Gmail.
Thursday, August 26, 2004
How to Chat Via Netcat
I'm not sure how useful it would be, given the prevalence of things like IM, SSH, and telnet, but printing to a remote user's terminal and allowing them to print back strikes a minimalist bone somewhere in my body.
I'm sure this is possible on *nix (nc originated there), but the poster has limited his instructions to windows. I think this will work on any platform if you substitute all mentions of DOS with console or terminal and give it a try.
-for Windows platforms
Requires Netcat (NC.EXE).
Parties at both ends must know one another's IP address.
Open two (2) command (DOS Prompt) windows. Place one above the other on
Select the upper window. Type this command:
nc [-u] -l -p port
where port = listening port number, known to your remote friend.
(Square brackets indicate an optional parameter. The "-u" switch is
optional. It causes Netcat to use the UDP protocol, which is sometimes
desirable for its relative "stealth," as when tunneling thru a
Now select the lower window. Type this command:
nc [-u] ipaddress port
where ipaddress = the remote IP address
port = the remote port number
(The -u switch is necessary if your chat partner is using UDP.)
The same commands may be entered in the Start...Run dialog, which will
cause the DOS windows to appear. They can also be created as shortcuts.
Assuming your chat partner has opened corresponding DOS windows and used
corresponding commands, you may now each type text in the lower window.
You will each see the other's transmissions in the upper window. Lines
of text are transmitted each time the
When sending via UDP, I've found that the sender must hit Enter once
before display begins at the other end.
Ctrl-C will close the connection and exit Netcat. If you've run the
command from the Run dialog or a shortcut, the DOS window will close
If you're using TCP and close your sending or receiving Netcat, the
corresponding Netcat at the remote end will close also. UDP doesn't
have this effect, being "connectionless."
If you're behind a NAT router, you must first set up tunneling,
sometimes termed a "virtual server"; directing the desired port/protocol
on the WAN to your own machine's IP/port on the LAN.
You can make a connection to your own machine by using your own IP
address. Handy for testing.
This also works between machines on the same LAN.
Wednesday, August 18, 2004
Study: Unpatched PCs compromised in 20 minutes | CNET News.com
If you are out there with a new computer, please don't ever connect it to the internet without being behind a NAT router or enabling your OS's firewall. Even if it is to just download a 'quick' patch.
Apparently I am Amiga OS
The geek milkshake
My web log brings all the nerds to the yard,
and I'm like: "mine's better than yours".
Damn right, it's better than yours!
I can link you, but I have to charge!
And continues here
I think it's pretty funny.
Tuesday, August 17, 2004
Study: Spammers, Virus Writers Getting Chummy