Test of Win IE displaying 4 different 3-byte sequences, 3 of which are not legal UTF-8, as if they represented the same UTF-8 Unicode character.
Possible cause is that the UTF-8 parser does not check the sequences for being valid UTF-8 and reads only the last 6 bits of the last two bytes in the sequences.
Bytes E1 FC D0 (no char):
Bytes E1 BC D0 (no char)
Bytes E1 FC 90 (no char)
Bytes E1 BC 90 (U+1F10 small epsilon psili): ἐ
Another test -- Win IE when reading Latin-1 while set to UTF-8 can generate Chinese (instead of the question mark or other "illegal character" symbol which should appear):
Bytes E9 20 71 contained in, for example, the Latin-1 French sequence "e-acute, space, q" are read as E9 A0 B1 (U+9831, le4)
pens que
(Note: I understand this behavior has been fixed in IE7b3)
tom@bluesky.org