December 03, 2004

ASCII Spam

Boing Boing has something up about ASCII spam.

The general idea being that if images are going to be filtered out and the Bayesian analysis stuff is getting good at blocking text... then the next step must be ASCII art.

Off the top of my head, I would guess that eventually the Bayesian filtering software would learn to weigh "pre" tags more towards spam, as well as large patches of spaces (which actually are more important than the "pre" tags in this sense, but the pre tag is what allows the spaces to survive on the screen in web browsers).
So while the Boing Boing mentions that it is hard to block - I would bet that the filters that can learn actually do fairly well at blocking them (assuming that they don't compress spaces).

Posted by Eric at December 3, 2004 01:32 AM | TrackBack

Comments





TrackBack:http://www.spamblogging.com/mt/mt-tb.cgi/474

Listed below are links to weblogs that reference 'ASCII Spam' from spamblogging.
ASCII trackback spam.
Excerpt: It wasn't the same as ASCII spam, mind you. The bastard kept the consonants, like "-nl-n- p-k-r," but & #111 ;'d, & #105 ;'d and & #101 ;'d all the vowels. One of these days they'll mess around and start...
Weblog: ALLABOUTGEORGE.com
Tracked: February 2, 2005 09:39 AM