Alan Thwaits at eChannelLine discusses RPD (Recurrent Pattern Detection) and its potential to stopping spam (or I should say "reducing" spam since the goal on their part is to reduce it by 95%).
The report identified three main criteria for anti-spam solutions: the need to block 94 to 95 per cent of spam; the ability to trace spam in the first few minutes of an outbreak; and the ability to block spam regardless of the language or dialects used.Commtouch's RPD technology wins on all three counts, according to Oren Drori, the company's director of product marketing.
"Instead of looking at the content of each message, like other anti-spam technologies, RPD looks at the Internet as a whole," he explained. "It looks for recurrences of patterns in huge amounts of mail. If you can trace the repetitive pattern, you can block the spam."
Commtouch's anti-spam detection center monitors approximately 30 to 40 million messages a day, which Drori said is a representative sample of the Internet itself.
It sounds as if it works by monitoring a distributed selection of mail off of the net, regardless of location or language (I would imagine ideally spanning many countries and areas), for patterns that show up in N number of messages. If that N value is sufficiently large, then it marks that signature as "spam" and then all clients that see that will react accordingly.
I could be misunderstanding the technology and/or the methods used, but it sounds an awful lot like CloudMark's SpamNet, which in my experience is slow to catch on.
Perhaps this is different enough to actually work quite well, I don't know.
Posted by Eric at May 9, 2004 10:17 PM
| TrackBack
Hey, I just report what I see - I didn't say I thought it worked well.
I would imagine they likely look at the headers and then look for X% of the e-mail being recurring - not so little so as to be triggered by a signature, and not so much so as to be fooled by the random text.
I would assume (hope) that there is more to it than that, but I haven't seen a full white paper on it - not sure if one is freely available since it is a commercial product.
Posted by: Eric at May 10, 2004 09:01 PM
Hallo friends! Really nice place here. I found a lot of interesting stuff all around. Just what I was looking for. Great joy!
I hate spam!!!
Posted by: Soey Farina at September 3, 2004 05:32 AM
i thought that a lot of anit-spam technologies on the isp end did this anyway?
a lot of spam emails just generate random content and gibberish onto the end of emails, or just add extra punctuation and letters to bypass them.
"It looks for recurrences of patterns in huge amounts of mail"
so it still looks at the content of each message, but what if an individual quoted from one of the emails, would that email not get through and would that be classed as a spam email?
if it's looking at such a large area, then couldn't their results get diluted by the vast amounts of varying letters in each spam message? i suppose they would check the source also?
Posted by: flump at May 10, 2004 07:48 PM