High increase in spammy blog comments

Beginning of January, I attempted to put as many IP address blocks in the blacklist, as well as filter more aggressively on unwanted keywords, unfortunately with limited results. The situation increased dramatically once I implemented a custom spam filter based on the following observations:

IP address ranges were very distributed and while some reoccurrence could be seen, less than half of the spams were caught by this list
The text seems to be composed on highly adaptable templates, where you could not blacklist given words, e.g.
“{Hello|Hi} there, {simply|just} {turned into|became|was|become|changed into} {aware of|alert to} your {blog|weblog} {thru|through|via} Google, {and found|and located} that {it is|it's} {really|truly} informative. {I'm|I am} {gonna|going to} {watch out|be careful} for brussels. {I will|I'll} {appreciate|be grateful} {if you|should you|when you|in the event you|in case you|for those who|if you happen to} {continue|proceed} this {in future}. […]”
The review of the Apache logs did not yield any further distinctive keyword (e.g. in the user-agent).
The only interesting field was the provided email, almost always following the following pattern: “Word 1 starting with capital letter” + “Word 2 starting with capital letter” + “number between 10 and 9999” (at) “a small list of predefined major free email providers”, e.g. MailletQuijas95@yahoomail.com

This last point is exactly the logic which got implemented in dcCustomSpamFilter with the following regular expression and a great success rate:

    public $regexEmail = '([A-Z][a-z]+){2}([0-9]{2,4})@(123mail\.net|aol\.com|googlemail\.com|gnumail\.com|yahoomail\.com|hotmail\.com|mail\.com|gmail\.com|aim\.com)';

The whole code for this custom DotClear spam filter is below and was placed in a newly created folder [DotClearRoot]/plugins/custom_antispam/:

_define.php

<?php if (!defined('DC_RC_PATH')) { return; } $this->registerModule( /* Name */ "Custom_antispam", /* Description*/ "Custom Anti Spam Filter", /* Author */ "www.ness.ch/misc/", /* Version */ '0.1', /* Permissions */ 'usage,contentadmin', /* Priority */ 200 ); ?>

_prepend.php

<?php if (!defined('DC_RC_PATH')) { return; } global $__autoload, $core; $__autoload['dcCustomSpamFilter'] = dirname(__FILE__).'/class.dc.filter.custom.antispam.php'; $core->spamfilters[] = 'dcCustomSpamFilter'; ?>

class.dc.filter.custom.antispam.php

<?php    
//Source: http://fr.dotclear.org/documentation/2.0/resources/plugins/antispam
class dcCustomSpamFilter extends dcSpamFilter    
{     
    public $name = Custom anti spam Filter';     
    public $has_gui = false;     
    public $regexEmail = '([A-Z][a-z]+){2}([0-9]{2,4})@(123mail\.net|aol\.com|googlemail\.com|gnumail\.com|yahoomail\.com|hotmail\.com|mail\.com|gmail\.com|aim\.com)';     
  
    protected function setInfo()     
    {     
        $this->description = __('My custom anti spam filter');     
    }
    
    /*     
Cette méthode prend les paramètres suivants :
$type : le type de commentaire (comment ou trackback)    
$author : le nom de l'auteur     
$email : l'adresse email de l'auteur     
$site : l'URL du site de l'auteur     
$ip : l'adresse IP de l'auteur     
$content : le contenu du commentaire     
$post_id : l'ID du billet sur lequel le commentaire a été posté     
La dernière variable $status doit bien être déclarée en référence (&$status) puisqu'elle permet de transmettre le statut du commentaire si celui-ci est marqué comme spam.
Cette méthode doit renvoyer true si le message est un spam et null si on ne sait pas.    
    */     
    
    public function isSpam($type,$author,$email,$site,$ip, $content,$post_id,&$status)     
    {     
        if (preg_match('/'.$regexEmail.'/',$email)) {     
            $status = 'Filtered';     
            return true;     
        }     
    }     
    
    public function getStatusMessage($status,$comment_id)     
    {     
        return sprintf(__('Filtered by %s. - generated email match'),$this->guiLink());     
    }     
}     
?>

Ness.ch - misc

_define.php

_prepend.php

class.dc.filter.custom.antispam.php

Ajouter un commentaire

Partagez cette page

Rechercher

Catégories

S'abonner