How I block comment spam

Updated on November 14, 2023
By Pete Freitag

You would think that by having custom written blogging software (only two other blogs out there are using this code), and not allowing HTML in comments, that comment spammers would not waste their time on me. But they do.

Why do they bother with me?

Even though their URL will not be hyperlinked on my blog giving them pagerank, they still spam because they are hoping for the following:

They want me to click on the link before I delete it.
They are hoping that people subscribed to the comment thread will click on the link.
They are planting keywords on my pages so that someone searching for the term in google, may find my page, and copy and paste the url.

When I was at the bloggers BOF at cfunited, people mentioned that when using Ray Camden's blog CFC software they didn't get much if any comment spam. I think that is because the comment form is located in a popup window launched by javascript. So it's more of a hassle for spammers to spam them. I however would rather keep my comment form on my entry page, so it's easier for readers to post comments.

What I do to block comment spam

Here's what I do to block comment spam on this blog:

Check HTTP Referrer to make sure it's coming from my site. I know some people like to turn this off in their browser, but they won't be able to post comments unless they turn it on.
If the comment contains a HTML link I reject it, giving the user a detailed message that tells them to just post the url.
Check for a set of bad words - my list is very small only about 10 words currently.
Check for [url] - a lot of comment spammers try to pass the links as [url]http://foo[/url]
Look for more than 5 url's in the comment. Comment spammers often try to post 10-20 urls at a time, so I just reject them. I use this regular expression REFindNoCase("(http:.*){5,}", form.comment)

blog spam comment spam

How I block comment spam was first published on July 19, 2005.

If you like reading about blog, spam, or comment spam then you might also like:

Discuss / Follow me on Twitter ↯

Tweet Follow @pfreitag

Comments

I was having the problem as well (using my own custom Blog software.) I fixed this by building a simple spam filter. When I get a piece of spam, I move it to a spam filter table. All my spam was coming in from basically the same e-mail address and the same Urls, so my filter basic looks to see if these is a bad comment and if so silently rejects it.

While I was getting a handful of spam messages a week, I've dropped down to zero.

by Dan G. Switzer, II on 07/19/2005 at 7:31:03 PM UTC

A small modification to your regex to deal with https and to make it non-greedy.

REFindNoCase("(https?://.*?){5,}", form.comment)

by Michael Dinowitz on 07/19/2005 at 11:51:08 PM UTC