The Joys of Being a Webmaster
For the past few weeks I've been inundated with spam feedback on a couple of my sites. They seem to be from the same source, trying to promote online poker and phentermine. Here's an example that I've left un-deleted. The idea seems to be to raise the PageRank of certain sites by piggy-backing on others.
It's annoying for lots of reasons but mainly because it makes a mess of the site and makes finding information even harder than it already is. Until this morning I have been patiently deleting them as they pour in. Now I've made a change to stop it happening.
I can only guess at how they actually work, but I assume it's a "bot" that trawls the web looking for certain forms it can post information to. How much human intervention is required I do not know. To stop it happening I've made a change to the comment form on articles. It now requires that human touch - literally. There's a hidden field with no value. When you click the submit button it gets given a value. If the form is posted without a value in this field the document won't save. Simple but effective.
So now that you're settled into your new home (congrats, by the way)... when can we expect a new article?
Thanks,
Brandon
Hey Jake,
If that is true, it might imply that a bot could be programmed to include the field in the post... admitedly going a bit far for a little page rank tweaking, but the door remains open. You might consider rotating key for the field.
When the form is loaded, today's key is used to name the field. When the document is saved, the key is validated against today's key, stored somewhere handy like the ini file. Updating the key daily would mean they bot programmer would have to either update his bot daily (hence really making your site too much trouble to bother with) or auotmate the key field extraction which would require parsing a page from your site every time he wants to post some spam, which would slow down his bot quite a bit.
Some random thoughts based in no actual practical knowledge of how this is done. ;-)
Hi Jake
You should try the W3C html validator.
It shows that this page has 108 errors in its html syntax!
Try this link:
{Link}
Regards BoFrede
Sorry for the above blank comemnts, was testing a theory...
Remember a form post is simply a URL that accepts data so you don't even need a web browser, just something to make a HTTP post to the URL. The blank comments were made by entering the post?CreateDocument&ParentUNID=[unid] into the browser address field. I was trying to add the fields by doing &BlogDateKey=xxxxx etc. but it didn't work for some reason.
@BoFrede - I am going to say this as nicely as I can because English isn't your first language, and you may have not meant to be as short as you were. Jake has done more to advance web standards among Domino developers than anyone I know. To suggest that he needs to validate his HTML or that he is unaware of the w3c as a standards body is a laughable commentary on your own puffed up arrogance. Your web site looks promising, but we'll see how well it validates after 5 or 6 years with all of the content still intact. All that said, if your comment was meant only to drive traffic to your own web site in some sort of non-automated comment spam way, then I guess it worked on me.
Brandon. Soon. I hope!
Jerry. I thought of that, but considered it overkill until they show they are that clever.
BoFrede. I'm sure it's concern that makes you point this out, but I am well past caring whether or not this site validates. With Domino it's just not worth the effort. It's not that I don't care. I do. Check validation on another of my sites {Link}
Andrew. No worries. It wasn't this form that I used the new code on though. It was the comments for articles, rather than blogs.
Chris. Thanks ;-)
Jake. Yes it is concern, and a hope that you would write a piece on how you fixed the html. It would inspire others to take on the same task. I totally agree that Domino have some problems of its own, generating valid html, but with Domino 6 and beyond it has gotten a lot better.
Keep inspiring ;-)
BoFrede.
I found the easiest way to block out spam bots is to add some text to the end of the post url after the submit - then there is not a field for the bot to search out (which some do).
ie:
have some javascript run on the "onsubmit" of the comment post - add some text to the comment post using some js>
eg document.forms['_DominoForm'].action=document.forms['_DominoForm'].action+&chk=sometext
then when the server receives the document - check that the url contains the text you added before accepting the comment - this is undetectable by the bots (they rescan your site every visit - they only remember site url not the post url) so you can say goodbye to them - its one of the methods I use to block about 500 of these b*ggers a week.
steve
{Link}
Maybe a little OFF-Topic but this is an article about promotion with a BLOG. (I hop eyou can read german!)
Steve,
How do you track the blog spam hits if you're blocking them, or is the 500 number based on garbage not recieved as compared to earlier onslaughts?
And your method is interesting and has other implications. I used that trick once to solve a URL caching issue. A meta tag wasn't doing it for some reaosn (probably my bad use of it :-) ) so I used a simple javascript random bit of text and that made IE think each URL was unique and hence reloaded every time.
Interesting bit of quirkyness, all the same.
Jerry - it allows the document to be saved but sets a marker to not be used - then I have a view where these comments can be approved (there are other block methods as well) if I wish or removed and automatically added to an IP block list.
Thats how I can see what I get through - shows me the method works as well.
maybe it's better to do with HTTP_REFERER in agent?
Agent looks, if Referer starts with your site-name, if not, it ignores.
BoFrede,
young Mike Golding {Link} has written exhaustively on this subject (X/D/HTML validation), and many others (e.g. Javascript) for a Domino audience.
(Oops, did I just bump up Mike's PageRank? To be fair I'll have to go and bump yours up from his site, Jake ;-o)