<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>WSI Blog &#187; Security</title>
	<atom:link href="http://www.wsi-ebizsolutions.biz/blog/category/security/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.wsi-ebizsolutions.biz/blog</link>
	<description>Website Development and Internet Marketing Blog</description>
	<lastBuildDate>Fri, 23 Jul 2010 13:12:14 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Propagation of Misconceptions About IT in the Media</title>
		<link>http://www.wsi-ebizsolutions.biz/blog/propagation-misconceptions-media/2010/03/</link>
		<comments>http://www.wsi-ebizsolutions.biz/blog/propagation-misconceptions-media/2010/03/#comments</comments>
		<pubDate>Tue, 09 Mar 2010 16:18:54 +0000</pubDate>
		<dc:creator>Barnaby Knowles</dc:creator>
				<category><![CDATA[Security]]></category>
		<category><![CDATA[Website Development]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.wsi-ebizsolutions.biz/blog/?p=782</guid>
		<description><![CDATA[IT and technology-related issues are frequently reported in the media. Many times articles contain factual inaccuracies. The problem is that reporters are not "techies" and often provide misleading or erroneous analyses.]]></description>
			<content:encoded><![CDATA[<p>IT and technology-related issues are frequently reported in the media. A well-known website is hacked, a new technology is unveiled, user feedback is discussed&#8230; Many times articles contain factual inaccuracies. The problem is that reporters are not &#8220;techies&#8221; and often provide misleading or erroneous analyses.</p>
<p><span id="more-782"></span></p>
<h2>Fake drug scam hijacks UK college websites</h2>
<p>The <a href="http://news.bbc.co.uk/1/hi/technology/8550219.stm" rel="external nofollow">BBC</a> recently reported that &#8220;UK academic institutions have unwittingly become the accomplices of criminals selling fake drugs online.&#8221; The article went on to state that this had happened because spammers had &#8220;exploited vulnerabilities&#8221; in the PHP scripting language. As a PHP programmer I take exception to this claim, as should website owners and web design agencies, for reasons that I will explain later.</p>
<p>The article reported how academic websites with the .ac.uk domain extension were unwittingly forwarding visitors to websites selling fake drugs online. Without going into much detail it was explained that spammers had injected code into the web pages, seemingly exploiting vulnerabilities in PHP, that would make Google and other search engines believe that the pages were relevant for searches related to prescription drugs such as Viagra. When a user searched for those terms and visited the website via the link on the search engine results page (SERP), the injected code would detect this and redirect them to the online pharmacy. When a user visited the website by typing in the URL directly or via a non drug0related search, the normal page is displayed.</p>
<h2>Deliberately Targeted</h2>
<p>This is not a random attack; the websites had been specifically targeted. Academic institutions rank very well in search engines because, put simply, they are inherently trusted and as such, the .ac.uk domain extension carries a lot of weight.</p>
<p>An attack like this is also clever because it doesn&#8217;t place visible links to spammers&#8217; websites or make any obvious changes to the web page that has been compromised. Visitors only get redirected to the online pharmacies if they are actually searching for specific terms. This way the website administrator may never know that their scripts have been compromised and remove the spammers&#8217; code!</p>
<p>One such website that has been affected is <a href="http://www.rave.ac.uk/" rel="external nofollow">Ravensbourne College of Design and Communication</a>. Amazingly, four days after the BBC reported that their website had been compromised (and presumably even longer since they found out), the injection in still in place! If you visit <a href="http://www.rave.ac.uk/" rel="external nofollow">www.rave.ac.uk</a> you will see the college&#8217;s official website. If you <a href="http://www.google.com/search?q=Ravensbourne+College+of+Design+and+Communication">search for the college</a> and follow the link in the SERP you will also see the college&#8217;s official website. However, if you <a href="http://www.google.com/search?q=viagra+site%3Arave.ac.uk">search for Viagra</a> and follow the link in the SERP you will not end up on the college&#8217;s website at all, but at a &#8220;Canadian online pharmacy&#8221;!</p>
<h2>What Cost?</h2>
<p>This is all very unfortunate for the college. The negative publicity alone would be bad enough, but they will also have to spend time and money removing the injected code and then plugging the holes that allowed an exploit of this type to happen in the first place.</p>
<p>I mentioned earlier that I would object to the reporting of this a PHP exploit. I believe that this is inaccurate and could lead people to believe that PHP is inherently less secure than other scripting languages. In fact I would not call this a PHP exploit at all &#8211; it&#8217;s slack coding that could have resulted in the same thing happening no matter what scripting language the website was developed in.</p>
<p>From the scant technical details offered in the original article it would appear that the affected websites do not properly validate and filter user input. Of course a website developed in PHP would be vulnerable to rogue code injection attacks if user input is not validated correctly. But for that matter so would any other scripting language!</p>
<p>Aside from my personal objections to the labelling of this incident as a vulnerability within PHP, website owners and web design agencies should also consider the effects of the media when reporting stories like this. If a potential client reads the article and takes from it that &#8220;PHP is not secure&#8221; or &#8220;PHP websites get hacked easily&#8221; and then you pitch a PHP-based website to them, how will that affect your chances of winning the contract? And to a lesser extent, how do stories like this affect the public&#8217;s perception of your current website? Would they feel safe buying from you online when such high-profile PHP websites fall victim to hacking?</p>
<h2>Conclusion</h2>
<p>But what can we in the IT community do? It&#8217;s not realistic to expect journalists to understand that this was poor coding rather than an insecure scripting language. The best that we can do is be aware of what stories are floating around in the news and be sure that we understand and can explain the real issues!</p>
<div id="crp_related"><h3>Related Posts:</h3><ul><li><a href="http://www.wsi-ebizsolutions.biz/blog/reducing-form-spam-without-captcha/2009/11/" rel="bookmark" class="crp_title">Reducing Form Spam Without the Use of a CAPTCHA</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/stopped-internet-explorer/2010/01/" rel="bookmark" class="crp_title">Why I stopped using Internet Explorer</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/future-web-intro-html-5/2009/11/" rel="bookmark" class="crp_title">The Future of the Web: Brief intro to HTML 5</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/comparing-mysql-mysqli/2010/02/" rel="bookmark" class="crp_title">Comparing mysql and mysqli</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/ecommerce-website-considerations/2010/02/" rel="bookmark" class="crp_title">E-Commerce Website Considerations</a></li></ul></div>
<div class="sociable">

<ul>
	<li class="sociablefirst"><a rel="nofollow"  target="_blank" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F&amp;title=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media&amp;bodytext=IT%20and%20technology-related%20issues%20are%20frequently%20reported%20in%20the%20media.%20Many%20times%20articles%20contain%20factual%20inaccuracies.%20The%20problem%20is%20that%20reporters%20are%20not%20%22techies%22%20and%20often%20provide%20misleading%20or%20erroneous%20analyses." title="Digg"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/digg.png" title="Digg" alt="Digg" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://sphinn.com/index.php?c=post&amp;m=submit&amp;link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F" title="Sphinn"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/sphinn.png" title="Sphinn" alt="Sphinn" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://delicious.com/post?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F&amp;title=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media&amp;notes=IT%20and%20technology-related%20issues%20are%20frequently%20reported%20in%20the%20media.%20Many%20times%20articles%20contain%20factual%20inaccuracies.%20The%20problem%20is%20that%20reporters%20are%20not%20%22techies%22%20and%20often%20provide%20misleading%20or%20erroneous%20analyses." title="del.icio.us"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/delicious.png" title="del.icio.us" alt="del.icio.us" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.facebook.com/share.php?u=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F&amp;t=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media" title="Facebook"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/facebook.png" title="Facebook" alt="Facebook" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.google.com/bookmarks/mark?op=edit&amp;bkmk=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F&amp;title=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media&amp;annotation=IT%20and%20technology-related%20issues%20are%20frequently%20reported%20in%20the%20media.%20Many%20times%20articles%20contain%20factual%20inaccuracies.%20The%20problem%20is%20that%20reporters%20are%20not%20%22techies%22%20and%20often%20provide%20misleading%20or%20erroneous%20analyses." title="Google Bookmarks"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/googlebookmark.png" title="Google Bookmarks" alt="Google Bookmarks" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.friendfeed.com/share?title=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media&amp;link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F" title="FriendFeed"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/friendfeed.png" title="FriendFeed" alt="FriendFeed" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F&amp;title=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media&amp;source=WSI+Blog+Website+Development+and+Internet+Marketing+Blog&amp;summary=IT%20and%20technology-related%20issues%20are%20frequently%20reported%20in%20the%20media.%20Many%20times%20articles%20contain%20factual%20inaccuracies.%20The%20problem%20is%20that%20reporters%20are%20not%20%22techies%22%20and%20often%20provide%20misleading%20or%20erroneous%20analyses." title="LinkedIn"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/linkedin.png" title="LinkedIn" alt="LinkedIn" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.myspace.com/Modules/PostTo/Pages/?u=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F&amp;t=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media" title="MySpace"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/myspace.png" title="MySpace" alt="MySpace" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://ping.fm/ref/?link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F&amp;title=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media&amp;body=IT%20and%20technology-related%20issues%20are%20frequently%20reported%20in%20the%20media.%20Many%20times%20articles%20contain%20factual%20inaccuracies.%20The%20problem%20is%20that%20reporters%20are%20not%20%22techies%22%20and%20often%20provide%20misleading%20or%20erroneous%20analyses." title="Ping.fm"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/ping.png" title="Ping.fm" alt="Ping.fm" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://reddit.com/submit?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F&amp;title=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media" title="Reddit"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/reddit.png" title="Reddit" alt="Reddit" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://slashdot.org/bookmark.pl?title=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F" title="Slashdot"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/slashdot.png" title="Slashdot" alt="Slashdot" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F&amp;title=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media" title="StumbleUpon"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/stumbleupon.png" title="StumbleUpon" alt="StumbleUpon" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://technorati.com/faves?add=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F" title="Technorati"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/technorati.png" title="Technorati" alt="Technorati" /></a></li>
	<li class="sociablelast"><a rel="nofollow"  target="_blank" href="http://twitter.com/home?status=Propagation%20of%20Misconceptions%20About%20IT%20in%20the%20Media%20-%20http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpropagation-misconceptions-media%2F2010%2F03%2F" title="Twitter"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/twitter.png" title="Twitter" alt="Twitter" /></a></li>
</ul>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.wsi-ebizsolutions.biz/blog/propagation-misconceptions-media/2010/03/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Password Protecting Websites with Apache .htaccess</title>
		<link>http://www.wsi-ebizsolutions.biz/blog/password-protecting-websites-apache-htaccess/2010/01/</link>
		<comments>http://www.wsi-ebizsolutions.biz/blog/password-protecting-websites-apache-htaccess/2010/01/#comments</comments>
		<pubDate>Wed, 27 Jan 2010 12:03:48 +0000</pubDate>
		<dc:creator>Barnaby Knowles</dc:creator>
				<category><![CDATA[Security]]></category>
		<category><![CDATA[Website Development]]></category>
		<category><![CDATA[.htaccess]]></category>
		<category><![CDATA[Apache]]></category>
		<category><![CDATA[password]]></category>

		<guid isPermaLink="false">http://www.wsi-ebizsolutions.biz/blog/?p=656</guid>
		<description><![CDATA[There are likely to be areas of your website that you don't want others to be able to access, such as admin areas. Or sometimes you might want to do some quick updates to your code without the website being accessible to the public. If you're using the Apache web server the hypertext access (.htaccess) file lets you add password protection in a flash!]]></description>
			<content:encoded><![CDATA[<h2>Keep Out!</h2>
<p>There are likely to be areas of your website that you don&#8217;t want others to be able to access, such as admin areas. Or sometimes you might want to do some quick updates to your code without the website being accessible to the public. If you&#8217;re using the Apache web server the hypertext access (.htaccess) file lets you add password protection in a flash!</p>
<p><span id="more-656"></span></p>
<h2>.htaccess</h2>
<p>Apache has a built-in way of protecting entire directories (and sub-directories) from unauthorised users. Let&#8217;s assume that you are protecting your /admin/ directory.</p>
<p>If you don&#8217;t have an .htaccess file in the admin directory you will need to create one. The .htaccess file then needs just 4 lines of code to turn on password protection:</p>
<pre style="margin: 20px; padding: 10px; background-color: #E4E4E4; border-left: 3px #C0C0C0 solid;">
AuthType Basic
AuthName "Admin Area"
AuthUserFile /system/path/to/.htpasswd
Require valid-user
</pre>
<h3>AuthType</h3>
<p>The <em>AuthType</em> directive selects that method that is used to authenticate the user. &#8220;Basic&#8221; is the most common method and is fine for what we are trying to accomplish.</p>
<h3>AuthName</h3>
<p>The <em>AuthName</em> directive sets the <em>Realm</em> that will be used during authentication. The <em>Realm</em> has two uses. Firstly, the web browser often presents this information to the user as part of the password dialogue box. Secondly, it is used by the web browser to determine which password to send for a given authenticated area. Once a user has authenticated in one <em>Realm</em>, the web browser will automatically retry the same password for any area on the same server that is marked with the same <em>Realm</em>. This means that a user will not be prompted for a password more than once if multiple restricted areas share the same <em>Realm</em>.</p>
<h3>AuthUserFile</h3>
<p>The <em>AuthUserFile</em> directive sets the path to the password file that stored usernames and encrypted passwords for your users. This is the system absolute path and not the path within your web space.</p>
<h3>Require</h3>
<p>The <em>Require</em> directive provides the authorisation part of the process by specifying the user that is allowed to access this area. You can specify a single user or &#8220;valid-user&#8221; to allow anyone in that is listed in the password file, and who correctly enters their password.</p>
<h2>Set Up A .htpasswd User List</h2>
<p>Once the <em>.htaccess</em> file has been set up, we need to create the <em>.htpasswd</em> file so that Apache knows which users should be granted access to our admin area.</p>
<p>.htpasswd files are text files that list each user and their encrypted password on a new line like so:</p>
<pre style="margin: 20px; padding: 10px; background-color: #E4E4E4; border-left: 3px #C0C0C0 solid;">
admin:PAzNeZcFJV3Vk
bob:oRCu8rlaPEaTs
frank:1VhSkx7Q37ZYQ
</pre>
<p>(Passwords are encrypted with a one-way algorithm, so you can&#8217;t decrypt the password even if you know the encrypted value.)</p>
<p>Apache comes with a utility that will generate your .htpasswd file and add users with encrypted passwords that you specify. However, we will assume that you don&#8217;t have shell access, which is needed to use Apache&#8217;s <em>htpasswd</em> utility.</p>
<p>An easier way to generate your .htpasswd file is to use one of the many online .htaccess password generators that will pretty much do everything for you.</p>
<p>One such website is Dynamic Drive&#8217;s <a href="http://tools.dynamicdrive.com/password/" target="_blank">.htaccess password generator</a>, which provides the code needed in both your .htaccess and .htpasswd files.</p>
<h2>Turn Password Protection On</h2>
<p>Once you have generated your <em>.htaccess</em> and your <em>.htpasswd</em> files you simply upload them to the directory that you wish to protect. Take care not to overwrite an existing <em>.htaccess</em> file, or you will lose the functionality that it added. If an <em>.htaccess</em> file already exists you should simply add your new code to it.</p>
<h2>Conclusion</h2>
<p>Apache comes with methods of setting up basic authentication in minutes, so you no longer need to worry about unauthorised users accessing parts of your website that you want to keep private!</p>
<div id="crp_related"><h3>Related Posts:</h3><ul><li><a href="http://www.wsi-ebizsolutions.biz/blog/introduction-apache-modrewrite/2010/01/" rel="bookmark" class="crp_title">Introduction to Apache mod_rewrite</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/website-redirects-on-lamp-hosting/2009/11/" rel="bookmark" class="crp_title">Website Redirects on LAMP Hosting</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/redirecting-search-engine-friendly-urls/2010/02/" rel="bookmark" class="crp_title">Redirecting to Search Engine Friendly URLs</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/ban-bad-bots-global_asa-classic-asp/2009/10/" rel="bookmark" class="crp_title">Banning Bad Bots Using The global.asa File In Classic ASP</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/choose-web-hosting/2009/11/" rel="bookmark" class="crp_title">How To Choose Your Web Hosting</a></li></ul></div>
<div class="sociable">

<ul>
	<li class="sociablefirst"><a rel="nofollow"  target="_blank" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F&amp;title=Password%20Protecting%20Websites%20with%20Apache%20.htaccess&amp;bodytext=There%20are%20likely%20to%20be%20areas%20of%20your%20website%20that%20you%20don%27t%20want%20others%20to%20be%20able%20to%20access%2C%20such%20as%20admin%20areas.%20Or%20sometimes%20you%20might%20want%20to%20do%20some%20quick%20updates%20to%20your%20code%20without%20the%20website%20being%20accessible%20to%20the%20public.%20If%20you%27re%20using%20the%20Apache%20web%20server%20the%20hypertext%20access%20%28.htaccess%29%20file%20lets%20you%20add%20password%20protection%20in%20a%20flash%21" title="Digg"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/digg.png" title="Digg" alt="Digg" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://sphinn.com/index.php?c=post&amp;m=submit&amp;link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F" title="Sphinn"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/sphinn.png" title="Sphinn" alt="Sphinn" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://delicious.com/post?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F&amp;title=Password%20Protecting%20Websites%20with%20Apache%20.htaccess&amp;notes=There%20are%20likely%20to%20be%20areas%20of%20your%20website%20that%20you%20don%27t%20want%20others%20to%20be%20able%20to%20access%2C%20such%20as%20admin%20areas.%20Or%20sometimes%20you%20might%20want%20to%20do%20some%20quick%20updates%20to%20your%20code%20without%20the%20website%20being%20accessible%20to%20the%20public.%20If%20you%27re%20using%20the%20Apache%20web%20server%20the%20hypertext%20access%20%28.htaccess%29%20file%20lets%20you%20add%20password%20protection%20in%20a%20flash%21" title="del.icio.us"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/delicious.png" title="del.icio.us" alt="del.icio.us" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.facebook.com/share.php?u=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F&amp;t=Password%20Protecting%20Websites%20with%20Apache%20.htaccess" title="Facebook"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/facebook.png" title="Facebook" alt="Facebook" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.google.com/bookmarks/mark?op=edit&amp;bkmk=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F&amp;title=Password%20Protecting%20Websites%20with%20Apache%20.htaccess&amp;annotation=There%20are%20likely%20to%20be%20areas%20of%20your%20website%20that%20you%20don%27t%20want%20others%20to%20be%20able%20to%20access%2C%20such%20as%20admin%20areas.%20Or%20sometimes%20you%20might%20want%20to%20do%20some%20quick%20updates%20to%20your%20code%20without%20the%20website%20being%20accessible%20to%20the%20public.%20If%20you%27re%20using%20the%20Apache%20web%20server%20the%20hypertext%20access%20%28.htaccess%29%20file%20lets%20you%20add%20password%20protection%20in%20a%20flash%21" title="Google Bookmarks"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/googlebookmark.png" title="Google Bookmarks" alt="Google Bookmarks" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.friendfeed.com/share?title=Password%20Protecting%20Websites%20with%20Apache%20.htaccess&amp;link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F" title="FriendFeed"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/friendfeed.png" title="FriendFeed" alt="FriendFeed" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F&amp;title=Password%20Protecting%20Websites%20with%20Apache%20.htaccess&amp;source=WSI+Blog+Website+Development+and+Internet+Marketing+Blog&amp;summary=There%20are%20likely%20to%20be%20areas%20of%20your%20website%20that%20you%20don%27t%20want%20others%20to%20be%20able%20to%20access%2C%20such%20as%20admin%20areas.%20Or%20sometimes%20you%20might%20want%20to%20do%20some%20quick%20updates%20to%20your%20code%20without%20the%20website%20being%20accessible%20to%20the%20public.%20If%20you%27re%20using%20the%20Apache%20web%20server%20the%20hypertext%20access%20%28.htaccess%29%20file%20lets%20you%20add%20password%20protection%20in%20a%20flash%21" title="LinkedIn"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/linkedin.png" title="LinkedIn" alt="LinkedIn" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.myspace.com/Modules/PostTo/Pages/?u=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F&amp;t=Password%20Protecting%20Websites%20with%20Apache%20.htaccess" title="MySpace"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/myspace.png" title="MySpace" alt="MySpace" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://ping.fm/ref/?link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F&amp;title=Password%20Protecting%20Websites%20with%20Apache%20.htaccess&amp;body=There%20are%20likely%20to%20be%20areas%20of%20your%20website%20that%20you%20don%27t%20want%20others%20to%20be%20able%20to%20access%2C%20such%20as%20admin%20areas.%20Or%20sometimes%20you%20might%20want%20to%20do%20some%20quick%20updates%20to%20your%20code%20without%20the%20website%20being%20accessible%20to%20the%20public.%20If%20you%27re%20using%20the%20Apache%20web%20server%20the%20hypertext%20access%20%28.htaccess%29%20file%20lets%20you%20add%20password%20protection%20in%20a%20flash%21" title="Ping.fm"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/ping.png" title="Ping.fm" alt="Ping.fm" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://reddit.com/submit?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F&amp;title=Password%20Protecting%20Websites%20with%20Apache%20.htaccess" title="Reddit"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/reddit.png" title="Reddit" alt="Reddit" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://slashdot.org/bookmark.pl?title=Password%20Protecting%20Websites%20with%20Apache%20.htaccess&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F" title="Slashdot"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/slashdot.png" title="Slashdot" alt="Slashdot" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F&amp;title=Password%20Protecting%20Websites%20with%20Apache%20.htaccess" title="StumbleUpon"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/stumbleupon.png" title="StumbleUpon" alt="StumbleUpon" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://technorati.com/faves?add=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F" title="Technorati"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/technorati.png" title="Technorati" alt="Technorati" /></a></li>
	<li class="sociablelast"><a rel="nofollow"  target="_blank" href="http://twitter.com/home?status=Password%20Protecting%20Websites%20with%20Apache%20.htaccess%20-%20http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fpassword-protecting-websites-apache-htaccess%2F2010%2F01%2F" title="Twitter"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/twitter.png" title="Twitter" alt="Twitter" /></a></li>
</ul>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.wsi-ebizsolutions.biz/blog/password-protecting-websites-apache-htaccess/2010/01/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Reducing Form Spam Without the Use of a CAPTCHA</title>
		<link>http://www.wsi-ebizsolutions.biz/blog/reducing-form-spam-without-captcha/2009/11/</link>
		<comments>http://www.wsi-ebizsolutions.biz/blog/reducing-form-spam-without-captcha/2009/11/#comments</comments>
		<pubDate>Mon, 16 Nov 2009 12:59:38 +0000</pubDate>
		<dc:creator>Barnaby Knowles</dc:creator>
				<category><![CDATA[Security]]></category>
		<category><![CDATA[Website Development]]></category>

		<guid isPermaLink="false">http://www.wsi-ebizsolutions.biz/blog/?p=365</guid>
		<description><![CDATA[Form spam is a growing problem for webmasters. Using these 9 simple tips it's easy to greatly reduce the amount of form spam that you receive without the use of a complicated CAPTCHA.]]></description>
			<content:encoded><![CDATA[<h2>The problem of form spam</h2>
<p>Form spam is a growing problem for webmasters. Through our &#8220;contact us&#8221; feedback forms we&#8217;ve all received the ubiquitous emails advertising everything from the little blue pill to cut-price designer timepieces. Bloggers will also be used to receiving lots of comments linking back to the poster&#8217;s own website or advertising various wares. The vast majority of this form spam is automated, meaning that a bot comes along and submits the form rather than a human being.<br />
<span id="more-365"></span></p>
<h2>Blocking spambots with a CAPTCHA</h2>
<p>Probably the most popular way to protect feedback forms from being spammed it to install a CAPTCHA. A CAPTCHA is a <strong>C</strong>ompletely <strong>A</strong>utomated <strong>P</strong>ublic <strong>T</strong>uring test to tell <strong>C</strong>omputers and <strong>H</strong>umans <strong>A</strong>part. There are many different types of CAPTCHAs, but the most common is a distorted image containing a word or phrase that the visitor submitting the form must correctly enter into one of the form fields. Bots cannot &#8220;see&#8221; the image so they cannot enter the correct word or phrase and submit the form.</p>
<p>Hate them as much as you want, but spammers aren&#8217;t stupid. As the use of CAPTCHA images increased, spammers started to defeat them by employing ORC technology that could actually &#8220;read&#8221; the images (a bit like scanning a document and using OCR software to turn it into editable text again). Unless a spammer <strong>really</strong> wants to submit your web form, it&#8217;s unlikely that a bot of this sophistication will be unleashed on your website. So whilst CAPTCHA images are still a good deterrent, they aren&#8217;t the last word in form spam prevention.</p>
<p>There are also other reasons why an image CAPTCHA might not be suitable for your website:</p>
<ul>
<li> CAPTCHA images may make it harder for a visually impaired visitor to contact you.</li>
<li>Visitors don&#8217;t like having to decipher CAPTCHA images!</li>
<li>CAPTCHA images won&#8217;t stop human spammers filling your form in manually.</li>
</ul>
<p>That&#8217;s not a definitive list of reasons why you might not want to use a CAPTCHA, but the point is that reasons do exist!</p>
<h2>How to block spam without using a CAPTCHA</h2>
<p>I have had great deal of success in blocking feedback form spam by filtering user input to identify spam. Spambots appear to operate in a similar way, and patterns can be identified and used to block their form submissions. By scanning input for certain words, phrases or patterns you should be able to virtually eliminate feedback form spam without inconveniencing genuine visitors.</p>
<p>These tips should work using any programming language, whether your website is programmed in PHP, ASP, Coldfusion etc&#8230;, as most (if not all) have functions to identify a text string within a larger text string.</p>
<h2>Things to look for in user input</h2>
<p>Spambots change their behaviour all the time. The items below do not constitute a definitive list of things to check for, but if you look for these you should greatly reduce the amount of form spam that you receive.</p>
<h3>PHP-specific hijacking</h3>
<p>PHP has the mail() function that allows the webmaster to send email through his website. It is possible for a spammer to craft his form input so as to inject additional headers into the webmaster&#8217;s email and thereby add new recipients to the message. If he successfully accomplishes this he can send large volumes of spam through the victim&#8217;s website. Many times this type of hijack will contain the phrase <strong>MIME-Version:</strong> and/or <strong>Content-Type</strong>. As these are not phrases that genuine visitors are likely to be using, we can assume that any input that includes these phrases is spam.</p>
<h3>Email addresses at your website</h3>
<p>A lot of the time spambots will use an email address at your website when filling out your form. So if the visitor has used an email address at your website (e.g. sales@your-domain.com) when submitting the form, you can assume that it is spam. Look for <strong>@your-domain.com</strong> (replacing your-domain.com for your own domain!).</p>
<h3>HTML links/code</h3>
<p>Spammers often try to submit lots of HTML links in the hope that your form sends you an HTML formatted email and you&#8217;ll visit their links. Unless you&#8217;re expecting your visitors to be sending you HTML code you can filter out any messages containing <strong>a href=</strong> as spam.</p>
<h3>BBCode</h3>
<p>Similarly, spammers often try to submit lots of BBCode links. So unless you&#8217;re expecting your visitors to be sending you BBCode you can filter out any messages containing <strong>[url</strong> as spam.</p>
<h3>URLs</h3>
<p>Along the same lines as the two points above, spambots often try to submit lots of plain URLs. This is a little more complicated than the former two examples of spam because you might want allow genuine visitors to include URLs in their message. My approach has been to count the number of times <strong>http://</strong> appears in the user input and flag any message with more than 2 URLs as spam.</p>
<h3>Short messages</h3>
<p>Spammers will sometimes test your form for things that they can exploit. Typically they&#8217;ll just enter a short message such as &#8220;Nice site!&#8221; or something similar. You can check the length of the message and flag messages shorter than 11 characters as spam. After all, what genuine visitor would send a worthwhile message that contained fewer than 11 characters?</p>
<h3>Common spam email subjects</h3>
<p>Spammers often use the same subject line when completing web forms. I have seen a lot of form spam with the subject &#8220;<strong>some sites</strong>&#8220;. It&#8217;s not a subject that I would expect any genuine visitors to be using so any form submissions with that subject can be marked as spam.</p>
<h3>Source email address</h3>
<p>I have also seen a large number of spambots use <strong>hotmail.ru</strong> email addresses. Unless you expect any Russian visitors to be contacting you, you can flag any form submissions using this domain for the email address as spam.</p>
<h3>Random spam email subjects</h3>
<p>Something that I have started to see more of is the subject containing a random string such as <strong>fXNmOtGchIdBGvA</strong>. OK so if every spam submission subject is random, how do you block it? Well that string is 15 characters long. How many times would the subject of a genuine form submission be 15 characters long with no spaces? That would require the genuine visitor to be using a single word of 15 characters or more, which seems highly unlikely to me.</p>
<p>Filter out this form of spam by checking the length of the subject line. If it&#8217;s 15 characters or more, check for the existence of a space. If none exists then it&#8217;s likely that the message is spam.</p>
<p>This type of spam is usually accompanied by the existence of one or more URL in the message. So if we want to be so as not to block legitimate visitors&#8217; messages, we can check the subject line and then also check if the message contains a URL. If both conditions are fulfilled then it&#8217;s a pretty safe bet that the message is spam.</p>
<h2>Safeguards</h2>
<p>Although unlikely, it is possible that a genuine user might trigger one of the spam filters above. Whenever I implement these measures I also assign a useful error message to each filter to that when one is tripped, the user is told exactly why their message has not been accepted. They can then change the offending text. Perhaps you won&#8217;t want to reveal your exact phrases or limits just in case a human spammer is accessing your web form, but providing users with an explanation of how to amend their input to pass your filters is a good idea.</p>
<h2>Conclusion</h2>
<p>By filtering visitor input using these 9 tips you should be able to virtually eliminate form spam. However, as webmasters find new ways to stop spambots, the spammers find new ways to get past our filters. As such user input filtering is an ongoing process and more filters must be added over time. Luckily for the webmaster, spambots&#8217; submissions usually have some discernible pattern that a human can identify and filter out.</p>
<div id="crp_related"><h3>Related Posts:</h3><ul><li><a href="http://www.wsi-ebizsolutions.biz/blog/ban-bad-bots-global_asa-classic-asp/2009/10/" rel="bookmark" class="crp_title">Banning Bad Bots Using The global.asa File In Classic ASP</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/choose-web-hosting/2009/11/" rel="bookmark" class="crp_title">How To Choose Your Web Hosting</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/5-tips-converting-online-leads/2010/02/" rel="bookmark" class="crp_title">5 Tips for Converting Online Leads</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/emailmarketing/2010/07/" rel="bookmark" class="crp_title">Essential Email Marketing Tips For Today – Part 2</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/email-campaigns-click/2010/01/" rel="bookmark" class="crp_title">Email Campaigns that Really Click</a></li></ul></div>
<div class="sociable">

<ul>
	<li class="sociablefirst"><a rel="nofollow"  target="_blank" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F&amp;title=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA&amp;bodytext=Form%20spam%20is%20a%20growing%20problem%20for%20webmasters.%20Using%20these%209%20simple%20tips%20it%27s%20easy%20to%20greatly%20reduce%20the%20amount%20of%20form%20spam%20that%20you%20receive%20without%20the%20use%20of%20a%20complicated%20CAPTCHA." title="Digg"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/digg.png" title="Digg" alt="Digg" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://sphinn.com/index.php?c=post&amp;m=submit&amp;link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F" title="Sphinn"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/sphinn.png" title="Sphinn" alt="Sphinn" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://delicious.com/post?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F&amp;title=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA&amp;notes=Form%20spam%20is%20a%20growing%20problem%20for%20webmasters.%20Using%20these%209%20simple%20tips%20it%27s%20easy%20to%20greatly%20reduce%20the%20amount%20of%20form%20spam%20that%20you%20receive%20without%20the%20use%20of%20a%20complicated%20CAPTCHA." title="del.icio.us"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/delicious.png" title="del.icio.us" alt="del.icio.us" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.facebook.com/share.php?u=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F&amp;t=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA" title="Facebook"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/facebook.png" title="Facebook" alt="Facebook" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.google.com/bookmarks/mark?op=edit&amp;bkmk=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F&amp;title=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA&amp;annotation=Form%20spam%20is%20a%20growing%20problem%20for%20webmasters.%20Using%20these%209%20simple%20tips%20it%27s%20easy%20to%20greatly%20reduce%20the%20amount%20of%20form%20spam%20that%20you%20receive%20without%20the%20use%20of%20a%20complicated%20CAPTCHA." title="Google Bookmarks"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/googlebookmark.png" title="Google Bookmarks" alt="Google Bookmarks" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.friendfeed.com/share?title=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA&amp;link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F" title="FriendFeed"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/friendfeed.png" title="FriendFeed" alt="FriendFeed" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F&amp;title=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA&amp;source=WSI+Blog+Website+Development+and+Internet+Marketing+Blog&amp;summary=Form%20spam%20is%20a%20growing%20problem%20for%20webmasters.%20Using%20these%209%20simple%20tips%20it%27s%20easy%20to%20greatly%20reduce%20the%20amount%20of%20form%20spam%20that%20you%20receive%20without%20the%20use%20of%20a%20complicated%20CAPTCHA." title="LinkedIn"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/linkedin.png" title="LinkedIn" alt="LinkedIn" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.myspace.com/Modules/PostTo/Pages/?u=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F&amp;t=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA" title="MySpace"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/myspace.png" title="MySpace" alt="MySpace" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://ping.fm/ref/?link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F&amp;title=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA&amp;body=Form%20spam%20is%20a%20growing%20problem%20for%20webmasters.%20Using%20these%209%20simple%20tips%20it%27s%20easy%20to%20greatly%20reduce%20the%20amount%20of%20form%20spam%20that%20you%20receive%20without%20the%20use%20of%20a%20complicated%20CAPTCHA." title="Ping.fm"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/ping.png" title="Ping.fm" alt="Ping.fm" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://reddit.com/submit?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F&amp;title=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA" title="Reddit"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/reddit.png" title="Reddit" alt="Reddit" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://slashdot.org/bookmark.pl?title=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F" title="Slashdot"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/slashdot.png" title="Slashdot" alt="Slashdot" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F&amp;title=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA" title="StumbleUpon"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/stumbleupon.png" title="StumbleUpon" alt="StumbleUpon" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://technorati.com/faves?add=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F" title="Technorati"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/technorati.png" title="Technorati" alt="Technorati" /></a></li>
	<li class="sociablelast"><a rel="nofollow"  target="_blank" href="http://twitter.com/home?status=Reducing%20Form%20Spam%20Without%20the%20Use%20of%20a%20CAPTCHA%20-%20http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Freducing-form-spam-without-captcha%2F2009%2F11%2F" title="Twitter"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/twitter.png" title="Twitter" alt="Twitter" /></a></li>
</ul>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.wsi-ebizsolutions.biz/blog/reducing-form-spam-without-captcha/2009/11/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Banning Bad Bots Using The global.asa File In Classic ASP</title>
		<link>http://www.wsi-ebizsolutions.biz/blog/ban-bad-bots-global_asa-classic-asp/2009/10/</link>
		<comments>http://www.wsi-ebizsolutions.biz/blog/ban-bad-bots-global_asa-classic-asp/2009/10/#comments</comments>
		<pubDate>Fri, 23 Oct 2009 13:02:10 +0000</pubDate>
		<dc:creator>Barnaby Knowles</dc:creator>
				<category><![CDATA[Security]]></category>
		<category><![CDATA[Website Development]]></category>
		<category><![CDATA[asp]]></category>
		<category><![CDATA[ban]]></category>
		<category><![CDATA[bot]]></category>
		<category><![CDATA[global.asa]]></category>

		<guid isPermaLink="false">http://www.wsi-ebizsolutions.biz/blog/?p=196</guid>
		<description><![CDATA[Use the global.asa file in classic ASP to ban bad bots by identifying their user-agent strings.]]></description>
			<content:encoded><![CDATA[<p>Bad bots can cause problems for your website. They can submit spam to your forum or blog, spam your contact form, or just use up your valuable resources such as bandwidth and CPU. If you use Classic ASP this article will show you how to ban bad bots from your entire website using the global.asa file.<br />
<span id="more-196"></span><br />
WSI was recently asked by one of our longstanding clients to investigate why their website had started using a much greater amount of bandwidth than expected. Our client was already utilising our <a href="http://www.wsi-ebizsolutions.biz/web-analytics_18.html">web analytics</a> services, so our first port of call was their Google Analytics account so that we could investigate their web traffic.</p>
<h2>Identify the traffic source</h2>
<p>We soon identified that large numbers of page views were being generated from a <a href="http://www.wsi-ebizsolutions.biz/google-ppc_55.html">Google AdWords</a> campaign that was no longer active. Naturally this aroused our suspicion, because an inactive <a href="http://www.wsi-ebizsolutions.biz/ppc-management_48.html">pay per click (PPC)</a> campaign should not be generating any traffic at all!</p>
<p>Our next course of action was to analyse the website&#8217;s server logs in order to further identify the source of the traffic. We quickly isolated the page views as being generated by a bot. A bot is a software application that runs automated tasks over the Internet. The largest use of bots is in web spidering by search engines and the like.</p>
<h2>The bad bot</h2>
<p>This is exactly what the bot in question was doing: loading our client&#8217;s page an average of once every 10 minutes 24 hours a day. A page load every 10 minutes isn&#8217;t a problem for server load, however this bot was also requesting all of the page media such as images and JavaScripts, and therefore wasting our client&#8217;s resources.</p>
<p>We then had to choose the most appropriate way to ban the bot.</p>
<h2>Banning the bad bot</h2>
<p>Our client&#8217;s website is built in classic ASP and hosted on a Microsoft IIS/6.0 machine, so that dictated the methods that we could use to ban the nuisance bot.</p>
<h3>robots.txt</h3>
<p>We assumed from the start that the bot would not obey any exclusion set up via the robots.txt file, so we ignored that option completely.</p>
<h3>ASP code at page level</h3>
<p>It would be possible to insert ASP code into individual pages in order to stop the pages loading if they were requested by the bot in question. However, this would mean editing multiple existing pages and then monitoring the website to check whether further pages started being affected by the bot. There are more suitable ways of achieving our goal using the IIS/6.0 web server itself.</p>
<h3>global.asa</h3>
<p>IIS/6.0 allows for an optional file called <em>global.asa</em> that can include scripts that can be accessed by every page in an ASP application. If we edit our <em>global.asa</em> file to block the bot then it&#8217;ll be immediately banned from every page on the website. (Note that this does not include content such as images or JavaScripts.)</p>
<h2>Identify the bad bot</h2>
<p>There are two main ways in which a bot can be identified: by its IP address and its user-agent. (A user-agent is a text string that identifies the client application making the request.) Normally it&#8217;s considered best practice to identify and ban a bot based on its IP address rather than its user-agent, because a user-agent can be easily spoofed. Bad bots often originate from one or a small number of IPs or networks and identifying these is usually the preferred method of blocking.</p>
<p>However, in this case our client&#8217;s website logs showed us that the bot came from a wide range of IP addresses on different networks, but as far as we could see it did always identify itself correctly with a legitimate user-agent string. We therefore decided that the best way to combat this particular bot would be to ban it based on its user-agent.</p>
<h2>Code to ban the bad bot using global.asa</h2>
<p>Banning the bad bot using the <em>global.asa</em> file is fairly straightforward. The <em>global.asa</em> file has 4 events: <em>Application_OnStart</em>, <em>Session_OnStar</em>t,<em> Session_OnEnd</em> and <em>Application_OnEnd</em>. By adding bot blocking code to the <em>Session_OnStart</em> event it will be executed whenever a user (including the bot) starts a session. We used the following code:</p>
<div style="padding: 10px; background-color: #ffffdd; margin-bottom: 10px;">
<pre>If request.ServerVariables("HTTP_USER_AGENT") = "insert bad bot user-agent here" Then
Session.Abandon
Response.End()
End If</pre>
</div>
<p>What this code does is identify the bot based on its user agent and then drop the session with Session.Abandon and stop the script execution with Response.End()</p>
<p>If we didn&#8217;t drop the session then the bot could accept the session cookie and request the page again. If it did this the page would be served normally, as the bot blocking code is only executed when a <em>new</em> session is started. By dropping the session for the bot we ensure that a new session is started every time that it requests a page.</p>
<p>Using Response.End() means that no HTML ever reaches the bot. This serves two purposes: it reduces bandwidth by not sending unnecessary HTML code; and it means that the bot does not receive the locations of other resources within the HTML response.</p>
<p>As mentioned earlier, images and JavaScripts etc&#8230; can still be requested. However, if the bot receives no HTML from its initial request it shouldn&#8217;t know <em>where</em> to find those resources.</p>
<h2>Conclusions</h2>
<p>Initial indications show that the bot ban has worked very well, with our client&#8217;s bandwidth coming back down to normal levels. The bot banning code could be further refined and extended:</p>
<ul style="margin-bottom: 10px;">
<li>We could look for specific keywords or phrases within the user-agent string rather than checking the whole string</li>
<li>We could send a &#8220;403 Forbidden&#8221; HTTP response rather than &#8220;200 OK&#8221;, which would be the technically correct thing to do when banning a visitor</li>
<li>We could have a list of bad bots that should be banned, rather than just identifying one bot</li>
<li>We could ban bots based on both their user-agents and IP addresses</li>
</ul>
<p>Perhaps we&#8217;ll investigate these possibilities in another post!</p>
<div id="crp_related"><h3>Related Posts:</h3><ul><li><a href="http://www.wsi-ebizsolutions.biz/blog/reducing-form-spam-without-captcha/2009/11/" rel="bookmark" class="crp_title">Reducing Form Spam Without the Use of a CAPTCHA</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/website-redirects-on-lamp-hosting/2009/11/" rel="bookmark" class="crp_title">Website Redirects on LAMP Hosting</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/password-protecting-websites-apache-htaccess/2010/01/" rel="bookmark" class="crp_title">Password Protecting Websites with Apache .htaccess</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/introduction-apache-modrewrite/2010/01/" rel="bookmark" class="crp_title">Introduction to Apache mod_rewrite</a></li><li><a href="http://www.wsi-ebizsolutions.biz/blog/special-characters-html-php-htmlspecialchars/2009/12/" rel="bookmark" class="crp_title">Special Characters in HTML with PHP&#8217;s htmlspecialchars() Function</a></li></ul></div>
<div class="sociable">

<ul>
	<li class="sociablefirst"><a rel="nofollow"  target="_blank" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F&amp;title=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP&amp;bodytext=Use%20the%20global.asa%20file%20in%20classic%20ASP%20to%20ban%20bad%20bots%20by%20identifying%20their%20user-agent%20strings." title="Digg"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/digg.png" title="Digg" alt="Digg" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://sphinn.com/index.php?c=post&amp;m=submit&amp;link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F" title="Sphinn"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/sphinn.png" title="Sphinn" alt="Sphinn" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://delicious.com/post?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F&amp;title=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP&amp;notes=Use%20the%20global.asa%20file%20in%20classic%20ASP%20to%20ban%20bad%20bots%20by%20identifying%20their%20user-agent%20strings." title="del.icio.us"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/delicious.png" title="del.icio.us" alt="del.icio.us" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.facebook.com/share.php?u=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F&amp;t=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP" title="Facebook"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/facebook.png" title="Facebook" alt="Facebook" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.google.com/bookmarks/mark?op=edit&amp;bkmk=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F&amp;title=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP&amp;annotation=Use%20the%20global.asa%20file%20in%20classic%20ASP%20to%20ban%20bad%20bots%20by%20identifying%20their%20user-agent%20strings." title="Google Bookmarks"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/googlebookmark.png" title="Google Bookmarks" alt="Google Bookmarks" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.friendfeed.com/share?title=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP&amp;link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F" title="FriendFeed"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/friendfeed.png" title="FriendFeed" alt="FriendFeed" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F&amp;title=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP&amp;source=WSI+Blog+Website+Development+and+Internet+Marketing+Blog&amp;summary=Use%20the%20global.asa%20file%20in%20classic%20ASP%20to%20ban%20bad%20bots%20by%20identifying%20their%20user-agent%20strings." title="LinkedIn"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/linkedin.png" title="LinkedIn" alt="LinkedIn" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.myspace.com/Modules/PostTo/Pages/?u=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F&amp;t=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP" title="MySpace"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/myspace.png" title="MySpace" alt="MySpace" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://ping.fm/ref/?link=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F&amp;title=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP&amp;body=Use%20the%20global.asa%20file%20in%20classic%20ASP%20to%20ban%20bad%20bots%20by%20identifying%20their%20user-agent%20strings." title="Ping.fm"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/ping.png" title="Ping.fm" alt="Ping.fm" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://reddit.com/submit?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F&amp;title=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP" title="Reddit"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/reddit.png" title="Reddit" alt="Reddit" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://slashdot.org/bookmark.pl?title=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP&amp;url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F" title="Slashdot"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/slashdot.png" title="Slashdot" alt="Slashdot" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F&amp;title=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP" title="StumbleUpon"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/stumbleupon.png" title="StumbleUpon" alt="StumbleUpon" /></a></li>
	<li><a rel="nofollow"  target="_blank" href="http://technorati.com/faves?add=http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F" title="Technorati"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/technorati.png" title="Technorati" alt="Technorati" /></a></li>
	<li class="sociablelast"><a rel="nofollow"  target="_blank" href="http://twitter.com/home?status=Banning%20Bad%20Bots%20Using%20The%20global.asa%20File%20In%20Classic%20ASP%20-%20http%3A%2F%2Fwww.wsi-ebizsolutions.biz%2Fblog%2Fban-bad-bots-global_asa-classic-asp%2F2009%2F10%2F" title="Twitter"><img src="http://www.wsi-ebizsolutions.biz/blog/wp-content/plugins/sociable/images/twitter.png" title="Twitter" alt="Twitter" /></a></li>
</ul>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.wsi-ebizsolutions.biz/blog/ban-bad-bots-global_asa-classic-asp/2009/10/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
