About ♥ Contact ♥ Press
Read These
clussster
Flower Child Vintage
Hello, Monkeyface
Maybe Someday We'll be Smarter
Design For Mankind
Smitten Kitchen
My Latest Pin
Archives
How to Deal With Internet Scum, Part 1: Splogs
March 27th, 2011
When it comes to your blog and creating content online, I wouldn’t tell you what to do if your content is stolen, but what to do when your content is stolen. It’s just part of being on the Internet, unfortunately. For this post, I’m focusing on splogs and what to do when someone steals your content.
This week alone, a few FBFF and FFB members had their content stolen, and my hope is to help others going through the same frustrating situation.
What is a splog?
A “splog” is basically spam+blog and is usually composed of plagiarized articles from around the web. Some splogs are automatic, and scrape content from feeds with bots, and some are manually run, with a site owner copy/pasting whole posts as their own. Some splogs may attempt to get around things by stating that “all posts are copyright of their original authors,” but it still doesn’t make it right. The main purpose for splogs is to make money via Google AdSense, AdWords or other ads.
Something to note off the bat is that many times splogs will only take small excerpts of your content while keeping a link back to your site. It’s rare, but if this happens, save yourself the time and pain and let that splog link back to your site. Technically they’re not violating your rights. Sure, what they’re doing may not be ethical, since they’re making money, but you win some and you lose some. Look on the bright side, it’ll diversify your link portfolio.
If the splog is a more serious offender, stealing your content outright with no link in return, please read on.
How to Spot a Splog
Splogs are usually pretty easy to spot off the bat, but here are a few tell-tale signs that what you’re looking at is a splog:
- Things just don’t make sense. Some posts are complete, but others are mix-and-match Frankenstein posts of random words and sentences from different sources.
- Things don’t add up. Take a peek at the archives — you may see an inhuman amount of posts, like 6,402 in one month.
- You might see excerpts from feeds, some with backlinks, some not.
What To Do

1. Contact the splog owner.
If you find that your content is being plagiarized, you should definitely attempt to contact the site owner. Many times contact info can be found in a variety of places, so be sure to check the About page, Contact page, site footer, email forms, author profile pages, etc.
Another neat trick is to check the page’s source code. Sometimes contact info can be in the meta tags or even hidden in a link somewhere (search for “mailto”). You can view a web page’s source code in every browser, and it’s very similar for each:
- Chrome: Click on “View” –> “Developer” –> “View Source”
- Firefox: Click on “View” –> “Page Source”
- Safari: Click on “View” –> “View Source”
- Internet Explorer: Click on “View” –> “Source”
- Opera: Click on “View” –> “Source”
If all else fails, try finding their contact info via a WHOIS search — Network Solutions and DomainTools are both great. Simply type in the root domain — http://evilbutthole.com rather than http://super.evilbuttole.com — into either of these searches and you’ll hopefully be returned with lots of info! This is a good example of what you’ll see, if they haven’t blocked any info (this is Tumblr’s WHOIS info for example):

Here’s a starter template of a note you can send to the site owner once you’ve found their contact info:
Dear [Name of Blog Owner Violating Copyright]:
This is to advise you that you are using copyrighted and protected material on your website/blog. Your illegal use of various articles including [Article Title] at [URL of blog stealing your content] is originally from my website/blog called [Website/Blog Title] at [URL of your blog post]. This is original content and I am the author and copyright holder. Use of copyright protected material without permission is illegal under copyright laws.
Please take one or more of the following actions immediately:
• Rewrite the post to include excerpts with a link to the original content.
• Credit the material specifically to me, as author, and my website ((be very specific about this)).
• Provide compensation for use of my copyright protected material of $[amount] USD for each article paid via PayPal. That’s $[amount] total.
• Remove the plagiarized material immediately.I expect a response within 5 days to this issue. Thank you for your immediate action on this matter.
In addition, I am sending this report to Google via mail/fax per Google regulations to report the copyright violations. See http://www.google.com/blogger_dmca.html.
Many times there is no contact info for the site owner so you’ll need to move on…
2. Contact the scraper’s host.
So you’ve attempted to find the splogger’s contact info on their page, in their page’s source code, and through a WHOIS service, but to no avail. Many of them don’t disclose a contact email, and for good reason. What then? You can directly contact their hosting service provider. Almost all hosting services would respond positively by shutting them down.
If the splog is hosted by WordPress.com, visit WordPress.com’s DMCA page, where you can find instructions on how to report the violation. Be sure to follow their instructions to a T, they’re very specific on what to include.
If the splog is anywhere else, especially on Blogger, fill out Google’s copyright infringement form to report the incident.
If they’re using Google AdSense to profit from copyrighted content fill out Google’s copyright infringement form for AdSense.
Be sure you’re VERY specific about the copyrighted content in question. When you report violations, you need to give exact examples with the URL, so the reports are processed. Google or the folks at WordPress.com aren’t going to sit there comparing the two sites.
3. Request a ban from search engines.
- Google’s DMCA Statement and Policy
- Yahoo!’s Policy and Instructions
- SEO Logic published this article on DMCA notifications for other search engines
4. Cease and Desist Order
Sometimes it comes down to this. A Cease and Desist Order is a “SRSLY, stop it” in legalese, and it tends to scare most people. Well, it’s supposed to, anyway. If you decide to write one yourself — as opposed to having a lawyer doing it for you — there are many templates and example letters around the web, so find one that suits you and customize it to this situation.
Keep the letter looking professional, even if emailing it (whether to the site owner or host). No colorful fonts or cartoony logos! A few good places for Cease and Desist Order resources include this example letter and an interesting article on the mechanics of ethical and effective cease and desist letters.
Prevention

Post Feed Footer for Blogger Users
There’s a nifty little setting in Blogger to tag the bottom of each post in your feed with something unique, like a link back to your site or a statement like, “This is content from http://poopyloops.com, licensed under CC BY-NC-SA 3.0″ or whatever. You can even paste in your copyright info or your Creative Commons badge. The reasoning behind this: If a content scraper is simply scraping everything from your feed, well, he’ll be grabbing this little gem, too.

Post Feed Footer Plugin for WordPress.org
If you’d like to add text to the footer of your feed posts and you’re not on Blogger, there is a WordPress plugin that can help you with that, too — check out ©Feed. You can configure the copyright message, use html, and even add a trusted domain to a whitelist so they don’t see the message. There are just tons of options for this plugin.
Feed cloaking
This is the same technique that the WordPress plugin above follows, only this is manual. First, you’ll need to find the IP address of the offending website. There are a few tools to do this, and I’ve found Network Solutions and DomainTools to work best.
For WordPress.org: Once you’ve found the plagiarist, you need to do the following with your wp-rss2.php template. First, find this section of code in the template:
<?php the_category_rss() ?> <guid isPermaLink="false"><?php the_guid(); ?></guid>
Then, add this after, replacing the X’s with the IP address of the scraper, the Y’s with a fake description and the Z’s with fake content:
<?php if ($_SERVER['REMOTE_ADDR'] == "XXX.XXX.X.X") : ?> <description><![CDATA[YYYYY]]></description> <content:encoded><![CDATA[ZZZZZ]]></content:encoded> <?php else : ?>
Lastly, nine lines down, add the closing tag to the “if” loop:
If done correctly, this should feed the plagiarist with fake content! Sweet! As for the content? Put in whatever you want. A suggestion would be something along the lines of, “If you are reading this, the site you are at is a scraper and is attempting to use my content illegally.” Personally, I’d put in a picture of goatse, but that’s just me. (If you don’t know what the goatse meme is, well… don’t look it up. It’s really NSFW.) To test your new feed cloak out, you can put your own IP address into the field and visit your feed to see if the fake content shows up.
For non-WordPress.org users with access to their .htaccess file: You follow the same steps with finding the offender’s IP address, but place this snippet of code into your .htaccess file:
RewriteEngine on
RewriteCond %{REMOTE_ADDR} ^XXX.XXX.XXX.XXX
RewriteRule ^(.*)$ http://newfeedurl.com/feed
Again, replace the X’s with their IP address and http://newfeedurl.com/feed with the fake feed or content.
Limitations of Feed Cloaking
- Cannot be used, yet, by sites like Blogger, WordPress.com, or other free blogs not hosted by the owner.
- You should be comfortable with accessing and editing things in PHP or .htaccess files.
- There’s nothing stopping a scraper from moving to a new IP address or domain.
- Doesn’t work with Feedburner as far as I know. If you use Feedburner like I do, check on your “Uncommon Uses” section. Google does a pretty good job of finding different places where your feed is being used, and a lot of the time they’re not splogs. My “Uncommon Uses” section is below. You can see that flavors.me and my IFB profile fell under these, but all I need to do now is check the box marked as “known” so they don’t show up here again.

.htaccess Ban
Instead of trying to fool a scraper by cloaking your feed, you can straight up block the scraper’s IP address altogether from viewing your site. Keep in mind, again, that you can only do this if you have access to your .htaccess file, usually if you’re self-hosted. Simply add this line to the file, replacing the X’s with the IP address:
order allow,deny deny from XXX.XXX.X.X allow from all
If you’d like to ban multiple IP addresses, simply add multiple IP addresses like so:
order allow,deny deny from XXX.XXX.X.X deny from XXX.XXX.X.X deny from XXX.XXX.X.X allow from all
The only con to doing this is that has an umbrella effect. It could affect other potentially good blogs that utilize a shared hosting, like HostGator, that has several hundred blogs on an IP address.
Tips & Tools

- Use Google Alerts. This service is amazing. It’s not only useful to keep track of mentions of your brand or blog elsewhere, but it comes in handy for catching plagiarists! Set up alerts for your blog name and personal name. You can have the service email you as soon as it spots a mention of either.
- Place some kind of copyright notice on your blog somewhere, the footer is usually a good spot.
- Visit Creative Commons to create your own Creative Commons license to place on your blog, they have a great wizard to help you choose a license that’s right for you. I’m currently using the Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) license, which basically means: People are free to share (copy, distribute and transmit the work) and remix (to adapt the work), AS LONG AS it’s attributed according to my wishes, and used for noncommercial purposes.
- Check out these great articles: 10 Big Myths About Copyright Explained and this Introduction to Copyrights.
- Chilling Effects also has a great FAQ on copyright.
- KEEP THE PAPERTRAIL! Well, e-papertrail I guess. Save any email or other contact you make with the site owner and host if you’re dealing with stolen content.
- Use Copyscape to search the web for any plagiarized content.

Sophie Catface sez, "If you liek dis post, plz consider sharing,
Pingback: Links á la Mode:Tech Weekly Roundup | Independent Fashion Bloggers
Pingback: Links à la Mode: Tech | Relatively Chic
Pingback: Ausblogcon 2011: Social media and you | Styling You
Pingback: Independent Fashion Bloggers links a la mode – technology | the clothing menu
Pingback: Style Bloggers’ News: 15 must-click links (issue 5) | The Blog Stylist
Pingback: How to Deal With Internet Scum, Part 2: Spam | Relatively Chic
Pingback: Feed Your Readers: RSS, Email Subscriptions & Beyond | Relatively Chic