Drupals impact on the environment?

This evening, while solving a duplicate content issue on a wordpress site it really made me stop and think about the possible impact of duplicate content that could arise within drupal. With everyone talking about carbon foot prints and the recent article by the Times Online discussing Googles carbon foot print from just 2 searches.

It brought to my attention when I wrote initially about duplicate content issues within drupal just how easy it was to fumble your way into lots of duplicate content within the site. As was pointed out then by many more experienced users that by not using the blog module and using story for content would have totally eliminated the duplicate content issues.

I’ve been told in the past, by some that they do not worry about duplicate content as the search engines will sort it out and display what they feel is the most relevant content. While that maybe true, what does the future hold for us?

As all the major search engines, try to reduce their carbon foot print what does that mean to site owners that have hundreds to thousands of duplicate content pages. Will they determine a ratio, which when you cross that threshold, they will completely stop indexing your site altogether?

And if you link to a site in which they have determined has to much duplicate content, will that be considered a bad neighbor hood in their eyes. What made me think of drupal in this aspect was when I finally re-enabled the breadcrumbs.

While the Taxonomy for the user’s view was correct the underlying links where now pointing to category/1/2 once again creating what appears as duplicate content, which the robots pick-up return and index that content!

Custom Breadcrumbs to the rescue

I downloaded and installed the Custom Breadcrumb module, initilzied it and through a lot of reading finally got the desired output I was looking for, essentially the breadcrumbs now matched the navigation links.

For both the Blog and Story content these where the settings that gave me the desired results:

custom-breadcrumb-1.jpg
custom-breadcrumb-2.jpg
custom-breadcrumb-3.jpg

Now the only issue once again to eliminate is when going through the blogs link, in the breadcrumb you have a link back to the blog instead of the single user blog. Of course you could always create a pseudonym to post with!

Still whether you believe in global warming or not, many do and will pursue it relentlessly. What part can we play to educate individuals using CMS system like drupal to reduce their carbon foot print by keeping everything nice and clean, and duplicate free.

After all it is not only all the different robots that are crawling the duplicate content, it is the sites server that has to deliver that content as well. I would appreciate your thoughts on this subject.

Comments on this entry are closed.

  • drawk Jan 15, 2009 @ 8:20

    FWIW, regarding the Times article about Google’s carbon footprint:

    How The Times Got Confused About Google and The Tea Kettle

    Most notably, Alex Wissner-Gros (the physicist who the Times attribute the claim to) clarifies his position.

    • Glenn Jan 15, 2009 @ 8:38

      Data can always be manipulated, thanks for bringing information that will clarify it.

       

      Glenn

  • dalin Jan 15, 2009 @ 21:14

    You can also use robots.txt to block the crawlers from indexing your taxonomy pages

    • Glenn Jan 16, 2009 @ 20:17

      In an ideal world that would be true, yet the many different robots do not always obey what is found in the robots.txt file. When I’m really bored, which fortunately is not to often. I will watch the live traffic on my server.

      They still crawl the pages they should not, they just don’t index them. This still requires an extra workload on the server to serve up the various pages.