Information Architecture, Faceted Navigation & Duplicate Content (Oh My!)

Posted by Hannah Smith

Hello there. You look lovely. I’m Hannah and I’m an SEO Consultant for Distilled. I’m British which means I spell things strangely sometimes, we like to make things more complicated than they really need to be here. This is my first post for SEOmoz, I hope you find it useful.


Whenever I kick off a new project with a client, they are typically very interested in how I might be able to get them some lovely links. They’re also pretty keen for me to create them some lovely shiny content. Sadly, most aren’t too interested in information architecture. Many don’t realise how important it is.
 
To be honest, up until fairly recently I was one of those people. Most of the sites which I had worked on previously were in the insurance niche. Now typically these sorts of sites don’t really have duplicate content issues. Likewise I had never encountered any problems with indexation. I secretly wondered what those other SEOs were whining about (bunch of big girl’s blouses).
 
But then… A rude awakening.
 
I’ll not name names (that’s just not nice) but I had a client who were part-way through a brand new site build. I figured the technical part of the project would be pretty straight-forward; after all when someone’s building a brand new site they’re bound to have given some serious thought to information architecture right? …Right? …Bueller? …Bueller? …Anyone?
 
Sadly not. The proposed architecture was riddled with so many issues it made my head spin. They would either have a lot of duplicate content or perhaps little or no content – it wasn’t quite clear which (and neither scenario made me jump for joy). They were likely to struggle with indexing. There were gaps you could drive a bus through in their landing page strategy. Their site was going to be a big old mess.
 
 
There was much lamenting, wailing, tearing of hair and gnashing of teeth… Then I calmed down.
 
What follows is a collection of the challenges I faced and how I dealt with them, plus definitions and explanations which I found useful when trying to fix these issues… Hopefully it’ll save you some pain. Once more unto the breach, dear friends…
 
The Challenge… No one cares but me
Yep, I came up against a whole heap of resistance when trying to fix these issues. No one really understood or cared about the situation. There was a lot of talk about how important the customer journey was; there was a lot of talk about brand experience – but SEO? Hmmm, well it wasn’t really getting much of a look in. The CMS being used for the build was apparently ‘SEO-friendly’ and there would be a sitemap, so the general consensus seemed to be that we were ‘all good’ for SEO thanks.
 
The Counter-Challenge – Education & Myth Busting
In my experience if you want to facilitate change, you’ll need to be prepared to do some serious ‘selling in’ of your ideas. But, the first step is to help people understand what the issues are, and as such, education is key. So, why should people care about information architecture?
 
Here’s what I went with…
Information architecture (or how the information on the site is organised) is important from a search perspective in two key ways:
  1. It enables the search engines to index all of pages on the site
  2. It provides suitable landing pages for all of the keywords (or search phrases) that you might wish to rank for 
Without sound information architecture your site may not get indexed properly, and if a site isn’t indexed, then clearly you’ll have no chance whatsoever of ranking. Likewise, without suitable pages to rank for your selected key phrases, again, you’ll struggle to rank for those keywords.
 
From an SEO perspective we’re also seeking to ensure that we’re not creating duplicate content (i.e. the same content available via more than one URL) – as ultimately this causes issues with ranking as you have more than one page from your site competing for the same search result.
 
Finally, as links equal strength when it comes to SEO we’re also looking to ensure that we have strong internal linking within the site in order to maximise the strength of our most important pages (i.e. the pages which we really want to rank). Of course, external links will play a major part here, but ensuring we’re passing internal ‘link juice’ is also important.
 
I also had to do a little myth busting. The most pervasive of which was the mythical power of the sitemap. There was a strong belief that the sitemap would cure all ills, that provided it included all the pages they wanted to get indexed, they’d duly get indexed and everything would be golden. I’m sure I don’t need to tell you that this isn’t the case. Sure sitemaps are helpful, but they aren’t a cure-all and I certainly wouldn’t recommend that anyone rely on a sitemap to get their content indexed. More importantly even if the sitemap assists with indexation, there was still the issue of providing suitable landing pages for all of the keywords which they wanted to rank for.
 
Key Takeaways
  1. If the search engines can’t index your content you will not rank.
  2. If you don’t have a page for each keyword (or at least each sub-set of keywords – you can of course target more than one keyword per page), again you’ll struggle to rank.
  3. A lack of rankings means a lack of traffic. A lack of traffic will likely mean a lack of revenue.
  4. A sitemap will not fix this. 
So, by this point they were finally pretty much onboard with why this was important. Yay! Time to sell in the solution (cue fanfare) – Faceted Navigation!
 
…Wait, what? What is that?
 
Faceted Navigation
A faceted navigation allows users to select and de-select various facets in order to search / browse for what they are looking for. As such, it allows visitors to utilise multiple navigational paths to reach their desired end goal.
 
Whilst that’s a fairly useful definition it’s probably easier to understand via an illustrated example: 
 
Let’s imagine that you’re shopping for a t-shirt. You might want to browse t-shirts by size (i.e. only those in your size), by colour, by designer, by price etc. To find the t-shirt you want it would be really handy if the website you were browsing allowed you to narrow down your search using some or all of those facets. It might look a little something like this:
 

 

Now I think this is pretty darn lovely from a user’s perspective. Additionally, the flexibility this sort of structure gives you helps you to solve the ‘page for each keyword / sub-set of keywords you want to target’ issue. Whilst it may look fairly simple on paper there are quite a few things to think about when tackling this. Here are some of the things I came up against, and how I dealt with them…
 

1.       How many facets do you need in order to get everything indexed?

Ideally your deepest facet should contain no more than 100 products. This will assist you greatly in getting all of your products indexed. (NB whilst most SEOs are comfortable that the search engines will crawl more than 100 links on any given page, I prefer to stick with 100 product links as most websites will have a number of navigation links on every page in any case. Sticking to a maximum of 100 product links will help keep the total number of links on any given page at a sensible level).
 
By ‘deepest’ I mean however many folders down you decide to go. Let’s stick with hannahstshirts.com as an example – here you may decide to use the following facets:
  •  Womens
  •  T Shirt Type
  •  Designer
An example deep facet page: hannahstshirts.com/womens/v-neck/a-wear/ – on this page, visitors would see all women’s v neck t-shirts from A Wear.
Now this type of page should have no more than 100 products on it, so provided that none of your designers offer more than 100 of a particular style of t-shirt then this is as deep as you need to go. If this isn’t the case you’ll need to add in another facet – e.g. colour.

2.       Facets versus filters

There will probably be further search / browse options which you want to offer visitors to your site that you don’t really care about from a search perspective. For example – it’s really useful for visitors to be able to browse only items which are available in their size; but you may decide that you’re not particularly worried about the search engines indexing these pages. That’s where filters come in. These filters should be implemented using JavaScript or no-indexed to prevent these pages from getting indexed.
 

3.       Do you have pages to enable you to rank for all of the keywords that are important to you?

This is really linked to the previous two points. Again using the example above – if your facets were Womens, T-Shirt Type and Designer; but you had a burning desire to rank for the term ‘white women’s t-shirts’ – then bad news, friend. As colour is a filter rather than a facet you don’t have an indexable page for that phrase. If you want to rank for these sorts of keywords you’ll need to make colour a facet, not a filter. 
 

4.       Pagination

At the top level e.g. ‘Womens’ you’ll return a number of pages of results. Now really you don’t want these pages indexed. Page 2 onwards of a given set of results is rarely an awesome result for a user; plus of course you’ll effectively be having more than one indexed page competing for the same keyword in the SERPs. It’s bad all round. Therefore use Ajax or JavaScript to display page two and onwards.
 

5.       Sorting

Likewise, you may decide to offer sorting options – e.g. sort by price, sort by rating etc. These are great for users, but a potential duplicate content love fest for search. You don’t want the various sorted versions of the same page being indexed separately, so use JavaScript or Ajax.
 

6.       Duplicate Content

Ok, so we’ve dealt with pagination and sorting options but we’ve still got duplicate content issues? Why?
Because there are multiple navigational paths to a user can take, if you’re not careful there will be duplicate URLs for the same content . For example if you wanted to see all of the women’s white t-shirts by Bench you could go via:
 
www.hannahstshirts.com/womens/v-neck/bench
www.hannahstshirts.com/womens/bench/v-neck
 
Plus, depending on your site structure you might also be able to go via:
www.hannahstshirts.com/bench/womens/v-neck
www.hannahstshirts.com/bench/v-neck/womens
www.hannahstshirts.com/v-neck/bench/womens
www.hannahstshirts.com/v-neck/womens/bench
 
Uh oh. Imagine how many permutations of this you’ll have across the site. Bad times. You’ll need to make sure that no matter which route a user takes to reach a particular page, there is only one indexable URL. Now hopefully, you’ll either be custom building something awesome, or be using a CMS which will allow you to do this. If not? You’ll have to 301 all the variants back to one indexable URL.
 
Right, we’re nearly there, I promise. If you’re still reading then you definitely deserve a cookie. Possibly two.
 
Content’s Still King (well, nearly)
So, let’s imagine that you’ve finally got there. You’ve got a lovely looking faceted navigation. You’ve got all of the keyword targeted pages you need. You’ve defeated the duplicate content demons. You are made of win.
 
Don’t stumble at the final hurdle. Despite your best intentions, you still have a site with a lot of pages which look quite similar. Lists of products which are available on a variety of other pages. Doesn’t feel all that unique, huh? You’ll need to create some unique content for each of these pages, and the more important the page is to you; the more awesome this content needs to be.
 
Key Takeaways
  1. Use as many facets as you need to ensure that your deepest faceted pages contain 100 products or fewer AND to ensure you have all the pages you need to target the keywords you want to rank for.
  2. Pagination and sorting options can cause duplicate content – use Ajax / JavaScript to avoid this.
  3. No matter which route a user takes to reach a particular page there can be only one (think Highlander) indexable URL
  4. Remember to create unique content for each page – the more important the page, the more awesome the content 
More Helpful Stuff…
If you’re wrestling with faceted navigation right now, you might find our handy cheat sheet useful – this was distributed post the Pro SEO conference in October – you can download the PDF here.
 
Plus, you might also like to check out Rand’s Whiteboard Friday on Faceted Navigation.
 
 
Failure image credit

Do you like this post? Yes No

This entry was posted in Uncategorized and tagged , , , , , , , , , . Bookmark the permalink.