Ecommerce SEO

CHAPTER 1:

Introduction

Length: 1,847 words

Estimated reading time: 10 minutes

This e-commerce SEO guide has almost 400 pages of advanced, actionable insights into on-page SEO for ecommerce. This is the first out of the 8 sections.

Written by an ecommerce SEO consultant with 20 years of research and practical experience, this comprehensive SEO resource will teach you how to identify and address all of the SEO issues specific to ecommerce websites, in one place.

The strategies and tactics described in this guide have been successfully implemented on top 10 online retailers, small & medium businesses, and mom and pop stores.

Please share and link to this guide if you liked it.

About this SEO course

Hello everybody, and welcome to my ecommerce SEO course. My name is Traian, and I will be your host for the next few hours.

If you are involved with ecommerce in a way or another or if your work touches even a bit on SEO, the course you are about to engage on is going to be full of actionable SEO tactics to help you optimize your website, the right way.

This guide is based on the contents of my book – “Ecommerce SEO”. For those of you who already read the book, you might remember I mentioned that the book would evolve into an easier to update medium – this online guide is now that medium. I would also like to thank everyone who purchased the book and left these five-star reviews:

Before proceeding further, I would like to take a few moments to thank several industry leaders who supported me with this initiative. I am humbled to have received reviews from them:

I would also like to thank everyone else in the SEO industry who supported me but are not listed here. Thanks a lot to all of you who left raving reviews on social media as well. It is much appreciated.

Please allow me to introduce my self.

I started playing with SEO in 1998 with a small website that I built to make some cash. I approached the business model with the “build it, and they will come” expectation, just like anyone else back in the day. You guessed right; it did not happen! Searchers could not find my website because it did not rank on the first page of Yahoo!, and back in ‘98, it was all about Yahoo, not Google. So, I started learning how to attract organic traffic. I read all the SEO books I could get my hands on; I read forums, articles, newsletters and then I tested, tested and tested again. I finally made it to the top five in a few months, and then …… I discovered my passion: SEO. A few years later I started being attracted to more complex SEO issues, specific to e-commerce websites.

Fast forward to today, I am still passionate about SEO and e-commerce. Meanwhile, adding more knowledge from different fields, helps me approach SEO with consideration to information architecture, usability, site performance, and user experience.

In the past, I worked with companies large and small, ranging from spas around the corner to multi-million e-commerce websites. I helped others like you to succeed in extremely competitive niches such as jewelry, hotel bookings, and online pharma. I am hoping that by creating this course, I will be able to help way more of you, than offering one to one consulting to just a few businesses.

Feel free to get in touch with me either by contacting me on the website or by connecting using LinkedIn and Twitter.

This ecommerce SEO guide evolved from the desire to offer those involved in ecommerce, access to SEO advice in a single place. The Internet contains a large amount of information on this subject matter, and the online SEO community is amazing. However, the SEO resources that ecommerce professionals need are widely scattered.

So, I decided to put everything I researched, learned and practiced about SEO into a single resource, this course.

We will start with the foundation, which is the architecture of the website, and then we will continue with keyword research, which is essential for determining your content strategy. Next, we will learn how to guide crawlers and how to avoid search engine bots’ traps, and then we will explore using internal linking to improve relevance and create strong topical themes.

We will continue by deconstructing the most important pages for ecommerce websites—the home page, listing pages, and product detail pages—each in separate sections.

For you to get the most out of this course it is better to go through the lectures in the order they are listed, without skipping lessons. That is because I will often be making references to concepts and tactics described in previous lectures.

Who should take this course?

If you are a small or medium business owner who runs an ecommerce website, then this course is for you. You have probably realized by now that running an ecommerce business requires many skills. Depending on your educational background, you are either putting much time and work into learning various disciplines such as programming, design, usability and copywriting or, you are contracting qualified help.

This course will help you realize how complex SEO is, and it should help you set realistic expectations. More importantly, don’t expect organic traffic to be a silver bullet. Business-wise it is a good idea to diversify your acquisition channels to email, social, referral and more while working your way up in organic results.

If you are an ecommerce executive take this course to understand how almost any decision you make regarding the website will affect its search visibility. The course will show you what needs to be done to have an SEO-friendly ecommerce website, but it is up to you to prioritize based on your current situation and objectives. This course will also help you have more educated conversations with your search engine optimizer(s).

Even if you work in a medium-sized business, you may realize that you do not have all the expertise or resources in-house, so you will have to hire outside talent. This course should help you understand what to look for when hiring that talent. As an executive, your time is probably at a premium, so if you do not feel like learning SEO stuff, at least let your web dev, marketing or production department know about this course.

If you are a search engine optimizer, I hope you will find this course helpful not only because it presents most of the SEO issues encountered by ecommerce websites in a single resource, but also because it provides advice and options for addressing those issues. Let your manager know about this course. Going through this course will help them understand that ecommerce SEO cannot be addressed overnight. Also, ecommerce SEO does not have strict recipes for success, because SEO is part tech, part marketing, and part art.

This course is also very valuable for web developers involved with ecommerce. The course discusses on-page SEO issues and proposes solutions. However, it does not detail how to write the code to address the problems. While working on addressing an issue, you, as a developer should decide which approach is best, given your particular technical setup.

For example, sometimes a 301 redirect is not possible, whereas a rel=”canonical” is. While I may recommend one approach over another, you will have to decide whether it is possible to implement the recommended method.

What type of websites is this course for?

This course is for websites that face complex issues such as faceted navigation, sorting, pagination or crawl traps, to name just a few. However, keep in mind that a website’s complexity is not directly tied to how big a business is in terms of revenue. Start-ups, SMBs and enterprise websites can be complex no matter their revenue. This course is therefore just as useful for large websites (e.g., sites with tens of thousands of items), as it is for small and medium websites (e.g., with tens to thousands of items).

Ecommerce extends across a multitude of segments, such as travel, where you can sell air tickets, railway tickets, hotel bookings, tour packages, etc. It also extends to retail, financial services, digital goods and services, consumer packaged goods (CPG), and many others. While the vast majority of the examples presented in the course are for retail and CPG, the SEO principles discussed here also apply to all other ecommerce segments. These principles also apply to non-ecommerce websites with complex structure and navigation.

Throughout this course, I will use the terms item and product to refer to a physical good, but an item can also be a digital product, such as a game or a song. The term item will have a different meaning depending on each business. For an online hotel reservation website, the item will be a hotel, and it will be presented on the hotel description page; for a paid content publisher the item may be a journal; for a real estate listings website, the item will be a real estate property, and so on. Also, I will refer to item and product interchangeably.

On-page SEO issues only

This course addresses on-page SEO issues only. Link development is a big part of the SEO equation and requires a course of its own. However, while links have been the main target of SEOs for a very long time, you should optimize your website by putting people and content first.

Level of expertise

This course contains intermediate to advanced SEO tactics, but it’s a good start for newbies as well. If you are one of them, this course should set you on the right SEO mindset.

During the course, you may find that I give particular advice or opinion about a topic, without getting into detail. That may be because that topic is discussed in detail in a referenced work. If you want to know more about those topics, or if you are a total newbie to the SEO field, check out those resources.

I have been asked very often, what’s the best piece of SEO advice I can give to those who do SEO for ecommerce. Here it is:

Optimize for users, without chasing the algorithm. Your ultimate goal is the long click AKA fulfilling or terminating the query. We are going to discuss “the long click” internal metric used by Google, later in the course. In a nutshell, it means a searcher Googles something, they find your website at the top, and when they land on it, they find whatever they are looking for on your website, without needing to go back to the search results.

Before ending this intro, I would like to tell you that you can make it on the first page of Google, Bing or any other big search engine, given that you have the necessary knowledge, you keep up to date with how algorithm changes affect ecommerce websites, and you work hard to achieve it.

I would love to hear your success stories, so don’t hesitate to get in touch with me.

However, for now, bring some paper and a pencil, and let’s start the course.

See you at the top!

CHAPTER 2

Website Architecture

Length: 10,291 words

Estimated reading time: 1 hour, 10 minutes

Chapter-Head-Chapter2

In this chapter, we will explore the concepts behind building optimized ecommerce website architectures.

Having a great site architecture means making products and categories findable on your website, in a way that users and search engines can reach them as efficiently as possible.

There are two concepts you should be aware of regarding site architecture:

  • Efficient crawling and indexing. This refers to the technical architecture or TA.
  • Classifying, labeling and organizing content. This refers to information architecture or IA.

Together, information and technical architecture form the site architecture (SA). A good understanding of these two concepts will help you build search engine optimized websites that are search-engine and user-friendly.

It is important to differentiate between information and technical architecture:

  • Information architecture is the process of classifying and organizing content on a website while providing user-friendly access to that content, via navigation. This process is done (or should be done) by information architects.
  • Technical architecture is the process of designing the technical and functional aspects of a site. This is mostly done by web developers.

Keep in mind that SEO involves both information and technical architecture knowledge.

Information architecture

The Information Architecture Institute’s definition of IA is:

  • The structural design of shared information environments.
  • The art and science of organizing and labeling websites, intranets, online communities and software to support usability and findability.
  • An emerging community of practice focused on bringing principles of design and architecture to the digital landscape.

This definition shows that information architecture goes beyond websites, and it hints at its complexity. It also reveals how flexible and theoretical information architecture is.

From an ecommerce standpoint, let’s oversimplify the definition of information architecture to this single sentence:

The classification and organization of content and online inventory.

You should be familiar with two other important information architecture concepts: taxonomy and ontology. While these names might be intimidating, the concepts are easy to understand.

Taxonomy is the classification of topics into a hierarchical structure. For ecommerce, this translates into assigning items to one or more categories. Ecommerce taxonomies are usually vertical, “tree-like” structures. A website’s taxonomy is often referred to as its hierarchy. To visualize a taxonomy, think of breadcrumbs.

Notice how the breadcrumbs above mimic the website taxonomy. In our examples, one branch of the taxonomy “tree” leads to Duvet & Comforter Covers, and the other to Aloe Vera Gels.

The structures depicted in these two screencaps are ordered using a parent-child relationship, from broader to narrower topics, and they are called taxonomies. One way to create ecommerce taxonomies is to use a controlled vocabulary, which is a restricted list of terms, names, labels, and categories. Usually, it is the information architects who develop these vocabularies.

In terms of SEO, you should use semantic markup to help search engines understand taxonomies. One such markup can be applied to your site breadcrumbs.

Microdata or RDFa markup is used by search engines to generate breadcrumb-rich snippets similar to this one:

Search engines can sometimes display the website taxonomy directly in the search engine results pages (aka SERPs).

Search engines can sometimes display the website taxonomy directly in the search engine results pages (aka SERPs).

We will discuss breadcrumbs in detail later in this book, but briefly, this is how the source code for the previous rich snippet example looks like:

Figure 3 -The highlighted text shows the Breadcrumb vocabulary markup.

The second information architecture concept you need to be aware of is ontology. It means the relationships between taxonomies.

If an ecommerce hierarchy can be visualized as an inverted tree, with the home page at the top, then an ontology is the forest showing relationships between trees. An ontology might encompass various taxonomies, with each taxonomy organizing topics into a particular hierarchy.

Simply put, an ontology is a more complex type of taxonomy, containing richer information about the content and the items on a website. We are just at the beginning of building ontology-driven sites, and one standard ontology vocabulary for ecommerce is GoodRelations.

The Semantic Web aims at helping artificial intelligence agents such as search engine bots crawl through and categorize information more efficiently. It is also designed for assisting identifying relationships between items and categories (e.g., relationships between manufacturers, dealers, and prices).

Figure 4 – Related Categories or Related Products can be considered a form of ontology.

If you are not an information architect or a business analyst, you probably will not be involved in identifying related categories and products, but it is important to know these terms in your discussions with information architects.

Sometimes, related categories and products are automatically identified by the ecommerce platform, or by specialized software.

Why is information architecture important for search engines?

A correctly designed information architecture will result in tiered website architecture. A good architecture has an internal linking structure that will allow child pages (pages that can link upwards in the hierarchy, such as product detail pages or blog posts) to support the more important parent pages (upper-level pages that link down in the vertical hierarchy, such as category and subcategory pages).

Figure 5 – Pages that link to each other at the same level of the hierarchy are called siblings. They share the same parent.

With correct internal linking a blog article, for example, “Top 5 New Features of Canon Rebel T5i DSLR” will support the product detail page Canon Rebel T5i DSLR. Canon Rebel T5i DSLR will support the category Digital Cameras, which will further support the top-level category Electronics.

Figure 6 – This pyramid-like structure is a very common architecture for ecommerce.

One of the questions that often comes up when deciding on the hierarchy is “What is the best number of levels to reach a product detail page?”

The famous three-click rule, which suggests that every page on a website should take no more than three clicks to access, is OK to use as a guide, but do not get stuck on it. If you need a fourth level in the hierarchy, that is perfectly fine.

Information architects, business analysts or merchandising teams can help identify relationships between categories, subcategories, and products. Based on these findings, you will decide on rules for an internal linking strategy. Such rules can include:

  • Only highly related categories will interlink.
  • Categories will link only to their parents.
  • Subcategories will link to related subcategories or categories.
  • Product pages will only link to related products in the same category, and parent categories.

A proper website architecture will help your website rank for the so-called head terms. For ecommerce websites, these are usually the category pages, at all levels of the hierarchy. However, internal linking is not enough for a subcategory page to reach the top of the search engine results pages for category-related search queries.

Because head terms are usually competitive, a page targeting such terms should also include:

  • Relevant and useful content. This means that your listing pages should display more than just a list of items. For product detail pages you need to present more than just product pictures and pricing.
  • Backlinks from related trusted external websites.

Additionally, proper information architecture means good usability. Great usability and content create an excellent user experience, which then leads to an increased dwell time (which is good for SEO).

Dwell time is the amount of time that a searcher spends on a page before returning to the SERPs. The longer this time is, the better.

Pogo-sticking means going back and forth between a SERP and the web pages listed in the results. For example, you search for something, click on the first result, you are not happy, you go back to the SERP. Then you click on the second result, you are still not happy, and you go back to the SERP again, and so forth, until you find what you are looking for, or until you refine your search query.

A SERP bounce happens when a search engine user clicks on your page in the SERPs and then goes back to the results without interacting with any page elements.

Note that a high SERP bounce is not inherently bad for SEO, but a low dwell time might be. An increased dwell time sends quality signals because it hints to search engines that your page is relevant for a given search query.

Navigation, such as primary, secondary, breadcrumbs or contextual, is also one of the critical components of website architecture. Navigation is jointly crafted by various members of the business, led by the information architect. Given that the primary navigation will be present on almost every page of the website, it influences how authority and link signals (i.e., PageRank and anchor text) are passed to other pages.

Fortunately, there are ways to give users what they want (findability, discoverability, and usability), and at the same time, guide search engine bots towards what you want them to discover, crawl and index.

How can SEO add value to IA?

Remember, information architecture is not about technical issues, but about organizing digital inventory and content. So, while SEO has a key role to play in information architecture, it should not dictate how information is labeled and organized. Information architecture is about making content easy to find and helpful for users. However, because most SEOs are biased towards marketing and technology rather than user experience and usability, it is advisable to involve both an information architect and an SEO consultant, when working out the information architecture.

Try to involve the SEO person from the initial stages of the information architecture process, to provide suggestions and feedback from a search engine standpoint, and to contribute to the overall site architecture discussion. Once the information architect designing the draft information architecture listens to what the SEO has to say, he or she can brainstorm with the other teams about how to implement the SEO recommendations with minimal changes to the initial information architecture format.

Many times, technology and marketing teams will dismiss a certain information architecture just because it does not have traffic potential. Do not make that mistake. When optimizing for search engines and their users, you should listen to what other teams in the business have to say and only then suggest solutions.

As mentioned, SEO’s role is to provide consultancy from the perspective of search engines. Let’s look at a few areas where SEO input is valuable.

The concept of flat architecture

In a flat architecture, deep pages – which are pages at the lower levels of the website hierarchy (usually the product detail pages) – are accessible to users and search engine bots within a balanced number of clicks for users (or hops, for bots).

Figure 7 – This figure depicts what flat website architecture looks like.

The opposite of flat architecture is the so-called deep architecture, and it may look like the diagram on the next page:

Figure 8 – In a deep-architecture model, pages are mostly linked in a vertical structure.

We will use math to illustrate the concept of flat architecture:

  • At level 0 (home page), you link to 100 category pages; 100^1=100 pages linked.
  • From each page at level 1 (the category pages), you link to 100 subcategory pages; 100^2=10,000 subcategory URLs.
  • From each page at level 2 (the subcategory pages), you link to 100 product pages; 100^3=one million product page URLs.

In three “clicks” search engines can reach and crawl (and eventually index) one million pages.

Note: the 100 links-per-page example was used as a guide only. In practice you can have more or fewer links, depending on your site authority.

Let’s look at the scenario of a direct visit to your homepage. To reach a product detail page from the home page, a user will have to perform the following actions:

  • The first click on the Cosmetics category page.
  • The second click on the Eye subcategory page.
  • The third click on the product details page.

If no external links point directly to that product details page (also known or abbreviated as PDP), search engines will find the PDP URL in a similar way to users. The bot will crawl from an entry page and will, eventually, find its way to the product detail page. Keep in mind that search engines will enter your website through a multitude of URLs, not only through the home page.

In our scenario, it took only three clicks to reach the PDP, but if the website is structured using deep information architecture, it might take users and search engines more clicks or hops.

But how and why did we adopt flat architecture?

The concept of flat website architecture seems to have its roots in web design, and it started with the three-click rule becoming a best practice around the year 2000.

However, when usability experts tested this rule, they found that it was not necessarily working for users, as expected. As a matter of fact:

“Users’ ability to find products on an ecommerce website increased by 600 percent after the design was changed so that products were four clicks from the homepage instead of three” (p. 322).

Then smart SEOs jumped in thinking that if the rule was good for users, then it should be suitable for search engines as well. SEOs found a way to funnel more PageRank to deeper levels and optimize crawling by providing shorter paths for search engines. However, the initial goal approach was to avoid ending up with pages in the supplemental index, because of their very low PageRank; it was not to flatten the site architecture.

Here are a few important pointers about flat architecture:

  • Unless you sell a limited number of products (e.g., just ten dietary supplement pills) or unless you have a very limited number of pages on the site, do not flatten to the extreme. That means do not link from the home page to hundreds of product detail pages, just to build a flat architecture.
  • Flat architecture is about the distance between pages in terms of clicks, not about the number of directories in the URL. For example, you can link from the home page directly to a subcategory URL at the fourth level of the hierarchy (e.g., mysite.com/Home-Garden/Furniture/Living-Room-Furniture/Recliners/) to promote a subcategory that generates high profits. In this example, the Recliners page is only one click away from the home page (which fits the flat architecture concept), but it is four levels down in the directory hierarchy (which matches the deep architecture concept).
  • If you have already organized your hierarchy using directories in URLs, do not remove them just for the sake of flattening.

As long as the directories do not generate super-long URLs, they have advantages such as:

  • Facilitating easier website “theming” (we will talk about this in the Siloing section).
  • Presenting users with a clear delineation of the categories on your website.
  • Allowing for easier SEO, information architecture and web analysis (e.g., you can use site:domain.com/directory/ to troubleshoot indexation problems).
  • Google and other search engines may use your directory structure to create rich-snippet breadcrumbs.

Figure 9 – SERP breadcrumbs will show up only if the directory hierarchy is clear to search engines.

In this screenshot, you can see how Google displays breadcrumbs directly in SERPs. However, such rich snippets will show up only if the directory hierarchy is structured, or if you mark up your breadcrumbs with Schema vocabulary.

It is not mandatory for URLs to replicate the exact website taxonomy. If you want, you can keep the URL structure under two directories deep. Here’s an example.

On hotel reservations websites, it is common to have a taxonomy based on hotel geo-locations:

Taxonomy: Home > Europe > France > Ile-de-France > Paris

URL: domain.com/europe/france/ile-de-france/paris/

Even though the URL reflects the hierarchical taxonomy, it is too long and too difficult to type-in or to remember.

If the website sells only hotel rooms, the alternative URL might look like:

domain.com/france/paris/

If the website offers other travel services such as air tickets or car rentals, then the alternative URL will include the type of service, and it might look like:

domain.com/car-rentals/france/paris/

Regarding the directory structure for hotel booking websites, it is worth noting that hotels are a special ecommerce case because you cannot re-categorize hotels from one city to another. However, for online retailers, product re-categorization happens frequently.

To avoid issues associated with moving products from one category to another, or issues related to poly-hierarchies (items that are categorized in multiple categories) keep the PDP URLs free of categories, whenever possible.

For example, to reach the product page 3-Level Carousel Media Center, a user will navigate through:

homepage – mysite.com/

category page – mysite.com/office-furniture/

subcategory page – mysite.com/office-furniture/storage/

sub-subcategory page – mysite.com/office-furniture/storage/media-storage/

However, once the searcher reaches the product detail page, the URL is free of categories and subcategories:

mysite.com/3-level-carrousel-media-center.html

Tip: setting product names in stone is also a good idea.

Notice a couple of things about the previous URLs:

  • The product page URL is free of category, subcategory or sub-subcategory names.
  • The category and subcategory URLs include the trailing forward slash (/) at the very end. That hints to search engines that the URLs are directories and there is more content to be found on those pages.

Figure 10 – This is how Google treats trailing slashes in URLs.

  • The product page has a .html file extension. The presence of the file extension hints the search engines that the document is an HTML page (an item page ) and not a directory. The file extension can be anything, i.e., .php or .aspx—because the file extension does not matter at all to search engines.

Removing categories names from URLs is a trade-off with your data analysis, as it will make the web analysis a little bit more challenging. However, this difficulty is surmountable. For example, you can group pages in your analytics tool or you can markup the HTML code with different strings, to group pages based on your own rules.

At the same time make sure your web analytics tool is set up to easily group pages for analysis. Without unique identifiers for URLs, it is more difficult to segment data. You can also use tag managers such as Google Tag Manager to create content groups, using Data Layers.

Figure 11 – The flat architecture concept on an ecommerce site.

Siloing

In the simplest terms, siloing means creating a site architecture that allows users to find information in a structured manner while linking pages using a controlled pattern to guide how search engine bots crawl the website. Usually, this structure is a vertical taxonomy.

Siloing sounds like a fancy term, but it is just good information architecture because siloing is one-part website hierarchy and one-part navigation (using internal linking).

Figure 12 – At a basic level, siloing means that pages in a taxonomy branch/category (i.e., PDPs) should not link to pages in a different branch/category.

In strict hierarchy patterns, child pages are only linked to and from their respective parent pages. This is not possible without a strictly controlled internal linking, and it is challenging to create such a strict internal linking pattern mainly because:

  • Primary navigation is present on all pages, so cross-linking will happen naturally.
  • poly-hierarchies, which means multi-categorizations for products or subcategories. For example, the Office Furniture category can be categorized under Office Products and Furniture.
  • Subcategory cross-linking and crossover products. For instance, you may have to link from a product that is categorized under Home Theater, to another product made by the same brand, but categorized under Audio.

Because ecommerce websites are complex, they are most likely to have a hierarchy that frequently interlinks silos. In practice, it is complicated, and sometimes, not even advisable to prevent internal linking between silos.

Figure 13 – Cross-linking happens naturally

The internal linking architecture can be very cumbersome and difficult to control, even for ecommerce websites with just a few hundred products, as you can see in this graph:

Figure 14 – This node graph shows how complex internal linking can be.

In this example, a website with just a few thousands of pages generated more than 250,000 internal links.

The siloing method

Conceptually, siloing is done by identifying the main themes of the website (for ecommerce, those will be departments, categories, and subcategories) and then interlinking only pages belonging to the same silo (for example, linking only within the same category).

The good part is that ecommerce websites are usually developed using a similar architecture, with separate hierarchies (themes) for each department or category.

The idea is that by siloing the website into themes, you will be able to rank high for semantically unrelated keywords with the same site, even though the themes are entirely different, e.g., “hard drives” and “red wines”.

You can achieve silos with directories or with internal linking.

Directories

Information architects create hierarchies using user research, user testing, keyword research, and by analyzing your web traffic. The labels used to describe these hierarchies will be present in the URL structure. Your silos will be the directories in the URLs.

Whenever possible, use a hierarchy created with directories.

Internal links

With internal linking, you create virtual silos, as pages in the same silo do not need to be placed in the same directory. You achieve virtual silos by controlling internal links in such a way that search engine bots will only find links to pages in the same silo. This is a very similar concept to bot herding or PageRank sculpting, with subtle differences in meaning and application.

Siloing with directories

Siloing with directories is the easiest to implement on new websites during the information architecture process. From a user experience perspective, creating the website hierarchy with directories is the best way to go.

But in the end, siloing with directories is nothing less than creating good vertical hierarchies, which the URLs reflect. Many online retailers create them naturally by branching out all categories, without overthinking about SEO and without being obsessed with keywords in the anchor text or with internal linking patterns.

A sample silo with directories would look like:

mysite.com/category1/subcategory1/

mysite.com/category1/subcategory2/

mysite.com/category1/subcategory3/

….

mysite.com/category1/subcategoryN/

Does this type of siloing look familiar to you? It should if you use directories in your URLs. Moreover, this is nothing more than a proper hierarchy. So, if you design your website hierarchy correctly, you do not even need to worry about siloing with other methods.

Keep in mind that it is best practice to keep the directory depth low, ideally fewer than four or five levels.

Siloing with internal linking

Siloing with directories may not always be possible, for example, if you wish to change an existing hierarchy on an established website. In this case, you will create virtual silos using carefully controlled internal linking.

Usually, pages in a silo need to pass authority (PageRank) and relevance (anchor text) only to other pages within the same silo. This prevents the dilution of the silo’s theme and sends the maximum power to the main silo pages.

Here are some rules for linking within and between silos. A page in a silo:

  • Should link to parents.
  • It can link to siblings, if appropriate. Siblings are pages at the same level in the hierarchy.
  • It should not link to cousins.
  • It could, eventually, link to uncles or aunties. Uncles and aunties are siblings of the node’s parent.

Figure 15 – An over-simplified siloing diagram.

In this simplified siloing example, sibling number one could eventually link to uncles, who are siblings of that node’s parent. That means that if you have to link two related supporting pages found in separate silos (which are called cousins), you should link only to the silo’s uncles.

If you need to link to pages outside the silo, you can block those links from being accessible to search engines (e.g., using AJAX – Asynchronous JavaScript and XML, iframes, JavaScript with robots.txt). Note that there is a fine line between white hat and gray hat SEO, and such linking may cross that line. This is because Google’s definition of manipulative techniques lies in answer to the question: “Would you do it if search engines did not exist?”

The goal is not to take siloing to the extreme. If a page is relevant and you want to link to it, then do so even if it is in a different silo or theme.

Siloing with internal links is a powerful advanced SEO technique especially for large websites with multiple departments, themes, or categories that are not semantically related, i.e., groceries and mobile phones. However, it is important to know that siloing is not easily achieved, and it pays to be aware of the existing dangers.

If you want to silo with internal linking, know that:

  • PageRank sculpting with rel=”nofollowis not recommended.
  • Virtual siloing means that you somehow have to “hide” internal links from search engine bots; doing so may fall outside search engines’ guidelines.
  • Hiding internal links from search engines using iframes, AJAX, JavaScript or other similar techniques can qualify as cloaking since you show different content to users than to search engines; this could result in penalties.
  • If you want to obfuscate links with AJAX or JavaScript for SEO reasons, first identify the percentage of users with JavaScript turned off. If that is a significant segment of your total visitors, make sure your website works correctly without JavaScript. Non-JavaScript users should be able to finish all micro and macro conversions on your site. An example of micro-conversion will be an “add to cart” event, while a macro-conversion is a completed order.
  • Trading away too much for SEO at the expense of usability and accessibility is not the right way to go.
  • Siloing may require hiding entire navigation elements, such as facets and filters, from search engines. There are risks associated with such bold tactics.

Figure 16 – The nofollow links are marked with the red-dotted border

In the image above, you can see how only the top-level categories (Women, Men, Baby, etc.) and the immediate next hierarchy level of subcategories (Clothing, Shoes, Accessories, etc.) pass authority through links. Category links are nofollow. This is a very bold (most likely bad) SEO approach to handling primary navigation menus.

Proper internal cross-linking is helpful and necessary for good rankings, and we will discuss this in detail in the Internal Linking section but remember that internal linking must be built for users first, and only then for search engines. You must link consistently, thematically and wisely (using synonyms, stems, plurals and singulars, and so on) to support rankings for categories and subcategories.

You should not remove navigation elements just for SEO purposes. Keep the links that are useful for users in the interface, and if you want to remove links for SEO reasons do it by blocking those links with AJAX or JavaScript.

Another theming method is to evolve taxonomies into ontologies. Instead of linking based strictly on a vertical taxonomy, interlink items that are conceptually related. For example, you can interlink a particular fragrance with the sunglasses manufactured by the same brand. This type of interlinking requires defining semantic and conceptual relationships between categories and items and then deciding on the internal linking based on predefined business rules.

One such business rule is crowd-sourced recommendations (AKA Customers Who Bought This Item Also Bought…). Do users often buy certain products together? If yes, then cross-link those product detail pages, even if they are in different silos.

If this type of linkage generates too many internal links on some pages, you can always block the less important links (you will have to define how many links is too many for your particular situation). However, for the sake of users, interlink whenever is necessary, without being concerned about siloing.

If the business rules are based on data, then you will not be linking from adult toys to children books. Also, you will not link to hundreds of related products, but just to a few highly related items.

Here’s what Google has to say about the subject of theming an internal architecture, in a post on their official blog:

Q: Let’s say my website is about my favorite hobbies: biking and camping. Should I keep my internal linking architecture “themed” and not cross-link between the two?

A: We haven’t found a case where a webmaster would benefit by intentionally “theming” their link architecture for search engines. And, keep-in-mind, if a visitor to one part of your website cannot easily reach other parts of your site, that may be a problem for search engines as well.

This is a reminder not to take siloing to the extreme. However, siloing with directories is natural, and the resulting internal linking is also great for users and search engine bots.

I lean towards a hybrid siloing concept combining the following:

Generate content ideas

It is widely known that keyword research can help with generating content ideas. Keyword research also enables you to expand from a relatively narrow set of head keywords (category and subcategory keywords) to a large number of torso and long tail keywords. These long tail keywords can then be used to generate content ideas, identify product attributes, and improve product descriptions.

Based on the initial taxonomy created by the information architect, you can identify keyword patterns, tag user intent, group keywords according to buying stages, and find search volumes; I will cover these tactics in the Keyword Research section.

This type of research provides excellent insights that are usually overlooked by the other teams in an ecommerce business.

If you want to consistently publish content that your target market will find relevant consistently, you will have to know not only the queries used by searchers but more importantly, the type of content they are looking for. Are they looking for general information about your products? If so, you would do well to put more emphasis on review-type content and how-to articles. Are they searching for products to buy? If so, you could improve the content on a certain product detail page.

You can better address your target market needs once you gained an understanding of what they want, by discovering the user intent behind the search query. When you do so, you will be better able to address their needs on your landing pages. And when your landing pages address people’s needs, conversion rates will sky-rocket and rankings may improve (as an indirect quality signal).

Here are some interesting facts about search queries:

Figure 17 – The search demand curve, as explained by MOZ. Notice how the long tail of keywords and chunky middle make for more than 80% of the keywords.

Why did I mention these search query facts?

It is because the correct way to start keyword research and build a great website architecture is by recognizing that only a small fraction of your target market is ready to buy, at any given moment. Many ecommerce websites mistakenly focus on targeting keywords such as department, category or subcategory names while completely ignoring a large number of informational search queries (and even navigational). I will detail a keyword research process in the Keyword Research section of the book.

Let’s look again at a typical ecommerce website architecture:

Figure 18 – Under this sample hierarchy, product detail pages are not supported by any other content-heavy level, below the PDP level.

Including the home page, there are four levels in this example: The first level is the home page, which is supported by categories (second level), subcategories (third level) and product detail pages (4th level). The subcategory and product detail pages support the category level; product detail pages then support subcategory pages. However, the product pages are the “leaves” in this example – they are the last level of the hierarchy.

When an ecommerce website does not support important pages (i.e., categories or PDPs) with an additional content-heavy level in the hierarchy, it can miss a considerable amount of organic traffic coming from informational search queries. It will also miss out on the ability to create useful contextual links to product, subcategory and category pages.

In our example, you can overcome these challenges by creating a 5th level in the hierarchy. This level can be a blog, a learning center or a projects section on the website, to name just a few ideas. This content-rich section can also be outside of your existing hierarchy.

On Victoria’s Secret website, I was not able to find a single reference to their blog. This is bad for them, but it is good news for the small guys competing in their niche.

Figure 19 – Only five pages on this website contained the word “blog”, and none of them were part of a real blog.

Here are two ideas for you:

  • Add a new layer of support for all pages on the website, especially for product and subcategory pages. As I mentioned, this layer can be a blog, a forum, expert Q&As, how-to guides, buying guides, white papers, workshops and so on. This layer will generate additional organic traffic and will provide support for contextual, internal linking. Additionally, it may help build a community around your brand, which is always a great thing to do.
  • Conduct keyword research with this new level in mind, which means you will not dismiss informational keywords. Categorize such keywords into the Informational bucket in your spreadsheets, and plan content based on them — more about this process in the Keyword Research section.

Let’s say that you sell home improvement items, and you want more people to come to your website and buy them. However, a lot of searchers in this niche are DIYers, and they use keywords specific to the awareness and research stages. Then why not create a series of DIY home improvement projects and publish on a content-heavy section of the website?

Take a look at the following inspiring piece of content from Home Depot’s blog. Home Depot is not at all into selling instructional DIY DVDs, but they are attracting the target market with highly related content. Home Depot has an entire DIY section on their website.

Figure 20 – Notice how this page supports category and product pages, by linking to them.

When you add a new content-rich layer in the hierarchy, you:

  • Expose your brand to your target market, in the early stages of the buying funnel.
  • Add a new way to generate more traffic.
  • Give visitors more reasons to buy from you.
  • Reinforce product and category pages with better internal linking.

Let’s see how SEO could help regarding information architecture.

Evaluate the information architect’s input

Planning an ecommerce architecture starts with information architects identifying the navigation labels such as departments, categories or subcategories.

In many cases, information architects do not associate this process with the keyword research process, which is good, because navigation has to serve the users, not the bots. However, you should evaluate the architect’s input, from a search engine perspective.

Here’s an example of how to do that using Google Trends. If the information architect wants to label one of the categories in the primary navigation as mp3 players, the following search trend comparison data might change his or her mind.

Figure 21 – The trend for “iPod” is downwards, but it is still a few times more than the one for “mp3 player”.

Indeed, iPod can be a child of the mp3 player parent, but you should brainstorm with others in the team to decide whether making the iPod category easier to find would be more beneficial for users, which may mean displaying it directly in the primary navigation.

Many times, the search volume for a parent category is higher than the search volume for a child category, but as you can see in this example, this rule is not definitive.

Also, note that Google Trends displays normalized data, on a scale of 1 to 100, where 100 is the highest search volume ever recorded. Google Trends does not present absolute search volumes.

All ecommerce websites will have primary navigation (aka global or main navigation), secondary navigation (aka local navigation) and some contextual navigation. Another form of navigation specific to ecommerce websites is faceted navigation.

Primary and secondary navigation

Primary navigation is for the content that most users are interested, but keep in mind that importance is relative (something important for your business may not be as important for another business). In general, on e-commerce websites, primary navigation displays departments, categories or market segments (i.e., men, women, kids, etc.).

Primary navigation is the easiest type for most users to identify. It allows direct access to the website’s hierarchy and is present on almost every page of a website.

Figure 22 – A sample primary navigation on Kohl’s website.

On a side note, it will be difficult for Kohl’s to rank for top-level category keywords (e.g., Home, Bed & Bath, Furniture, Outerwear, etc.) since they will have to compete with niche-specific websites that are laser-focused on a single segment—for example, a company that sells just furniture. It is not impossible for Kohl’s to achieve good rankings, but it will require significant work including onsite SEO and quality backlink development.

Regarding secondary navigation, even information architecture experts like Steve Krug, Jesse-James Garret, and Jacob Nielsen cannot agree on a definitive definition.

Secondary navigation stands for content that is of secondary interest and importance to users. Again, importance is relative to each business.

Strongly connected with navigational links, there is an SEO best practice that recommends keeping the number of links on a page under 100. However, this is an obsolete rule; you can list more than 100 links on your pages, depending on the authority of your website.

You will see high authority websites like Walmart listing hundreds of internal and external links:

Figure 23 – There are 633 links on this page. This may be too many unless you have an excellent site authority.

Walmart’s large number of links results from the use of the so-called fly-out mega menus in the primary navigation, for usability reasons. This type of menu makes deeper sections of the website easily accessible to users.

Mega menus allow direct linking to subcategories and even to products, but you must be careful to keep the number of links to a reasonable limit. Since the primary navigation is present on most of the pages on a website, it has a pretty significant influence on how authority moves back and forth between pages.

Consolidating a long list of departments into a single place has to do with design considerations (limited screen estate) and user experience (too many options to skim at once). However, it also affects the PageRank passed to the other pages.

Figure 24 – Design limitations forced Walmart to reduce the number of links in the navigation. Notice the “See All Departments” link at the bottom of the primary navigation.

However, Walmart has a separate page for the complete list of their departments (i.e., health) and categories (i.e., vitamins):

Figure 25 – The “All Departments” consolidation is a clever idea because this page will act as a sitemap for both people and search engine bots.

SEO can help information architects decide which categories are the most important for users and therefore should be listed in the primary navigation. Use web analytics tools to identify metrics such as the most searched terms on the website, the most viewed pages, the highest search volume from pay-per-click campaigns.

Figure 26 – The keyword with the highest number of internal site searches could eventually be placed in the navigation if it makes sense or, it can be placed near the search field.

Contextual navigation

Contextual navigation refers to the navigation present in the main section of web pages. It excludes boilerplate navigation items such as those found in headers, sidebars or footers.

Some examples of contextual navigation on ecommerce websites include sections such as:

Figure 27 – Customers who viewed this item also viewed

Figure 28 – Best Sellers

Figure 29 – Contextual text links in the main content (MC) areas.

Figure 30 – Links in Recommended Products carousels.

You will need to discuss contextual navigation with the information architect to identify relevant relationships between categories, subcategories, and products and to plan the internal linking accordingly.

Prioritization

SEO can help with the prioritization of labels in the navigation.

It is helpful to know how many pages will be linked from structural sections of the website (primary, secondary and footer links) on each page template. This is important to estimate because you need to determine how many links you can display in the contextual navigation (only if you need to limit the number of links on pages).

This is not a definitive rule, but if you start a new website, it is a good idea to keep the number of links on each page to maximum 200. This is because you will have only a small authority to pass along to lower levels in the beginning.

Here are some prioritization guidelines:

  • Keep the number of top-level categories or departments in the primary navigation low, to avoid the paradox of choice. Research has established that having too many options is bad for decision making.
  • The short-term memory “rule of seven items” does not apply to primary navigation, as users do not need to remember the labels.
  • You can list more categories on a “view-all departments” or “view-all categories” page.

Figure 31 – In a horizontal design, the primary navigation is constrained by design space.

As you can see in the examples above, in a horizontal design, the primary navigation is constrained by design space. Notice how short the category names must be. Macy’s displays eleven labels in the primary navigation, same as with BackCountry, while Office Depot lists only nine labels.

  • Vertical primary navigation placement allows for more categories to be listed:

Figure 32 – Costco displays 18 categories in the menu (the same as Sears) while Walmart displays only 13.

Specialty retailers will probably have less than two or three departments (sometimes none), in which cases they may not even list departments in the menu, but categories. General department stores can have up to 20 departments.

  • You can break each category level into 20 to 40 subcategories, depending on how extensive your inventory is.
  • If a parent category needs more than 40 subcategories, consider adding a new parent category or implementing faceted subcategories.
  • Ideally, the hierarchy depth to reach a product detail page should be under four levels:
    • Two levels deep: home, category, product detail page (this is suitable for niche retailers).
    • Three levels: home, category, subcategory, product detail page (this is the most common setup for medium-sized ecommerce websites).
    • Four levels deep are home, department, category, subcategory, product detail page OR home, category, subcategory, sub-subcategory and product page. This setup is specific to marketplaces, large department stores or websites with extensive inventories.
  • If the hierarchy has more than four or five levels, use faceted navigation to allow filtering by product attributes.
  • To improve the authority (PageRank) and the relevance (anchor text) of product detail pages, add a content layer (e.g., blog, community forums, user reviews and so on) in the hierarchy just below the product detail page level and link to relevant items from there.
  • Ordering categories (or items) alphabetically is not always the best option. You should prioritize based on popularity and logic whenever possible, and eventually, complement with alpha navigation if user testing proves that such type of navigation is indeed, useful.

Figure 33 – An older version of primary navigation on OfficeMax, featuring alpha navigation.

Figure 34 – Newer screenshot after OfficeMax tested the alpha navigation and reverted to category name navigation.

  • If a category has too few items, consider moving them to an existing category with more items but do this only if the new categorization makes sense for users.
  • If a category has too many items (i.e., thousands), it may generate information overload. In this case, you can break the category into smaller subcategories. Additionally, create a user experience that allows better scope selection, before displaying a list of items.

Keyword variations

Planning a categorized product hierarchy is not easy. At the top category level, the labels in the primary navigation must be intuitive, must have the appropriate search volumes and must be concise enough to support menu-based navigation. It is worth repeating that determining the hierarchy of an ecommerce website based solely on keyword research is neither ideal nor recommended. However, keyword research should complement and support information architecture.

One common question regarding keywords is how to handle misspellings, synonyms, stemming or keyword variations for a category. Where do you place them in the web site’s information architecture?

For your internal site search, this should be easy to handle: you must associate each keyword variation, misspelling, etc. to an existing product or category and redirect users to the respective canonical product or category page. If there is no exact match between the keyword variation, misspelling or synonym and a category on your site, then send users to an internal site search result page.

For example, when someone searches for “tees”, “tee shirts” or “tshirts”, you return results for “t-shirts”. If there is an exact match between the search query and the category name you can also redirect the searcher to the t-shirts category listing page.

Figure 35 – Make sure that your internal site search works appropriately and does not return wrong products, as in this example (a search for “t-shirt” returned bras).

In this screenshot, I wanted to highlight the improper handling of internal site search results; returning bras when someone searches for “t-shirts”.

Handling keyword variations for external search engines is a bit more complicated. Commercial search engines like Google and Yahoo! need to understand and connect keyword variations with the right content on your website.

In the past, you would’ve created individual pages to target keyword variations (or a group of keyword variations). However, Google shifted to ranking topics instead of individual keywords. Therefore, it is important that your pages include the searched keyword and semantically-related words (e.g., synonyms, plurals).

Just make sure you are not overdoing it; including all 20 possible variations of a keyword on a single page is spammy.

Here are some ideas for you to consider:

Target the most common variations in the title and description or both.

Figure 36 – Gap targets keyword variations in the description, while Sears uses the title tag.

Use product and category descriptions

One option is to use category or product description sections to add keyword variations in the copy. The bottom of the image below highlights how this website uses two keyword variations for “t-shirts”.

Figure 37 – This retailer uses the words “tees” and “t shirts” in the category description copy, to capture traffic for those keyword variations.

Take advantage of related searches

This approach requires displaying a related searches section on your pages. This section may contain several of the most used keyword variations:

Figure 38 – Keep in mind that Related Searches sections should be useful for users, in the first place, and only then for search engine bots.

Identify possible information architecture problems

You can perform the site: query on Google, for example, “category_name site:mysite.com” (without quotes) to see whether search engines list the right page at the top. You can also use products and subcategories in the site: query. For example, you can search for:

site:www.costco.ca/ gourmet products

site:www.costco.ca/ “gourmet products”

If what the page your optimized for on your website does not show up at the top of the results, various reasons are possible, such as:

  • Improper internal linking. This happens when the internal linking architecture does not support the correct page.
  • Thin content, no content or inaccessible content (e.g., JavaScript reviews) on the right page.
  • External links point to the wrong page(s), diluting and reducing the relevance of the correct pages. If people are linking to the wrong pages, you must ask yourself why. Maybe those other pages are more relevant to them?
  • Page-specific penalties.

Of course, an in-depth analysis is required to identify the cause of these issues. When attempting to determine the cause of such problems, it is important to understand how the targeted page (the page you want to rank with, at the top of the SERPs), is linked internally from other pages on your website, and external sites.

One of the tools for analyzing this is Google Search Console:

Figure 39 – The Internal Links report will display the most important internal links, but only for the most important pages on the website.

This report is basic, but it can provide some immediate insights. Look for signals such as:

  • Are there more internal links to the wrong page(s) than to the desired page?
  • Is the desired page linked from the parent pages (pages higher in the hierarchy)?
  • Is the desired page linked from pages with high authority?
  • Is the desired page linked with the proper anchor text?

If there are issues like these, it is time to restructure your internal linking. Keep in mind that Google will not allow you to download the complete list of links, only the top ones.

Another useful method to assess the internal linking is to run a crawl on your website using tools like Xenu Link Sleuth or Screaming Frog and export the results to Excel.

It is also a good idea to run the most important terms on your internal site to check whether there is a match between the URL returned by your internal site search and the URL returned by search engines.

For instance, let’s say that Google returns the Gourmet Products category URL in the first position when you search for “site:costco.ca gourmet products”. If you were to click on the result, the Gourmet Products page opens:

Figure 40 – Costco’s organic search landing page, pointing visitors to the right category page.

However, Costco’s internal site search returns a different page, which is a search results page. This is not the best approach from a usability point of view or for search engines, because Google does not want to list in SERPs other results pages.

Figure 41 – In Costco’s case, this mismatch may happen because of the setup of the internal site search rules.

In many cases, when there is an exact match between a user’s query and a category name, it is preferable to redirect the user to the listing page instead of to a search results page.

Labeling

In reference to choosing the names of the links in the navigation, labeling is an area where information architecture and search engine optimization overlap. SEOs and information architects must understand the user’s mental model to label the navigation correctly. Labeling is not easy and presents a real challenge for very large ecommerce websites. Research from eBay shows how complicated it can get.

While most ecommerce taxonomies can be architected based on a predefined vocabulary, SEO can assist in the labeling process.

Let’s say you sell toys. Start by searching for the category name (“toys”) using Google’s Keyword Planner:

Figure 42 – Do not forget to set up the targeting options based on your target market.

Download the list generated by Keyword Planner and open it with Excel. Then, categorize keywords into “buckets” by mapping each keyword to either its category, attribute or filter name:

Figure 43 – Categorize keywords into “buckets”.

Insert a pivot table that counts the occurrences of Category:

Figure 44 – Sort by Count of Category.

If you sort by Count of Category, you can get an idea of what needs to be present in the navigation. You can also identify filters values that can be used in the faceted navigation.

Some navigation labels will be easy to identify after tagging fewer than a hundred keywords. For instance, in our example, it seems clear that brand should be a primary or secondary navigation label and users should be able to navigate and filter items by brands. Other possible candidates in this example are age, theme, and character.

Take the findings from this type of research and discuss them with the information architect.

Another thing you should do with the keyword list generated by Keyword Planner is to get the individual word frequency, using tools such as wordle.net:

Figure 45 – Words sorted by frequency.

Visually, this is how the word frequency will look like for our previous example:

Figure 46 – The “word cloud” for a list of keywords.

The image above is what we call the “word cloud,” and in our example I excluded the words “toys” and “toy”, to make the other words stand out.

The frequency of the word “kids” is particularly interesting. If you sell toys only for kids (no other target age, i.e., adults), then you probably should exclude the word “kids” from your analysis.

If you are in this niche, you may notice that a few essential segments/labels are missing from this keyword list:

  • One is the gender label (girls and boys).

  • Is your target market price sensitive? Then pricing might be another segmentation/label ( shop by price).

Insights like the ones above cannot be discovered using keyword tools. So, how do you identify these “hidden” labels? By conducting user research, user testing, creating consumer personas and scenarios, user flows, website maps and wireframes.

Keep in mind that from an information architecture perspective, labeling does not stop with the text used for links and navigation. There are different types of labels as well, such as:

Document labels

  • URLs (whenever possible, URLs should contain keywords that make sense to searchers and to search engines).
  • File names
    (having relevant keywords in filenames is important for SEO and users).

Content labels

  • Page titles should make sense to searchers and search engines. When there is a partial match between the keywords in the HTML title element and the search query, search engines will emphasize (bold) the matched keyword(s), which may help with SERP click-through rates (CTR).
  • Headings and sub-headings. Headings use large fonts and attract the eyes almost immediately. Putting keywords in headings assures users that they are in the right place and help with dwell time and bounce rates.

Other types of navigation labels

  • Breadcrumbs. Keep in mind that since search engines became so popular, home pages are not the only entry points to websites. Therefore, use breadcrumbs to easily and quickly communicate the hierarchy of your site to searchers.
  • Contextual text links. Using keyword-rich anchor text placed in a sentence or paragraph is one of the best ways to interlink pages, either vertically or horizontally.
  • Footers are also a type of navigational label.
    A quick note on this type of navigation: this is probably the place people spam the most, by creating tens of keyword-rich internal links.

Figure 47 – The screenshot depicts a footer that makes this website a good candidate for an over-optimization filter.

This footer is mainly boilerplate text, meaning that search engines will most likely ignore it when assessing this page’s content and the anchor text relevance.

It does not help to repeat “men’s{category name}” across a million pages since search engines can exclude boilerplate text pretty well when computing relevance.

Figure 48 – An excerpt from Google’s webmaster guidelines regarding boilerplate repetition.

It is funny how SEOs refer to the concepts discussed in this section of the course as on-page SEO factors, while information architects refer to the same as labels. It seems that SEOs and information architects work with similar and related concepts, but they still cannot easily come to agreements when it comes to optimizing websites for both searchers and search engines.

Poly-hierarchies

SEO can help information architects with canonicalizing poly-hierarchies.

Very often, multiple suitable hierarchies could be appropriate for a given item. It is important to help the information architect choose the best fit as the canonical hierarchy and to stick to it. From the primary or secondary navigation, you should link only to the canonical hierarchies.

Ideally, all links on the website should point to only one canonical hierarchy.

You can keep as many logical hierarchies as are helpful to users, but to avoid confusing search engines, link to the canonical hierarchy as well.

For example, the Elmo category can be found under:

Toys > Stuffed Animals > Elmo (URL: mysite.com/toys/stuffed-animals/elmo/)

Gifts > Holidays > Christmas > Elmo (URL: mysite.com/gifts/holidays/christmas/elmo/)

If you decide that the first hierarchy is the canonical one (usually canonical hierarchies are the shortest), then whenever you link internally to the Elmo category, use the URL mysite.com/toys/stuffed-animals/elmo/

You can use your web analytics tool to see how most users reached a given page. For example, look at the Navigation Summary report generated using Google Analytics (under Behavior –> All Pages), and see how most people reached the Elmo page:

Figure 49 – To get this report, follow the steps illustrated in this screenshot. If you want a more detailed analysis, use the Visitors Flow report under the Audience tab.

Additionally, you look at the Refined Keywords dimension in the Behavior –> Search Terms section to understand what keyword refinements were made after a search for “Elmo”. The Refined Keyword report can also be a source of keyword variations as you can see in the following screenshot:

Figure 50 – The Refined Keyword report can be a source for keyword variations.

Remember that there is no wrong or right way to classify a product into certain taxonomies if you refine them over time if need be. However, once you decided on a canonical hierarchy, it is a good idea to set that in stone.

Here are some other SEO tips for ecommerce information architecture:

  • If you use Google Analytics (or any other web analysis tool), activate the Site Search Tracking option. Analyze what users search for and use that information to decide on the website’s hierarchy. However, do not rely solely on your web analytics data, because you will miss a lot of data that is sourced outside your site.
  • Use keyword research tools to identify keyword variations and suggestions for the terms you have in mind or for those generated with user research and card sorting.
    • Google Keyword Tool
    • Search Term/Query Reports
    • Wordstream
    • Ubersuggest
    • Keywordtool.io
    • Google Suggest
    • Google Correlate
    • SEMRush
    • SpyFu
  • Analyze your competitors’ website architecture and navigation, but do not copy blindly. Use their information for inspiration but create your site architecture in the end.
  • Use a crawler on your competitors’ websites and sort their URLs alphabetically. You may need to crawl a large number of URLs (i.e., 250k+ URLs), for this to work.
  • Find your competitors’ sitemaps (both the HTML sitemap and the XML Sitemap) and analyze them in Excel.

Figure 51 – Sorting URLs alphabetically can reveal the website structure.

  • Download the DMOZ taxonomy and look at the shopping categorization.
  • When choosing category names, use Google Trends to check whether there is a steep drop in what people search online, over time.

Figure 52 – Notice how the interest in “digital cameras” trends downwards. Maybe this has to do with mobile phones that yield increasingly better pictures.

  • Do not create the website hierarchy solely on keyword research data; validate with card sorting and user interviews. Nowadays, you can quickly do that online.
  • Perform simple navigational queries, like “contact{your_brand}” and make sure the contact URLs, and all other important URLs, are user-friendly.

Figure 53 – This is a not-so-friendly “contact us” URL.

Remember, labeling applies to URLs too, not only to links. In this example, the URL is not optimized for users (nor for search engines). This may be limited by the CMS, in which case it may be the time to ditch the old CMS for a new one.

A friendly URL will read www.jcpenney.com/contact-us OR jcpenney.com/contact-us

  • If you need to categorize large volumes of items, you can use the power of folksonomy, which is an academic term for what we commonly call crowdsourcing. Services such as Mturk from Amazon will allow you to categorize products quickly, and even create relationships between products using real people. However, you need to be careful about how you select participants and what instructions you give them.
  • When card sorting tests are in progress, it is more important to listen and observe than to put words in your users’ mouths.
  • When you remove/update categories from your website (at all levels), make sure that the URLs belonging to the updated categories redirect to the most appropriate working page.
  • When you develop or update the website, create a checklist of SEO requirements for the information architect (e.g., directory and file name conventions, canonicalization rules, lower casing all URLs, data quality rules for data input teams, seasonality, and expired content handling, parameters handling, and so on). I will not provide an extensive checklist here, because people tend to limit to using just the pointers in the list while missing others. After reading this book, you should be able to come up with your list.
  • Send email alerts to the search engine optimizer when someone removes or updates categories, subcategories or products so that he or she can check the header responses for the new and old URLs. This task can be easily automated.

Technical architecture

At the beginning of this section, I mentioned that site architecture (SA) is made of information architecture (IA) and technical architecture (TA). We then looked at several information architecture topics. Now it is time to discuss technical architecture.

While duplicate content and crawlability issues are well-known SEO headaches, many search engine optimizers categorize them under the information architecture umbrella. However, they are in fact technical issues. Most of the SEO tips you will learn during the next chapters are addressing technical architecture issues.

CHAPTER 3

Keyword Research

Length: 10,599 words

Estimated reading time: 1 hour, 10 minutes

Chapter-Head-Chapter3

Keyword Research

In this section, I will explain the keyword research process, which is a core concept of ecommerce SEO. Keywords are important for users and are still crucial for search engines. I will explain how to perform the research with users in mind, and I will show you how to tie the buying funnel and the user intent to this research. I will also touch on developing personas that can be later used to create a content marketing strategy by mapping user intent to content. Towards the end of the section, I will share a few less known keyword strategies specific to ecommerce.

While SEO is the abbreviation for “search engine optimization”, SEO experts do not improve how search engines work; they optimize for search engines. And because the primary purpose of search engines is to be helpful to the people who use them, SEO would be better thought of as optimizing your website for search engine users. Search engine users are also referred to as searchers.

The search trifecta includes three entities:

  • The user
  • The search engine
  • The website

A common search experience looks like this: The user performs a search on the search engine, which leads searchers to the website, which should (in an ideal state) fulfill the user’s query.

Figure 54 – The search trifecta.

When performing keyword research, we often skip the user and jump straight to the search engine. This section describes what I believe is a better approach to keyword research: start with the user, continue with the website, and finally, consider the search engine.

During this guide, I will refer to keywords and queries interchangeably, but there is a subtle difference between them.[1] A search query is a series of words users type into a search engine. A keyword is an abstract concept within a search query.

Figure 55 – Keywords are abstract concepts.

For example, on ecommerce websites, keywords are represented by department, category or subcategory names. A search term that contains several words, including the keyword, is a search query.

Good information architecture and keyword research are at the foundation of great ecommerce websites that perform the best in search engines and convert at high rates. In the Information Architecture section, we found that deciding on primary and secondary navigation labels (or category and subcategory labels) based solely on keyword research is not optimal—it should be complemented by user testing and research or, by using controlled and custom vocabularies. That is because the user intent is not always reflected in what he or she types in a search engine. That is also why it is difficult to estimate user intent purely by analyzing keywords or search queries.

Keyword or query research is a core concept for ecommerce websites because it is important to map keywords with the right type of content, for both users and search engines. Discussing search engines and keywords outside the context of users is not the correct SEO approach.

In terms of marketing, research means collecting all the raw data that you will later use to perform an analysis. In reference to keywords, research means collecting keyword data from different sources. Here are some data and metrics that you might collect:

  • The keywords or the search queries used by searchers on search engines.
  • Their associated search volumes.
  • The current rankings (keep in mind that rankings are difficult to measure accurately due to personalization and geo-location).
  • Competitiveness data such as the average DomainAuthority of the top 10 ranking domains.

You will collect the above keyword data directly from search engines or by using third parties.

Gathering keywords

Collecting the initial set of keywords is straightforward, but the number of potential sources is overwhelming. You can use:

  • Google’s Keyword Planner.
  • Google’s Display Planner.
  • Google’s autosuggest feature (crank it up with Ubersuggest or Keyword Snatcher).
  • Google and Bing related searches.
  • Bing’s Keyword Research feature within Bing Webmaster.
  • You can collect keywords by brainstorming with various internal departments or using your existing Google Ads campaigns.
  • You can also use Google Search Console data.
  • Social media sources (Twitter, Facebook, LinkedIn, etc.).

Other less known sources for collecting keywords are:

  • Internal site search data, using Google Analytics or other web analysis tools.
  • Voice-of-the-customer surveys and research.
  • User testing.
  • The anchor text of the natural links to your pages.
  • Competitor analysis.

Even though there is a plethora of keyword tools, the most extensive set of keywords and search queries and the most accurate search volumes can be extracted from pay-per-click (PPC) advertising platforms such as Google Ads.

I recommend collecting keyword data using an active Google Ads campaign, rather than using the data Google Ads provides without a live campaign. This is because when you run a live campaign, the Google Ads data goes beyond the keyword suggestions within the Keyword Planner. A live campaign can generate a handy list of long-tail keywords (use the Search Terms Report in Google Ads), and in my experience, that list is impossible to capture with any other tool.

Besides Text Ads, you should run Product Listing Ads in Google Ads (via the Merchant Center) and Dynamic Search Ads, and then use the Search Query Report to get a fantastic number of relevant keywords. Many of those keywords will be long-tail keywords.

Figure 56 – It is easier to rank for a search query that has more words in it (long-tail keyword) because the search query is usually less competitive.

Unfortunately, many people stop their keyword research after collecting just quantitative data, such as search volumes. I call this the traditional keyword research approach.
In the digital marketing world, the following is a typical scenario:

“We identified that these keywords have the highest search volumes, so we should target them. We will change page titles, we will go with a 3% keyword density, and we will build a bunch of backlinks to the pages targeting those keywords”.

Alternatively, if the marketing person or agency is more knowledgeable, the scenario may sound like this:

“These keywords have a decent amount of traffic and have good conversion rates, as per your analytics data. They are competitive, and that is why we should optimize the internal linking and build backlinks to the most appropriate SEO’d pages”.

Yes, search volume data research is necessary, but you need to go much deeper than this if you want to increase organic traffic. Search volumes are just the starting point. You must think of your users and their concerns, questions and their FUDs (fears, uncertainties, and doubts). All of these affect their purchasing decision. Once you identified those pain points, create content that attracts qualified organic traffic and generates sales.

Seasoned marketers call this concept Intent to Content.

Creating Personas

One of the best ways to map intent-to-content is by creating personas. A definition of persona is “a quasi-imaginary representation of your customer based on market research and real data about your existing customers. A persona includes demographics, behaviors, motivations, and goals”.

Ecommerce businesses, especially in the B2B space, need to go above and beyond and develop well-researched buyer personas to attract people in the early stages of the buying funnel. You will have to create and market content for every stage of the buying funnel.

Let’s say that you sell promotional products to businesses. Here’s what an oversimplified persona creation could be like:

Start by identifying the segments you need to market to and by giving them names, for example:

  • Vera, the Marketer.
  • Chris, the IT Geek.
  • Brad, the Economic Buyer.

If you sell promotional products, your focus will be on Vera, the Marketer.

Creating Vera’s buyer profile should be comprehensive. Everyone involved with marketing and sales should contribute to a Persona Questionnaire, based on their experience, knowledge and online research of Vera. Additionally, you can interview the existing customers who share Vera’s profile to find out commonalities. This questionnaire should be developed by a joint marketing and sales team and should include questions such as:

  • Where does Vera read online?
  • Where does she go to ask for help?
  • What kind of wording does she use online?
  • What challenges does Vera face right now?
  • What are her goals?
  • What does her career path look like?
  • What motivates her to select a competitor?
  • How does she make decisions?

Additionally, you can collect and analyze public résumés to identify career paths for people involved in marketing decisions, like Vera. Here’s what the word cloud for marketing managers’ accountabilities may look like:

Figure 57 – Responsibilities for marketing managers.

Some of the most important facts you need to uncover about Vera are her pain points and how she makes decisions. You will reveal such data by engaging on websites where she goes to read, to educate herself or to ask for help.

Once you identified Vera’s pain points and top challenges related to your vertical (in our example, promotional products), bucket them in different content types and rank based on the most severe problems. Then, prepare content to address each pain point or challenge (e.g., case studies, how-to’s, extensive guides, etc.).

One type of content for the upper buying funnel can be targeted to raise awareness about a given challenge. For the mid-funnel, the content might be a guide on how to address the same challenge, with examples of how you helped other businesses deal with the problem. For those who are ready to make a vendor selection, the content can be a case study. However, none of these should be salesy, just excellent and useful information.

Simply put, you will identify Vera’s most important problems, educate her and then prove that you have the products she needs.

Note: usually, it is a bad idea to “gate” upper and mid-funnel content, for example, by asking for email and contact details to access the content.

The Intent-to-Content concept became more relevant and prominent after the Hummingbird algorithm update. The focus of this update was on processing conversational queries, which are longer, question-like queries including modifiers such as how to, where is or where can. The focus shifted away from the traditional word-parsing approach.

Another objective of Hummingbird was to match the user intent so that Google can provide answers rather than just search results. After the introduction of the so-called “position zero”, which is also known as the quick answer box, the intent to content concept became even more important.

To map keywords-to-intent and then intent-to-content, we need to discuss two concepts: the buying funnel, and the user intent.

The buying funnel

In the U.S., ecommerce conversion rates are still at about 3%,[2] because online retailers focus mostly on converting branded traffic and on marketing to consumers in the late stages of the buying funnel. Also, ecommerce websites usually concentrate their link-building campaigns on category and subcategory anchor text.

Keyword research and web analytics tools, PPC data and other similar sources provide insights on which keywords the target audience searches for, when they search, and where they search from. However, savvy online retailers must understand searcher context and create a content plan accordingly. They try to answer the why question. User testing is one great way to gauge the why, but it has limits.

Users do not just turn their computers on, navigate to your website and buy your products. First, they realize they have a need; next, they research online,[3] decide what’s right for them and—only then—purchase.

This journey is called the buying funnel.

Figure 58 – The stages of the buying funnel.

In the image above, you can notice how keywords in the awareness stage are generic and broad in nature, and gradually become more specific, until they finally become the exact product the searcher wants to purchase.

Although the keyword categorization in this example may seem straightforward and logical, in practice the keywords used by consumers will belong to multiple categories and will be found in various buying stages. That is why you do not need to be too particular about where to bucket a keyword.

Here are some great insights from one study that mapped 40,000 PPC keywords to the buying funnel:[4]

  • Targeting only keywords in the Purchase and Decision stages of the buying funnel for ecommerce websites can, theoretically, lead to 79% less organic traffic.
  • The buying funnel is representative of actual online consumer behavior, at least at the individual query level (6. Discussion and Implications, p. 11).
  • Advertisers can use the model to organize separate campaigns targeting various consumer touch points (6.3. Practical Implications, p. 14).
  • The implication is clear: do not ignore Awareness key phrases (6.3. Practical Implications, p. 14).

The researchers analyzed data from about seven million keywords from a large retail chain having a brick-and-mortar and online presence:

Figure 59 – The stages of the buying funnel[5].

As you can see in this screenshot from the study, the awareness and research keywords, which are mostly informational, made almost 80% of the total keywords. The same research indicates high PPC advertising costs for the Awareness (25%) and Research (57%) stages. A staggering $4.6 million (57% of $8 million) could have been saved with a proper keyword-to-content strategy.

When you map keywords to buying stages and to user intent, and when you develop content accordingly, you will:

  • Generate content that attracts organic traffic.
  • Create content that can be linked-to more easily.
  • Support pages in the vertical silos up to top-level categories.
  • Reduce advertising costs.

The awareness stage

This is the first stage in the buying funnel. Your customers realize that they have a need or a problem, and they start researching general information about what would help them fill that need or fix that problem. They want to know what types of products or services are available on the market.

For an ecommerce website, the queries that can be associated with the awareness stage are the broadest, most generic terms, such as department, category or subcategory names (e.g., “commuter bikes”, “winter jackets”, “car racks”, “running shoes”, “cruise deals”, “diamonds”, and so on).

However, longer queries and natural language queries are also found at the awareness stage. For example, a searcher wants to know how to save on his daily commute. He starts by typing “best ways to save on commuting”, then reads an article about commuter bikes.

At this stage, consumers do not know yet what will address their needs and are still seeking information, so awareness queries usually contain neither brand nor specific full product names. They can include an action or a problem that needs to be solved—e.g., “removing wine stains”.

According to the same study cited earlier,[6] an awareness search query:

  • Does not contain a brand name.
  • Could contain the partial product name/type.
  • Could contain the problem to be solved.

At this stage, the user intent is mostly informational.

Tactics for this stage
The search queries associated with the awareness stage are what most of the ecommerce websites target, for example, head terms such as department, category, subcategory or sub-subcategory names. These search terms are usually super-competitive, and realistically, ranking for such keywords will not happen unless your website has a significant amount of authority and reputation in the industry (including backlinks). You will also need to have a lot of great content that clearly establishes your website theme, and that makes you the go-to resource for a subject matter or theme (i.e., wines).

Some tactics associated with attracting traffic for these keywords are:

  • Creating content such as community pages, how-to pieces, blog articles, educational content, and linking vertically from such content. You will need a significant amount of content and consistent linking to support category pages.
  • Siloing the website with directories and internal linking.
  • Building themed backlinks to the content-rich pages and articles.
  • Showcasing instructional videos by featuring them on your website and through social media.

The research stage

At this point, the consumer has identified the type of product or service that could help. The possible customer can now recognize brands in your industry or niche but has not yet decided on a definitive brand. The customer still needs to refine his or her knowledge before making a purchasing decision.

While the search queries are still broad, instead of generic searches the consumer now uses more specific terms, including keyword modifiers such as brand names or geo-locations.

The queries may look like “lightweight commuter bikes”, “insulated winter jackets”, “rear mounted car racks”, “cross training running shoes”, “European cruise deals”, or “4-carat wedding ring”. The queries can be subcategories, sub-subcategories or product attributes.

Long-tail queries can be found at this stage as well. In our biking example, the consumer may type: “what are the best brands for commuter bikes”, “which brand is more reliable”, “compare{brand1}with{brand2}”, “what bike size do I need”, “what is the cost of an electric bike”, “how much will it cost to maintain a bike”, and so on.

At this stage, the user intent is still mostly informational, but transactional intent may be there too.

Tactics for this stage

  • Write product reviews and product comparisons, as well as plenty of articles to answer your target market’s questions.
  • Write extensive user guides (e.g., How to Select an Electric Bike or How to Choose a Commuter Bike in 10 Easy Steps). This type of content is an organic traffic driver and has the potential to become a real link magnet as well. Then, promote this content socially and with influencers in your industry to generate buzz, and hopefully, backlinks. If you are a small or medium business that focuses on a line of products or niche, there is some encouraging news for you. Because the authoritative websites you compete with on Google have a lot of inventory and themes to create content for, you may have an advantage if you are focused.
  • Create buyer personas to identify a) where the target market goes to read information and b) what questions they have. Once identified, group them into topics, and write articles to answer the questions. If you are interested in learning more about personas, read this leaked document that talks about BestBuy’s buyer personas.[7]
  • Keyword-rich internal linking is also crucial at this stage. As a matter of fact, internal linking is essential at all stages of the buying funnel, so make sure you correctly cross-link from informational pages to category and product details pages.

The decision stage

Now, your prospective customer has an idea about what solution is good for him, or her. The prospect will research the best store to buy from, and it will try to get the most value. His logic and emotion will favor a particular brand. The possible customer will be much closer to making a purchase decision.

At this stage, the consumer has chosen a product and a brand, but not the exact model number or version of the product. In our bicycle example, the searcher knows he wants the Ridgeback brand, and he knows he needs a commuter bike.

The Decision stage is where comparison shopping occurs, so search queries often include brand names and technical specifications. At this point, his queries will be more focused than in the previous two stages and can consist of very strong commercial intent keyword modifiers like “sale”, “discount”, “coupon”, “buy” or “buy online”.

Going back to our example, the searchers’ keywords can be “ridgeback coupons”, “ridgeback commuter bikes deals”, “ridgeback free shipping”, “ridgeback bike size guide”, “ridgeback commuter bikes comparison”, and so on.

At this stage, the user intent is mostly transactional, with some commercial intent. Some navigational queries may occur when consumers check the manufacturers’ websites directly.

Tactics for this stage

  • Make sure that your website shows up for branded search queries such as “{brandname}reviews”.

Your website will claim the first positions if you have pages that target reviews-related keywords and if you build just a couple of good backlinks from external sites to those pages.
Having a dedicated template page for “{brandname}reviews” will allow you to publish all the reviews for any given product.

Figure 60 – This online retailer has a Reviews and News template for each brand.

Here’s how the website above ranks for “ridgeback reviews”. They got the #1 and #2 positions:

Figure 61 – If you are the owner of the brand, your site should easily come at the top even without backlinks.

Other tactics that you may consider are:

  • Distribute coupons to build links and brand awareness.
  • Write how-to content, user guides, and product comparison pages.
  • Optimize your brand pages and product descriptions to include reassurances, shipping estimates, refund policies and so on. Think in terms of optimizing your content for conversions rather than SEO.
  • Have a Promotions/Coupons/Reviews page targeting your own brand terms.

Figure 62 – SERP for results for “Macy’s coupons”.

In the image above you can see the results for “Macy’s coupons”. This is a great keyword to rank for, and Macy’s is ranked #2 for its brand name plus “coupons”. By creating this Coupons page on their website, they are taking away traffic from coupon websites.

  • If you accept coupon codes at checkout, make no mistake; consumers will leave the checkout process to find your coupons. Instead of allowing users to leave the checkout to find current promotions outside your website, use a pop-up window or open a page in a new tab to lists all of your current promotions and coupon codes.
  • Create interactive tools for finding, comparing or visualizing products (e.g., virtual eyewear, try before you buy tools, see the painting in your room, etc.).

The purchase stage

This is the stage at which consumers know either exactly what they want to buy or at least the brand they want to buy from. The queries contain specific product names and the exact model number or version of the product (e.g., Ridgeback Meteor ’14). The keywords are the most focused at this stage. These are probably easier keywords to classify because they often contain the product name or the brand name. For ecommerce websites, the landing pages most associated with these queries are the product detail pages.

At this stage, the user intent is mostly transactional, with some navigational intent as well (for example, typing “amazon” in a search engine to buy a book, or purchasing directly from the manufacturer’s website).

Tactics for this stage

  • Engage appropriate influencers for product reviews and to send qualified traffic to your website.
  • Develop backlinks to product detail pages.
  • Optimize product detail pages to include detailed product specs, persuasive descriptions, great images, questions and answers, and so on.
  • Offer coupons.

The purchasing stage is the last stage in our buying funnel model. Some marketers and sales professionals have gone into greater depth and broken it down into even more detailed steps. However, if you start breaking down the funnel into just four stages and you begin developing content based on these stages, you will see traffic and sales increase nicely over time.

Keep in mind that a purchasing decision is never going to be linear. You will have prospective customers who start their journey in the middle or, at the end of the funnel. Regardless of where the journey begins, you will be able to capture consumers at any stage if your content is well planned.

Knowing about the buying funnel stages is important for understanding another keyword research concept, the user intent.

The relationship between these two concepts is pretty tight. Usually, a consumer who is in the Awareness stage is going to use informational search queries. When he is in the Purchasing stage, he will mainly use transactional intent keywords that have strong commercial intent.

The user intent

When users go to search engines and type queries, they are trying to accomplish something. This can be:

  • Finding a business, which can be located either online or offline.
  • Getting more information about a product or a service.
  • Purchasing an item, which also can happen either online or offline.

Searchers have a goal in mind, and that goal is called the user intent. Search engine users type-in phrases that represent their intents and Google tries to match those intents with the most relevant results. If you understand this concept, then you understand the importance of mapping keywords-to-intents and developing content accordingly.

Figure 63 – Three types of user intent keywords.

The specialty literature[8] breaks down the user intent into three categories:

  • Navigational, when searchers use a search engine to navigate to a specific website.
  • Informational, when searchers want to find content and info about a specific topic.
  • Transactional, when searchers want to engage in an activity, such as buying a product online, downloading or playing a game, seeing pictures, viewing a video, and so on. Transactional intent does not necessarily involve a purchase.

Google’s guidelines for quality raters, which are the human evaluators[9] who assist with quality control of the SERPs, refer to the same categories as Navigation, Informational, and Action.

When discussing user intent types, it is worth mentioning commercial intent. Commercial intent is rather an independent dimension that can apply to all three types of user intent, with transactional queries probably carrying a higher commercial intent than the other two. A Microsoft Research study found that 38% of the queries have commercial intent, and the rest are non-commercial.[10]

Figure 64 – Navigational and informational keywords can have commercial intent, too.

For example, when a consumer wants to buy a car, he will perform the research online, but he will seal the deal in a dealership. His queries, whether informational, transactional or navigational, will all have some commercial intent because his final goal is to purchase a car.

Mapping keywords to intent is not an easy task. Even search engines are not able to accurately classify user intent in general, let alone commercial intent. So, map keywords to user intent as best you can. As long as you start categorizing based on intent, you will begin generating ideas for content that matches the intent, and that is very relevant to users. This is the best SEO approach to stand the test of continuous algorithm updates.

Below are some guidelines for classifying user intent but remember that many keywords can be placed into multiple intent buckets.

These are queries containing:

  • Companies, brands, organizations or people’s names.
  • Parts or full domain names.
  • The word “website” or “web site”.

Navigational queries are the easiest to spot during keyword research. For this type of queries, make sure that you show up at the very top of the SERPs for your brand and your domain name search queries. If you are not showing up at the top for such queries, you might have a more significant problem than mapping user intent. You might have a site-wide penalty.

Figure 65 – Best Buy pays for its brand name to appear on Google Ads. That is because they deemed the branded keywords very important.

If you sell someone else’s brands, it is not a good idea to put efforts into ranking for keywords made of brand names only because this means competing directly with the brand owners and their social media profiles. Overtaking them in rankings is not possible—unless the brand sucks in terms of SEO—and even then, is going to require significant effort.

If you own the brand or if you are a manufacturer that sells your own products, make sure that your website shows up for possible keywords that contain your brand name plus product names. For example, if you manufacture and sell computer RAM, your site should rank at the top for brand queries including the products you sell (e.g., “Kingston 1Gb RAM” and “Kingston 1 Gb RAM”).

Informational intent queries, or the “know” queries

These are queries containing:

  • Question words (e.g., ways to, how to, what is, etc.).
  • Informational terms (e.g., list, top, playlist, etc.).
  • anti-commercial queries (e.g., DIY, do it yourself, plans, tutorial, guide, etc.).
  • Words like instructions, information, specs.
  • Words like help, resources, FAQ.
  • A category or subcategories (e.g., digital cameras, raincoats, etc.).

If you have difficulty classifying keywords based on intent, one trick is to find the navigational and transactional intent queries first, then assume that the rest are informational.

If you want to learn how Google teaches its search quality raters to classify the search queries, I recommend reading Google’s “Search Quality Rating Guidelines”, especially Section Two.

Informational intent is the type of intent ecommerce websites should start shifting their attention to because informational keywords provide the chance to get in front of the target market in the early stages of the buying funnel. The earlier your audience is exposed to your brand, the higher the chances of closing a sale.

The content that addresses informational queries encompasses all types of media such as text, video, audio, etc. It includes all kinds of content such as product descriptions, technical specs, expert reviews, infographics, instruct-o-graphics, blog posts, how-to guides, etc.

When creating content for this type of intent your goal is not to sell your products, but preferably to position yourself as the authoritative source in your space. You need to become a publisher of reliable and useful content. Informational queries are perfect for this because they represent an excellent opportunity to increase brand awareness and show expertise.

The fact that 80% of search queries are informational[11] represents a massive opportunity for those who plan for long-term gains. These queries can be very generic, for example, head terms such category and subcategory names (e.g., “cars” or “insurance brokers”), but long-tail keywords as well. For example, search queries like “what is the most fuel-efficient car on the market?” or “life insurance brokers in New Westminster, BC” are informational. Note that either of these two example queries could have a transactional intent as well.

To cover as many informational queries as possible, you will need to create educational content for consumers who are not yet ready to buy or for those who do not even know what they need to buy. Your goal is to provide searchers with content that answers their questions and fill their need for information. Also, your content must assist nudging searchers further down the buying funnel.

Informational intent queries appear at the Awareness, Research, and Decision stages. You must gradually guide consumers towards content that is more transactional, which will eventually lead to conversions. After all, a macro-conversion (e.g., a web purchase) happens only at the end of several micro-conversions, such as reading an article about a problem, finding the right product, adding to the shopping cart, click on proceed to checkout, etc.

One way to check whether there is a disconnect between user intent and the content on your website is by looking at the ecommerce transactions (and conversion rates) for your keywords (if the keywords data is available):

Figure 66 – This Google Analytics report became almost useless after Google stripped the search query data from the referring URLs (now showing as “not provided”).

Using Google Analytics, you can look at which pages or keywords perform poorly. In this example, getting 3.3k organic visits from a single keyword and ending with just one conversion is a sign that something is wrong. It may be because the searchers land on an improper landing page, or maybe because the landing page is attracting the wrong keywords. It may also be related to conversion frictions such as your pricing being higher than competitors’.

Another way of finding this disconnect is by analyzing the keywords’ bounce rate, (again, whenever you can get this kind of information):

Figure 67 – Whenever you can identify a keyword that drove traffic to the website, look at its bounce rate.

A high SERP bounce rate is usually a bad thing because it shows that searchers landed on your website and didn’t find what it was expected. However, blog pages typically have a high bounce rate since visitors might find the answer they are looking for in the article itself, and then leave.

As you start looking at keywords through the user intent prism, as opposed to just words and numbers, and as you try to solve high bounce rates, low conversion rates, and low transaction numbers, you will start learning more about your visitors. This will help not only with organic traffic but with everything marketing and sales.

When analyzing the performance of the informational queries, keep in mind that such keywords will most likely not convert at the first visit.

The transactional intent or the “do” queries

These are queries containing:

  • Calls to action (subscribe, purchase, pay, play, send, download, buy, listen, view, watch, find, get, compare, shop, search, sell, etc.).
  • Entertainment terms (pictures, movies, games, and so on).
  • Promotional terms (coupons, deal, discounts, for sale, quotes).
  • Complete product names.
  • Comparison terms (where to buy, prices, pricing, compare prices for).
  • Terms related to shipping (next day shipping, same day shipping, and free shipping).

However, not all transactional queries contain verbs. For example, the search query “Dell Vostro 1700” can be both transactional and informational, because the user either wants to read more about this product or wants to buy it. Also, transactional queries do not necessarily have to involve money or purchases. They reflect only the desire to perform some action on the Internet.

Transactional queries with commercial intent occur more frequently in the decision and purchasing stages. Such keywords should land visitors on category and product detail pages, or on landing pages that have been built to funnel visitors to a page where a commercial transaction occurs (e.g., a product comparison tool or a finder tool).

Transactional queries are most likely to generate the highest return on investment (ROI) for pay-per-click campaigns, and that is why their cost-per-click can be high. However, the ROI would be even better if you had previously “touched” the pay-per-click searcher with an organic result. Your brand might be recognized upon landing on your website from the PPC ad, and that might have a positive effect on conversions.

A possible way to connect user intent with search queries is using user surveys sourced from your organic traffic. You can implement a modal window or a pop-up that tracks the search query used by visitors. The downside is that search engines no longer pass search queries data in the URL string so that you will get only a fraction of the queries.

However, when you can identify the keyword, trigger the modal window and ask a simple question, such as “What is the goal of your visit to our website today?”. Provide two possible options:

  1. I am shopping for something to buy now or in near the future.
  2. I am looking for more information about some products/services.

Mapping intent to content

Your content strategy should be created by understanding where in the buying funnel the user is when he types in a search query, by mapping his user intent, and by bucketing the searcher into the right persona he belongs to.

But why is user intent so important? It is because there is an algorithm that matches user intent with search queries: Hummingbird

One of the metrics used by search engines to measure the match between the user intent, the search query, and the perfect result is SERP user engagement.

However, the ultimate metric that Google uses to quantify if the content on a page matches the user intent behind a search query is called The Long Click.

The following is a quote from the book “In the Plex: How Google Thinks, Works, and Shapes Our Lives” and describes the long click:

“On the most basic level, Google could see how satisfied users were. To paraphrase Tolstoy, happy users were all the same. The best sign of their happiness was the “Long Click” — This occurred when someone went to a search result, ideally the top one, and did not return. That meant Google has successfully fulfilled the query”.

Let’s start the keyword mapping process.

“Tired Jamie” is a persona that has been developed for the scenario in which a buyer wants to purchase a mattress.

Scenario: Jamie cannot sleep at night, and she wants to improve her sleep. She founds that old mattresses can cause poor sleep, and she decides it is time to buy a new one. She starts looking for information on how to choose a mattress that can provide the best night’s sleep. She discovers a useful mattress finder tool that recommended foam mattresses, based on her inputs. Next, she starts researching which brands are selling foam mattresses and which of their products have the best reviews. She founds Tempur-Pedic®, which seems to be trusted a brand by many people, so she investigates their various types of mattresses. Finally, she knows what she wants, and now she is actively looking for that product.

First, do your best to map her keyword journey by sorting keywords top to bottom based on the buying funnel.

Figure 68 – This is the beginning of the keyword mapping process.

In this example, Jamie starts with a broad search, “tossing all night”, then she refines it to “how to improve my sleep”. After finding out that mattresses can be the cause of her poor sleep, she refines her search to “how to choose a mattress”. Once she found the type of mattress that seems to solve her problem, she starts investigating “foam mattresses brands”.

Once she found a brand that she trusts, she will look for their products, by searching for “tempur pedic mattresses”; this search query contains the brand and the category of products. Finally, she looks for the specific product she intends to buy “tempur pedic cloud luxe breeze”.

In the second column, tag each keyword by the most appropriate theme or silo it belongs to. For example, all keywords containing the word “mattress”, will be bucketed in the mattresses silo. “Tossing all night” and “how to improve your sleep” do not belong to a specific category of products so that you can assign them to a generic “resources” silo.
Next, map the keywords to the user intent, while remembering that a query can have multiple intents. In our example, the first four keywords are informational, and the last two are transactional.

Then, add the type of content that fits the intent and the search query (e.g., for the search query “how to choose a mattress” you can build a mattress finder tool).

Now, you need to add more details.

Figure 69 – We will keep adding more data to this table.

The URL represents the page you want to rank in the SERPs. This is the page that you deem to be the most appropriate to rank with. In our example, for the keyword “tempur pedic cloud luxe breeze” you will want to rank with this product page:

/tempur-pedic-cloud-luxe-breeze.html

In the Anchor Text(s) column you will list the internal anchor text used to link to the targeted URLs (this can be used as anchor text for your backlinks as well).

Note: failing to establish contact and presence with consumers who perform informational queries is one cause of single-digit conversion rates. Too often, ecommerce websites try to sell too early.

If you develop content for all the buying stages and intents, you can land the prospective customers on your website, at the research stage. Then, gradually nudge them to the purchasing stage, without having them exit your website to find answers from competitors. If one of your competitors becomes the trusted source of advice for that potential customer, you have lost the sale.

So, make sure that you optimize the right pages for the right queries. If the query is informational, you want to rank a page that provides informational or educational content. Likewise, if the query is transactional, you need to optimize and rank with pages that have transactional intent.

Prioritization

Keyword prioritization is difficult because:

  • You need to consider many and various metrics.
  • The SERP ranking factors are not publicly available.
  • Several metrics, such as competitiveness, come from third-party sources (not directly from the search engines).

Therefore, any keyword evaluation model based on rankings factors and competitiveness metrics is subjective. Prioritization methodologies are usually based on factors such as keyword difficulty, search volumes, business goals, margins and profits, conversion rates, or a combination of these.

One lesser-known method for keyword prioritization is based on the revenue opportunity of forecasted rankings. This evaluation model determines a monetary value for the top 10 rankings, using the average SERP CTRs.

Note that this model is meant only as a tool to help you identify the lowest-hanging opportunities.

Figure 70 – We’re adding search and business metrics to the process.

In this table you can get the Search Volume data with Keyword Planner, the Current Ranking data with the ranking tracking tool of your choice, and the Organic Visits data using your web analytics tool. The Revenue data is also collected from your web analytics tool. You will generate the Per Visit Value by dividing Revenue to Organic Visits.

Note that I excluded metrics such as the conversion rate or the number of conversions, on purpose. That is because ecommerce websites have multiple micro-conversions and macro-conversions (e.g., a web sale, a newsletter subscription, reaching a critical page, submitting a form, etc.), and this evaluation method is based solely on revenue, not conversion rates.

If you want to dig into details and evaluate based on each type of conversion (for example, prioritize keywords that generate more email subscriptions), then use only the newsletter revenue data for each keyword and prioritize accordingly.

The columns named “Rev. if ranked 1…10” represent the revenue opportunity for various positions if you were to rank organically at those positions. Looking at the “Rev. if ranked #1” column, you can see that although the keyword “tempur pedic cloud luxe breeze” is a transactional keyword and has the highest Per Visit Value ($25), it is not the keyword with the highest potential to increase revenue. That keyword would be “how to choose a mattress”.

Figure 71 – SERP CTRs from Optify.

For this forecasting method, I used the organic SERP CTRs based on research done by Optify.[12]

The next step is adding keyword competitiveness data. There are a few different methods for assessing the competitiveness of a keyword. The easier ones are:

  • The average Domain Authority (DA) or Page Authority (PA) of the top 10 ranking pages and root domains. Note that the table below includes the average PageRank, but this data is not publicly available anymore.
  • The keyword difficulty score computed by MOZ or the CI index from serpIQ (now defunct).

Figure 72 – We’re adding competitiveness data.

Now that you have some quantitative information, you can slice and dice the keyword data any way you like. I suggest analyzing data in sets or themes only. If you mix keywords related to mattresses with keywords about dressers, your data will be skewed.

Also, it is important to find a balance between the forecasted revenue and the costs associated with obtaining the necessary rankings to achieve that revenue. Keep in mind that you will need to produce content, promote it through various marketing channels and build backlinks to it. All these actions have a cost.

Figure 73 – We’re adding costs related to producing and promoting content.

Content Creation Cost is an estimate of how much it will cost to create the content necessary to promote the keyword. Each keyword might have a different cost depending on the type of content you need to create; creating an article is less expensive than creating a video, which in turn is less costly than creating an interactive tool or a mobile app. The Cost per Link is an estimate of how much it will cost to build one link to that piece of content.

This is the Costs formula:

Costs=content creation cost + (cost per link * (average DA / 10)*2)

The number 2 in this formula is a cost coefficient tied to your domain authority. The lower the domain authority, the higher the coefficient. You can use the following brackets as guidelines for adjusting the coefficient, based on your DA:

  • DA 0–20, coefficient=5
  • DA 21–40, coefficient=4
  • DA 41–60, coefficient=3
  • DA 61–80, coefficient=2
  • DA 81–100, coefficient=1

The Costs formula indicates that the lower your DA, the more links you need to build to achieve first-page rankings. For example, for the keyword “tempur pedic mattresses”, if your website DA is 65, the coefficient is 2, and the Costs formula is:

Costs=$250 + ($200 * (45/10)*2)=$250 + ($200 *9), where 9 means you will need to build nine good-quality links.

You can download the Excel file containing the example formulas here.

Going back to user intent, remember that the goal of search engines is to provide straight answers for search queries and to provide the best possible results for keyword searches. If search engines fail at this, they will lose users, market share, and advertising revenue. It is, therefore, crucial for search engines to identify user intent as best as they can. Remember, Google changed its algorithm to focus on this, with the release of Hummingbird. Microsoft used to have a publicly available commercial intent detector tool, but unfortunately, it has been discontinued. So, you will have to classify the results manually.

Whenever you are in doubt about how search engines map keywords to intent, use Google’s help to assess what type of pages it returns for a specific keyword. First, log out of all your Google accounts. Then, clean out the browser cookies, open an incognito session and type in the keyword you want to research.

For example, let’s look at the results for the keyword “digital camera”. (see screenshot on the next page). For this keyword, seven out of the eleven listings are informational or educational resources (non-commercial intent such as reviews, news, images, tips, wiki, etc.) and four are online retailers (commercial intent). Keep in mind that I counted image results as just one single result. If you sell digital cameras and want to rank for this keyword, you have to create great educational resources on your website and promote them heavily, on both your site and external websites.

Given the number of informational results for “digital camera”, it seems like Google does not assign a strong commercial intent to this keyword. Then why do ecommerce websites try to rank their Digital Cameras category URL rather than a Digital Camera page dedicated to educational content and tools? Wouldn’t a category page create a disconnect between the user intent and the content on that page?

Figure 74 – SERP for “digital camera”.

Imagine you walk into a store to get information about which digital camera best suits your needs, only to encounter a pushy salesperson who tries to sell you items he wants you to buy, rather than what you think you need. You will probably thank him nicely and leave without buying. The same applies to online experiences; if searchers land on a page that does not fit their intent, they will bounce.

Creating content based on keyword research must address not only the possible buyers of your products but also those who will link to your content. That is because most of the people who buy from you will not link to a product or category page. It is possible that customers will share the purchase socially, but back-linking is going to happen only from people who believe the content they link to is valuable. Buyers think in terms of the value they get by buying the product from you, but those who link to you think in terms of the value they offer to their audience.

Product attributes and keyword variations

Online retailers often find themselves selling products with a multitude of similar attributes, and therefore would like to rank for a large number of keywords and product variations. For example, you sell a sweater that comes in red, blue, and green, and in three different sizes (extra-large, large, and small). This matrix will generate nine variations: small red sweater, large red sweater, extra-large red sweater; small blue sweater, large blue sweater, extra-large blue sweater; small green sweater, large green sweater, extra-large green sweater.

In the section dedicated to product detail pages, we will discuss how you should approach product variations, but for now, let’s say that creating unique product descriptions for each product variation may not be the best idea unless you have a large budget for content creation. Instead, you should handle product variations in the interface, without reloading the entire content. To achieve this, you can use dropdowns to allow users to pick a color and AJAX to load the content specific to that color variation.

If you already have unique URLs for product variations, then choose a canonical URL and point all product variation URLs to it (be careful what you choose as the canonical version). However, if the URLs are clean (i.e., they do not have too many URL parameters), do not change the URL structure without consulting an SEO expert.

Figure 75 – A simple decision chart inspired by MOZ.[13]

Keyword strategies

In this section, I will discuss a couple of less talked about keyword strategies for ecommerce websites.

Target low-hanging fruits

When you run an ecommerce website, the number of keywords you want to rank for is enormous, so it is not economically feasible to target all of them with link-building campaigns. You can rank organically for many long-tail keywords just by supporting them with content. Other keywords (usually the more competitive terms such as category and subcategory names) will only rank if you support them with content-rich sections of the website and with links from external websites.

An often-overlooked keyword strategy is to focus on keywords ranked on page 2, especially those ranking between position 11 and 15. Moving a keyword from the second page to the first page is usually less competitive than moving the same keyword four spots up, from 5 to 1. In the same way, moving from position 21 to 17 (also four positions up) is not going to generate a substantial increase in visits.

Let’s illustrate this concept with a keyword that has 1.2 million searches a month, “wedding dresses”. Moving this keyword from position 11, where it gets less than 2.6% of the clicks, which is about 3,100 visits, to position 6, where it gets 4.1% of the clicks, which is about 4,900 visits, represents a 158% improvement in traffic. Moving the same keyword from position 21 to 16 will generate a minimal rise in visits.

Figure 76 – Keywords ranking at the top of page 2 can be a good target for link development

The idea is that by building links to keywords ranking on the second page, you gradually increase your website’s authority, and at the same time, generate more organic traffic. In time, these links will support the link-building campaign for more competitive terms.

Of course, you should not focus solely on keywords ranking on the second page. A thorough analysis will identify keywords that rank on the first page and do not have much competition—it makes sense to target those as well.

Target holidays and retail days search queries
Holidays such as Christmas, Hanukkah, Thanksgiving, and Easter, and major retail events such as Back to School, Halloween, Cyber Monday, Black Friday, Boxing Day, represent significant traffic and revenue opportunities for all ecommerce websites, and for online retailers in particular. Shoppers are more open to spending during holidays. However, their search patterns change around these special shopping days.

The ecommerce shopping days calendar created by Shopify[14] shows that there is not a single month without a major shopping event. Promotions change very rapidly, and shoppers will shift their search queries very quickly as well. Smart ecommerce websites have to adapt to and capitalize on such shifts. However, many websites do not have the agile SEO abilities to take advantage of shopping day opportunities.

Here are some common mistakes made by online retailers, in regard to targeting shopping events:

  • Not updating page titles, descriptions and headings to include event-related modifiers.
  • Not updating page titles, descriptions and headings to include event-related modifiers until just a few days before that event. This is too late from a business point of view and for SEO. Google Insights research[15] suggests that Black Friday searches can come as early as July.

Figure 77 – 30% of shoppers plan their Christmas shopping list before Halloween.

  • Creating year-specific pages (e.g., Christmas 2018) and removing them without proper redirects once the holiday or event ended.
  • Not planning a “flash” link building campaign to target event-specific modifiers. In terms of link building, a flash campaign means two to three months in advance.
  • Not targeting last-minute buyers by adding “free” or “next-day shipping” in the page titles.

Figure 78 – 55% of consumers expect free shipping.[16]

Additionally, very few ecommerce websites will create content (e.g., guides, ideas, how-to’s) specifically targeting such retail dates. That is a shame since this type of content can capture potential customers during their research stage, when they use informational searches queries, like “Halloween costumes ideas”, “Christmas gift guides” or “Easter egg decorating pictures”.

Use keyword modifiers to update titles, descriptions, and headings
The way consumers search online before and during shopping events is different from how they do so the rest of the year. They add keyword modifiers to their usual search queries to better define their intent. Event modifiers are words like “Christmas”, “Cyber Monday” or “Boxing Day”, but also “same day shipping”, “next day shipping” and even “gifts”.

Look at the spikes in search volume associated with the “next day shipping” keyword modifier. The peaks reach the maximum a few days before Christmas. You should be fast enough to capitalize on this search pattern change.

Figure 79 – Shipping-related queries increase significantly around Christmas.

Figure 80 – Adding “same-day shipping” or “next-day shipping” to your titles by December 15 may prove wise.

Let’s say you want to capitalize on searches that contain the keyword “Christmas”. Add the word “Christmas” in the title of the category or product detail pages immediately after the Cyber Monday is over. You can also consider altering meta descriptions and page content as well. Be sure to check the rankings associated with these pages a couple of days after you made the updates (and regularly after that), to see whether there is a traffic drop. You can expect some fluctuations, but as you get closer to Christmas, you should see an increase in rankings and traffic.

If there is a drop, revert to the usual titles. If there is an increase, change all category, subcategory and product detail page titles. As you get closer to Christmas (e.g., December 15), change the title to “Free same-day Christmas shipping” since “free shipping” tops the list of the strongest incentives for visitors to buy goods online.

Get any page crawled and indexed by Google in less than one minute
Use the Fetch as Google feature in Google Search Console to achieve this. In the new GSC, use the Test Live URL functionality.

Figure 81 – Once you hit the Fetch button, Googlebot will crawl the submitted URL. If the page passes Google’s filters, it will be indexed in minutes.

Figure 82 – The number of fetch requests in Google Search Console is limited, so use your quota wisely.

Once Christmas is over, change the titles to target the next Holiday, e.g., Boxing Day. If there is a gap of more than three to four weeks between shopping events you can default to the usual titles.

You may want to use an automated system that allows event-specific titles, descriptions, and headings to be updated on specific dates. If that is not possible, then at least set up calendar reminders a month before the less important shopping events, and two months before, for the most important ones. You can refer to this article[17] for consumer trend data and the importance of each consumer holiday.

Create holiday-specific landing pages
Marketers create holiday or promotion-specific landing pages to drive targeted traffic with PPC, email, and catalogs. During the year, they will create pages for “Christmas 2018 Promotion”, “Father’s Day Specials” or “Valentine’s Day Two-for-One Deals”. You have probably noticed this implemented by big brands or small, but smart, competitors.
By creating these landing pages, which are visually themed in accordance with the event they target, marketers make ecommerce websites more attracting to visitors. These pages can get natural links from deals or coupon websites if your brand is recognizable or if you push the pages with an outreach campaign.

Usually, when ecommerce websites use specific event or holiday landing pages, such pages will have their own URLs—e.g., mysite.com/Boxing-Day-Sale. However, improper redirect handling such as no 301 redirects, or 301 redirects to the wrong pages, may lead to PageRank loss once the event is over. In such mishandlings, the pages are removed from the website.

Here are a couple of tips if you use separate URLs for holiday or shopping events:

  • Do not include years or any other time or date indicators in URLs. It is okay to include time indicators in page titles, descriptions, headings and main content.
  • When the event is over, redirect the event pages to the most appropriate sections of the website, or keep the URLs alive (but with changed content).
  • In the following years, you can “revive” the promotion-specific URLs a couple of weeks before each event.

A mixed approach
Whenever possible, I like to implement another tactic: update titles, descriptions, and headings while customizing the look and feel of the existing landing pages. Instead of having separate URLs for each event, your current landing pages (for example, your category pages) will become the event landing page.

Choose the most important categories on your website or choose the categories that will be promoted during a specific consumer holiday and customize their look and feel to match the event. Customization can be as simple as displaying a banner at the top of the main content area or adding a background image for the entire page, or it can be as complex as creating an entirely new event-themed layout.

You will not publish this new layout under a different URL. Instead, this themed look, feel, and messaging will be released on the regular category page URLs. For example, let’s say your Christmas 2018 promotion includes a 25% discount on all Cleansing products. Rather than creating the URL mysite.com/Christmas-Deals for this holiday, you will use its usual URL, mysite.com/Cleansing/. However, this page will be themed with a Christmas look and feel.

The main benefit of customizing the existing category pages for shopping events is that you will be able to build backlinks to category pages, more easily. Another benefit is that there will be no future redirect headaches. Also, if other websites want to link to your promotions, they will link to your category pages.

Once the holiday is over, go back to the usual, non-themed layout. You will also have to update titles, descriptions and page copy (a bit).

Tip: If your website gets image-rich snippets, you can “theme” the image thumbnails with an event or holiday-specific icon. Once the holiday/event is over, revert to the usual images. For instance, if you sell cameras, instead of this video thumbnail:

Figure 83 – Video listing in search results.

use a themed image:

Figure 84 – Personalizing the video thumbnails can lead to a better CTR.

Optimize for “gift card” related keywords
Last-minute shoppers often choose to buy e-gift cards instead of real products, to avoid shipping delays. If you are still debating using gift cards, consider the following:

  • 26.7% of the gift cards sold during December 2011 were sold between December 21 and 24, according to Giftango Corp.[18]
  • 57.3% of shoppers planned to buy a gift card in 2011.[19]
  • Gift cards were the most requested gift in 2012, with 59.8% of US shoppers wanting one.[20]
  • E-gift cards reach their recipients instantly (no delays, no shipping, and no hassles).

It makes sense to offer both e-gift cards (which are perfect for last-minute shoppers) and gift cards (great for those who do not know what to buy as gifts).

Target long-tail keywords
For ecommerce websites (especially those new on the market) it is more viable to start by targeting long-tail search queries and gradually progress towards more competitive head terms. Usually, carefully chosen torso and long-tail search queries tend to generate more qualified traffic and have less competition. However, keywords containing brand names may prove as competitive as head terms.

Targeting search queries that assist with conversion is a good tactic. Often, such search queries require content (interactive tools, comprehensive guides, etc.), but think of this content as a long-term investment. For example, targeting the search query “how to choose a digital camera” may require creating a camera finder tool. If you target “how to choose shaving cream” you will need to create an extensive (eventually interactive) and visually appealing resource specifically for that.

Here are just a few benefits of targeting long-tail keywords:

  • You will achieve organic search results fast.
  • It helps gathering insights about your customers.
  • It assists with improved paid search results (through better Quality Scores).

Figure 85 – This is the SERP for the query “how to choose shaving cream”. None of the top 10 results has a product finder or a product wizard. If you are in this niche, that is your opportunity.

In addition to focusing on long-tail search queries, you may need to avoid targeting head terms with very vague user intent. For example, let’s say you sell greeting cards. Would it be useful to rank for a keyword like “greeting” or “cards”? No, because you will not be able to identify the user intent behind these keywords. You will invest a lot to brand your business for those terms, and you will get a ton of traffic if you manage to rank at the top, but generic terms generate very few conversions, at a very high cost per conversion. Instead, you can start targeting keywords like “40th birthday greeting cards for dads”, perhaps on a blog post, or, if it is a popular search query, with a content-rich subcategory page.

As you can see, keyword research is far from being simple or fast. It is a process that cannot be fully automated, and human review is irreplaceable, especially when bucketing keywords for relevance to your business. After going through this section, you hopefully understood that performing keyword research without considering user intent is a bad idea.

During the next sections, we will find that keywords are part of almost every on-page SEO factor, from page titles to URLs and internal anchor text, and to product copy. However, for search engines to find and analyze keywords, they first have to find and reach the pages where those keywords are featured. Since ecommerce websites are a challenging crawling task for search engines, you need to optimize how search bots discover relevant URLs. This process is called crawl optimization, and it is the subject of the next section of the guide.

References:

CHAPTER 4

Crawl Optimization

Length: 6,918 words

Estimated reading time: 50 minutes

Chapter-Head-Chapter4

Crawl Optimization

Crawl optimization is aimed at helping search engines discover URLs in the most efficient manner. Relevant pages should be easy to reach, while less important pages should not waste the so-called “crawl budget” and should not create crawl traps. Crawl budget is defined as the number of URLs search engines can and want to crawl.

Search engines assign a crawl budget to each website, depending on the authority of the website. Generally, the authority of a site is somehow proportional to its PageRank.

The concept of crawl budget is essential for ecommerce websites because they usually comprise of a vast number of URLs—from tens of thousands to millions.

If the technical architecture puts the search engine crawlers (also known as robots, bots or spiders) in infinite loops or traps, the crawl budget will be wasted on pages that are not important for users or search engines. This waste may lead to important pages being left out of search engines’ indices.

Additionally, crawl optimization is where very large websites can take advantage of the opportunity to have more critical pages indexed and low PageRank pages crawled more frequently.[1]

The number of URLs Google can index increased dramatically after the introduction of their Percolator[2] architecture, with the “Caffeine” update.[3] However, it is still important to check what resources search engine bots request on your website and to prioritize crawling accordingly.

Before we begin, it is important to understand that crawling and indexing are two different processes. Crawling means just fetching files from websites. Indexing means analyzing the files and deciding whether they are worthy of inclusion. So, even if search engines crawl a page, they will not necessarily index it.

Crawling is influenced by several factors such as the website’s structure, internal linking, domain authority, URL accessibility, content freshness, update frequency, and the crawl rate settings in webmaster tools accounts.
Before detailing these factors, let’s talk about tracking and monitoring search engine bots.

Tracking and monitoring bots

Googlebot, Yahoo! Slurp, and Bingbot are polite bots,[4] which means that they will first obey the crawling directives found in robots.txt files, before requesting resources from your website. Polite bots will identify themselves to the web server, so you can control them as you wish. The requests made by bots are stored in your log files and are available for analysis.

Webmaster tools, such as the ones provided by Google and Bing, only uncover a small part of what bots do on your website—e.g., how many pages they crawl or bandwidth usage data. That is useful in some ways but is not enough.

For really useful insights, you have to analyze the traffic log files. From these, you will be able to extract information that can help identify large-scale issues.

Traditionally, log file analysis was performed using the grep command line with regular expressions. But, lately, there are also desktop and web-based solutions that will make this type of geek analysis easier and more accessible to marketers.

On ecommerce websites, monthly log files are usually huge—gigabytes or even terabytes of data. However, you do not need all the data inside the log files to be able to track and monitor search engine bots. You need just the lines generated by bot requests. This way you can significantly reduce the size of the log files from gigabytes to megabytes.

Using the following Linux command line (case sensitive) will extract just the lines containing “Googlebot”, from one log file (access_log.processed) to another (googlebot.log):
grep “Googlebot” access_log.processed > googlebot.log

To extract similar data for Bing and other search engines, replace “Googlebot” with other bot names.

Figure 86 – The log file was reduced from 162.5Mb to 1.4Mb.

Open the bot-specific log file with Excel, go to Data –> Text to Columns, and use Delimited with Space to enter the log file data into a table format like this one:

Figure 87 – The data is filtered by Status, to get a list of all 404 Not Found errors encountered by Googlebot.

Note: you can import only up to one million rows in Excel; if you need to import more, use MS Access or Notepad++.

To quickly identify crawling issues at category page levels, chart the Googlebot hits for each category. This is where the advantage of category-based navigation and URL structure comes in handy.

Figure 88 – It looks like the /bracelets/ directory needs some investigation because there are too few bot requests compared to the other directories.

By pivoting the log file data by URLs and crawl date, you can identify content that gets crawled less often:

Figure 89 – The dates the URLs have been fetched.

In this pivot table, you can see that although the three URLs are positioned at the same level in the hierarchy, URL number three gets crawled much more often than the other two. This is a sign that URL #3 is deemed more important.

Figure 90 – More external backlinks and social media mentions may result in an increased crawl frequency.

Here are some issues and ideas you should consider when analyzing bot behavior using log files:

  • Analyze server response errors and identify what generates those errors.
  • Discover unnecessarily crawled pages and crawling traps.
  • Correlate days since the last crawl with rankings; when you make changes on a page, make sure to re-crawl it; otherwise the updates won’t be considered for rankings
  • Discover whether products listed at the top of listings are crawled more often than products listed on component pages (paginated listings). Consider moving the most important products on the first page, rather than having them on component pages.
  • Check the frequency and depth of the crawl.

The goal of tracking bots is to:

  • Establish where the crawl budget is used.
  • Identify unnecessary requests (e.g., “Write a Review” links that open pages with the exact content except for the product name, e.g., mysite.com/review.php?pid=1, mysite.com/review.php?pid=2 and so on).
  • Fix the leaks.

Instead of wasting budget on unwanted URLs (e.g., duplicate content URLs), focus on sending crawlers to pages that matter for you and your users.

Another useful application of log-files is to evaluate the quality of backlinks. Rent links from various external websites and point them at pages with no other backlinks (product detail pages or pages that support product detail pages). Then, analyze the spider activity on those pages. If the crawl frequency increases, then that link is more valuable than a link that does not increase spider activity at all. An increase in crawling frequency on your pages suggests that the page you got the link from also gets often crawled, which means that the linking page has good authority. Once you identified good opportunities, work to get natural links from those websites.

Flat website structure

If there are no other technical impediments to crawling large websites (e.g., crawlable facets or infinite spaces[5]), a flat website architecture can help crawling by allowing search engines to reach deep pages in very few hops, therefore using the crawl budget very efficiently.

Pagination—or, to be more specific, de-pagination—is one way to flatten your website architecture. We will discuss pagination later, in the Listing Pages section.

For more information on flat website architecture, please refer to the section titled The Concept of Flat Architecture in the Site Architecture section.

Accessibility

I will refer to accessibility in terms of optimization for search engines rather than optimization for users.

Accessibility is probably a critical factor for crawling. Your crawl budget is dictated by how the server responds to bot traffic. If the technical architecture of your website makes it impossible for search engine bots to access URLs, then those URLs will not be indexed. URLs that are already indexed but are not accessible after a few unsuccessful attempts may be removed from search engine indices.
Google crawls new websites at a low rate, then gradually increases up to the level it does not create accessibility issues for your users or your server.

So, what prevents URLs and content from being accessible?

DNS and connectivity issues
Use http://www.intodns.com/ to check for DNS issues. Everything that comes in red and yellow needs your attention (even if it is an MX record).

Figure 91 – Report from intodns.com.

Using Google and Bing webmaster accounts, fix all the issues related to DNS and connectivity:

Figure 92 – Bing’s Crawl Information report.

Figure 93 – Google’s Site Errors report in the old GSC.[6]

One DNS issue you may want to pay attention to is related to wildcard DNS records, which means the web server responds with a 200 OK code for any subdomain request, even for ones that do not exist. An even more severe problem related to DNS is unrecognizable hostnames, which means the DNS lookup fails when trying to resolve the domain name.

One large retailer had another misconfiguration. Two of its country code top-level domains (ccTLDs)—the US (.com) and the UK (.co.uk)—resolved to the same IP. If you have multiple ccTLDs, host them on different IPs (ideally from within the country you target with the ccTLD), and check how the domain names resolve.

Needless to say, if your web servers are down, no one will be able to access the website (including search engine bots). You can keep an eye on the availability of your site using server monitoring tools like Monitor.Us, Scoutt or Site24x7.

Host load
Host load represents the maximum number of simultaneous connections a web server can handle. Every page load request from Googlebot, Yahoo! Slurp, or Bingbot generates a connection with your web server. Since search engines use distributed crawling from multiple machines at the same time, you can theoretically reach the limits of the connections, and your website will crash (especially if you are on a shared hosting plan).

Use tools such as the one found at loadimpact.com to check how many connections your website can handle. Be careful though; your site can become unavailable or even crash during this test.

Figure 94 – If your website loads under two seconds when used by a large number of visitors, you should be fine – graph generated by loadimpact.com.

Page load time
Page load time is not only a crawling factor but also a ranking and usability factor. Amazon reportedly increased its revenue by 1% for every 100ms of load time improvement,[7] and Shopzilla increased revenue by seven to 12% by decreasing the page load time by five seconds.[8]

There are plenty of articles about page load speed optimization, and they can get pretty technical. Here are a few pointers to summarize how you can optimize load times:

  • Defer loading of images until needed for display in the browser.
  • Use CSS sprites.
  • Use http2 protocols.

Figure 95 – Amazon uses CSS sprites to minimize the number of requests to their server.

Figure 96 – Apple used sprites for their main navigation.

  • Use content delivery networks for media files and other files that do not update too often.
  • Implement database and cache (server-side caching) optimization.
  • Enable HTTP compression and implement conditional GET.
  • Optimize images.
  • Use expires headers.[9]
  • Ensure fast and responsive design to decrease the time to first byte (TTFB). Use http://webpagetest.org/ to measure TTFB. There seems to be a clear correlation between lower rankings and increased TTFB.[10]

If your URLs load slowly search engines may interpret this as a connectivity issue, meaning they will give up crawling the troubled URLs.

The time spent by Google on a page seems to influence the number of pages it crawls. The less time to download a page, the more pages are crawled.

Figure 97 – The correlation between the time spent downloading a page and the pages crawled per day seems apparent in this graph.

Broken links
This is a no-brainer. When your internal links are broken, crawlers will not be able to find the correct pages. Run a full crawl on the entire website with the crawling tool of your choice and fix all broken URLs. Also, use the webmaster tools provided by search engines to find broken URLs.

HTTP caching with Last-Modified/If-Modified-Since and E-Tag headers
In reference to crawling optimization, the term “cache” refers to a stored page in a search engine index. Note that caching is a highly technical issue, and improper caching settings may make search engines crawl and index a website chaotically.

When a search engine requests a resource on your website, it first requests your web server to check the status of that resource. The server will reply with a header response. Based on the header response, search engines will decide to download the resource or to skip it.

Many search engines check whether the resource they request has changed since they last crawled it. If it has, they will fetch it again—if not, they will skip it. This mechanism is referred to as conditional GET. Bing confirmed that it uses the If-Modified-Since header,[11] and Google does as well.[12]

Below is the header response for a newly discovered page that supports the If-Modified-Since header when a request is made to access it.

Figure 98 – Use the curl command to get the last modified date.

When the bot requests the same URL the next time, it will add an If-Modified-Since header request. If the document has not been modified, it will respond with a 304 status code (Page Not Modified):

Figure 99 – A 304 response header

If-Modified-Since will return 304 Not Modified if the page has not been changed. If it has been modified, the header response will be 200 OK, and the search engine will fetch the page again.

The E-Tag header works similarly but is more complicated to handle.

If your ecommerce platform uses personalization, or if the content on each page changes frequently, it may be more challenging to implement HTTP caching, but even dynamic pages can support If-Modified-Since.[13]

Sitemaps

There are two major types of sitemaps:

You can also submit Sitemaps in the following format: plain text files, RSS, or mRSS.
If you experience crawling and indexing issues, keep in mind that sitemaps are just a patch for more severe problems such as duplicate content, thin content or improper internal linking. Creating sitemaps is a good idea, but it will not fix those issues.

HTML sitemaps

HTML sitemaps are a form of secondary navigation. They are usually accessible to people and bots through a link placed at the bottom of the website, in the footer.

A usability study on a mix of websites, including ecommerce websites, found that people rarely use HTML sitemaps. In 2008, only 7% of the users turned to the sitemap when asked to learn about a site’s structure,[14] down from 27% in 2002. Nowadays, the percentage is probably even less.

Still, HTML sitemaps are handy for sending crawlers to pages at the lower levels of the website taxonomy and for creating flat internal linking.

Figure 100 – Sample flat architecture.

Here are some optimization tips for HTML sitemaps:

Use segmented sitemaps
When optimizing HTML sitemaps for crawling, it is important to remember that PageRank is divided between all the links on a page. Splitting the HTML sitemap into multiple smaller parts is a good way to create more user and search engine friendly pages for large websites, such as ecommerce websites.

Instead of a huge sitemap page that links to almost every page on your website, create a main sitemap index page (e.g., sitemap.html) and link from it to smaller sitemap component pages (sitemap-1.html, sitemap-2.html, etc.).

You can split the HTML sitemaps based on topics, categories, departments, or brands. Start by listing your top categories on the index page. The way you split the pages depends on the number of categories, subcategories, and products in your catalog. You can use the “100 links per page” rule below as a guideline, but do not get stuck on this number, especially if your website has good authority.

If you have more than 100 top-level categories, you should display the first 100 of them on the site map index page and the rest on additional sitemap pages. You can allow users and search engines to navigate the sitemap using previous and next links (e.g., “see more categories”).

If you have fewer than 100 top-level categories in the catalog, you will have room to list several important subcategories as well, as depicted below:

Figure 101- A clean HTML sitemap example.

The top-level categories in this site map are Photography, Computers & Solutions and Pro Audio. Since this business has a limited number of top-level categories, there is room for several subcategories (Digital Cameras, Laptops, Recording).

Do not link to redirects
The URLs linked from sitemap pages should land crawlers on the final URLs, rather than going through URL redirects.

Enrich the sitemaps
Adding a bit of extra data by annotating links with info is good for users and can provide some context for search engines as well. You can add data such as product thumbnails, customer ratings, manufacturer names, and so on.

These are just some suggestions for HTML sitemaps so that you can make the pages easier for people to read and very lightly linked for crawlers. However, the best way to help search engines discover content on your website is to feed them a list of URLs in different file formats. One such file format is XML.

XML Sitemaps

Modern ecommerce platforms should auto-generate XML Sitemaps, but many times the default output file is not optimized for crawling and analysis. It is therefore important to manually review and optimize the automated output or generate the Sitemaps on your own rules.

Unless you have concerns about competitors spying on your URL structure, it is preferable to include the path of the XML Sitemap file within the robots.txt file.

Robots.txt is requested by search engines every time they start a new crawling session on your website. It is analyzed to see if it was modified since the last crawl. If it wasn’t modified, then search engines will use the existing robots.txt cached file to determine which URLs can be crawled.

If you do not specify the location of your XML Sitemap inside robots.txt, then search engines will not know where to find it (except if you submitted it within the webmaster accounts). Submitting to Google Search Console or Bing Webmaster allows access to more insights, such as how many URLs have been submitted, how many are indexed, and what eventual errors are present in the Sitemap.

Figure 102 – If you have an almost 100% indexation rate you probably do not need to worry about crawl optimization.

Using XML Sitemaps seems to have an accelerating effect on the crawl rate:

“At first, the number of visits was stabilized at a rate of 20 to 30 pages per hour. As soon as the sitemap was uploaded through Webmaster Central, the crawler accelerated to approximately 500 pages per hour. In just a few days it reached a peak of 2,224 pages per hour. Where at first the crawler visited 26.59 pages per hour on average, it grew to an average of 1,257.78 pages per hour which is an increase of no less than 4,630.27%”.[15]

Here are some tips for optimizing XML Sitemaps for large websites:

  • Add only URLs that respond with 200 OK. Too many errors and search engines will stop trusting your Sitemaps. Bing has,

“a 1% allowance for dirt in a Sitemap. Examples of dirt are if we click on a URL and we see a redirect, a 404 or a 500 code. If we see more than a 1% level of dirt, we begin losing trust in the Sitemap”.[16]

Google is less stringent than Bing; they do not care about the errors in the Sitemap.

  • Have no links to duplicate content and no URLs that canonicalize to different URLs—only to “end state” URLs.
  • Place videos images, news, and mobile in separate Sitemaps. For videos, you can use video sitemaps, but mRSS formatting is supported as well.
  • Segment the Sitemaps by topic or category, and by subtopic or subcategory. For example, you can have a sitemap for your camping category – sitemap_camping.xml, another one for your Bicycles category – sitemap_cycle.xml, and another one for the Running Shoes category – sitemap_run.xml. This segmentation does not directly improve organic rankings, but it will help identify indexation issues at granular levels.
  • Create separate Sitemap files for product pages — segment by the lowest level of categorization.
  • Fix Sitemap errors before submitting your files to search engines. You can do this within your Google Search Console account, using the Test Sitemap feature:

Figure 103 – The Test Sitemap feature in Google Search Console.

  • Keep language-specific URLs in separate Sitemaps.
  • Do not assign the same weight to all pages (your scoring can be based on update frequency or other business rules).
  • Auto-update the Sitemaps whenever important URLs are created.
  • Include only URLs that contain essential and important filters (see section Product Detail Pages).

You probably noticed a commonality within these tips: segmentation. It is a good idea to split your XML files as much as you can without overdoing it (e.g., just 10 URL per file), so you can identify and fix indexation issues more easily.[17]

Keep in mind that sitemaps, either XML or HTML, should not be used as a substitute for poor website architecture or other crawlability issues, but only as a backup. Make sure that there are other paths for crawlers to reach all important pages on your website (e.g., internal contextual links).

Here are some factors that can influence the crawl budget:
Popularity
Crawlers will request pages more frequently if they find more external and internal links pointing to them. Most ecommerce websites experience challenges building links to category and product detail pages, but this has to be done. Guest posting, giveaways, link bait, evergreen content, outright link requests within confirmation emails, ambassador programs, and perpetual holiday category pages are just some of the tactics that can help with link development.

Crawl rate settings
You can alter (usually decrease) the crawl rate of Googlebot using your Google Search Console account. However, changing the rate is not advisable unless the crawler slows down your web server.
With Bing’s Crawl Control feature you can even set up day parting.

Figure 104 – Bing’s Crawl Control Interface.

Fresh content
Updating content on pages and then pinging search engines (i.e., by creating feeds for product and category pages) should get the crawlers to the updated content relatively quickly.

If you update fewer than 300 URLs per month, you can use the Fetch as Google feature inside your Google Search Console account to get the updated URLs re-crawled in a snap. Also, you can regularly (e.g., weekly) create and submit a new XML Sitemap just for the updated or for the new pages.

There are several ways to keep your content fresh. For example, you can include an excerpt of about 100 words from related blog posts on product detail pages. Ideally, the excerpt should include the product name and links to parent category pages. Every time you mention a product in a new blog post update the excerpt of the product detail page, as well.

You can even include excerpts from articles that do not directly mention the product name if the article is related to the category in which the product can be classified.

Figure 105 – The “From Our Blog” section keeps this page updated and fresh.

Another great tactic to keep the content fresh is to continuously generate user reviews, product questions and answers, or other forms of user-generated content.

Figure 106 – Ratings and reviews are a smart way to keep pages updated, especially for products in high demand.

Domain authority
The higher your website’s domain authority, the more visits search engine crawlers will pay. Your domain authority increases by pointing more external links to your website—this is a lot easier said than done.

RSS feeds
RSS feeds are one of the fastest ways to notify search engines of new products, categories, or other types of fresh content on your website. Here’s what Duane Forrester (former Bing’s Webmaster senior product manager) said in the past about RSS feeds:

“Things like RSS are going to become a desired way for us to find content … It is a dramatic cost savings for us”.[18]

You can get search engines to crawl the new content within minutes of publication with the help of RSS. For example, if you write content that supports category and product detail pages and if you link smartly from these supporting pages, search engines will request and crawl the linked-to product and categories URLs as well.

Figure 107 – Zappos has an RSS feed for brand pages. Users (and search engines) are instantly notified every time Zappos adds a new product from a brand.

Guiding crawlers

The best way to avoid wasting crawl budget on low-value-add URLs is to avoid creating links to those URLs, in the first place. However, that is not always an option. For example, you have to allow people to filter products based on three or more product attributes. Alternatively, you may want to allow users to email to a friend from product detail pages. Or, you have to give users the option to write product reviews. If you create unique URLs for “Email to a Friend “ links, for example, you may end up creating duplicate content.

Figure 108 – The URLs in the image above are near-duplicates. However, these URLs do not have to be accessible to search engines. Block the email-friend.php file in robots.txt

These “Email to a Friend” URLs will most likely lead to the same web form, and search engines will unnecessarily request and crawl hundreds or thousands of such links, depending on the size of your catalog. You will waste the crawl budget by allowing search engines to discover and crawl these URLs.

You should control which links are discoverable by search engine crawlers and which are not. The more unnecessary requests for junk pages a crawler makes, the fewer chances to get to more important URLs.

Crawler directives can be defined at various levels, in this priority:
Site-level, using robots.txt.

  • Page-level, with the noindex meta tag and with HTTP headers.
  • Element-level, using the nofollow microformat.

Site-level directives overrule page-level directives, and page-level directives overrule element-level directives. It is important to understand this priority because for a page-level directive to be discovered and followed, the site-level directives should allow access to that page. The same applies to element-level and page-level directives.

On a side note, if you want to keep content as private as possible, one of the best ways is to use server-side authentication to protect areas.

Robots.txt

Although robots.txt files can be used to control crawler access, the URLs disallowed with robots.txt may still end up in search engines indices because of external backlinks pointing to the “robotted” URLs. This suggests that URLs blocked with robots.txt can accumulate PageRank. However, URLs blocked with robots.txt will not pass PageRank, since search engines cannot crawl and index the content and the links on such pages. The exception is if the URLs were previously indexed, in which case they will pass PageRank.

It is interesting to note that pages with Google+ buttons may be visited by Google when someone clicks on the plus button, ignoring the robots.txt directives.[19]

One of the biggest misconceptions about robots.txt is that it can be used to control duplicate content. The fact is, there are better methods for controlling duplicate content, and robots.txt should only be used to control crawler access. That being said, there may be cases where one does not have control over how the content management system generates the content, or cases when one cannot make changes to pages generated on the fly. In such situations, one can try as a last resort to control duplicate content with robots.txt.

Every ecommerce website is unique, with its own specific business needs and requirements, so there is no general rule for what should be crawled and what should not. Regardless of your website particularities, you will need to manage duplicate content by either using rel=“canonical” or HTTP headers.

While tier-one search engines will not attempt to “add to cart” and will not start a checkout process or a newsletter sign-up on purpose, coding glitches may trigger them to attempt to access unwanted URLs. Considering this, here are some common types of URLs you can block access to:

Shopping cart and checkout pages
Add to Cart, View Cart, and other checkout URLs can safely be added to robots.txt.

If the View Cart URL is mysite.com/viewcart.aspx, you can use the following commands to disallow crawling:

User-agent: *
# Do not crawl view cart URLs
Disallow: *viewcart.aspx
# Do not crawl add to cart URLs
Disallow: *addtocart.aspx
# Do not crawl checkout URLs
Disallow: /checkout/

The above directives mean that all bots are forbidden to crawl any URL that contains viewcart.aspx or addtocart.aspx. Also, all the URLs under the /checkout/ directory are off-limits.

Robots.txt allows limited use of regular expressions to match URL patterns, so your programmers should be able to play with a large spectrum of URLs. When you use regular expressions, the star symbol means “anything”, the dollar sign means “ends with”, and the caret sign means “starts with”.

User account pages
Account URLs such as Account Login can be blocked as well:
User-agent: *
# Do not crawl login URLs
Disallow: /store/account/*.aspx$

The above directive means that all pages under the /store/account/ directory will not be crawled.

Below are some other types of URLs that you can consider blocking.

Figure 109 – These are some other types of pages that you can consider blocking.

A couple of notes about the resources highlighted in yellow:

  • If you are running an ecommerce on WordPress, you may want to let search engine bots crawl the URLs under the tag directory; there were times when you had to block the tag pages, but not anymore.
  • The /includes/ directory should not contain scripts that are used for rendering content on pages. Block it only if you host the scripts necessary to create the undiscoverable links inside /includes/.
  • The same for the /scripts/ and /libs/ directories – do not block them if they contain resources necessary for rendering content.

Duplicate or near duplicate content issues such as pagination and sorting are not optimally addressed with robots.txt.
Before you upload the robots.txt file, I recommend testing it against your existing URLs. First, generate the list of URLs on your website using one of the following methods:

  • Ask for help from your programmers.
  • Crawl the entire website with your favorite crawler.
  • Use weblog files.

Then, open this list in a text editor that allows searching by regular expressions. Software like RegexBuddy, RegexPal or Notepad++ are good choices. You can test the patterns you used in the robots.txt file using these tools, but keep in mind that you might need to slightly rewrite the regex pattern you used in the robots.txt, depending on the software you use.

Let’s say that you want to block crawlers’ access to email landing pages, which are all located under the /ads/ directory. Your robots.txt will include these lines:

User-agent: *
# Do not crawl view cart URLs
Disallow: /ads/
Using RegexPal, you can test the URLs list using this simple regex: /ads/

Figure 110 – RegexPal automatically highlights the matched pattern.

If you work with large files that contain hundreds of thousands of URLs, use Notepad++ to match URLs with regular expressions, because Notepad++ can easily handle large files.

For example, let’s say that you want to block all URLs that end with .js. The robots.txt will include this line:

Disallow: /*.js$
To find which URLs in your list match the robots.txt directives using Notepad++ you will input “\.js” in the “Find what” field and then, use the Regular expression Search Mode:

Figure 111 – Regular expression search more in Notepad++

Skimming through the highlighted matching URLs marked with yellow can clear doubts about which URLs will be excluded with robots.txt.

When you need to block crawlers from accessing media such as videos, images or .pdf files, use the X-Robots-Tag HTTP header[20] instead of the robots.txt file.
However, remember, if you want to address duplicate content issues for non-HTML documents, use rel=“canonical” headers.[21]

The exclusion parameter

With this technique, you selectively add a parameter (e.g., crawler=no) or a string (e.g., ABCD-9) to the URLs that you want to be inaccessible, and then you block that parameter or string with robots.txt.

First, decide which URLs you want to block.

Let’s say that you want to control the crawling of the faceted navigation by not allowing search engines to crawl URLs generated when applying more than one filter value within the same filter (also known as multi-select). In this case, you will add the crawler=no parameter to all URLs generated when a second filter value is selected on the same filter.

If you want to block bots when they try to crawl a URL generated by applying more than two filter values on different filters, you will add the crawler=no parameter to all URLs generated when a third filter value is selected, no matter which options were chosen, nor the order they were chosen. Here’s a scenario for this example:

The crawler is on the Battery Chargers subcategory page.
The hierarchy is: Home > Accessories > Battery Chargers
The page URL is: mysite.com/accessories/motorcycle-battery-chargers/

Then, the crawler “checks” one of the Brands filter values, Noco. This is the first filter value, and therefore you will let the crawler fetch that page.
The URL for this selection does not contain the exclusion parameter:
mysite.com/accessories/motorcycle-battery-chargers?brand=noco

The crawler now checks one of the Style filter values, cables. Since this is the second filter value applied, you will still let the crawler access the URL.
The URL still does not contain the exclusion parameter. It contains just the brand and style parameters:
mysite.com/accessories/motorcycle-battery-chargers?brand=noco&style=cables

Now, the crawler “selects” one of the Pricing filter values, the number 1. Since this is the third filter value, you will append the crawler=no to the URL.
The URL becomes:
mysite.com/accessories/motorcycle-battery-chargers?brand=noco&style=cables&pricing=1&crawler=no

If you want to block the URL above, the robots.txt file will contain:User-agent: *
Disallow: /*crawler=no

The method described above prevents the crawling of facet URLs when more than two filters values have been applied, but it does not allow specific control over which filters are going to be crawled and which ones not. For example, if the crawler “checks” the Pricing options first, the URL containing the pricing parameter will be crawled. We will discuss faceted navigation in detail later on.

URL parameters handling

URL parameters can cause crawl efficiency problems as well as duplicate content issues. For example, if you implement sorting, filtering, and pagination with parameters, then you are likely to end up with a large number of URLs, which will waste crawl budget. In a video about parameters handling, Google shows[22] how 158 products on googlestore.com generated an astonishing 380,000 URLs for crawlers.

Controlling URL parameters within Google Search Console and Bing Webmaster Tools can improve crawl efficiency, but it will not address the causes of duplicate content. You will still need to fix canonicalization issues, at the source. However, since ecommerce websites use multiple URL parameters, controlling them correctly with webmaster tools may prove tricky and risky. Unless you know what you are doing, you are better off using either a conservative setup or the default settings.

URL parameters handling is mostly used for deciding which pages to index and which page to canonicalize to.

One advantage of handling URL parameters within webmaster accounts is that page-level directives (i.e., rel=“canonical” or meta noindex) will still apply as long as the pages containing such directives are not blocked with robots.txt or with other methods. However, while it is possible to use limited regular expressions within robots.txt to prevent the crawling of URLs with parameters, robots.txt will overrule page-level and element-level directives.

Figure 112 – A Google Search Console notification regarding URL parameters.

Sometimes there are cases where you do not have to play with the URL parameters settings. In this screenshot, you can see a message saying that Google has no issues with categorizing your URL parameters. If Google can crawl the entire website without difficulty, you can leave the default settings as they are. If you want to set up the parameters, click on the Configure URL parameters link.

Figure 113 – This screenshot is for an ecommerce website with fewer than 1,000 SKUs. You can see how the left navigation generated millions of URLs.

In the previous screenshot, the limit key (used for changing the number of items listed on the category listing page) generated 6.6 million URLs when combined with other possible parameters. However, because this website has strong authority, it gets a lot of attention and love from Googlebot, and it does not have crawling or indexing issues.

When handling parameters, the first thing you want to decide is which parameters change the content (active parameters) and which ones do not (passive parameters). You are best to do this with your programmers because they will know the usage of parameters the best. Parameters that do not affect how content is displayed on a page (e.g., user tracking parameters) are a safe target for exclusion.

Although Google by itself does a good job at identifying parameters that do not change content, it is still worthwhile to set them manually.

To change the settings for such parameters, click Edit:

Figure 114 – Controlling URL parameters within Google Search Console.

In our example, the parameter utm_campaign was used to track the performance of internal promotions, and it does not change the content on the page. In this scenario, choose “No: Does not affect page content (ex: track usage)”.

Figure 115 – Urchin Tracking Module parameters (widely known as UTMs), can safely be consolidated to the representative URLs.

To make sure you are not blocking the wrong parameters, test the sample URLs by loading them in the browser. Load the URL and see what happens if you remove the tracking parameters. If the content does not change, then it can be safely excluded.

On a side note, tracking internal promotions with UTM parameters is not ideal. UTM parameters are designed for tracking campaigns outside your website. If you want to track the performance of your internal marketing banners, then use other parameter names or use event tracking.

Some other common parameters that you may consider for exclusion are session IDs, UTM tracking parameters (utm_source, utm_medium, utm_term, utm_content, and utm_campaign) and affiliate IDs.
A word of caution is necessary here, and this recommendation comes straight from Google.[23]

“Configuring site-wide parameters may have severe, unintended effects on how Google crawls and indexes your pages. For example, imagine an ecommerce website that uses storeID in both the store locator and to look up a product’s availability in a store:
/store-locator?storeID=123
/product/foo-widget?storeID=123
If you configure storeID to not be crawled, both the /store-locator and /foo-widget paths will be affected. As a result, Google may not be able to index both kind of URLs, nor show them in our search results. If these parameters are used for different purposes, we recommend using different parameter names”.

In the scenario above, you can keep the store location in a cookie.

Things get more complicated when parameters change how the content is displayed on a page.

One safe setup for content-changing parameters is to suggest to Google how the parameter affects the page (e.g., sorts, narrows/filters, specifies, translates, paginates, others), and use the default option Let Google decide. This approach will allow Google to crawl all the URLs that include the targeted parameter.

Figure 116 – A safe setup it to let Google know that a parameter changes the content, and let Google decide what to do with the parameter.

In the previous example, I knew that the mid parameter changes the content on the page, so I pointed out to Google that the parameter sorts items. However, when it came to deciding which URLs to crawl, I let Google do it.

The reason I recommend letting Google decide is because of the way Google chooses canonical URLs: it groups duplicate content URLs into clusters based on internal linking (PageRank), external link popularity, and content. Then Google finds the best URL to surface in search results, for each cluster of duplicate content. Since Google does not share the complete link graph of your website, you will not know which URLs are linked the most, so you may not always be able to choose the right URL to canonicalize to

  1. Google Patent On Anchor Text And Different Crawling Rates, http://www.seobythesea.com/2007/12/google-patent-on-anchor-text-and-different-crawling-rates/
  2. Large-scale Incremental Processing Using Distributed Transactions and Notifications, http://research.google.com/pubs/pub36726.html
  3. Our new search index: Caffeine, http://googleblog.blogspot.ca/2010/06/our-new-search-index-caffeine.html
  4. Web crawler, http://en.wikipedia.org/wiki/Web_crawler#Politeness_policy
  5. To infinity and beyond? No!, http://googlewebmastercentral.blogspot.ca/2008/08/to-infinity-and-beyond-no.html
  6. Crawl Errors: The Next Generation, http://googlewebmastercentral.blogspot.ca/2012/03/crawl-errors-next-generation.html
  7. Make Data Useful, http://www.scribd.com/doc/4970486/Make-Data-Useful-by-Greg-Linden-Amazon-com
  8. Shopzilla’s Site Redo – You Get What You Measure, http://www.scribd.com/doc/16877317/Shopzilla-s-Site-Redo-You-Get-What-You-Measure
  9. Expires Headers for SEO: Why You Should Think Twice Before Using Them, http://moz.com/ugc/expires-headers-for-seo-why-you-should-think-twice-before-using-them
  10. How Website Speed Actually Impacts Search Ranking, http://moz.com/blog/how-website-speed-actually-impacts-search-ranking
  11. Optimizing your very large site for search — Part 2, http://web.archive.org/web/20140527160343/http://www.bing.com/blogs/site_blogs/b/webmaster/archive/2009/01/27/optimizing-your-very-large-site-for-search-part-2.aspx
  12. Matt Cutts Interviewed by Eric Enge, http://www.stonetemple.com/articles/interview-matt-cutts-012510.shtml
  13. Save bandwidth costs: Dynamic pages can support If-Modified-Since too, http://sebastians-pamphlets.com/dynamic-pages-can-support-if-modified-since-too/
  14. Site Map Usability, http://www.nngroup.com/articles/site-map-usability/
  15. New Insights into Googlebot, http://moz.com/blog/googlebot-new-insights
  16. How Bing Uses CTR in Ranking, and more with Duane Forrester, http://www.stonetemple.com/search-algorithms-and-bing-webmaster-tools-with-duane-forrester/
  17. Multiple XML Sitemaps: Increased Indexation and Traffic, http://moz.com/blog/multiple-xml-sitemaps-increased-indexation-and-traffic
  18. How Bing Uses CTR in Ranking, and more with Duane Forrester, http://www.stonetemple.com/search-algorithms-and-bing-webmaster-tools-with-duane-forrester/
  19. How does Google treat +1 against robots.txt, meta noindex or redirected URL, https://productforums.google.com/forum/#!msg/webmasters/ck15w-1UHSk/0jpaBsaEG3EJ
  20. Robots meta tag and X-Robots-Tag HTTP header specifications, https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag
  21. Supporting rel=”canonical” HTTP Headers, http://googlewebmastercentral.blogspot.ca/2011/06/supporting-relcanonical-http-headers.html
  22. Configuring URL Parameters in Webmaster Tools, https://www.youtube.com/watch?v=DiEYcBZ36po&feature=youtu.be&t=1m50s
  23. URL parameters, https://support.google.com/webmasters/answer/1235687?hl=en

CHAPTER 5

Internal Linking Optimization

Length: 11,404 words

Estimated reading time: 1 hour, 20 minutes

Chapter-Head-Chapter5

The importance of external links for rankings is a well-documented SEO fact and part of conventional SEO wisdom. However, internal links can also impact rankings.

Links, either internal or from external websites, are the primary way for site visitors and search engines to discover content.

If a page does not have incoming internal links, not only may that page not be accessible to search engines for crawling and indexing, but even if it gets indexed, the page will be deemed as less valuable (unless a lot of external links point to it). Examples of pages without internal links are products detail pages that are accessible only after an internal site search, or entire catalogs being available only to logged-in members.

On the other hand, if a page does not link out to other internal pages, you send search engine robots into a dead-end.

In between these two extremes, internal links can lead crawlers into traps or to unwanted URLs that contain “thin”, duplicate, or near-duplicate content. Internal links can also put crawlers into circular referencing.

When you optimize the internal linking, remember that websites are for users not for search engines. The reason links exist in the first place is to help users navigate and find what they want, quickly and easily. Therefore, consider an approach that balances links available only to users with links that are available to bots. Build the internal linking for users, and then accommodate the search engines.

Ecommerce websites come with an interesting advantage: a large number of pages creates a large number of internal links. The larger the website and the more links pointing to a page, the more influential that page is. Strangely enough, although SEOs typically know that the more links you point to a page, the more authority the page receives, many SEOs still focus on getting links from external websites first.

However, why not optimize the lowest hanging fruit first, the internal links? When you optimize your internal linking architecture, you do not need to hunt for external backlinks. You need to increase the relevance and authority of key pages on your website by creating quality content that attracts organic traffic and links, and by interlinking pages thematically.

Let’s see how ecommerce websites can take advantage of internal linking to boost relevance, avoid or mitigate duplicate content issues, and build long-tail anchor text to rank for natural language search queries.

Crawlable and uncrawlable links

Before we move forward, a quick and important note. Do not blindly implement any of the techniques discussed in this section. Decide which solution best suits your website based on your business needs and specific situation. If you are in doubt, get help from an experienced consultant before attempting to make changes.

A crawlable link is a link that is accessible to search engine crawlers when they request a web resource from your web server.

An uncrawlable link (undiscoverable link) is a link that search engines cannot discover after parsing the HTML code and after rendering the page. However, that uncrawlable link is still accessible to users, in the browser.

Uncrawlable links can be created client-side (in the browser) using JavaScript or AJAX and by blocking access to the resources required to generate the URLs, using robots.txt. Uncrawlable links are created on purpose and are not the same thing as broken links, which occur accidentally. Also, uncrawlable links are not hidden links (e.g., off-screen text positioned with CSS or white text on white background).

Because the main goal of ecommerce websites is to sell online, they must be useful and present information in an easy to find manner. Imagine an ecommerce website that does not allow users to sort or filter 3,000 items in a single category. However, this sorting and filtering generates URLs that present no value for search engines and, in some cases, limited value for users. Since the current crawling and browsing technologies are dependent on clicks and links, these issues are here to stay for a while.

However, why do ecommerce websites generate overhead URLs, and why are search engines able to access such URLs? There is plenty of reasons:

  • URLs with tracking parameters are needed for personalization or web analysis.
  • Faceted navigation can generate a large number of overhead URLs if it is not properly controlled.
  • A/B testing can also create overhead URLs.
  • If the order of URL parameters is not enforced, you will generate overhead URLs.

So, how do you approach overhead URLs?

A compromise for offering a great user experience while helping search engines crawl complex websites is to make the overhead links undiscoverable for robots, while the links are still available to users, in the browser.
For example, a link that is important for users but not important for search engines can be created with uncrawlable JavaScript.

Before we look at some examples, keep the following in mind:

  • Decide whether there is an indexing or crawling issue to be addressed in the first place. Are 90%+ of your URLs indexed? If yes, then maybe you just need to build some links to the other 10% of pages. Or maybe you can add more content to get those 10% pages indexed.
  • Would you hinder user experience by blocking access to content with JavaScript?
  • Hiding links from robots may qualify as cloaking, depending on the reason for the implementation. Here’s a quote from Google:

“If the reason is for spamming malicious, or deceptive behavior—or even showing different content to users than to Googlebot—then this is high-risk”.[1]

Please note that from an SEO perspective, I advocate using uncrawlable links only for the following reasons:

  • To create better crawl paths for helping search engines reach important pages on your website.
  • to preserve crawl budget and other resources (e.g., bandwidth)
  • to avoiding internal linking traps (i.e., infinite loops).

I do not endorse this tactic if you want to spam or to mislead unsuspecting visitors. There are a couple of methods for keeping crawlers away from overhead URLs.

iframes

Let’s say you do not want any links generated by the faceted navigation to be visible to search engines. In this scenario, you will embed the faceted navigation in an <iframe> and block the bots’ access to that iframe using robots.txt.

The advantage of using iframes is that it is fast to implement and remove if the results are not satisfactory. One disadvantage is that you cannot granularly control which facets can be indexed; once the iframe source is blocked with robots.txt, no facet will be crawled.

Figure 117 – This screenshot highlights a classic implementation of faceted navigation (left-hand navigation). This type of navigation often creates bot traps.

Intermediary directory/file

The directory implementation requires including a directory in the URL structure and then blocking that directory in robots.txt.

Let’s say that the original facet URL is:

/bedding?CatalogId=110504+423949610&CatRefId=105024&pagSort=0

Instead of linking to the URL above, you will link through an intermediary directory, which is then disallowed by robots.txt. The URL contains the /facets/ directory:

/facets/bedding?CatalogId=110504+423949610&CatRefId=105024&pagSort=0

Your robots.txt will disallow everything placed under this directory:

User-agent: *
# Do not crawl facet URLs
Disallow: /facets/

Instead of a directory, you can also use a file in the URL. The controlled URL will include the facets.php file, which will be blocked in robots.txt.

If this was the original facet URL:

/bedding?CatalogId=110504+423949610&CatRefId=105024&pagSort=0

Using the robbotted file, this is how the new URL will look like:

/bedding?facets.php&CatalogId=110504+423949610&CatRefId=105024&pagSort=0

User-agent: *
# Do not crawl faceted URLs
Disallow: *facets.php*

JavaScript and AJAX

Using JavaScript or AJAX is another method used to control access to internal links, to silo the website, and to avoid duplicate content issues, at the source. Search engines can already execute some JavaScript statements such as document.write(). They can also render AJAX to discover content and URLs but only to some extent,[2] and there are limitations to what they can understand. However, keep in mind that the major search engines evolve rapidly, and in a matter of months they might be able to execute complex JavaScript.

While most of the time SEOs want to make AJAX content more accessible to search engines, when you want to control internal links, you aim for the opposite. You will use JavaScript or AJAX to generate the links in the browser (client-side) rather than in the raw HTML. Depending on the implementation, those URLs may not be available to bots when they fetch the HTML and render the page.

One application of this method is to generate clean internal tracking URLs in the HTML, and add the user tracking parameters on demand, in the browser.

Let’s say you have three links on the homepage and all point to the same URL, but each link is in a different location on the page. The first link is in the primary navigation, the second one is on a product thumbnail image, and the last one is in the footer. Your Merchandising team wants to track where people clicked, and they ask the Analytics team to track the click locations. The Analytics and Dev teams will tag each of the three URLs with internal tracking parameters. The three tagged links may look like these:

mysite.com/watches/?trackingkey=hp-watches-primary_nav
mysite.com/watches/?trackingkey=hp-watches-body_image
mysite.com/watches/?trackingkey=hp-watches-footer

The trackingkey parameter in the first link communicates to the web analytics tool that the click came from the home page (which is indicated by the hp string), in the watches category located in the primary navigation (primary_nav). The other two URLs are similar except that the link location is different. When the Analytics team added these tracking parameters, they created three duplicate content pages, which is not desirable.

Of course, in this scenario, you can use rel=”canonical” to point to a significant URL, or you can use the URL Parameter Tool in Google Search Console to consolidate to a canonical URL. However, for our purpose, we want to have a solution that avoids creating the duplicate URLs, in the first place.

Here’s one way to use JavaScript to avoid generating duplicate content URLs, if you want to use internal tracking with parameters.

In the source code your anchor element will look similar to this:

<a href=“http://www.mysite.com/watches/” param-string=“trackingkey=hp-watches-primary_nav”>Watches</a>

This URL is clean of parameters, which is great for SEO.

The page featuring this link includes a JavaScript code that “listens” when users click the tracked link. When the left mouse button is pressed, the href is updated client-side, by appending the content of the param-string attribute to the URL.

This is how the URL will look like at the mousedown event:

<a href=“http://www.mysite.com/watches/?trackingkey=hp-watches-primary_nav” param-string=“trackingkey=hp-watches-primary_nav”>Watches</a>

Now, the URL includes the internal tracking parameter trackingkey. However, the parameter was added in the browser; it was not present in the raw HTML code when the bot accessed the page (you can get the sample HTML and JavaScript code from here).

If you decide to create uncrawlable links with JavaScript, keep in mind that Google can identify links and anything that looks like a link to them, even if it is a JavaScript link. For instance, OnClick events that generate links are the most likely to be crawled. I have seen cases where Google requested and tried to crawl virtual page view URLs generated by Google Analytics.[3]

Also, note that using JavaScript to create undiscoverable links for bots can be a tricky web development task. Such links may also hinder the user experience for visitors who do not have JavaScript enabled. If your existing website does not work with JavaScript off, you should be fine using AJAX links. However, if your website fully degrades for non-JS users, do not sacrifice user experience for SEO.

The user-agent delivery method

This approach is controversial because it delivers content based on the user-agent requesting the page. The principle behind it is simple; when there is a request for a URL, identify the user-agent making the request and check if it’s a search engine bot or a real browser. If it is a browser add internal tracking parameters to the URL; if it is a bot, deliver a clean URL.

Do you think this method is too close to cloaking? Let’s see how Amazon uses it to add internal tracking parameters to URLs on the fly, client-side. If you go to their Call of Duty: Ghosts – Xbox 360 page[4] while using your browser’s default user-agent and mouse over the Today’s Deals link, you will get a URL that contains the tracking parameter ref:

Figure 118 – The internal tracking parameter shows up in the URL.

Now, change the user-agent to Googlebot, reload the page, and mouse over the same link. This time the URL does not include the tracking parameter. To change the browser user-agent use one of the many browser extensions, available for free.

Figure 119 – When Googlebot is used as user-agent, the tracking parameter is not in the URL anymore.

Below is their HTML code. The top part of this screenshot shows the code when Googlebot requests the page. The bottom part depicts the code served to the default user-agent.

Figure 120 –This “white-hat cloaking” may be ok if you are only playing with URL parameters that do not change the content.

Assessing internal linking

The first step towards internal linking optimization is the diagnosis. Analyzing if pages are linked properly can reveal technical and website taxonomy issues.

One of the fastest and easiest ways to ascertain which pages are interlinked the most (and therefore deemed more important by search engines) is to use your Google Search Console account.

Figure 121 – The Internal Links report in Google Search Console.

Look at the Internal Links report, found under the Search Traffic section in Google Search Console. Are the most important pages for your business listed at the top? For ecommerce websites, those are usually the categories listed in the primary navigation.

In the image above, notice the /shop/checkout/cart/ directory. The URL is the second most linked page on the website. This makes sense from a user standpoint because this link must be present on most pages. However, the cart link is not important for search engines, so you can disallow the entire /checkout/ directory in robots.txt, to prevent everything under it from being crawled.

Figure 122 – The shopping cart link is the only link that is followed.

Next, let’s see how each page is linked anchor text wise. For this we will use one of the best, yet underestimated, on-page SEO crawler and audit tool, the IIS SEO Toolkit.[5]

Figure 123 – The IIS SEO Toolkit, an indispensable on-demand desktop crawler.

You do not hear the SEO community talk much about this tool, maybe because it is Microsoft technology. However, its flexibility and extended functionalities are better than Xenu (free) and at least at par with Screaming Frog (paid).

The IIS SEO Toolkit is free, which makes it a great tool to start with. Unfortunately, the development of the IIS SEO toolkit was stopped years ago, so it cannot really compete with the new crawlers.

Once you identified and fixed all the problems reported by the IIS SEO toolkit, you can consider upgrading to an enterprise tool such as:

  • Botify – undisclosed pricing
  • DeepCrawl – $89 USD/month (100k URLs per month)
  • OnCrawl – $69 USD/month (100k URLs per month)

These monthly costs are estimates based on the minimum number of URLs crawled per month as of Dec 2019 (prices might’ve changed meanwhile).

Figure 124 – There are virtually thousands of ways to analyze your website with IIS SEO toolkit, and you can slice and dice the SEO data in almost any way you can imagine.

Install the tool on your Windows machine and run your first crawl. It is simple to set up, and it does not require an IIS server, as the name implies. The toolkit uses the default IIS server in Windows, so you might need to activate the IIS server component. Also, Windows 10 users will have to go through an additional fix to make it run.

Now, let’s see how you can use the toolkit to identify some major internal linking issues.

Finding broken links with this toolkit is a breeze, just like it should be with any decent crawler. You can find the broken links report under the Violations section or the Content section.

Figure 125 – You can find broken links using the Violations or Content reports.

You already know that broken links are an issue that needs attention because they hinder user experience as well. So, use the toolkit to identify and take care of them.

The general SEO wisdom is that any page should be accessible in as few clicks as possible. Four or five levels are OK and acceptable for users and bots but any more becomes problematic.

Figure 126 – The Link Depth report can uncover issues such as circular referencing, malformed URLs or infinite spaces.

Having URLs buried 24 levels deep, as depicted in this screenshot, suggests that there is a problem with the internal linking. In this example, the issue stemmed from malformed URLs that were creating circular references.

Use the Pages with Most Links report in the Links section of the tool to identify the number of outgoing links from each page on your website. Sort the data by Count to get a quick idea of where the problems are.

Figure 127 – Including 936 links on a page is a bit concerning.

In most cases, the number of links on the same page template should be similar. However, in this case, there is a quick jump from about 300 to around 900 links for the same template, the product detail page.

When you find such big differences, check the pages that seem off the charts. Investigate why there are so many links, compared to the other pages on the same template.

The Pages with Most Links report is also available in the Violations section:

Figure 128 – Check the Violations report to identify problematic pages.

Identify hubs

The Most Linked Pages report can help you identify internal hubs. In terms of website taxonomy and internal linking, a hub is a parent with lots of children linking back to it.

Usually, the largest link hubs on ecommerce websites are the home page and the category pages linked from the primary navigation. If you see other pages at the top, you might have internal linking issues. The number of products under a certain category also influences how many links a category gets.

Figure 129 – The Most Linked Page report is very similar to the Internal Links report in Google Search Console.

The numbers in the previous image highlight three issues:

  • The most linked page does not have a <title> tag.
  • The shopping cart URL seems to be getting too many internal links. Because the shopping cart URL is dynamic bots will try to access it from multiple pages, which is not ideal.
  • A significant number of internal links point to 301 redirects. This suggests that somewhere in the primary navigation there is a link to a 301 redirect. Whenever possible, link directly to the final URL.

To dig more deeply and get additional details on how each URL is linked, right-click the URL you would like to analyze and then click on View Group Details in New Query.

Figure 130 – Finding out how each URL is linked.

Then click on Add/Remove Columns and add the Link Text column. Then click on Execute, at the top left, to update the report.

Figure 131 – You can remove/add columns to your reports.

Regarding section one in the screenshot above, if a page is linked using an image link, the IIS SEO toolkit does not report the alt text of the image. This is one of the few downsides of the toolkit.

In section two I highlighted a mismatch between the anchor text and the linked page. The highlighted page is linked using the “customer service” anchor text, which is wrong because the linked page is not the customer service page.

Look for this kind of mismatches in your analysis.

Next, let’s aggregate anchor text:

  1. Click on Group by.
  2. Select Link Text in the Group by tab and hit Execute.
  3. You will get a count of each anchor text pointing to that particular URL.
  4. To analyze a different URL, simply change the value in the Linked-URL field.

Figure 132 – If you find that a page is linked with too many varying anchor texts, you need to evaluate how close the anchor texts are semantically and taxonomically.

Ideally, you consistently link to category pages using the category name, but a few variations in the anchor text are acceptable. For example, you can link to the Office Furniture category with the anchor text “office furniture”, but you can also link using “furniture for the office”. When you link to product detail pages (PDPs) from product listing pages (PLPs), use the product name as the anchor text. If you link to PDPs from blogs, user guides, or other content-rich pages, you can vary the anchor text. The anchor text can include details such as product attributes, brands, and manufacturers.

To get an overall picture of the site-wide anchor text distribution, you need to create custom reports, which are called “queries” in the IIS SEO Toolkit. This is where the enormous flexibility of the tool comes in handy.

To create a custom report, go to the Dashboard, click on the Query drop-down, and select the New Link Query:

Figure 133 – Adding a New Link Query.

  1. In the new tab (Links) select the field name values as depicted in the image above.
  2. In the Group By tab select Link Text from the drop-down.
  3. Click Execute.

Figure 134 – There you have the internal anchor text distribution for the entire website.

In the example above, notice a couple of things that need to be investigated further:

  • First, why does the most linked page have no anchor text?
  • Second, how about blocking bots’ access to the shopping cart link?

If you want to look at a fancy visualization of your hub pages, use the Export function of the IIS SEO Toolkit to generate the list of all URLs. Then, import that file into a data visualization tool.

Figure 135 – Sample internal linking graph.

The image above is a visualization example generated with Gephi. You can find tutorials on how to generate link graphs using Google’s Fusion Tables,[6] NodeXL[7] and Gephi[8].

Problematic redirects

Using the Redirects report will help identify internal PageRank leaks, unnecessary 302 or 301 redirects, and undesirable header response codes. To make the analysis easier and see the issues grouped by pages, you can sort by Linking URL.

Figure 136 – Sort by Linking-StatusCode to identify issues.

Regarding the two notes in this screenshot:

  1. The currency selection is kept in the URL rather than in a cookie. For this website, each currency selection generated a unique URL on almost every single page, which is bad.
  2. Instead of linking to a URL that returns a 301 (Moved Permanently), link directly to the destination.

Figure 137 – The unnecessary redirects are also available in the Violations report.

Wrong URLs blocked by robots.txt

The toolkit can also help you identify URLs robotted by mistake. Use the Links Blocked by robots.txt report to find such URLs.

Figure 138 – The Help.aspx page is blocked with robots.txt

Do not block bot access to help pages (or similar pages, e.g., FAQs or Q&As). You want people who have questions about your products or services to be able to find such pages, straight from a search engine query. The content on these pages has the potential to reduce calls to customer service.

Because the help page is located under the /Common/ directory, which is blocked with robots.txt, search engines will not be able to access it, and the help page will not be indexed.

Figure 139 – All pages under the /Common directory will be blocked.

In the Links Blocked by robots.txt report look for pages and URLs that should be indexed but are blocked by mistake.

Protocols

The Protocols report displays the various protocols used to link internally to resources on the website:

Figure 140 – Do you interlink https with HTTP pages?

If your website uses both non-secure HTTP and secure HTTPS protocols, what happens when visitors go back and forth between HTTP and HTTPS pages? Do they get warning messages in the browser? Do you link to the same URL with secure and non-secure protocols?

We know that shopping carts, logins, and checkout pages should be secure and such pages do not need to be indexed by search engines. However, it is best to switch everything to HTTPS. Keep in mind that when you switch from non-secure HTTP to secure HTTPS, there might be a temporary drop in traffic.

Other issues

Here are some other common internal linking mistakes:

  • Inconsistent linking; this happens when you link to the same page with multiple URL variations, for example, linking to the home page with URLs such as mysite.com and www.mysite.com. When you link internally, be consistent—link to one consolidated URL only.
  • Default page dispersal; this is when you link to index files rather than to root directories. For example, many webmasters link to index.php when linking to home pages. Instead, you have to link to the root directory, which is just the slash sign, /.
  • Case sensitivity that leads to 404 Not Found errors. For instance, Apache servers are case sensitive, so if you link to the URL Product-name.html using upper-case “P” instead of lower-case, the server may return an error.
  • Mixed URL paths; this happens when you link to the same file using both absolute and relative paths. This is not an SEO issue per se; however, adopting standardized URL referencing helps with troubleshooting web dev issues. Also, if you use absolute paths, when content scrappers steal content, they may still leave the absolute links to your URLs.

When you assess your competitors’ internal linking from an SEO perspective, compare the source code generated with Googlebot used as user-agent and with JavaScript disabled, with the source code generated when you use the browser’s default user-agent. Are there any internal linking differences?

You should also analyze the internal linking differences between the cached version and the live page.

Nofollow on internal links

The nofollow microformat[9] is a Robot Exclusion Protocol that applies at the element level, and it prevents PageRank and anchor text signals from being passed on to the linked pages. The HTML element that nofollow applies to is the A element.

Some SEOs use the nofollow attribute believing it will prevent the indexation of the linked-to URL. Often, we find statements similar to “nofollow the admin, account, and checkout URLs to prevent these pages from being indexed”.

Such statements are not accurate because nofollow does not prevent crawling, nor indexing.

Figure 141 – Interpretation of nofollow by the individual search engine, according to Wikipedia.

Matt Cutts, who worked as the head of Webspam team at Google, says that Google does not crawl nofollow links and here are his words:

“At least for Google, we have taken a very clear stance that those links are not even used for discovery”.[10]

However, Google’s Content Guidelines documentation states something different:

“How does Google handle nofollowed links? In general, we do not follow them”.[11]

Notice the “in general” mention, in the statement above.

A test I performed some time ago with internal nofollow site-wide footer links showed that although it took about a month, Googlebot, msnbot, and bingbot did crawl and index the nofollow links. Yahoo! Slurp was the only bot that didn’t request the resource.

My recommendation is to use nofollow not as a method to keep search engines away from content, but just as a method for preventing crawling. Keep in mind that if you nofollow links that search engines previously discovered, those links may still be indexed. Also, if external links are pointing to nofollow URLs, those URLs will get indexed.

Figure 142 – You will often see the nofollow tag applied to links such as shopping carts, checkout buttons, and account logins.

A few years ago, nofollow was used to funnel PageRank to important pages, a tactic named “PageRank sculpting”. However, nowadays, the vast majority of SEOs know that PageRank sculpting with nofollow no longer pays off[12], and many ecommerce websites stopped nofollow-ing internal links.

However, some continue doing it, as you can see in this screencap:

Figure 143 –Instead of nofollow links like the ones in the image above, a better approach is to consolidate links into a single page.

Consolidating links is a good approach because when you nofollow a site-wide URL like “Terms of Use”, you take that page out of the internal links graph completely.[13] This means that the page will not receive internal PageRank, but it also means it will not have internal PageRank to pass.

The previous example brings us to a more important issue: nofollow-ing links in primary or secondary navigation. Depending on what links you nofollow, you could be making a big mistake.

It is important to know that PageRank is a renewable resource, which means that it flows back and forth between pages that link to one another. According to the original formula, the PageRank metric uses a decay factor (AKA damping factor) between 10% and 15% at each iteration, to avoid infinite loops[14].

Let’s say that page A is the home page and it links to category pages B and C, from the primary navigation menu. To simplify, let’s assume that pages B and C do not have any external links pointing to them.

Figure 144 – An overly simplified PageRank flow.

The most important thing to understand from this diagram is that page B and page C each return PageRank to page A, which increases the PageRank for page A.

Let’s see what happens when you add rel=“nofollow” to the link pointing to Page C in the primary navigation:

Figure 145 – The nofollow attribute stops sending PageRank to page C.

When the nofollow is applied, page C stops sending internal PageRank back to page A, because Page C does not receive any internal PageRank to pass.

When I researched examples for this topic, big names such as Toyota surprised me by using nofollow in the global navigation. You can see in the screenshot below how Toyota nofollow-ed all the links pointing to car models such as Yaris and Corolla.

Figure 146 – The link in the red dotted border are nofollow.

Note: at the time of the research PageRank was still publicly available. Back then, Toyota’s home page had a PageRank 7, and the Yaris page (which is a nofollow link in the primary navigation) had a PageRank 5. The PageRank 5 was mostly due to a large number of external links rather than to the internal linking flow.

Figure 147 -The Yaris page gets a lot of external backlinks from more than 2,500 domains.

However, the situation was different on jtv.com. This time, the categories linked from the primary navigation did not get many backlinks from external sources. While their home page had a PageRank 5, all Shop by Type pages had a “not ranked” PageRank.

Figure 148 – Because the Shop by Type pages were linked from the primary navigation, they should’ve got a decent amount of authority (e.g., at least PageRank 3).

The nofollow attribute on primary navigation links does not mean that those pages will not show in SERPs. As a matter of fact, they were all cached by search engines, at that time. Also, using nofollow on those links does not mean that more PageRank was passed to other follow links. What it really meant is that the authority of the nofollow-ed URLs in the primary navigation was significantly reduced.

If some links are not important for users, consider removing them from the navigation altogether. Not every category needs a link in the primary navigation menu.

If you want to send link juice only to specific links or pages, here are some alternatives to nofollow:

  • Have fewer links on the linking page.
  • Move important links to prominent places.
  • If you do not want to pass link juice to certain links, make them undiscoverable for bots.
  • Block search engine bots from discovering overhead links.

Keep in mind that nofollow is not a solution for duplicate content.

Because nofollow is incorrectly used to prevent indexing, it may also be incorrectly used to prevent duplicate content issues. However, adding nofollow to links is not the best approach for controlling duplicate content. Since nofollow is not 100% crawling and indexing fail-proof, how can it be used to prevent indexation or duplicate content?

Internal linking optimization

Users navigate from one page to another by clicking on links. That is one of the core principles of the Internet, and it hasn’t changed since the Web’s inception. However, while links are simply a way for people to navigate within a website or between websites, search engines will use links as authority and relevance signals.

For search engines, though, all links are not created equal. Some links are assigned more weight based on various criteria. For example, links surrounded by text are considered more important than links in footers, as Google states in the video[15].

Links surrounded by text are called contextual text links, while links used to structure a website (for example, links in the primary and secondary navigation or breadcrumbs) are called structural or hierarchical links.

One of the reasons contextual text links receive more search engine weight is related to the fact that users often ignore structural links to go straight to the content,[16] and because users rarely scroll to click on footer links. This is why search engines deem contextual text links more important than some structural links such as footer links.

Figure 149 – Structural links in several types of navigation such as primary, secondary, faceted navigation. The contextual text links are present in the main content area.

Large websites such as ecommerce ones have the advantage of generating an incredible number of internal links; however most of those are structural links that do not carry the same power as contextual text links. Moreover, in some cases Google might even ignore boilerplate or structural links:

“We found that boilerplate links with duplicated anchor text are not as relevant, so we are putting less emphasis on these”.[17]

There are several ways to optimize internal linking, and there is no excuse for you not to capitalize on SEO opportunities that are under your direct control.

Theoretically, a large number of factors could influence the value of an internal link,[18] but we are going to limit to the following:

  • The position of the link in the page layout.
  • The type of link, e.g., contextual versus structural link.
  • The text used in the anchor.
  • The type of link, as in image link versus text link. An image’s alt text seems to pass less ranking value than text links, as it has been reported in this article[19].
  • The page authority and the number of outbound links on the page.

The position of the link in the page layout (e.g., in the primary navigation, in the footer, or the sidebar) influences how much PageRank flows out to the linked-to page.[20]
Microsoft has the VIPS patent (VIPS stands for A Vision-based Page Segmentation Algorithm[21]), which talks about breaking down page layouts into logical sections. Microsoft has another paper that talks about Block-Level PageRank, which suggests that PageRank passed out to other pages is dependent on the location of the link on the page.[22]

Google has a patent on “Document ranking based on semantic distance between terms in a document[23] and another patent called “Reasonable Surfer[24]. These two papers indicate that links placed in prominent places pass more PageRank than links in less important sections of the page.

Contextual text links are assigned more weight than primary and secondary navigation links, which in turn are deemed more important than footer links. However, the presence of the keyword rich anchor text in the primary navigation (which is present on almost every single page of the website) compensates for relevance. Therefore, primary navigation links are at least as powerful as contextual text links.

Unfortunately, you can have only a limited number of anchors in the primary or secondary navigation, which means you have to choose carefully. However, with contextual links, you can implement a large number and variety of anchors because you are not limited by design space or by strict anchor text labeling. For example, you may be restricted to using the anchor text “hotels” in your structural navigation, but on content-rich pages, you can use contextual text links such as “5-star hotels in San Francisco” or “San Francisco’s best 5-star hotels”.

Related to the link position, it is worth mentioning the concept of the First Link Rule. This rule says that when multiple links on the same page point to the same URL, only the first anchor text matters to search engines.[25]

Figure 150 – Each of these URL pairs points to the same URL twice, but with not so optimal anchor text

Regarding the first pair, linking back to the home page with the anchor text “home” may confuse search engines. This is because the anchor text “home” conflicts with the anchor text “home & garden products”.

Regarding the second pair, Children’s Bedroom Furniture should be a category page on its own, at a separate URL.

For the third pair, the “Decorating with Metal Beds” link points to a shopping guide, which is great. However, the link using the anchor text “modern metal beds” should point to a category page (if keyword research unveils that “modern metal beds” is an important category). For example, the link could point to the Metal Beds category page, filtered by the Style=modern.

If you want to make search engines count multiple anchor texts,[26] one of the best options is to add the hash sign (#) at the end of the URLs[27].

So, if your first link is mysite.com/metal-bed-guide.htm, then all subsequent links will read like mysite.com/metal-bed-guide.htm#value

However, if you link to the same URL with varied anchor text, you do not need to use the hash in the URL. For example, you can link to the same product page once with the product name and the second time using the product name plus the manufacturer name. Just make sure that the varied anchor text is related and relevant to the linked-to page.

Very often you will encounter multiple URLs pointing to the home page—once on the logo and once in the breadcrumbs.

Figure 151 – Both links (logo and breadcrumb) point to the homepage.

The alt text of the logo is “UGG Australia,” and the anchor text in the breadcrumb is “Home”. While having a “Home” link is good for usability, I am not a big fan of the “home” anchor text. I would either:

  • Use the brand name in the breadcrumb, because in this particular case the brand name (UGG) is very short. Instead of “Home”, I would use “UGG Australia” or just “UGG”.
  • I would replace the anchor text “Home” in the breadcrumb with a small house icon and use the alt text “UGG Australia” for that icon.

Multiple same-page linking happens when a page contains multiple links to the same URL. On ecommerce websites, this frequently arises when links on product listing pages point to product detail page URLs. One link is on the clickable image thumbnail, and the other link is on the product name:

Figure 152 – Multiple links to the same product details page.

Figure 153 – The HTML code for the previous image.

If we look at the source code for the previous example we will find that the alt text of the thumbnail image is “black”, and the product anchor text is “Solid Ribbon Belt”. This sends confusing relevance signals and is not optimal.

Additionally,

  1. The <A> element has an alt attribute, but it is in the wrong place because the alt attribute is not allowed on the <A> element. This alt attribute was probably intended to be a title attribute.
  2. The alt texts #1 and #2 should be switched.
  3. The alt attribute on the A tag (#1) should be removed.

Figure 154 – This is the product name text link.

Let’s talk about several options for addressing multiple links generated by image thumbnails and product names:

  • Repeat the product name text in the image alt text. This is the easiest way to tackle this particular type of issue. In our example, the thumbnail’s alt text will become “Solid Ribbon Belt”.
  • Wrap the image and text under a single anchor or a single link. This not always possible and it is not good for accessibility.
  • Deploy the URL hash if you need to use unrelated anchor text to point to the same URL.
  • Place the product name above the image (this is against usability and design conventions).
  • Code the page so that the text link is above the image link in the HTML code, while in the browser you will use CSS to display the anchor text below the image. This is a bit complex to implement, and it is not a very good idea.

In the case of multiple same page links, if the anchor texts are unrelated, they will send confusing relevance signals. However, PageRank will pass through both links.[28]

Now that you know that contextual text links are important, let’s see how you can create more of them with the help of user-generated content, product descriptions, brand pages, and blog posts.

User-generated content
User-generated content (UGC) is one of the best ways to feed search engine bots, to send engagement signals, and to help users make purchasing decisions.

Figure 155 – The highlighted texts are potential internal links.

In this screenshot, you can see two typical reviews displayed on a product detail page. Reviews can add to the overall main content text and help with conversions. I highlighted in yellow a couple of words that could be potential internal links.

Product reviews
Product reviews are one type of content-heavy user-generated content and represent a huge opportunity for generating contextual links. However, not many ecommerce websites are taking full advantage of product reviews for internal linking purposes.

While researching this topic, I was surprised to find that only one of the top 50 online retailers was adding contextual links on user reviews content. For whatever reason (maybe poor SEO implementation, vendor restrictions, fear of linking out from product detail pages and losing conversions, and so on) the other retailers did not. In fact, very few of the top 50 online retailers deployed SEO-friendly reviews. We will discuss how to optimize reviews in detail, in the section dedicated to product detail pages.

Figure 156 – This is a very popular product with more than one thousand reviews.

In the example above, the product has 1,221 reviews. If you were to add just one contextual link on 10% of the reviews, you would create 120 powerful contextual internal links.

Product descriptions
Many ecommerce pages contain text-rich sections. Take product detail pages for example; each product has or should have a description. These content-rich sections are great places to link up to parent categories and brand pages:

Figure 157 – The highlighted text could be a link to a brand page (Bobeau).

When you link from product descriptions, it is important to link to the parent category and, optionally, to other highly related categories.

Optimized brand pages

Figure 158 – Most of the time brand pages are nothing but a product listing page.

There is nothing wrong with listing products on a brand page, but you must make brand pages content-rich as well.

If you want to build relevant and valuable contextual text links that send more PageRank authority to product or category pages, then brand pages must include text, media, and social signals. Be creative with the content, and link smartly. Add a paragraph about the brand’s history, and link to the brand’s top sellers. Alternatively, you can add interesting facts about the brand and a couple of useful reviews. Get the brand owners interviewed and publish the interview on their brand page. You can then ask for a link or a mention from their Press or News section.

Look at how Zappos improved the internal linking on their brand pages, and how they carefully interlink thematically related pages:[29]

Zappos’ brand page does a good job at satisfying users and search engines:

  • Zappos uses section 1 as a sitemap, to guide bots to various other related pages on their website.
  • In sections 2, it implements brand-specific RSS feeds. When there are new products published for that brand, search engines will be instantly notified.
  • In section 3, you can see how they use text-rich content for contextual linking.
  • In section 4, they link to the brand’s featured products.
  • In section 5, Zappos features contextual links within user reviews.

Blogging
As mentioned in the Information Architecture section, blogs can be used to support and increase authority for the category and product pages, but very few ecommerce websites take full advantage of blogging.

Figure 159 – Contextual text links from the main content area carry significant authority. Make sure your content-rich pages link internally to PDPs and PLPs.

When you write blogs, link internally from the main content areas to pages on your website, ideally to product and category pages.

Figure 160 – This is a good implementation of internal linking from blog posts. If you do not overdo it, internal exact anchor text match is still important.

At the risk of becoming annoying, I need to stress this: if you are not blogging, you are missing a huge amount of long-tail search queries used by possible customers in the early buying stages.

Remember, you write articles not to sell or promote something but to grab long-tail traffic for informational search queries, and to support pages higher in the hierarchy. The amount of content you need to create to support category, subcategory or products depends on how competitive each keyword is.

Other types of user-generated content that you can use to create contextual links are blog comments, user or customer support questions and answers, guest posts, product images with captions, user-submitted images, curated rich media, and even shop-able images.

Anchor text

The anchor text optimization principle is simple: the text used in the anchor sends relevance clues to search engines, and it must be relevant to the linked-to page. For example, if the anchor text is “suitcases” and the linked-to page includes the phrase “suitcases” along with other semantically related words, then the anchor text in the incoming link is given more weight.

However, if you were to use “click here” on internal anchor text pointing to, let’s say hotel description pages, then search engines will assign less relevance to those anchors, as they are too generic and don’t communicate anything about the linked-to page. In our previous example, when linking internally to hotel description pages, you should use the hotel names in the anchor text.

The following study has been conducted on more than 3,000 ecommerce and non-ecommerce websites, and it analyzed more than 280,000 internal links and their corresponding anchor text[30]. The study looked for the most common words used in the internal anchor text. This screenshot shows those terms ranked by frequency.

Figure 161 – The study looked at how 3,000 websites use anchor text in internal linking.

Seven out of 10 anchor text links could be logically consolidated into three groups, represented by the numbers in the image. This technique is called link consolidation, and it is a better alternative to link sculpting with nofollow. Keep in mind that if the links you consolidate are in the footer, then the value of doing this is minimal.

Let’s see what anchor texts you use to link pages internally.

First, let’s find out whether you use generic anchor texts such as “click here” or “here” on your website. After you run the crawl on your website, use the IIS SEO Toolkit to check whether The link text is not relevant violation is reported under the Violations Summary section of the tool:

Figure 162 – If you double-click any of the violation titles in the Violations Summary section you will get more details about each error.

There are situations where it is OK to use “click here” as anchor text, for example when you link to a page that is not important for rankings, or when you use “click here” as a call to action. As a matter of fact, “click here” is one of the most powerful calls to action used in online marketing.

By default, the IIS SEO Toolkit searches for the words “here” and “click here” in text anchors. In practice, there are more generic anchors that you should pay attention to. A more comprehensive list of generic anchors is available in here.

If you want to be exhaustive with this type of analysis, you need to export the list of anchors from the IIS SEO Toolkit and use Excel for a deeper analysis. Here’s how to do it.

Figure 163 – Create a new link query.

In the IIS SEO Toolkit go to Dashboard, then click on the Query drop-down, then click on New Link Query.

Figure 164 – Use the settings depicted in section (1) and (2).

Use the following settings in section (1):

  • Linked Is External Equals False.
  • Link Type Not Equal Style.
  • Link Type Not Equal Script.
  • Link Type Not Equal Image.

In the Group By section, select Link Text. Then hit Execute, sort by Count, and then click Export. This will generate the aggregated link text report. You can export the data to a .csv file.

Once the Links tab opens, right-click anywhere on the gray area, and select Query –> Open Query.

Figure 165 – Importing an XML query in the IIS SEO Toolkit.

Open the file generated by the IIS SEO Toolkit using Excel, and name one of the spreadsheets Anchors. Name the first column Anchor, and list all the anchor text in it. Name the second column Occurrences and list the occurrence count (the SEO toolkit generates this data.)

Add a third column (name it Presence) and leave it empty for now because this column will be filled in later using a VLOOKUP function.

Figure 166 – The count of occurrences for each anchor text.

Create a new spreadsheet and name it Generic anchors. Then, create two columns: Generic Words and Presence. Then, list all generic keywords in the Generic Words column. Fill the Presence column with number “1”:

Figure 167 – Adding the number one in the Presence column will be used to match the anchors on your website with the generic anchor text list.

Now, go back to the Anchors spreadsheet, and add the following VLOOKUP formula in cell C2:
=VLOOKUP(A2,’generic anchors’!A:B,2,FALSE)

Figure 168 – VLOOKUP is a built-in Excel function that is designed to work with data that is organized into columns.

Copy the VLOOKUP formula all the way down in column C. You can double click on the tiny dot in cell C2 (the dot at the bottom right of the cell) to automatically fill the column C with the VLOOKUP formula.

If there is an exact match between the anchors used on the website and the generic keywords list, the column C cells will be filled with value “1”. You will get “#N/A” when there is no match. Sort or filter by “1”, and you will get the list of generic anchors on your website:

Figure 169 – The anchor text “Blog” is one of the most used internal anchor texts. Additionally, there are some other generic anchors such as “click here”, “here”, “home”, or “website”.

So, we identified that the “blog” anchor text is heavily used on this website; this large number suggests that it is probably a site-wide link.

Next, we will use the IIS SEO Toolkit to see which pages link to the Blog section.

You will need to open a new query by going to Dashboard –> Query –> New Link Query;

In the Field Name section use the following settings:

Figure 170 You can group the data by Link Text.

  • Link Type Not Equal Style.
  • Link Type Not Equal Script.
  • Link Type Not Equal Image.
  • Link Text Equals “Blog”.

In the Group By section, select Link Text (if Group By does not show up by default, you will have to click on the Group By icon just below the Links tab. Next, hit Execute. This report will show you how many times the word “blog” was used as anchor text.

Double-clicking on “Blog” will open a detailed list of Linking URLs. Repeat the process for all generic anchor texts.

You have to be more creative and replace the anchor text “blog” with something more appealing to search engines and people. Even {CompanyName}Blog is a better choice, but you could theme this anchor text even more. For example, if you sell fishing or hunting equipment, you can use {CompanyName}Fish & Hunt Blog. If you sell running shoes, you could use Mad Runner’s Blog, and so on.

When you link to category or product detail pages, use the category or the product names as anchor text. For instance, if you sell books, you will link to the product detail page with the book’s name. You can also vary the anchor text by adding brands or product attributes to the product name.

Exact internal anchor text match still matters for ecommerce websites, if you do not go overboard, for example by spamming with site-wide footer links. Usually, it is a good idea to match the search queries with your internal anchor text, as closely as possible. However, how do you know which anchors to use to link to a page that lists, let’s say, ignition systems for a 2004 Audi A3? By doing keyword research.

For example, if you sell auto parts, you can break down the keywords by years, makes, models, product types, or categories. Collect keyword data from as many sources as you can: user testing, Google Analytics, Google Ads data, your webmaster accounts, competitor research, or data from your Amazon account. Put all the keywords in a master spreadsheet and remove duplicates using Excel.

Add the metrics that you want to take into consideration, and your table may look like this:

Figure 171 – I like to add a keyword ID in the first column, just to be able to revert to the original data at any time, by sorting by ID.

As metrics, I am going to consider the average monthly searches for each keyword and the number of conversions.

Now, you need to identify search patterns. You are going to do this by replacing each word with its corresponding product attribute or category it belongs to. For example, you will replace “2007” or any other year with the placeholder{year}, “Chevy” or any other make with the placeholder{make}, and “grill” or any other category name with the placeholder{category}. Replace all the words until you end up with a significant number of placeholders.

Figure 172 – You can speed up this process if your programmers can write a script to replace keywords with attributes automatically.

Once you replaced all the words with placeholders, identify the most used patterns by using pivot tables:

Figure 173 – You can identify the most used patterns using pivot tables.

For your pivot table settings use Keyword Pattern for your rows, and for Values use the following:

  • The sum of average monthly searches.
  • The sum of conversions.
  • The count of keyword pattern.

There you have it! The most common pattern in our example is {year}{make}{model}{category}. However, the pattern with the most searches is {make}{model}. The pattern with the most conversions is {make}{model}{category}.

By mimicking user search patterns in your internal linking, you will increase the relevance of the linked-to pages.

Anchor text variation
Despite RankBrain becoming better at understanding keyword variations, it is still a good idea to vary the internal anchor text pointing to the same URL. For ecommerce websites, the category and subcategory pages will allow only some room for keyword variations. For example, when you link to the Vancouver Hotels page, you can use “hotels in Vancouver” or “Vancouver hotels”.

When you link to a product listing page (for example a page that lists all Rebel XTi cameras), you can add the brand name (“Canon Rebel XTi”) or the product line the product belongs to (e.g., “Canon EOS Rebel XTi”).

Figure 174 – When you link from content-rich areas such as blog posts or user guides, you can vary the anchor text more.

Contextual text links allow more anchor text variation than structural links. Structural links are often based on rules, such as using only the product names or product names plus product attributes. Therefore, structural links are not very flexible, while contextual text links are.

For product variations (e.g., model numbers or different colors), the anchor text on the item name can contain differentiating product attributes:

Figure 175 These three SKUs are variations of the product “Canon Digital Rebel XTi 10.1MP”.

In this screenshot, the three SKUs are variations of the same product, “Canon Digital Rebel XTi 10.1MP”. The first SKU is just the camera body. The second SKU includes a lens too, and the anchor text includes that detail. Similarly, the third SKU includes a lens too, but in a different color.

Remember to link using text that makes sense for users without forcing keywords. Also, just a reminder that when you use plurals in the anchor text (e.g., “digital cameras”), consider linking to a listing page because search queries that contain plurals usually denote that users want to see a list of items.

Merchandising and marketing teams needed to cross- and upsell; that is why ecommerce websites started featuring sections such as Related Items. Related linking can be found under various names and implemented in various ways such as people who purchased this also purchased…, you may also like…, people also viewed…, related products, or related searches. This concept was originally introduced to increase the number of items added to the cart and the average order value, and to help users navigate to related products.

Figure 176 – The You May Also Like section in this screenshot is commonly found on ecommerce websites and is a good example of related items sections.

SEOs realized that related items sections could also be used to:

  • Optimize internal linking by interconnecting deep pages that were otherwise impossible to connect with other types of navigation (e.g., breadcrumbs).
  • Flatten the website architecture.
  • Silo the website architecture by linking to siblings and parent categories. Keep in mind that siloing with related products requires very strict business rules.

Related links can be used to boost the authority of any page(s) whenever needed:

  • You can boost the crawling, indexing, and eventually the rankings of newly added products by linking directly from the category listing page, or even from the home page.
  • If there are products that have very high value for your business, linking from the home page will send more authority to those products.
  • On a page that lists all houses for sale in a particular district, you can also link to houses in nearby neighborhoods.
  • You can boost hotel description pages by linking to recently reviewed hotels from city listing pages.
  • And so on.

If you have a lot of data to rely on, you can implement related products with the help of recommendation engines. Such engines are used to optimize the shopping experience on-the-fly, but often they are implemented with uncrawlable JavaScript. One way of tackling related items implemented with JavaScript is to define and load a set of default related products that are accessible to search engines when they request a page. You will then replace or append more items with AJAX once the page loads in the browser, to improve the discoverability for users. The idea is that you do not want to leave the rendering the content to Googlebot.

Figure 177 – The related items section on the left side of the screenshot is accessible to search engines, as you can see in the cached version of the page, on the right side of the screenshot.

On a side note, while the content of the recommendation engine is indexed, the alt text of the images could be improved.
On the other hand, on the website below the AJAX implementation prevents search engines from finding the recommended products:

Figure 178 – The You May Also Like section should show up in the cached version, just after the last product in the list, but it does not.

If the website above wants to flatten the website architecture by internally linking from related products, they have to make sure search engines can access the links in the related products section. Use fetch and render using Google Search Console to clarify if search engines can render the items in the You May Also Like section. If it works there, it will work for search too.

Googlebot is a headless browser. This means that it is a web browser without a graphical user interface, but one that can render and “see” pages.

Also, keep in mind that what you see when you use the “cache:” operator is not the same with what Google renders at their end. The source of truth for Google is very close to what “Fetch and Render” provides in Google Search Console, while the cached version is just the raw HTML. It is most likely that Google uses both the cached and rendered versions of a page, just to make sure people are not spamming.

Here are a few things to consider when implementing related or recommended items:

  • If you need to add tracking parameters to recommended item URLs, do so in the browser, at mouse down or click events. If you cannot use click events, canonicalize the tracking parameters using Google Search Console or using rel=”canonical”.
  • Keep the number of recommended items low and focus on quality (three to five products should be enough).
  • If you want to provide even more recommended items, use carousels.

The website below links to a sweater and sandals PDP because those products are related to the product detail page they are featured on.

Figure 179 – You can interlink related items even if they are in different silos if it makes sense for users (e.g., link from a skirt PDP to the sandals PDP that completes the look).

Popular searches sections are another type of related links that can be implemented on ecommerce websites. Run a search on Google or Bing and see which keywords they suggest at the bottom of the search results and consider including some of them on your pages.

Internal linking over-optimization

While internal links with exact match anchor text typically do not hurt,[31] do not to overdo it. Let’s look at a few scenarios that can raise over-optimization flags.

Unnatural links to the homepage
It does not help much if you replace the anchor text “home” with your primary keyword.[32]

Figure 180 This looks spammy.

If your domain or your business name is “online pharmacy”, then it may be fine to use keyword-rich anchor text to point to the home page, but otherwise, do not do it.

Too many contextual text links
A high ratio of internal anchor text links to content is not advisable. For example, if a category description content has 100 words and you place 15 anchors in it, that is too much.

Figure 181 – Contextual links are great, but that does not mean you have to abuse them, as depicted in the image above.

Contextual text links can be created either programmatically or added manually by copywriters or SEOs. In both cases, you need to define rules to avoid over-optimization.

Let’s exemplify with a set of rules for category descriptions:

  • Add links to other products from the parent category. Maximum products linked per 100 words is two.
  • Add links to related categories. Maximum related categories linked per 100 words is two.
  • The maximum consecutive anchor text links is two.
  • The maximum number of links with the same anchor text is one.
  • The minimum number of links per 100 words is two.

Use these rules just as guidelines and customize based on your circumstances.

The following is an example of decently safe internal linking:

Figure 182 – The text in this paragraph flows naturally, and the anchors seem natural as well.

Keyword-stuffed navigation and filtering
Some ecommerce websites try to enhance rankings for head terms like category or subcategory names by stuffing keyword-rich anchor text links in the primary navigation, similarly to what you see in this screenshot:

Figure 183 – Did you notice how each subcategory link contains the upper category name?

It is not necessary to repeat keywords repeatedly in the main navigation. If your website architecture is properly built, search engines will be able to understand that if the category name is Watches, all the links and products found under it belong to the Watches category.

The same applies to other forms of navigation, such as faceted navigation.

Figure 184 – These links look spammy too.

You can use properly nested list items to help search engines understand categorization so that you do not need to repeat the category name in every filter value in the left navigation.

Because PageRank is a renewable metric, having external links to category and subcategory pages not only provides ranking authority to the target pages but also increases the amount of PageRank that flows throughout the entire website. Moreover, because it is not economically feasible to build links to individual product pages for ecommerce websites with large inventories, the link-earning efforts should be focused on category and subcategory pages. Keep in mind that link development is complex and outside the scope of this course.

Focusing your link building efforts towards just a few top-performing category pages is a good idea for new websites or websites with limited marketing budgets, but generally, you need to diversify your targets. Once you built enough links to a category page, that page becomes a hub: it will pass link equity to pages downwards and upwards in the website hierarchy. The more hubs you build, the more natural your website will look, and the more PageRank will flow throughout it.

You can identify existing link hubs using Google Search Console and use them to your advantage. Anytime you want to boost a new page you can tap the power of the hubs. For example, you identified that Women’s Apparel subcategory is a hub. If you want to boost the Women’s Sleepwear category, link to it from the hub page contextually, from the main content.

  1. Browser-specific optimizations and cloaking, https://productforums.google.com/forum/#!topic/webmasters/4sVFlIdj7d8
  2. GET, POST, and safely surfacing more of the web, http://googlewebmastercentral.blogspot.ca/2011/11/get-post-and-safely-surfacing-more-of.html
  3. Google Analytics event tracking (pageTracker._trackEvent) causing 404 crawl errors, https://productforums.google.com/forum/#!topic/webmasters/4U6_JgeCIJU
  4. Call of Duty: Ghosts – Xbox 360, http://www.amazon.com/Call-Duty-Ghosts-Xbox-360/dp/B002I098JE
  5. Free SEO Toolkit, http://www.microsoft.com/web/seo
  6. One More Great Way to Use Fusion Tables for SEO, http://moz.com/ugc/one-more-great-way-to-use-fusion-tables-for-seo
  7. Visualize your Site’s Link Graph with NodeXL, http://www.stateofdigital.com/visualize-your-sites-internal-linking-structure-with-nodexl/
  8. How To Visualize Open Site Explorer Data In Gephi, http://justinbriggs.org/how-visualize-open-site-explorer-data-in-gephi
  9. rel=”nofollow” Microformats Wiki, http://microformats.org/wiki/rel-nofollow
  10. Interview with Google’s Matt Cutts at Pubcon, http://www.stephanspencer.com/matt-cutts-interview/
  11. Use rel=”nofollow” for specific links, https://support.google.com/webmasters/answer/96569?hl=en
  12. PageRank sculpting, http://www.mattcutts.com/blog/pagerank-sculpting/
  13. Should internal links use rel=”nofollow”?, https://www.youtube.com/watch?feature=player_embedded&v=bVOOB_Q0MZY
  14. Damping factor, http://en.wikipedia.org/wiki/PageRank#Damping_factor
  15. Are links in footers treated differently than paragraph links?, https://www.youtube.com/watch?v=D0fgh5RIHdE
  16. Is Navigation Useful?, http://www.nngroup.com/articles/is-navigation-useful/
  17. Ten recent algorithm changes, http://insidesearch.blogspot.ca/2011/11/ten-recent-algorithm-changes.html
  18. Link Value Factors, http://wiep.net/link-value-factors/
  19. Image Links Vs. Text Links, Questions About PR & Anchor Text Value, http://moz.com/community/q/image-links-vs-text-links-questions-about-pr-anchor-text-value
  20. Are links in footers treated differently than paragraph links?, https://www.youtube.com/watch?v=D0fgh5RIHdE&feature=youtu.be&t=40s
  21. VIPS: a Vision-based Page Segmentation Algorithm, http://research.microsoft.com/apps/pubs/default.aspx?id=70027
  22. Block-Level Link Analysis, http://research.microsoft.com/apps/pubs/default.aspx?id=69111
  23. Document ranking based on semantic distance between terms in a document, http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=1&p=1&f=G&l=50&d=PTXT&S1=7,716,216.PN.&OS=pn/7,716,216&RS=PN/7,716,216
  24. Google’s Reasonable Surfer: How The Value Of A Link May Differ Based Upon Link And Document Features And User Data, http://www.seobythesea.com/2010/05/googles-reasonable-surfer-how-the-value-of-a-link-may-differ-based-upon-link-and-document-features-and-user-data/
  25. Results of Google Experimentation – Only the First Anchor Text Counts, http://moz.com/blog/results-of-google-experimentation-only-the-first-anchor-text-counts
  26. 3 Ways to Avoid the First Link Counts Rule, http://moz.com/ugc/3-ways-to-avoid-the-first-link-counts-rule
  27. When Product Image Links Steal Thunder From Product Name Text Links, http://www.goinflow.com/when-product-image-links-steal-thunder-from-product-name-text-links/
  28. Do multiple links from one page to another page count?, https://www.youtube.com/watch?v=yYWlEItizjI
  29. Agave Denim, http://www.zappos.com/agave-denim
  30. [Study] How the Web Uses Anchor Text in Internal Linking, https://www.conductor.com/blog/2012/05/image-anchor-text/
  31. Will multiple internal links with the same anchor text hurt a site’s ranking?, https://www.youtube.com/watch?v=6ybpXU0ckKQ
  32. Testing the Value of Anchor Text Optimized Internal Links, http://moz.com/blog/testing-the-value-of-anchor-text-optimized-internal-links

CHAPTER 6

Home Pages

Length: 9,019 words

Estimated reading time: 1 hour

Chapter-Head-Chapter6

Home Pages

In this part of the guide I will break down the most common sections found on home pages, and I will describe how to optimize each of them for better search engine visibility and user experience. You are going to learn how to improve the primary navigation, which has an impact on almost every single page on the website. I am going to show you how to make better use of the internal site search field, by helping users and search engines with discoverability and findability. You will also learn how to optimize marketing and merchandising areas, as well as text areas of the home.

Of all the pages on a website, home pages usually have the highest authority, because of the way PageRank flows and because most of the backlinks point to it. I said usually because in some cases other pages can beat the home page in terms of authority. For example, if the internal linking architecture is broken, or if another page receives more external links, then the home page might not be deemed the most important page by search engines.

Every department, including marketing, merchandising, information architecture, SEO, UX, and even the executives, wants a piece of the home page. Hence, home pages are often cluttered with content and links to tens or even hundreds of categories, calls to action, marketing banners, and so on. This makes home pages unfriendly for users and search engines.

The biggest advantage of home pages is that they pass a lot of PageRank downwards on the website taxonomy. Pages linked directly from the home page get more search engine love.

Do you need a boost for a new product or a new category? Link to it from the homepage. It is a simple concept. For example, if you want to increase authority for the most profitable or for the best converting cities or hotels pages, then link them from the homepage.

If you want to push authority to categories or product pages but do not want to crowd the home page, add a Featured Categories section on the index page of the HTML sitemap.

Figure 185 – The links within the New Top Products and Top Ink Families sections will create shorter paths for crawlers and will send more authority to the pages linked from there.

Before getting into the details, remember that when optimizing home pages, it is highly important to balance SEO with user experience and business goals. As a matter of fact, the same applies to all other pages on your website.

You do not want too many links on the home page, and you want important links in prominent places. Also, the decision to add, remove or consolidate links on the home page needs to take users into account first, and only then accommodate search engines.

Let’s see which sections appear most frequently on ecommerce home pages, and then discuss optimization tactics specific to the most important of them:

  • Logo (this is the area where you display the logo and possibly, a tagline).
  • User account (this area displays links to register, sign in, my account, order tracking, and other pages that require a login).
  • Site personalization (these are links to the country or currency selectors, store locator, color theme, etc.).
  • Search field (this is the area surrounding the internal site search field).
  • Primary navigation (this is the global navigation area).
  • Cart area is where you list links to the shopping cart, checkout or other similar links.
  • Marketing and merchandising areas (i.e., carousels, internal banners, featured products, top categories, most popular deals/brand, and so on).
  • Promotional area (e.g., wish lists, gifts).
  • Help area (e.g., FAQ, live chat, contact us, help center).
  • Footers.

Primary navigation

Theoretically, any HTML link is a navigation element, but for our purposes, we will refer to navigation in the context of primary and secondary navigation menus used by visitors to browse items. Primary navigation is also known as global navigation, while secondary navigation is referred to as local navigation.

Primary navigation usually appears horizontally at the top of a web page or, sometimes, vertically as a sidebar on the left side of the page. Primary navigation is easy to identify, as it consistently appears in the same position across the entire website.

For e-commerce websites, the labels in primary navigation represent major groups of information. Labels can organize information by departments, topics, top-level categories, target market, alphabetical order or other ways, depending on how the information architects, marketing, and usability team structures the website.

Figure 186 – Walmart uses a vertical design for its primary navigation, and the labels in this screenshot are the departments.

Figure 187 – On Bed Bath & Beyond, the primary navigation is horizontal and lists top-level categories.

Figure 188 – On Crocs, the primary navigation is represented by the main target market segments.

When you have many items to list in the menu, display the global navigation vertically. Use a horizontal layout when you can fit all the important labels at the top of the design.

Number of links in the primary navigation

Displaying the primary navigation horizontally limits the number of links that can be placed at the top of the layout, to between five and twelve, depending on how long the labels are. Do not worry; these are reasonable numbers for both users and search engines. A vertical navigation bar is more versatile and allows for more categories to be displayed.

On ecommerce websites, however, the primary navigation is often supplemented by sub-navigation such as dropdowns, fly-outs, or mega menus. This navigation can substantially increase the number of links on any given page.

Figure 189 – This is a very common drop-down menu implementation. The Clothing section lists several topical links.

Usually triggered at mouse hover, the dropdowns and fly-out menus present some substantial usability issues.[1] However, mega menus seem to perform ok.[2] Refer to these two articles for more information on the usability of sub-navigation menus.

Figure 190 – In this screenshot, you can see an example of the so-called “mega menus”. Such menus can include more than just a list of links; it can include images as well.

Figure 191 – Staples handles drop-down menus in a more user-friendly manner. They gave up the standard mouse-over implementation; the sub-menu expands at click only. Additionally, notice the red down arrow icons; such icons suggest to users that there is more information displayed under the labels.

There isn’t a hard limit on how many links to list in sub-menus. I recommend using as few or as many as it makes sense for users; do not worry about search engines too much. Let me explain why.

You often hear suggestions to limit the number of links on a page, to send more authority to possibly other more important pages. While it may be the right SEO approach on certain types of pages such as product listing pages with faceted navigation, many times this recommendation disregards the user experience and usability. There is also a widely accepted SEO best practice which suggests that the entire menu must be SEO friendly, meaning that search engines should be able to crawl all the links in the menu. This is not accurate.

In practice, some ecommerce websites could benefit more from not allowing robots to crawl just about any link in sub-menus. No doubt about it, you should present to users as many links as it makes sense, but you can also make some of them undiscoverable for search engines.

This search engine “unfriendliness” of the links cannot be categorized as cloaking if it is not done for cloaking purposes. After all, even Amazon uses this technique to push authority to products and categories listed in the main content area of the page, as you can see in this screenshot:

Figure 192 – Take a look at the cached version of the page. It does not include the Shop by Department links.

Their global navigation (labeled Shop by Department) presents more than 100 links to users, which is useful to them to navigate the website, but the entire mega menu links are not hyperlinked for search engines, as you can see in the cached version of the page. A quick note here: if the links do not appear in the cached version, it does not mean that search engines are not aware of them, as bots can execute JavaScript to discover those links. It all depends on how Amazon made those links uncrawlable.

Figure 193 – Other retailers in the top 100 obfuscate links as well.

Moreover, Amazon is not the only top retailer to do this. Here’s another retailer in the top 100, doing it as well. Their approach is a bit different; they allowed access to department links, but all the top categories such as Appliances or Electronics are not plain links.

Figure 194 – In this example, you can see that the primary navigation links are cached, but the sub-menu navigation links are not.

Making the links uncrawlable may sound strange and against SEO common sense, but it is a great way to balance usability and SEO. Users will get the links they need, while search engines will have access to prioritized links, which will prevent wasting crawl budget.

The number of links in the primary navigation also depends on how many categories and subcategories your taxonomy has. If you have only five top category pages, each with two to five subcategories, then you can list all of them in the primary navigation. If you have twenty categories with ten subcategories each, you need to give the primary navigation more consideration.

A more radical technique for limiting the number of links in the navigation is to get rid of sub-menus completely. There will be no dropdowns and no mega menus, but just a dropline menu. A dropline menu is a sub-menu that has only one line of items in it. You can see it in use on Ann Taylor’s website. When you use a dropline menu, you need to choose the labels carefully.

Figure 195 – Ann Taylor uses a dropline menu with carefully chosen labels.

Figure 196 – Aeropostale also uses a minimalist approach with just four links in the primary navigation. This approach works well on apparel websites.

If you decide to keep a minimal number of links, make sure that the navigation helps users find content quickly and is not making their task more difficult.

Going back to PageRank basics, we know that the more links on a page, the less authority flows through each link. Regarding user experience, the more cluttered a page is, the more complex content findability[3] is, and the higher shopper anxiety[4] is. Therefore, it makes sense to reduce the number of options and improve user experience by minimizing decision paralysis.[5]

This is where click tracking and analysis plays an important role. You identify which links are helpful for users using metrics such as the most-clicked links and remove the ones that are not. If you have multiple links pointing to the same URL, you can decide either to leave only one link or to implement browser-side URL tracking parameters.

You have several tools at your disposal to track clicks and click paths. For example, Google Analytics allows some nice visual click analysis with reports like User Flow and Behavior Flow.

Figure 197 – The Visitor Flow report in Google Analytics.

In Google Analytics you can also use the Navigation Summary report, which can be found under Behavior –> Site Content –> All Pages to do some path analysis. You get up to 500 data points that you can export and analyze further with Excel:

Figure 198 – The Navigation Summary report can provide some interesting insight.

Other tools, such as CrazyEgg or ClickTale, can do click analysis as well. No matter which tool you use, the goal is to identify links that can be either:

  • Removed from the navigation, or
  • Consolidated into logical groups.

In many cases, home pages link to top-selling and high-margin products, categories or subcategories, and this is good for users and search engines. If you are concerned about the number of links on the page, you can easily consolidate the links to About Us, Contact Us, Terms of Use, and Privacy pages.

I advise testing to see whether the uncrawlable links approach or the reduced number of links approach works best for you, as each website is different in its own vertical.

Navigation labels

The categories and subcategories listed in the menus should consider business goals and user testing. For example, if 20% of your categories generate 80% of the revenue, then those categories should be linked from the primary navigation. You should also experiment with other decision metrics such as the most-searched terms on-site search, the most-visited pages, and so on.

Regarding the labels present in the navigation text links, there are two schools of thought in terms of SEO:

  • The labels should not contain the target keywords.
  • The labels should contain the target keywords.

What I can recommend with confidence is to:

  • Avoid forcing keywords in the primary and secondary navigation labels.
  • Avoid clever labeling[6]. Navigation labels must be easy for users to understand; the labels should enable searchers to find the information they want quickly and easily. For example, what is the label “Inspired Living” supposed to mean on a home improvement website?
  • Design the navigation to pass the trunk test[7], which means you will make it very intuitive and use clear labels.

Search engines should not be the focus when labeling primary and secondary navigation. You have to label for users, not for bots. If that means using keywords in labels, that is fine. If your labels do not require keywords to be useful for users, then do not force keywords just for SEO reasons.

Sometimes, keywords can show up in the navigation naturally. For example, if an ecommerce website sells musical instruments and wants to rank for the keyword “guitars”, having a Guitars label in navigation is natural, and it will help.

There is an alternative for those who want to push longer keywords in the primary navigation menus. You can use images to design the menu in a way that includes short text labels for users, while adding longer labels as keywords in the alt text of the images, for search engines. However, do not spam the alt text. Additionally, you can implement a technique called image replacement[8].

As an example of short labels in an image plus longer alt text look at the screenshot below. The short image labels are Shoes, Handbags, Watches, Jewelry, and Dresses.

Figure 199 – Short labels allow users to scan them easily.

However, the alt text of each image label contains longer keywords, as you can see in the HTML source code:

Figure 200 – The alt texts are longer than the image labels.

Your end goal is to create navigation menus that meet user needs and reflect their behavior on the website. Because these image labels are short, it will allow people to scan them quickly, which is great.

Menu labels can represent a single category, but they can also represent multiple categories grouped into a single label. Whenever you group categories into a single label, you still need to create separate URLs for each category.

For example, if you group Pharmacy, Beauty, and Health into one label (as in the image below), you need to create a separate URL for the Pharmacy category, another one for Beauty and another one for the Health category. That is if you want to increase the chances of ranking separately for keywords like “pharmacy”, “health”, or “beauty”.

Figure 201 – If you want to rank for “pharmacy”, “health” or “beauty” separately, you need separate category pages for each term.

Grouping categories works best when you plan a new website or if the current website is new and you can make changes without affecting an established taxonomy.

The opposite of grouping is de-grouping, which is when you want to split categories grouped under a single label, to create multiple categories. It will be more complicated to split URLs if your website has already been online for a while, and you grouped categories under a single URL. In the latter case, one option is to leave the grouped category as is and create new pages for each subcategory in the group.

From the primary navigation, you should link to canonical categories. For example, if Video Games is categorized in both Games and Electronics, and Games is the default category for Video Games, then the path in primary navigation should be Games –>Video Games. This creates short, unique crawl paths for search engine robots.

Search field

The internal site search on ecommerce websites is often enhanced with an autosuggest or autocomplete functionality that displays items from a list of popular searches, or from products and category names. While autocomplete may be great for users, it is implemented with AJAX, and the suggested links are not accessible to search engines.

However, you can help improve findability for users and search engines by adding plain HTML links to popular searches, directly under the search field. To identify which links are best for users, you will need to track internal site searches with your web analytics tool.

Figure 202 – Google Analytics has a Site Search report that provides a list of internal site searches. To get this list, you can use Start Page as your primary dimension and the Search Term as a secondary dimension.

You can also add product attributes or filters as links, for example linking to a page that sorts by size or type of shoes. Additionally, you can link to a popular searches page:

Figure 203 – The use of related terms near the search field is great because the Search By links serve as entry points for search engines and as a discovery tool for users.

If you want to improve the user experience, these links should vary from page to page, to match:

  • Top searches performed on that specific page.
  • Next product or category pages visited after viewing the current page. You can get this data using your web analytics tool.

On a product detail page, you can list the most used product filters for that product, or for the leaf category the product belongs to. A leaf category is a category at the bottom of the taxonomy, which means there is are no other categories under it.

If you use the HTML <label> element for search fields, keep in mind that this content is indexable:

Figure 204 – Consider improving the text for <label> elements.

While the <label> element will not have an impact on rankings, if you use it along with the <input> element, consider improving the wording within the label tag.

In this example, you can see that the text “find something great” is being used in the label tag. However, this website should use something more relevant and enticing to users, specific to the page visitors are on. For example, on all Shoes category pages, you can use “search for shoes”; on all Bags category pages, you can use “search for bags”; on all Toys pages you can use “search for toys”, and so on.

Also, it is not a good idea to create HTML links for search buttons, since search engines will unnecessarily try to crawl anything that looks like a link.

Figure 205 – This implementation creates links that bots will try to access.

In the previous implementation, the href points to the hashmark that will become a link to the home page, with the anchor text “search”. This is not optimal, and you should avoid this kind of implementation.

Text Areas

Usually, home pages do not allow too much room for plain text content, so you find very few contextual text links going out from most home pages. Many ecommerce websites try to get around this challenge by adding text at the bottom of the page, close to the footer section, as in this example:

Figure 206 – This is a common approach on many websites.

Some websites even use CSS to position text sections such as this one at the top of the source code, while visually the text is at the bottom of the page. This might have worked in the past to overcome the 100Kb indexing limit, but nowadays it is useless. Remember that search engines can render pages, so they will know where that text is displayed on the page. Keep in mind that Google stops the rendering of a page after 10,000 pixels. If your so-called “SEO content” is below that threshold, it may not get picked up.

For those who still believe in the so-called “text to code ratio”, this ratio can be altered by using the tab navigation or SEO-friendly carousels like the following ones:

Figure 207 – Carousels allows you to add more text content to home pages.

In the example above, not only that there is plenty of plain text for search engines to feed on, but there are good internal links too. This approach makes this type of design implementation useful for users and search engines.

As you can see in the cached version of the page, the text content of the carousel is available to search engines for analysis and rankings:

Figure 208 – The cached version of the page.

Another method for including more text in the main content areas, and for creating more contextual text links from a home page, is to use tabbed navigation. While the text displayed in tabs is not given the full importance in a desktop world, this will change when Google switches to mobile-first indexing.

Figure 209 – Using tabbed navigation you can add a lot of plain text content in a relatively limited design space.

If you want to add even more content, use expand and collapse features. However, remember not to fill that content with spam, or you may get into troubles.

If you decide to use tabbed navigation in the main content area, it is worth mentioning that users easily overlook tabs, so you need to provide strong design clues to help them understand that there is more content behind this type of navigation.

Marketing and merchandising area

Sliders, carousels, or static banners are some of the most used marketing sections on home pages. Merchandizing sections include featured products, top categories, most popular deals, top brands, and so on. SEO does not own these areas, but many times there is nonetheless room for organic improvement.

While carousels seem to have several usability and conversion issues,[9],[10] they are still present on many ecommerce websites, from Dell and Hewlett Packard to other small and medium businesses:

Figure 210 – Carousels can pose some usability and conversion issues.

From an SEO perspective, there are two usual issues with carousels:

  • The entire carousel is built with unfriendly JavaScript.
  • The text content and the links in carousels are embedded in images.

Let’s see how you can identify unfriendly JavaScript carousels.

Unfriendly JavaScript

Custom-made carousels may use AJAX or JavaScript to populate content in the carousel dynamically, and that is when you can get into troubles. Carousels can be tested from an SEO point of view by placing them on a public test domain, subdomain, or page. Once Google crawls and caches the URL, look at the cached version of the page. If you see something along the lines of “loading”, or “waiting for content”, or if the content of the carousel is missing, it means that the carousel’s implementation is not that SEO friendly.

Figure 211 – The content of this carousel is not indexed. Instead, the text “loading…loading”, is.

Additionally, you can test the implementation using the Fetch and Render in Google Search Console.

If you want to send authority to specific items and index links within carousels, you have to correct the coding so that their content can be crawled.

Figure 212 – A product carousel at target.com.

In the example above, neither section is cached by search engines. Additionally, the items linked from section 1 are nofollowed, which suggests that this online retailer does not want the links to be crawled and indexed.

However, the links in section 2 are followed, which may suggest that those items are supposed to be indexed. However, section 2 is implemented with JavaScript, which can create accessibility issues for those URLs.

Embedded text and links

To make the text and links accessible to search engines, use CSS positioning and image replacement to overlay text on background images. Another option is to create the carousels or banners in plain HTML and CSS:

Figure 213 – The carousel in the image above is built with CSS and HTML.

Figure 214 – The text on this banner is overlaid with CSS and HTML as well. You can check if the text is plain HTML by selecting the content of the page using CTRL+A. Plain text whitens out when selected in the browser.

When the imagery is more complex, and you cannot implement image replacement or plain CSS/HTML carousels, use image alt text along with <area> and <map> tags to create the links.

For example, the image below embeds three call-to-action links: shop now, sign up for emails, and like us on Facebook.

Figure 215 – The three calls to action in this example are implemented with area maps.

The code deploys the HTML <map> element with three areas to make the links clickable. Each area tag has its alt text:

Figure 216 – The above image depicts the HTML source code for the previous screenshot.

If you think about it, maps inside images make sense for ecommerce websites. For example, apparel websites that feature model looks could allow users to click on the hat, the pants, or any other piece of clothing depicted in the hero images, to send users directly to product pages. The area tag will effectively make such images shoppable.

You can improve image maps a bit with SEO-friendly tooltips and hot spots that expand at click. Notice the + and $ signs in these two carousels:

Figure 217 – The two hotspots can be used to add more text content.

When you hover the mouse over these two icons, the + and $ signs expand to provide more details. The text in the tooltips is indexable:

Figure 218 – The text in the + tooltip is indexable.

Since <map> and <area> tags are not commonly used by marketers, here are a couple of tips for you:

  • Every area element should have an alt attribute, even if it is empty (alt=““).
  • The alt text should describe the image in 150 characters maximum.
  • According to Microsoft, the alt text should not start with the word “copyright” or with the copyright symbol [© or (c)][11] or any other character or symbol that has no search-engine relevance. Start the alt text with the most important words.
  • Avoid Flash area maps.
  • Use XML or text files for the tooltip content. This will allow copywriters to make changes easily.
  • If a particular Google ad you tested worked well, it might also help with the CTR on tooltip links, since ad titles are usually fewer than 25 characters.

Merchandising areas

Products and categories linked from the merchandising areas of the home page such as hot deals, best sellers, or top brands tend to receive more internal link authority than products linked using structural links. Therefore, you can use these areas to push more SEO love to the products and categories that are the most important for your business.

Figure 219 – Merchandizing is a very common practice among online consumer electronics retailers.

Sometimes, the items listed in these areas are implemented with carousels. Make sure that the design of the carousels allows people to identify that they are looking at a carousel, and make sure that they have full control of the play, pause, next, and previous functionalities.

Many users will probably miss the black arrows and the dots that control the carousel in the previous image.

If you want to send PageRank to the items in the carousel, implement it in an SEO-friendly way, as previously discussed. When you have to use AJAX to load the items, I recommend loading a default set of items in the first slide of the carousel. This first slide is loaded in the raw HTML and is accessible to search engines. Load the next slides with AJAX when the user clicks the controller buttons.

Here are some SEO tips to optimize the items listed in the merchandizing areas:

  • It is not mandatory to wrap the merchandizing section name in an HTML heading, but it can be done if you want. Since there will likely be more than one merchandizing areas, I recommend using level 2 headings (H2), or higher. H1s are usually used on more important labels, and for ecommerce templates, it is not a good idea to have multiple H1s on the same page.

Figure 220 – A sample headings outline.

On product listing pages, the product image thumbnails and the product names need to be optimized to send one consolidated signal. This means that the image alt text and anchor text of the product should be the same, or slightly different.

Figure 221 – The image’s alt text is the same as the anchor text.

  • Add one to three links to manufacturers, brands or relevant product attributes, whenever you have the space to do so:

Figure 222 – You can see how Amazon links to various category pages from the Top Holiday Deals section.

  • Avoid linking with generic anchor text:

Figure 223 – The “See Details” link in the image above is not optimal. Instead, the product name should be the link. If you want to keep the “see details” link, you can embed it in an image with the product name as alt text.

  • Do you need a “Buy Now” or “Add to Cart” link on the items listed in the merchandizing areas of the home page? Do visitors add to cart directly from the homepage? If not, consider either removing such strong CTAs or replacing them with softer CTAs such as “More details” or “Find more”.

If you are keen on using add to cart buttons in these sections, implement them with either the HTML <button> element or using JavaScript:

Figure 224 – The “More Details” link is implemented as HTML <button> elements, so it won’t be crawled.

Figure 225 – This “add to cart” link is implemented with JavaScript, which means that the chances of being followed and crawled are minimized.

  • If you want to add even more content inside merchandizing areas, use CSS expand and collapse features:

Figure 226 – When you click on the “more details” link in this example, it displays more information about the “MacBook Pro” sale.

  • Use alt text and image map areas for links whenever you use images to display promotional banners.

Logo

Most of the logos are implemented as images. Sometimes, logos embed the company’s tagline, slogan, or unique selling proposition.

Alt text or image replacement?

There is a lot of debating about how to implement logos properly, not only from an SEO point of view but also as HTML markup. Regarding SEO, some argue that using the alt text on the logo image is enough; others recommend using the image replacement technique. Regarding HTML markup, some believe logos should be wrapped in H1, while others say it should be an H2 or have no heading markup at all.

No matter how you choose to provide better context for a logo image, whether using alternative text or image replacement, that content will not provide a big lift in rankings. Just because you use the primary keyword in the alt text of the logo (for example SiteName – Digital Cameras) does not mean that search engines will consider your website the authority for that keyword. It may help a tiny bit in obscure niches.

If you can easily and safely implement image replacement for your logo, do it. Image replacement is not spamming if you do not abuse it with keyword stuffing or other crazy stuff. W3C uses image replacement in their logo; A List Apart and Smashing Magazine use it too. MOZ used to do the same, before their site redesign.

However, keep in mind that using alt text on image logos will work just as well as image replacement.

Let’s look at a few options you can consider for the text describing the logo, implemented either with alt text or image replacement:

  1. You can use just the company name (e.g., “Staples logo”).
  2. Alternatively, use your company name plus two or three top categories (e.g., “Dell laptops, tablets, and workstations”).
  3. You can also use a dynamic text that includes the brand name plus a category name. In this case, the text changes from one page to another. For example, on the Home page you will use the alt text “Microsoft logo”, but on the Tablets page you will use the alt text “Microsoft logo – Tablets”.
  4. Another option is to use the company name plus a tagline, unique selling proposition or slogan (e.g., “KOHL’s – expect great things”). Ideally, the slogan is short, descriptive and contains some keywords.

Figure 227 – Kohl’s slogan.

 5. You can also use the company name, followed by the company slogan in plain text:

Figure 228 – HP’s slogan.

HP’s slogan states the focus of the website and cleverly includes the keyword pattern{Printing Solutions}. This improves relevance for keywords like “large format printing solutions”, “commercial printing solutions”, and “industrial printing solutions”.

I like implementing option #5 wherever possible, followed by #4, #3, #2 and #1. In each case, the text must be representative of the logo it describes and should not be spammy.

You can provide context for search engines using the logo’s alt text or image replacement. Both should be user-friendly and are OK for SEO as long as you do not spam. Note that the alt text on clickable images is the equivalent of the anchor text on text links, so it is worth spending time optimizing your alt texts.

However, there are also implementations that do not allow alternative text on logos. One such implementation is the use of CSS backgrounds or using CSS sprites.

Figure 229 – Walmart uses CSS sprites for their logo and other icons. In the cached version of the page, the alt text for the logo is missing since the logo was implemented as a CSS background.

The logo should link using the homepage canonical URL to consolidate relevance signals. This ensures that the internal reputation is not split between multiple URLs.

For example, do not link to the home page using both the default URL (i.e., index.php) and the root (/). Choose a canonical version, which is usually the root, and link consistently. This is not an issue for search engines nowadays. However, it is a web development best practice to be consistent with your internal links, just as using lower case only for your UTM tags values is best practice for tracking and analytics.

Wrap the logo using the Organization markup

Google supports[12] Schema markup for organization logos[13]. This means that you can markup your HTML code to specify which image should show up as the logo in the Google Knowledge Graph when someone searches for your brand name.

Simply wrap the logo using the Organization markup as in this example:[14]

<div itemscope itemtype=“http://schema.org/Organization”>

<a itemprop=“url” href=“http://www.example.com/”>Home</a>

<img itemprop=“logo” src=“http://www.example.com/logo.png” />

</div>

This will help get more SERP estate dedicated to your brand.

One frequent question is “Should logos be wrapped in H1”?

Wrapping the logo in an H1 heading is highly contentious.[15] Additionally, the SEO influence of H1’s is not significant enough.[16],[17]

My stance is that the logo should not be marked up with H1, or with any other heading, for that matter. A heading is a textual element that should be marked up as text, while a logo is a branding image and should be marked up as an image.

Utility links

Site personalization, user account login, help, and cart links can be labeled as what usability experts call utilities (Steve Krug[18]) or courtesy navigation (Jesse James Garret[19]) links.

Site personalization

Some of the best-known site personalization links are shipping destination, language, and currency selectors:

Figure 230 – A “ship to” country selector.

These are links used to personalize the shopping experience based on, user geolocation, shopping currency or site language. For example, someone vacationing in France might visit a Canadian website to send a gourmet gift basket to their loved ones in Canada. You identify the visitor IP as French and decide to change (automatically or pop-up based) the ship-to country to France, the item currency to EUR and the language to French. However, you need to give users the ability to change these settings.

Figure 231 – In this example, the language is set to French, and the currency is set to Euro.

One of the common SEO mistakes with site personalization links such as ship-to, language, and currency selectors, is that they create crawlable URLs, even if they are temporary 302 redirects. This happens because, in many cases, the user selection is kept in the URL, as in the image below:

Figure 232 – These crawlable URLs will create duplicate URLs for each page that has currency selectors, which is not desirable.

The solution is straightforward: do not create crawlable URLs for such selectors; keep the user choices in cookies rather than in URLs. Additionally, you can use AJAX to load the user choices and to set up cookies.

Figure 233 – When the ship to icon on the top right is clicked, a modal window opens for users to make their selections. When the CONTINUE button is clicked, the choice updates are made using AJAX, and the users’ choices are kept in cookies.

Store locator

These links are present in more or less prominent places in the page layout, depending on how important the web-to-store behavior is for each company. Sometimes, the link will be in the masthead, sometimes at the bottom of the pages, in the footer.

Figure 234 – Walmart puts a lot of emphasis on store locations because the web-to-store visits are primordial for them.

Figure 235 – The store locations don’t seem too important to Nordstrom.

The Store Locations link being positioned in the footer implies to a certain degree that the web-to-store traffic is not important for Nordstrom. However, the mobile version of the website displays the store location icon in the primary navigation:

Figure 236 – The mobile version of Nordstrom’s website.

This mismatch between mobile and desktop designs makes me think if the desktop design was well planned.

The store locator link should not be nofollow or otherwise blocked to search engines. The link should land users on a page that lists all the store locations or that allows easy and quick location searches. Additionally, you should have a dedicated landing page for each store location and create a separate XML Sitemap for store location pages.

Cart links

These are the shopping cart and checkout links.

Figure 237 – Most of the time, the shopping cart icon is placed at the top right of the page. Amazon has trained shoppers to expect it in that area.

Perpetual mini-shopping carts[20] are often implemented with AJAX and are not crawlable. That is fine because you do not need the shopping cart or checkout pages to be crawled or indexed. On a side note, such carts are called perpetual because they display the number of items in the cart while users navigate other parts of the website. Persistent carts are the carts carried across multiple sessions, when users place something in the cart, leave the website and then come back at a later date, and the items are still in the cart.

Some prefer to nofollow these URLs as a way to preserve the crawl budget. This will not hurt, but it is not necessary for SEO. Checkout pages can either be blocked in robots.txt or noindexed with the meta robots tag at the page level.

Help links

These are the links to Contact Us, FAQ, Live Chat, Help Center, and similar pages. If your current Live Chat link is not JavaScripted, block it with robots.txt. Links such as Help, FAQ, and Contact Us should be accessible to robots as plain HTML links.

You can consolidate many help links under a single menu to list more links in a limited space:

Figure 238 – The dropdown list under the Need Help? label contains important help links for users.

On many websites, help links also appear in the footer, which means the links will be duplicated.

User account links

These are links that allow users to perform actions such as creating or logging into their accounts, registering, tracking their order status, etc.

Figure 239 – Best Buy disallows the entire secure www-SSL subdomain. This subdomain hosts the user accounts, and it has been blocked. That is a good approach.

Usually, account links lead to secure HTTPS sections, and there is no need for search engines to crawl or index account pages.

You will notice that many ecommerce websites nofollow the user account links:

Figure 240 – User account links are nofollow.

The red border in the image above (and below) means that the links as nofollow. Both Target and OfficeMax nofollow user account pages, but this is not necessary nowadays.

Figure 241 – The red dotted border means that those links are nofollow.

Google recommends leaving the nofollow off not only for account links but all internal links.[21] This –they say–, allows PageRank to flow freely throughout your website.

Keep in mind that controlling the crawl, no matter how it is done (e.g., with nofollow, robots.txt or uncrawlable JavaScript) is tricky.

Here’s why:

  • If you control crawling with robots.txt, PageRank flows into robotted URLs,[22],[23] but it does not flow out from those URLs.
  • If you control crawling with nofollow, PageRank does not flow into the nofollow URL,[24] but it flows out from the linked-to URL.
  • If you control with uncrawlable links, then no PageRank flows in, but it flows out (if the page is discovered in a different way by search engines).

Ask yourself, will your account pages help searchers if search engines index them? Do searchers use terms like “Account Login at{your_company_name”}? If not, is there any other reason to have these pages indexed? If there is no other reason, consider blocking bot access to such pages.

Paradoxically, if you want pages completely out of search engines’ indices, you have to first allow bots to crawl and add a noindex meta tag at the page level, only the pages have been crawled. This means you will block those pages with robots.txt until they have been crawled.

If you are concerned about link juice flow, robots.txt might be a better crawl optimization alternative to nofollow. PageRank does not flow out of robots.txt blocked pages because Google is not able to crawl those pages to determine how PageRank should flow through each link. However, if pages were indexed before being blocked by robots.txt, Google knows where and how to pass PageRank. Theoretically, leaving those pages open for crawling once in a while should allow Google to crawl them temporarily, and flow PageRank.

However, if you are not worried about PageRank flow, there are a few alternatives to robotted and nofollow account links:

  1. Consolidate links into groups:

Figure 242 – The Your Account section is a drop-down menu that consolidates several links.

Figure 243 – These account pages are accessible with JavaScript off.

2. Deactivate some account links until the user signs in:

Figure 244 – At click on sign in, the drop-down menu displays account URLs, but they are not active.

In the example above, the links to Your Profile, Check Order Status, Points Balance, Your Coupons and Your Lists are not active HTML links until you sign in.

Additionally, the sign in, join for free, and My Location links are implemented with JavaScript and are not accessible to search engines:

Figure 245 – Search engines can’t find the sign in, join for free, and My Location links.

3. Use a modal window to log in users:

Figure 246 – Clicking on the sign in link at the top right of the page opens a JavaScript modal.

This modal window is loaded on demand, and its content is not accessible to search engines at page load.

Footers

Footers remain one of the most abused site-wide sections of websites. This is probably because footer links still work –at least for some websites–, despite Google saying that this strategy is not acceptable, and it does not work.

Figure 247 – Look no further than Amazon to see the risky use of keyword-rich footer links.

The footer is often the last section that site users will check if they are unable to find the information they need anywhere else on the page. Frequently, users will not check the footer at all. It is thus important not to bury links there.

Footers are usually implemented as boilerplate text and will send less link authority to the linked-to pages. Yahoo! even stated that they might devalue footer links:

The irrelevant links at the bottom of a page, which will not be as valuable for a user, do not add to the quality of the user experience, so we do not account for those in our ranking”.[25]

Google sends contradictory signals to webmasters by stating that site-wide links are outside their content quality guidelines, but at the same time not acting against those who abuse such links.

Since footers are at the very bottom of web pages, the CTR on footer links is pretty low. However, footers can be great for user experience, especially fat footers,[26] and they may be a pretty useful internal linking tool as well.

Let’s discuss some ideas for improvement.

Do not repeat the primary navigation in the footer

If some links are listed in the primary navigation menu, you do not need to repeat them in the footer. The most important links for users should be somewhere in the masthead. Keep the footer for relevant but less important links.

Group links logically

If you want to reduce the number of links in the footer, consider consolidating multiple URLs in a single page. If space is a concern, you can implement JavaScript dropdowns like in the image below:

Figure 248 – Instead of creating unique URLs for each topic in the Your Orders section, create just one main URL for Your Orders and include the content of all sections on that page (e.g., Order Status, Shipping & Handling, etc.)

Take a look at how Staples and YouTube implemented this. Both are very good examples of useful footers:

Figure 249 – The Corporate Info “pop-up” menu links to relevant pages, but it takes up less visual space.

Figure 250 – Staples leaves the links open for crawling.

Also, the links in the pop-up menu are open for crawling as you can see in the cached version of the page.

Figure 251 – Country selector on YouTube.

On YouTube, the country listing is made available with AJAX only when the Country selector is clicked. Imagine if this list was available in plain HTML on every single page of the website.

Figure 252 – The country links are not available to crawlers.

Walmart uses [+]expand and [-]collapse links to give users more options, but search engines do not have access to subcategory links:

Figure 253 – A click on the small [+] signs opens us several subcategories.

While the top categories links such as Electronics, Bikes, and Toys, are crawlable, their corresponding subcategories like Laptops, Apple iPads, Tablets, and TVs, are not:

Figure 254 –This approach combines user experience and SEO.

Remove useless links

Track clicks on footer links either with URL parameters or using click tracking tools, such as CrazyEgg. Are you helping users with those links, or do you want to have an optimized footer just for search engines? If you find that people do not click on some of your “optimized” links, then do not link them from the footer.

On the other hand, if a link in the footer gets a high number of clicks, consider placing it in a more prominent location on the page.

Test having versus removing links in the footer and measure how each affects conversions rates, SEO or usability.

Do not abuse footers

We know for a fact that search engines do not like external site-wide footer backlinks.[27] Also, we know that search engines treat links in boilerplate sections differently from contextual links, where the former do not pass as much link authority.

Footer links are not inherently bad; in fact, it may not be the links that cause problems, but the general abuse of footers.

Footers seem to be one of the preferred sections for over-optimization. Look at the following screenshot, for example:

Figure 255 – The extensive use of “women’s” and “men’s” in these anchors is unnecessary.

If you stuff exact anchor text keywords in the footer just because doing so still works, mind the Penguin penalty that may come with this tactic. That is right, Penguin is not only about external backlinks, but about internal links as well. For more information about this see pointer number one in this video,[28] and see this comment[29] as well as this video.[30]

Figure 256 – Site-wide links with exact anchor text like in this example will create problems. The larger the website, the higher the chances of being filtered by Panda or Penguin.

Another trick you must avoid is placing “SEO content” far below the footer, well below the “normal” view area. Even if Google is not able to penalize you algorithmically if you do this, this page certainly will not pass a manual review.

Figure 257 – The “SEO content” starts after the real footer ends, which is a foolish attempt to trick users and search engines. I wonder whether that is why this website had a gray toolbar PageRank in the past.

Tabbed navigation

If you need to provide users more links in the footer, tabbed navigation is another option as it will allow you to display more links. Take a look at this example:

Figure 258 – Tabbed navigation will allow you to display more links.

You can optimize tabbed navigation even further by adding even more text in the footer. Take a look at how 1-800 Contacts does this:

Figure 259 – A great usage of tabbed navigation in the footer.

Each link on the left side triggers a new tab. For example, when you click on the Our Commitment link a new tab is loaded:

Figure 260 -This is good content for a footer. Adding a few internal contextual text links would bring even more SEO value to this section.

Dynamic footers

Ecommerce websites can make footers more appealing to search engines by dynamically updating the content and the links in footers to be relevant to each section of the website and even to each page. This approach works best when implemented with a tabbed navigation that allows at least 150 words of content to be displayed in the footer.

Consider the same example from 1-800 Contacts. They could add a new tab to the page where the footer is displayed, featuring a relevant excerpt from a recent blog post. For example, on the Avaira brand page, the new tab name could be something like Avaira News, and the blog post excerpt would contain the brand name and, eventually, one or two links to Avaira products. The footer on the Biomedics brand page would be related to Biomedics products, and so on.

Just as properly related linking is helpful for users, adding page-specific links in the footer is also good for usability, as it customizes the shopping experience to meet users’ expectations.

You can even customize the links in the tabbed navigation. For example, on a product detail page for Avaira lenses you can dynamically change the link from “Contact Lens Brands” to “Avaira Contact Lenses”, and list only products manufactured by Avaira.

Dynamic footers make sense if you keep user intent in mind. These footers will not only help users by presenting content relevant to the page they are on but will also help with varying the internal anchor text. Dynamic footers will reduce the occurrences of exact match anchor text and will prevent the footer from generating site-wide links. Currently, many ecommerce websites do not use this concept.

During the past 20+ years, footers –as compared to other areas of a page– have not evolved into something more helpful for users. Maybe you can start changing that and work it to your organic advantage, at the same time.

Debugging and support

Footers can also be used as a debugging, development or customer-support feature.

Tagging footers with unique text containing the year and month the page was last generated or updated (e.g., “page generated September 2017”) will allow some basic crawl debugging a month or two later with the site: operator, for example by using site:mysite “page generated September 2017”.[31]

  1. Does User Annoyance Matter? http://www.nngroup.com/articles/does-user-annoyance-matter/
  2. Mega Menus Work Well for Site Navigation, http://www.nngroup.com/articles/mega-menus-work-well/
  3. Findability, http://en.wikipedia.org/wiki/Findability
  4. The Paradox of Choice, http://en.wikipedia.org/wiki/The_Paradox_of_Choice:_Why_More_Is_Less
  5. ‘How We Decide’ And The Paralysis Of Analysis, http://www.npr.org/templates/story/story.php?storyId=122854276
  6. Avoid Category Names That Suck, http://www.nngroup.com/articles/category-names-suck/
  7. Interface Design >Navigation > Trunk Testing, http://jrivoire.com/ED722/trunktest.html
  8. Nine Techniques for CSS Image Replacement, http://css-tricks.com/css-image-replacement/
  9. Auto-Forwarding Carousels and Accordions Annoy Users and Reduce Visibility, http://www.nngroup.com/articles/auto-forwarding/
  10. Are carousels effective? http://ux.stackexchange.com/questions/10312/are-carousels-effective/10314#10314
  11. WEB1009 – The <img> or <area> tag does not have an ‘alt’ attribute with text, http://msdn.microsoft.com/en-us/library/ff724032(Expression.40).aspx
  12. Using schema.org markup for organization logos, http://googlewebmastercentral.blogspot.fr/2013/05/using-schemaorg-markup-for-organization.html
  13. Thing > Property > logo, http://schema.org/logo
  14. Thing > Organization, http://schema.org/Organization
  15. The H1 debate, http://www.h1debate.com/
  16. Survey and Correlation Data, http://moz.com/search-ranking-factors
  17. Whiteboard Friday – The Biggest SEO Mistakes SEOmoz Has Ever Made, http://moz.com/blog/whiteboard-friday-the-biggest-seo-mistakes-seomoz-has-ever-made – Check #4 starting around 5:00
  18. Do not Make Me Think: A Common Sense Approach to Web Usability, 2nd Edition – http://www.amazon.com/Dont-Make-Me-Think-Usability/dp/0321344758/ref=la_B001KHCFUU_1_1?s=books&ie=UTF8&qid=1387533022&sr=1-1
  19. The Elements of User Experience: User-Centered Design for the Web and Beyond (2nd Edition) (Voices That Matter), http://www.amazon.com/Elements-User-Experience-User-Centered-Design/dp/0321683684/ref=sr_1_1?s=books&ie=UTF8&qid=1387533087&sr=1-1&keywords=jesse+james+garrett
  20. Persistent Shopping Carts vs. Perpetual Shopping Carts, http://www.getelastic.com/persistent-shopping-carts-vs-perpetual-shopping-carts/
  21. Should I use rel=”nofollow” on internal links to a login page? https://www.youtube.com/watch?v=86GHCVRReJs
  22. PageRank: will links pointing to pages protected by robots.txt still count?, http://webmasters.stackexchange.com/questions/5534/pagerank-will-links-pointing-to-pages-protected-by-robots-txt-still-count/5548#5548
  23. Will a link to a page disallowed in robots txt transfer PageRank, https://www.youtube.com/watch?v=j6H3xBcvkZY
  24. PageRank sculpting, http://www.mattcutts.com/blog/pagerank-sculpting/
  25. Eric Enge Interviews Yahoo’s Priyank Garg, http://www.stonetemple.com/articles/interview-priyank-garg.shtml
  26. SEO and Usability, http://www.nngroup.com/articles/seo-and-usability/
  27. Link schemes, https://support.google.com/webmasters/answer/66356?hl=en
  28. Smarter Internal Linking – Whiteboard Friday, http://moz.com/blog/smarter-internal-linking-whiteboard-friday
  29. Smarter Internal Linking – Whiteboard Friday, http://moz.com/blog/smarter-internal-linking-whiteboard-friday#comment-189218
  30. Webmaster Central 2013-12-16 https://www.youtube.com/watch?v=Snarx2wBlWg @ 27:52
  31. How to Build an Effective Footer, http://graywolfseo.com/seo/build-effective-footer/

CHAPTER 7

Category & Product Listing Pages (PLPs)

Length: 21,316 words

Estimated reading time: 2 hours, 25 minutes

Chapter-Head-Chapter7

Category & Product Listing Pages

Those involved in ecommerce in one way or another refer to product detail pages (also known as PDPs), as the “money pages”. This seems to imply that many view PDPs as the most important pages for ecommerce. Because of this mindset, often the PDPs do get the most attention, at the expense of listing pages such as product or category listing pages.

However, listing pages are in fact the hubs for ecommerce websites, and can collect and pass the most equity to lower and upper levels in the website hierarchy. Also, link development for ecommerce usually focuses on category and subcategory pages, so listing pages deserve more attention.

Listing pages display content in a grid or list. When these pages list products, they are referred to as product listing pages or PLPs. When the pages list categories, subcategories, guides, cities, services, etc., they are referred to as category landing pages or simply, category pages; sometimes they are also called intermediary category pages.

Two types of listings

Listing pages usually display one of two types of items:

  • Products – this listing displays items belonging to the currently viewed category.
  • Subcategories – this listing displays subcategories under the currently viewed category or department.

Product listings

Product lists (or grids) display thumbnail images for all the items categorized in a certain category or subcategory. This means that all the items listed there share a common parent in the hierarchy.

Figure 261 – This screenshot shows a traditional product grid. All the items displayed in the main content area belong to the Guitars category.

The product list approach has the advantage of sending more authority directly to the products in the list, especially to those on the first page of the list. However, these listings can present too many options to users, who may have to sift through hundreds or thousands of products, as depicted in the image below:

Figure 262 – 2,839 clothing items in a single category will require pagination.

In many cases, showing the entire list of products belonging to a top-level category will not make sense to users. They need guidance in choosing a product, and listing thousands of items is too much and too generic.

Let’s talk about several recommendations for optimizing product listings.

Deploy an SEO-friendly Quick View functionality
Use this feature to provide more content and context for users and search engines. This functionality is usually implemented with modal windows to quickly provide product summary information without visiting the actual product detail page:

Figure 263 – A click on the QUICK LOOK button brings up the modal window to the right. This functionality can lead to a better shopping experience.

To make this functionality work to your advantage, implement it with SEO-friendly JavaScript, whenever possible. For example, you can deliver more crawlable content by loading the static product description in the source code but displaying it in the browser only when Quick Look is clicked. Dynamic information such as product availability, available colors, or pricing can be loaded on-demand with AJAX.

Just as with any other method that displays content to users only at certain browser events, it is wise not to abuse the Quick Look implementation. This means that the content should be super-relevant and brief. Fifty to 150 words for the product description is probably more than enough.

Also, the internal linking should not go overboard; two to five links in the short product description is enough.

Additionally, you may want to consider the number of items you load in the default view, which is the view that gets cached by search engines. If you load 20 products each with 100-word descriptions, that is 2,000 words of content on that page. If you load 50 products, that is 5,000 words, which may be too much.

Create and improve internal algorithms to optimally display items in the list.
SEO is about increasing profits from organic traffic by optimizing for users coming from search engines. If a user lands on a category page and the first items in the list or in the grid do not generate profits for you, then you are missing opportunities.

You need to design and use an algorithm that assigns a product rank to every item, and you need to organize the products based on this metric. The algorithm does not have to be complex. It can consider a few different metrics, for example, percentage margin, sales statistics, stock availability, proximity to user location, and even hand-picked items.
The idea is to put the profit-maximizing items first on the list.

Most sites have the best-selling or most popular items as the default view in the product list, which is good for usability because most customers will be looking for bestsellers.[1] However, that does not mean that you should not try to optimize profits by experimenting with your rankings algorithm that displays at the top the products most important to you.

Add category-specific content
Adding content to PLPs can increase the chances of showing up higher in search result pages. This applies to categories at all levels of the hierarchy.

You are probably familiar with the “SEO content” for category descriptions; many ecommerce websites have it nowadays, usually at the bottom of the page. Take a look at the screenshot on the next page:

Figure 264 – The “SEO content” is displayed after the product grid or list to allow items to be displayed above the fold. The SEO influence of this content can be improved by adding links to several internal pages.

Do you wonder if this tactic works for Newegg?

Figure 265 – They rank #2 for “LCD monitors”, above Best Buy.

Of course, other factors helped this page rank second, but that category description will have some influence as well. Remember, SEO is about making small and incremental changes.

Some websites prefer to place this type of content above the listing, but this approach does not allow room for much copy, and it will push the listing down the page, as you can see in this screenshot:

Figure 266 – The category “SEO text” at the top of the listing pushes the products down the page.

In the above example, the category description is not long at all, but the marketing banner pushes the product grid even further down.

There is no doubt that more text content can help with SEO. However, adding too much content above the product list can push the products below the fold, which can, in turn, confuse users, and negatively affect conversion rates. On the other hand, displaying the content below the product grid is not as helpful and effective as having the content at the top of the page.

There are a few techniques to address this issue, for example, collapse/expand more content at click, or using JavaScript carousels. I find SEO-friendly tabbed navigation to be one of the most elegant solutions to fit a lot of content at the top of the page. This approach is good for both users and bots, and it can be done within a limited amount of space without being spammy.

A quick note on content behind tabs and expandable clicks: before mobile-first indexing, such content is considered less important and given less authority. However, this changed when the mobile-indexing went live.

Let’s compare “before and after” tabbed navigation screenshots. The screenshot below shows how a category page looked like on REI. It displayed some content at the top of the list, but it was not using tabbed navigation. Notice how the content at the top pushes the listing down the page.

Figure 267 – This implementation does not require tabs.

And this is how the tabbed navigation version looks:

Figure 268 – This new design uses tabs to display more above the fold.

The Shop by Category is the default tab, which is great for users because it lists subcategories. The last tab, Expert Advice & Activities, holds a whole lot of SEO value:

Figure 269 – The content above is great for users and search engines.

The content in the previous image is not only well-written content that focuses on users and conversions rather than SEO, but it is also great “food” for search engines. This type of content targets visitors at various buying stages and will move them further into the conversion funnel, which is awesome. It will also increase the category’s chances of ranking better in the SERPs.

A quick note here: REI could easily add one or two contextual text links to thematically related subcategories or products to push some SEO equity to them.

The lesson here is that whatever content is placed in the tabbed navigation, it should be useful for users and should not be just some boilerplate text.

I mentioned it before and will say it again, ecommerce websites have to become content publishers if they want to succeed in the long run. This is not an SEO strategy but rather a healthy marketing approach. The content you place on each page should match the user intent targeted with that page. If the query you target with a page is generic, i.e., targeting category names, try to satisfy multiple intents on that page. I call this multi-intent content.

In addition to the great content wrapped in this tab, REI added even more content at the bottom of the subcategory grid, outside the tabbed navigation:

Figure 270 – Adding more content at the bottom of the page is intended to increase the relevance of this subcategory page.

One great implementation of SEO content at the bottom of the listing grid is on The Home Depot’s website. They placed buying guides, project guides and category-related community content, which is great for users, and search engines will fall for it. Just a side note here: it would be interesting to test the effects on conversions if this type of content was moved up in the layout, to just above the product grid.

Creating the kind of content deployed by Home Depot is a win-win tactic because:

  • Users will get helpful content to assist with their needs and questions, which leads to better conversion rates.
  • Search engines will love such content, which leads to more organic traffic.

Figure 271 – A very useful section is listed at the end of the product listing.

Another option for adding more content to category pages is to present a link or a button to more content, just above the listing. You can see it exemplified in this screenshot:

Figure 272 – When users click on the View Guide button they are taken to a new page. The guide on this new page is long and good, but it does not add any value to the listing page itself.

Instead of opening the guide in a new page, a better SEO option is to open a modal window that contains an excerpt from the guide. Preload the text excerpt in the HTML code so that it is accessible to search engines. This modal window will contain a link to the HTML guide, so users can click on it if they need to read the entire guide.

Creating content is time and resource consuming, so you need to identify the top-performing or best-margin categories to start with, then gradually proceed to others.

Capitalize on user-generated content (UGC)
User-generated content is a highly valuable SEO asset, so let’s take a look at two types of UGC that you can implement on listing pages: product reviews and forum posts.

Product reviews
Adding relevant product reviews will influence conversion rates and search engine rankings:

Figure 273 – In this screenshot, you can see how the product reviews section is displayed at the bottom of the product listing page. The reviews in this section should, ideally, match some of the products in the listing.

If the listing is paginated, the reviews should be listed on the index page and should not be repeated on paginated pages. If you have enough reviews to populate pages 2-N of the series, you might be tempted to do so, but this is not a good idea.

In such cases, you may want to consider increasing the number of reviews you list on the index page. Instead of listing three reviews, increase to five or ten.
When you do so, you need to create rules to avoid duplicate content issues between listing and product detail pages. Such rules can be:

  • do not display more than two reviews for the same product on the same page.
  • display only five reviews on the same listing page.
  • on the listing page do not display the same reviews you displayed on the product page

Forum posts
Community content such as forum posts can be handy not only in the forum section of the website (of course, if you have one) but on category pages as well:

Figure 274 – In addition to product reviews, relevant forum posts are listed below the category grid.

Optimize for better SERP snippets
Product listing pages can get rich snippets in Google search result pages:

Figure 275 – SERP snippet enriched with list item count. Sometimes, Google displays only the number of items in the listing; other times, it displays a few item names as well.

While many ecommerce websites are interested in knowing how to get these rich snippets, Google’s official recommendations do not go into much detail:[2]

“If a search result consists mostly of a structured list, like a table or series of bullets, we will show a list of three relevant rows or items underneath the result in a bulleted format. The snippet will also show an approximate count of the total number of rows or items on the page” (for example, “40+ items” as in the screenshot above)”.

Figure 276 – Clean HTML code can help with getting the item count in the rich snippet.

Google can use your HTML code to generate rich snippets, which means that it does not necessarily need semantic markup such as Schema.org. This is why it is important to keep your code clean and well structured.

Keep in mind that if your listing pages get rich snippets that include item names, then the description line in the SERP snippet will be shorter than the usual ones. Instead of two or three lines of text, the description snippet may be truncated to just one line of text. You may want to check the impact on SERP CTR in such cases.

Here are some tips on how to get rich snippets for category listings:

Validate the HTML code for your list
If you open a list item element but do not close it, or if you nest elements improperly, it will be more difficult for Google to understand the page structure.

Figure 277 – Each product in the grid is wrapped in a list item element that is properly closed. Also, notice the DIV and UL class names.

Do not break the HTML tables
The rich snippet will display the number of items on the index page (e.g., “40+ items”) if the product grid contains 40+ items in a single table, but only if the table markup has no breaks. If something in between items 10 and 11 breaks the table, Google will instead display the message “10+ items”. If you list your products in multiple tables, Google will choose to display the count from only one of them.

Use suggestive HTML class names
It is reported[3] that using the class name “item” in the item’s DIV helps with getting rich snippets for category pages:

“Just to confirm, wrapped a few items in <div class=items> and the snippet has been updated. Took four days to appear in the SERPs”.

This advice seems to be working, at least to some extent, as you can see in the following screenshot:

Figure 278 – Notice the LI class name.

The DIV that wraps the product grid contains the word “products”, and this seems to be common among websites that get rich snippets without using semantic markup. Also, the list item class name contains the word “item”.

Figure 279 – The rich snippet for the Running Shoes category includes the number of items per page and the total number of items in this category.

A large total of items in the list may attract more clicks because a large selection is one of the things consumers look at when choosing where to click. This brings us to another optimization idea.

Reconsider the number of items in the listing
If the number of products within the currently viewed category is reasonably low and easily skimmable (e.g., 50 items in a grid of five rows by ten columns), then load them all on one page. Depending on how many other links you have on the page and your overall domain authority, you can sometimes pump up this number to 100 or even more.

If you think that it is necessary to display a low number of items from a user experience perspective, you can load 50, 100 or 150 items in the source code in an SEO-friendly way, and use AJAX to display only 10, 15 or 20 items in the browser, to avoid information overload. You can then use AJAX to update the page content based on user’s requests, such as scroll down, sort, display all, and so on.

If you have thousands of products under the same category, consider breaking them into more manageable subcategories. You can list the subcategories instead of products after segmenting into smaller chunks.

Tag product reviews with structured markup
This is a debatable tactic, so you want to pay attention to how you implement it. Search engines do not support product review markup on product listing pages and may consider such markup spam; be careful.

Figure 280 – The Review markup can only be safely used on PDPs. PLPs should not contain this markup.

Make sure you do not use the AggregateOffer entity in your markup because this will raise spam concerns. The safest entity to use in PLPs is the Offer.

To learn more about Schema.org product markup, read this article.[4]

Display category related searches below the search field
Related searches sections have traditionally been used to link internally to other pages and to flatten the website architecture. Here’s a classic example:

Figure 281 – The Related Searches section helps with linking internally to other pages.

Related searches are there to help users with discoverability and findability, by providing highly related links to other pages on a website. Given this, why not place them closer to where users will perform a search, such as a search field? You can see this in action on Zappos’s website:

Figure 282 – The Search by links are placed in a prominent place, to push authority to the linked pages and to help users.

However, on Zappos, the links are the same on every page, and they do not make sense on the Bags section of the website:

Figure 283 – Zappos displays search options as plain HTML links right below the search field.

Size, Narrow Shoes and Wide Shoes are not useful refinements for someone looking for bags, right? Instead, you can dynamically change these links to something related to bags, maybe by linking to a page that filters bags by Style or another attribute that suits the Bags category.

If you do not want to use too much space to list 10 or more related popular searches, you can implement a modal window that opens at click on “Popular Searches”. Make sure its content is available to search engines on page load. You can list as many popular related keywords for each category as you like.

Figure 284 – The image above depicts a possible implementation of popular searches using with a modal window.

As mentioned in the Home Pages section, you can use one of the following sources to identify searches helpful to users:

  • Find the top searches performed on each category page.
  • Identify the products or subcategories most visited after viewing the category page.
  • Get the top referring keywords. Remember, Google and other commercial search engines hide search queries behind the “not provided” label.

Defer loading the product thumbnail images
When you load tens of items in a listing, the chances are that many of them will be below the fold.[5] Loading all thumbnail images at once is neither necessary nor recommended. Lazy load only when the user scrolls down to view more products.

While image deferring has not much to do with rankings, it will help improve user experience by decreasing the page load time.

Figure 285 – Notice how small the scroll slider is (highlighted in a red rectangle); this size conveys that the page is very long. The products in the screenshot are several thousand pixels “below the fold”.

A word of caution about the meaning of fold: the “fold” has a very clear meaning in print (i.e., the physical fold of the newspaper right in the middle), but with websites the meaning of fold is blurry. You will need to define and identify where the fold is for your website, considering the browser resolution and the devices used by most of your users.
Obviously, the fold will be different on mobile than desktop.

Remove or consolidate unnecessary links
Product lists often pose the issue of redundant links. For example, in this screenshot, there is an image link on the product thumbnail image and another link on the product name. Both are pointing to the same URL.

Figure 286 – The links on the product thumbnail image and the product name point to the same URL.

There are several ways to address this issue, and we have discussed this before. Please refer to the Link Position section in the Internal Linking section.

Another issue very similar to the thumbnail-product name redundancy occurs when you place a link on the review stars and the text link displaying the number of reviews for the same product:

Figure 287 – The image link on the stars and the text link on the number “6” point to the same URL.

In the example above, none of the links can provide strong relevance clues to search engines due to the lack of anchor text so that you can keep only one link. I would keep the links on the star images, because you can add more SEO context using the alt text, and because the link area on those star images is larger than the text numbers. The link on the number of reviews could eventually be JavaScript-ed.

Removing unnecessary links or other page elements can de-clutter the design, provide white spacing between products, and can reduce the number of links that leak authority to the wrong pages.

Figure 288 – It is unnecessary to display the Special Offers link for each product. Instead, use tooltips or display small icons or stickers to highlight such offers.

Another element frequently listed in product lists is the “add to cart” button. I am not saying that you should remove it without proper analysis, but you can always A/B test to see how it influences conversion rate.

I suggest tracking “add to cart” events and analyze whether users add to cart directly from product listings. If they do, go a step further and identify what type of users do that (e.g., returning customers, first-timers, etc.) In many cases, those who add directly from product listings are return customers who are very familiar with your brand, your products, and your website; usually, they know exactly what they want from you. If you decide to remove “add to cart” buttons, these users will know that they can also add products to their cart, from product detail pages.

The usefulness of the “add to cart” buttons on product listings must be tested — test by replacing them with other CTAs, adding more product detail, or removing the buttons altogether.

Make the listing view the default view for bots (and searchers, if it makes sense)
Usually, the list view allows more room for product-related content, which is useful for users and search engines.

Figure 289 – This is the grid view. There isn’t much room to feature product info in a grid (name and price only).

Figure 290 – In a list view, there is room for more product info to be displayed.

In the example above, the list view is the default view for users and search engines, but users have the option to switch to grid view in the interface.
At the beginning of this section, I mentioned that there are two types of listings. Until now, we discussed product listings. Now, let’s talk about the second type:

Category listings

To list categories, it means that instead of displaying products, you show the available subcategories that a category contains, each displayed using a representative thumbnail image. Category listings are implemented at the first two or three levels of a site’s category hierarchy, depending on the size of the product catalog. Because the number of subcategories to display is low, most of the time category listing pages use a grid view, rather than a list view.

Let’s look at how HomeDepot implemented the subcategories grid in a user and search engine-friendly manner.

Figure 291 – This is the first level of the hierarchy.

The first level in the hierarchy (Appliances), lists several subcategory thumbnails (Refrigerators, Ranges, Washers, etc.), as well as sub subcategory links (e.g., under Refrigerators they list links to French Door Refrigerators, Top Freezer Refrigerators, Side By Side Refrigerators, etc.)
When you click on Refrigerators, a category listing is loading. This time the listing displays some of the most important sub-subcategories for the Refrigerators subcategory.

Figure 292 – This is the third level of the hierarchy.

The third level in the hierarchy (Appliances > Refrigeration > Refrigerators) still lists categories instead of products. This encourages users to take a more deliberate selection path before the page displays tens or hundreds of products.

Implementing subcategory listings in the first two levels of the ecommerce hierarchy has the advantage of sending more PageRank to subcategory pages. That is better than sending more PageRank to just a few products because your link development efforts should point to category and subcategory pages. It is not economically feasible to target product pages with link building unless you have either a large budget or only a few products in the catalog. Developing external backlinks builds equity for category and subcategory pages, which further flows to PDPs.

Implementing the first two levels of the ecommerce hierarchy as subcategory listings also makes for a better user experience. Usability tests have shown[6] that users can be encouraged to navigate deeper into the hierarchy and make better scope selections.

The choice between product and subcategory listing depends on the particularities of each website. Usually, subcategory listings are a better choice, especially for websites with large inventories. Deciding which subcategories to feature at which level of the hierarchy should be based on business rules (e.g., top five subcategories with highest margins, or top-five bestsellers).

Here are several recommendations on how to build better subcategory listing pages:

  • To send SEO authority directly to products, add a list of featured/top items at the bottom of the listing, as shown in this example:

Figure 293 – Keep in mind not to list too many products; five to 10 items should be enough.

  • Keep the left sidebar navigation available to users because that is the spot we have been trained to look to for secondary navigation; this navigation pattern influences conversions[7]. Also, it is easier to scan and choose from secondary navigation links.
  • The secondary navigation will not contain filters until a user reaches the point where you list products instead of subcategories.
  • Display professional subcategory thumbnails, as exemplified here.

Figure 294 – High-quality imagery reassures users that they are dealing with a serious business.

  • Add a brief description of the category whenever possible, and eventually link to buying guides or interactive product-finder tools that may help users decide which product is right for them. This is especially important if your target market is not familiar with the items you sell, or if you sell high-ticket items.

Figure 295 – A brief description of each category may help first-time buyers understand your terminology and can provide more context for search engines.

Figure 296 – Providing guides and educational content helps increase conversions.

In the example above, the original design did not include buttons like “Find the right fridge”, “Find the right washer “or “Find the right dryer”. However, those links can be of great value to users and might help with SEO as well. If searchers click such buttons after landing from a generic search query (i.e. “best appliances”), the clicks will help lowering bounce rates and will lead to longer dwell times.

  • Use an SEO-friendly Quick View functionality to add more details about each category.

Just as this functionality works on product listings, a similar approach can be implemented for category listings.

Figure 297 – In this screenshot, I added the More Info button to the original design for illustration purposes.

Clicking on More Info will open a modal window. In this window, you can include details such as a brief explanation of the category, what users can expect to find under this category, links to more subsequent categories in the hierarchy, and even FAQs.

Breadcrumbs

Breadcrumbs are a form of navigational elements, and are usually displayed between the header and the main content:

Figure 298 – Breadcrumbs provide a sense of “location” for users.

For example, a breadcrumb on a website selling home improvement products might read Home > Appliances > Cooking.

In a breadcrumb structure, Home, Appliances, and Cooking are called elements, and the > sign is called separator.

Breadcrumbs are frequently neglected as an SEO factor, but here are a few good reasons for you to pay more attention to them:

  • Breadcrumb links are very important navigational elements that communicate the location of a page in the website hierarchy to users and helps them easily navigate around the website.[8],[9]
  • Breadcrumbs are one of the best ways to create silos, by allowing search engine bots to crawl vertically upwards in the taxonomy.
  • Breadcrumb navigation makes it easier for search engines to analyze and understand your site architecture.
  • Breadcrumbs are one of the safest places to use exact anchor text keywords.

In spite of great usability and SEO benefits, many ecommerce websites fail to implement breadcrumbs correctly, for users and search engines.

Figure 299 – Can you guess what the above page is about?

Take a quick look at this screenshot. Which page do you think this is? Is it the Edition page? Maybe the Gifts page? Or, the Designer Sale? Or is it the Shop by Designer page? None of these. It is the Shoes & Handbags category listing page. Did you find the label, yet? It is the drop-down in the left navigation.

Using a breadcrumb on this website would make it easier for users to understand where they are in the hierarchy.

If usability alone has not yet convinced you to pay more attention to breadcrumbs, then maybe it is time to remind you that properly implemented breadcrumbs show directly in Bing[10] and Google search results:[11]

Figure 300 – Breadcrumb-rich SERP snippets.

In this screenshot, you can see how BestBuy, NewEgg, and Dell websites have a breadcrumb structure in the results snippet, but Walmart does not have any. Perhaps their HTML code for breadcrumbs is not properly marked-up.

On the subject of featuring breadcrumb-rich snippets in SERPs, a Google patent[12] discusses the taxonomy of the website, internal linking, primary and secondary navigation, and structured URLs among other things they consider when deciding to display breadcrumbs in SERPs. To increase the chances of breadcrumbs showing up in search engine result pages, implement them consistently across the website, and follow Google’s official guidelines by using the Breadcrumbs structured markup with microdata or RDFa.[13]

In the past, breadcrumb-rich search result listings allowed users to click not only on the underlined blue SERP title but on the breadcrumbs in the listing as well. However, Google decided not to hyperlink these links in the breadcrumbs. I believe that people clicked on the intermediary category links and landed on pages that didn’t match their intent, so Google decided to retire it.

If a product belongs to multiple categories, it is OK to list multiple breadcrumbs on the same page,[14] as long as the product is not categorized in too many different categories. However, the first breadcrumb on the page has to be the canonical path to that product, because Google picks the first breadcrumb it finds on the page.

Depth-triggered breadcrumbs

Some websites implement breadcrumbs only at a certain depth in the website hierarchy, but that is not optimal for users and search engines.

Figure 301 – When users are on the top category page for Furniture, there are no breadcrumbs.

Figure 302 – When users navigate to a Furniture subcategory (i.e., Bedroom Furniture), the breadcrumbs start being displayed. All subcategories under Bedroom Furniture will have breadcrumbs.

Depth-triggered breadcrumbs may work fine for users who start navigating from the home page, but nowadays every page on your website could serve as an entry point for users and search engines. Therefore, it is important to feature breadcrumbs from the first level of the website taxonomy. Additionally, featuring breadcrumbs only on some pages and not on others may confuse users.

Figure 303 – Many times, the category name is displayed in the breadcrumbs as well.

It is OK to repeat the category name in the breadcrumb and the heading. However, the last element in the breadcrumb should not be linked. This is because that element will contain a self-referencing link to the active page, which is very confusing to users.
Depending on how they are implemented, there are three types of breadcrumbs:[15] path-based, location-based and attribute-based.

Path-based breadcrumbs

This type of breadcrumbs shows the path users have taken within the website to get to the current page. It acts as the “this is how you got here” clue for users. The breadcrumbs will dynamically update to reflect the user’s historical navigation path. Page view history is achieved with either URL tagging or session-based cookies.

It is not a good idea to implement this type of breadcrumb anywhere except internal site search result pages. Users landing from search engines can reach deep sections inside a website without ever needing to navigate through the website. In this case, a path-based breadcrumb becomes meaningless for users. The same applies to search engine bots, which can reach deep pages on your website from external referral sources.

Location-based breadcrumbs

This is the most popular type of breadcrumb, and it indicates the position of the current page within the website hierarchy. It is the “you are here clue” for users. This type of breadcrumbs keeps users on a fixed navigation path based on the website’s taxonomy, no matter which previous pages they visited during navigation. This is the type of breadcrumb recommended by taxonomists[16] and usability experts.[17]

On top-level category pages, the breadcrumb will be just one link to the home page, while the category name will be plain text (not a hyperlink).

Figure 304 – The category name is not hyperlinked, because this is the active page. This is the correct behavior.

The first element in the breadcrumb should always be a link to your homepage, but it does not necessarily have to use the anchor text “home page” or “home”. You can use the company’s name instead or use a small house icon with your company name in the alt text.

The subsequent levels in the breadcrumbs are the category and subcategory names used in your taxonomy. Again, do not make the current page a link because it will confuse users.

Figure 305 – There are instances when using a keyword in the anchor text link pointing to the home page may make sense (for example when your business name is The Furniture Store). But even then, use it with caution.

Attribute-based breadcrumbs

Attribute-based breadcrumbs, as the name suggests use product attributes or filter values (such as style, color, or brand ) to create navigation that is presented in a breadcrumb-like fashion. This type of breadcrumbs is the “this is how you filtered our items” clue for users:

Figure 306 – This page presents the breadcrumbs as filters.

If you were to click on Bed &Bath and then on the Comforter Sets you will see the breadcrumbs listed at the top. In the example above, the elements of the breadcrumb are not links (the “X” signs are links).

Technically speaking, these are not breadcrumbs, but rather filter values. However, this implementation mimics the traditional breadcrumbs usage, and users will expect the filters to be clickable, just as they expect the breadcrumbs to be displayed horizontally and not vertically.

I do not usually recommend replacing categories with filters for the top-level categories and the first subcategory levels. The choice between a category and a filter comes down to when it does not make sense to create separate categories for specific product attributes. For example, having separate categories for shoe sizes does not make sense. The size is rather a product attribute that will translate into a left navigation filter.

Separator

You need to clearly separate each element in the breadcrumb trail; you can divide the elements using separators. The most common separator between breadcrumb elements is the “greater than” sign (>). Other good options may include right-pointing double-angle quotation marks (»), slashes (/) or arrows (→). Remember to mark the separators with correct HTML entities.[18]

Pagination

SEO for category pages starts to get complicated when the listings need pagination. Pagination occurs on ecommerce websites because of the large number of items that have to be segmented across multiple paginated pages (also known as component pages). Usually, pagination occurs on product listing pages and internal site search result pages.

Figure 307 – The pagination in the example above spreads across 113 pages, which is way too much for users to handle, and it will be tricky to optimize for bots.

If pagination occurs on pages that list other subcategories instead of products, it is time you revise making subcategories available to users without pagination. You can achieve that by increasing the number of subcategories you list on a page, or by breaking the subcategories into sub-subcategories.

Pagination is one of the oldest issues found on websites with large sets of items, and to address it was to aim at a moving target. Currently, the most recommended approach is with rel=“prev” and rel=“next”.

However, there were a couple of tactics to address pagination even before Google introduced these relationships at the end of 2011. Such tactics included noindexing all pages except the first page or using a view-all page.

To make pagination even more intriguing, Google says that one of the options for handling pagination is to “leave as-is”,[19] suggesting that they can identify a canonical page and handling pagination well.

However, anything you can do to help search engines better understand your website and crawl it more efficiently is advantageous. The question is not whether you need to deal with pagination, but how to deal with it.

From an SEO perspective, a “simple” functionality such as pagination can cause serious issues with search engines’ ability to crawl and index your site content. Let’s take a look at several concerns regarding pagination.

Crawling issues
A listing with thousands of items will need pagination since a huge listing like that will not help either users or search engines. However, pagination can screw up a flat website architecture like nothing else.

For instance, in this example, it may take search engines about 15 hops to reach page 50 of the series. If the only way to reach the products listed on page 50 is by going through this pagination page by page, those products will have a very low chance of being discovered. Probably those pages will be crawled less frequently, which is not ideal.

Figure 308 – We are on page 7 in the series, and this page lists an additional three pagination URLs (8, 9 and 10) compared to page 1.

In our pagination example, there are missing component pages in the series (page 2 and 3), which means that bots can jump from page 1, straight to page 4. Because of these gaps in component URLs, bots can reach page 50 in about 15 hops instead of 43 (bots will need 43 hops to reach page 50 because they can go from page one straight to page 7 since page 7 is listed on the index page. From page 7 it will be another 43 hops/clicks on Next until they reach page 50).

The odds that Googlebot will “hop” through paginated content to crawl the final pages decreases with each page in the series, and more significantly at page 5.

The graph above is from an experiment on pagination. As you can see, Google crawls component pages 6-N far less frequently.[20]

The experiment concluded that:

“The higher the page number is, the less probability that the page will be indexed… On average, the chance that the robot will crawl to the next page of search results decreases by 1.2 to 1.3% per page”.

If you have a large number of component pages, find a way to add links to intermediary pages in the series. Instead of linking to pages 1, 2, 3 and 4 and then jumping to the last page, add links to multiple intermediary pages. In our previous example, we can break the series into four parts by linking to every 28th page in the series. Why did I choose to link to every 28th component page? It is because I wanted to break the pagination in four (112 divided by four is 28).

The fewer component pages you have in the pagination, the fewer chunks you will use. For example, if you have 10 component pages, you will list links all of them. If you have 50, you will divide by 2; 100 will be divided by 4, 200 will be divided by 8 and so.
So now, the pagination may look like:

Figure 309 – Make sure that the navigation on each page in the series makes sense for users.

Once you made changes to pagination, you can assess the impact after a week or two. Additionally, you can use your server logs to determine Googlebot’s behavior before and after you have made updates to pagination.

Duplicate metadata
While the products listed on pages 2 to N are different, very often each component page has the same page title and meta description across the entire series. Sometimes even the copy of the SEO content is duplicated across the pagination series, which means that the index page will compete with component pages. In many cases, this duplication is due to the default CMS configuration.

Consider the following to avoid or improve upon this:

  • Create custom, unique titles and descriptions for your index pages (page 1 in each series).
  • Write boilerplates titles and descriptions for pages 2-N. For instance, you can use the title of page 1 with some boilerplate appended at the end, “{Title Page One}– Page X/Y”.
  • Do not repeat the SEO copy (if you have any) from the index page on, component pages.

Adding this uniqueness to your titles, descriptions, and copy for the entire series may not have a huge impact on rankings for pages 2 to N, but doing it helps Google consolidate relevance to the index page. Additionally, the component pages will send internal quality signals, and will not compete with the index page.

Another duplicate content issue particular to pagination can occur when you reference the first page in the series (AKA the index page) from component pages:

mysite.com/category/
mysite.com/category?page=1

Figure 310 – URLs pointing to the index page (page 1 in the series) should not include pagination parameters. Instead, these links should point to the category index URL, mysite.com/category/.

Ranking signals dispersion
Sometimes, because component pages in the pagination series get linked internally (or from external sites), they may end up in the SERPs. In such cases, ranking signals are dispersed to multiple destination URLs instead of to a single, consolidated page.

If we look at how PageRank flows according to the first paper on this subject (published in 1998[21], which notes that PageRank flows equally throughout each link and has a decay factor of 10 to 15%), then component pages seem to be PageRank black holes—especially those not linked from the first page in the series.

Let’s see how PageRank flows on a view-all page and several paginated series. For our purposes, we will split the PageRank only between component pages. This assumes that all the other links are the same on all pages.

Figure 311 – In the pagination scenario above, the items listed on page 2 will receive about three times less PageRank than the items listed on a view-all page.

Our scenario is for a listing with 100 items on a PageRank 4 page. Due to the decaying factor, this listing page will send only 3.4 PageRank points to other pages, 4 x (1-0.15). Each of the 100 items listed on the view-all page will receive 0.034 PageRank points.

We will split the same listing into ten pages in a paginated series, listing ten items per page. We will have links to component pages 1, 2, 3, 4, 5…10.

For the first page in the series we will have the following metrics:

  • Its PageRank is 4, which is the same as the view-all page, and the amount that can flow to all other links is 3.4 PageRank points.
  • The total number of links is 16 (10 links for items plus six links for pagination URLs).
  • Each item and pagination URL receives 0.213 PageRank points.

The ten items on the first page of pagination receive about six times more PageRank than items on the view-all page.

The second page has these metrics:

  • Its PageRank is 0.213, and the amount that can flow further to all other links is 0.181.
  • The total number of links is still 16 (10 links for items plus six links for pagination URLs).
  • Each item and pagination URL receives 0.011 PageRank points.

The ten items on the second page receive three times less PageRank than the items listed on the view-all page.

Figure 312 – Page 6 in the series is not linked from the index page of the pagination exemplified above.

If a component URL is not present on the first page in the series (e.g., page 6 shows up only when users click on page 2 of the series, as in the screenshot above), then the amount of PageRank that flows to items linked from page 6 is incredibly low.

  • This page PageRank is 0.011 (from page 2), and the amount that can flow further is 0.0096.
  • The total number of links is 16 (10 links for items plus six links for pagination URLs).
  • Each item or pagination URL receives 0.0006 PageRank points.

This means that the items on such pages will receive about 56 times less PageRank than the items listed on the view-all page.

Figure 313 – This screenshot depicts how PageRank changes when you change the number of items on each component pages (i.e., listing 10, 20 or 25 items per page).

If you are interested in playing with this model, you can download the sample Excel file from here.
This PageRank flow modeling suggests that:

  • The items listed on the first page in a pagination series receive significantly more PageRank than those listed on component pages.
  • The fewer links you have on the first page, the more important they are and the more PageRank they receive—no surprise here.
  • If the link to a paginated page is not listed on the series’ index page, that page receives significantly less PageRank.
  • Items listed on pages 2 to N receive less PageRank than if they were listed on a view-all page. The exception is when they receive a lot of internal or external links.

However, in practice PageRank is a metric that flows in more complex ways. For example, PageRank flows back and forth between pages, and more PageRank is passed from contextual links than from pagination links. The amount of PageRank that gets into pagination pages is impossible to compute, except for Google.

However, this oversimplified model shows that you can either pass a lot of PageRank to a few items on the first page of pagination (and significantly less to items on component pages) or pass a medium amount of PageRank to all items via a view-all page.

In both cases, if you use pagination, it is essential to put your most important products at the top of the list on the first page.

Thin content
Listing pages usually have little to no content, or at least not enough so that search engines would consider worthy of indexing. Except for product names and some basic items information, there is not much text in there. Because of this, Panda filtered a lot of listing pages from Google’s index.

Questionable usefulness
Do your visitors make use of pagination? Look at your analytics data to find out. Do component pages serve as entry points, either from search engines or other referrals? If not, the SEO and user benefits of a view-all might be much greater than having pagination.

Pagination may still be a necessary evil if the site architecture has already been implemented and it is too difficult to make updates, or if a large number of items cannot be divided and grouped into multiple subcategories.

If you want to minimize pagination issues, it is probably best to start with the architecture of the website. You can avoid some challenging user-experience, IT and SEO issues by doing so. You should consider the following:

Replace product listings with subcategory listings
For example, on the Men’s Clothing category page in the next screenshot, instead of listing 2,037 products you can list subcategories such as Athletic Wear, Belts & Suspenders, Casual Shirts, and so on. You will only have product listings deeper in the website hierarchy.

Figure 314 – The Men’s Clothing category page lists products, but instead it should list subcategories, as in the next image.

Figure 315 – This is just a mock-up I came up with to demonstrate the replacement of a product listing with a subcategory listing.

Listing categories instead of listings products will also assist users in making better scope selections.[22]

Break into smaller subcategories
If you have a category with hundreds or thousands of items, maybe it is possible to break it down into smaller segments. That, in turn, will decrease or even eliminate the number of pages in the series.

Figure 316 – The Jeans & Pants subcategory can be split into two separate subcategories.

Segmenting into multiple subcategories may completely remove the need for pagination if you list a reasonable number of items. However, do not become overly granular; you want to avoid ending up with too many subcategories.

Increase the number of items in the listing
The idea behind this approach is simple: the more products you display on a listing page, the fewer component pages you have in the series. For example, if you list 50 items using a 5×10 grid and have 295 items to list, you will need six pages in the pagination series. If you increase the number of items per page to 100, you will need only three pages to list them all.

How many items you list on each page depends on how many other links are on the page, your web server’s ability to load pages quickly, and the type of items in the list. For example, greeting card listings may be scanned more slowly than light bulbs. Generally, 100 to 150 items is a good choice.

Link to more pagination URLs
Instead of skipping pages in the pagination series, link to as many pagination links as possible.

Figure 317 – This kind of pagination present big usability issues.

The pagination in the previous screenshot requires search engines and people to click the right arrow seven times to reach the last page. That is bad for users and SEO.
Instead, you should link all the pagination URLs, and your pagination will look like this:

Figure 318 – With this approach, it is way easier for users to go to any of the component pages.

Adding links to a manageable number of pagination URLs will ensure that crawlers will get to those pages in as few hops as possible.

If you can, interlink all the component pages. For example, if the listing results in fewer than 10 component links, you can list all the links instead of just 1, 2, 3…10. If the listing generates an unmanageable number of component URLs, list as many as possible without creating a bad user experience.

The aforementioned ideas will help you minimize the impact of pagination on SEO. However, in many cases pagination is still necessary—and you will have to handle it.

The way you approach pagination is situational, which means it depends on factors such as the current implementation, the index saturation (the number of your pages indexed by search engines), the average number of products in categories or subcategories, and other factors. There is no one-size-fits-all approach.

Apart from the “do nothing” approach, there are various SEO methods for addressing pagination:

  • The “noindex, follow” method.
  • The view-all method.
  • The pagination attributes, aka as the rel=“prev”, rel=“next” method.
  • The AJAX method.

An incorrect approach to pagination is to use rel=“canonical” to point all component pages in a series to the first page. Google states that:

“Setting the canonical to the first page of a parameter-less sequence is considered improper usage”.[23]

The “noindex, follow” method

This method requires adding the meta robots “noindex, follow” in the <head> of pages 2 to N of the series, while the first page will be indexable. Additionally, pages 2 to N can contain a self-referencing rel=“canonical”.

Of the three methods, this is the least complicated to implement, but it effectively removes pages 2 to N from search engines’ indices. Note that this method does not transfer any indexing signals from component pages to the primary, canonical page.

Figure 319 – Pages 2 to N are noindexed with a meta robots “noindex, follow” tag.

If your goal is to keep pages out of the index – maybe because a thin content filter has hit you – then this is the best approach. Also, a good application of the noindex method is on internal site search results pages, since Google and all other top search engines do not like to return “search in search”.[24]

Blocking crawler’s access to component pages can be done with robots.txt and within your webmaster accounts. These two options will not remove pages from indices; they will only prevent further crawling. Moreover, while you can use Google Search Console to prevent component pages from being crawled, it is easier to manage pagination if you block them just in one place only, either with robots.txt or with GSC. When you are auditing crawling and indexation issues, do not forget where you blocked content.

The “view-all” page

This method seems to be Google’s preferred choice for handling pagination, because

“users generally prefer the view-all option in search results”

and because

“[Google] will make more of an effort to properly detect and serve this version to searchers”.[25]

This approach seems to be backed up by testing performed by usability professionals such as Jakob Nielsen, who found that:

“the View-all option [was] helpful to some users. More important, the View-all option did not bother users who did not use it; when it wasn’t offered, however, some users complained”.[26]

The view-all method involves two steps:
1. Creating a view-all page that lists all the items in a category, as in this screencap:

Figure 320 – The view-all link list all the items.

2. Make the view-all page the canonical URL of the paginated series by adding rel=“canonical” pointing to the view-all URL, on each component page:

Figure 321 – Every component URL points to the view-all page.

The purpose of rel=“canonical” is to consolidate all link signals into the view-all page. With a view-all approach, all component pages lose their ability to rank in SERPs. However, while the view-all page can be different from the listing page index, making the view-all the index page is also possible.

The view-all method comes with advantages such as better usability, indexing signal consolidation, and relative ease of implementation.

However, there are several challenges to consider before creating a view-all page:

  • Consolidating hundreds or thousands of products on one page can dramatically increase page load times, especially on product listing pages. Fast loading time is considered under four seconds, but you should aim to load under 2 seconds. Use progressive loading to make this happen.
  • A view-all page means having hundreds or thousands of links on a single page because the view-all page must display all the items from the component pages. While the compensation may be the consolidation of indexing signals from component pages to the view-all page, we do not have any official words on how search engines will assess such a large number of links on the view-all page.
  • Sometimes you do not want to remove all other component pages and push the view-all to be listed in SERPs. If you would like to surface individual pages from the pagination series, you should use the rel=“next” and rel=“prev” method.
  • Implementation is a bit more complex than for the “noindex” method. However, it is not as complex as the pagination attributes.

There are situations when you want to implement the view-all page solely for user experience purposes, and you do not want search engines to list it in SERPs. In such situations make sure that the component pages in the series do not have a rel=“canonical” pointing to the view-all page, but rather to the first page of the pagination. Also, mark the view-all page with “noindex”. Additionally, you may want to make the view-all link available only to humans using AJAX or cookie-triggered content, or other methods.

If you are concerned with page load times, there are ways to deliver a barebone version of the view-all page to search engines, while presenting the fully rendered page to humans, on-demand and without increasing load times. These implementations must take into consideration progressive enhancement[27] and mobile user experience.

However, nowadays you should be looking into building progressive web apps and accelerated mobile apps to load super-fast, rather than complicating things with delivering different resources to bots versus humans.

While de-pagination can work for websites that have a reasonably low number of items in their listings, for websites with larger inventories it may be easier to stay with user-friendly pagination that limits the number of items. From a usability standpoint,

“typically, this upper limit should be around 100 items, though it can be more or less depending on how easy it is for users to scan items and how much a long page impacts the response time”.[28]

Pagination attributes (aka rel=”prev” and rel=”next” method)

Another method for handling pagination is to use pagination attributes (also known as the rel=“prev” and rel=“next” method). Even if Google is not using this method as an indexing signal anymore, this is probably still the best approach for URL discoverability, as it seems to generate good results without completely removing the ability of component pages to rank in search results.

In the <head> section of each component page in the series, you use either rel=“prev” or rel=“next” attributes (or both), to define a “chain” of paginated components.

The prev and next relationship attributes have been HTML standards for a long time,[29] but they only got attention after Google pushed them. Rel=“prev” and rel=”next” are just hints to suggest pagination; they are not directives.

Let’s say you have product listings paginated into the following series:

http://www.website.com/duvet-covers/
http://www.website.com/duvet-covers?page=2
http://www.website.com/duvet-covers?page=3
http://www.website.com/duvet-covers?page=4

On the first page (the category index page), you would include this line in the <head> section:
<link rel=“next” href=“http://www.website.com/duvet-covers?page=2” />

The first page contains the rel=“next” markup, but no rel=“prev”. Typically, this is the page in the series that becomes the hub and gets listed in the SERPs.

On the second page, you will include these two lines in the <head> section:

<link rel=“prev” href=“http://www.website.com/duvet-covers/” />
<link rel=“next” href=“http://www.website.com/duvet-covers?page=3” />

Pages 2 to second-to-last should have both rel=“next” and rel=“prev” markup.

Note that page 2 points back to the first page in the pagination as /duvet-covers/ instead of /duvet-covers?page=1. This is the correct way to reference the first page in the series, and it does not break the chain because:

/duvet-covers/ will point to/duvet-covers?page=2 as the next page, and
/duvet-covers?page=2 will point to /duvet-covers/ as the previous page.

On the third page (http://www.website.com/duvet-covers?page=3) – you will include the following markup in the <head> section:

<link rel=“prev” href=“http://www.website.com/duvet-covers?page=2” />
<link rel=“next” href=“http://www.website.com/duvet-covers?page=4” />

On the last page (http://www.website.com/duvet-covers?page=4), you will include the following link attribute:

<link rel=“prev” href=“http://www.website.com/duvet-covers?page=3” />

Notice that the last page in the series contains only the rel=“prev” markup.

The rel=”prev” rel=”next” method has a few advantages, such as:

  • Component pages retain and share equity with all other pages in the series.
  • It addresses pagination without the need to “noindex” component pages.
  • It consolidates indexing properties such as anchor text and PageRank just as with a view-all implementation. This means that in most cases the index page of the series will show up in Google’s SERPs.
  • On-page SEO factors such as page titles, meta descriptions, and URLs may be retained for individual component pages, as opposed to being consolidated into one view-all page.
  • If the listing can be sorted in multiple ways using URL parameters, then these multiple “ordered by” views are eligible to be listed in SERPs. This is not possible with a view-all approach.

It is not a good idea to mix pagination attributes with a view-all page. If you have a view-all page, point the rel=“canonical” on all component pages to the view-all page, and do not use pagination attributes. You may also self-reference component pages to avoid duplicate content due to session IDs and tracking parameters.

Using rel=“canonical” at the same time with rel=“prev” and rel=“next”
Pagination attributes and rel=“canonical” are independent concepts, and both can be used on the same page to prevent duplicate content issues.

For example, page 2 of a series could contain a rel=”canonical”, a rel=“prev” and a rel=“next”:

<link rel=“canonical” href=“http://www.website.com/duvet-covers?page=2” />
<link rel=“prev” href=“http://www.website.com/duvet-covers?sessionid=1235sfsd” />
<link rel=“next” href=“http://www.website.com/duvet-covers?page=3&sessionid=1235sfsd”/>

This setup tells Google that page 2 is part of a pagination series and that the canonical version of page 2 is the URL without the sessionID parameter. The canonical URL should point to the current component page with no sorts, filters, views or other parameters, but rel=“prev” and rel=“next” should include the parameters.

Keep in mind that rel=“canonical” should be used to deal with duplicate or near-duplicate content only. Use it on:

  • URLs with session IDs.
  • URLs with internal or referral tracking parameters.
  • Sorting that changes the display but not the content (e.g., sorting that happens on a page-by-page basis).
  • Subsets of a canonical page (e.g., a view-all page).

You can also use rel=”canonical” on product variations (i.e., on near-duplicate PDPs that share the same product description with the only difference being an overhead product attribute (i.e., the same shoe but in different sizes). You need to understand your target market before applying the canonical, and you also need to be able to select the canonical product from the collection of SKUs.

Rel=“prev”, rel=“next” and URL parameters
Although rel=“prev” and rel=“next” seems more advantageous than the view-all method from an SEO standpoint, it comes with implementation challenges.

Regarding URL parameters, the rule on paginated pages is that pagination attributes can link together only URLs with matching parameters. The only exception is when you remove the pagination parameter for the first page in the series.

To make pagination attributes work properly, you have to ensure that all pages within a paginated rel=“prev” and rel=“next” sequence are using the same parameters.
Pagination and tracking parameters

The following URLs are not considered part of the same series, since the URL for page 3 has different parameters, and that would break the chain:

http://www.website.com/duvet-covers?page=2
http://www.website.com/duvet-covers?page=3&referrer=twitter
http://www.website.com/duvet-covers?page=4

In this case, you should dynamically insert the key-value pairs based on the fetched URL.

If the requested URL contains the parameter referrer=twitter (http://www.website.com/duvet-covers?page=3&referrer=twitter) then the pagination URLs should dynamically include the referrer parameter as well:

<link rel=“prev” href=“http://www.website.com/duvet-covers?page=2&referrer=twitter”>
<link rel=“next” href=“http://www.website.com/duvet-covers?page=4&referrer=twitter”>

Additionally, you can use Google Search Console to tell Google that this parameter does not change the page content, and to crawl the representative URLs only (URLs without the referrer parameter).

Pagination and viewing or sorting parameters
Another frequent scenario with pagination is the sorting and viewing of listings that span across multiple pages. Because each view option generates unique URL parameters, you will have to create a pagination set for each view.

Let’s say that the following are the URLs for “sort by newest”, displaying 20 items per page:

http://www.website.com/duvet-covers?sort=newest&view=20
http://www.website.com/duvet-covers?sort=newest&view=20&page=2
http://www.website.com/duvet-covers?sort=newest&view=20&page=3

On page 1 you will have only the rel=“next” pagination attribute pointing to URLs with sort and view parameters:
<link rel=“next” href=“http://www.website.com/duvet-covers?sort=newest&view=20&page=2”>

On page 2 you will have rel=“prev” and rel=”next” also pointing to URLs with sort and view parameters:

<link rel=“prev” href=“http://www.website.com/duvet-covers?sort=newest&view=20”>
<link rel=“next” href=“http://www.website.com/duvet-covers?sort=newest&view=20&page=3”>

On page 3 you will have only a rel=”prev” attribute also pointing to URLs with sort and view parameters:

<link rel=“prev” href=“http://www.website.com/duvet-covers?sort=newest&view=20&page=2”>

The above markup defines one pagination series.

However, if users can also display 100 items per page, that is a new view option, and it will create a new pagination series. The new URLs will look like the ones below; the view parameter now equals 100.

http://www.website.com/duvet-covers?sort=newest&view=100
http://www.website.com/duvet-covers?sort=newest&view=100&page=2
http://www.website.com/duvet-covers?sort=newest&view=100&page=3

On page 1 you will have only the rel=“next” pagination attribute pointing to URLs with sort and view parameters:
<link rel=“next” href=“http://www.website.com/duvet-covers?sort=newest&view=100&page=2”>

On page 2 you will have the rel=“prev” and rel=”next” also pointing to URLs with sort and view parameters:

<link rel=“prev” href=“http://www.website.com/duvet-covers?sort=newest&view=100”>
<link rel=“next” href=“http://www.website.com/duvet-covers?sort=newest&view=100&page=3”>

On page 3 you will have only a rel=”prev” attribute, also pointing to URLs with sort and view parameters:

<link rel=“prev” href=“http://www.website.com/duvet-covers?sort=newest&view=100&page=2”>

When dealing with sorting URLs, you may want to prevent search engines from indexing bi-directional sorting options, because sorting by newest is the same as sort by oldest, only in a different order. Keep one default way of sorting accessible—e.g., “newest”—and block the other, “oldest”.

Also, adding logic to the URL parameters not only can prevent duplicate content issues, but it also can:

“help the searcher experience by keeping a consistent parameter order based on searcher-valuable parameters listed first (as the URL may be visible in search results) and searcher-irrelevant parameters last (e.g., session ID). Avoid example.com/category.php?session-id=123&tracking-id=456&category=gummy-candies&taste=sour”[30]

Make sure that parameters that do not change page content (such as session IDs) are implemented as standard key-value pairs, not as directories. This is necessary for search engines to understand which parameters are useless, and which ones are useful.

Here are a couple of other best practices for pagination attributes:

  • While technically you could use relative URLs to reference pagination attributes, you should use absolute URLs to avoid cases where URLs are accidentally duplicated across directories or subdomains.
  • Do not break the chain. This means that page N should point to N-1 as the previous page and to N+1 as the next page (except for the first page, which will not have a “prev” attribute, and the last page, which will not have the “next” attribute).
  • A page cannot contain multiple rel=“next” or rel=“prev” attributes.[31]
  • Multiple pages cannot have the same rel=“next” or rel=“prev” attributes.

Probably the biggest downside of rel=“prev” and rel=“next” is that it gets tricky to implement, especially on URLs with multiple parameters. Also, keep in mind that Bing does not treat the previous and next link relationships the same way as Google. While Bing uses the markup to understand your website structure, it will not consolidate indexing signals to a single page. If pagination is a problem in Bing, consider blocking excessive pages with a Bingbot-specific robots.txt directive or noindex meta tag.

The AJAX or JavaScript links method

With this method, you create pagination links that are not accessible to search engines but are available to users, in the browser. The trade-off is that users without JavaScript will not have access to component pages. However, users can still access a view-all page.

Figure 322 – Pagination links are not plain HTML links.

In the screenshot above you can see that the interface (1) allows users to sort, as well as choose between list and grid views. It also allows access to pagination. However, the source code (2) reveals a JavaScript implementation for pagination. If the JavaScript resources needed to generate those links are blocked with robots.txt, Google will not have access to those pagination URLs (3).

This approach has the potential to avoid a lot of duplicate content complications associated with pagination, sorting, and view options. However, it can introduce URL discoverability problems.

If you prefer this approach, make sure that search engines have at least one other way to access each product in each listing— using, for example:

  • A more granular categorization that does not require more than 100 to 150 items in each list.
  • Controlled internal linking that links all products in an SEO-friendly way from other pages.
  • Well-structured HTML sitemaps, along with XML Sitemaps.
  • Other sorts of internal links.

Infinite scrolling

A frequent user interface design alternative for pagination is infinite scrolling.[32] Also known as continuous scrolling, it lets users view content as they scroll down towards the bottom of the page, without the need to click on pagination links. Visually, this alternative appears very similar to displaying all the items on the page. However, the difference between infinite scrolling and a view-all page is that with infinite scrolling the content is loaded on demand (e.g., by clicking on a “load more items” button, or when content becomes visible above the fold), while for a view-all page the content is loaded all at once.

Mobile websites use infinite scrolling more since on small screens it is easier to swipe than to click. However, infinite scrolling relies on progressive loading with AJAX,[33] which means that you will still need to provide links to component URLs to search engines or to users without JavaScript active. You will achieve this using a progressive enhancement approach.

Regarding SEO, infinite scrolling does not solve pagination issues for large inventories, and this is one of the reasons Google suggests paginating infinite scrolls.[34]
Google’s advice is OK, but I do not believe that continuous scrolling needs pagination when there aren’t too many products in the listing; a view-all page with 200 items is preferable in many cases. However, pages that list more than 200 items and use infinite scrolling should degrade to plain HTML pagination links for non-JavaScript users, and this includes search engine bots.

Degrading to HTML links means that search engines may still get into pagination problems, so you will have to handle pagination with one of the methods described earlier.

Figure 323 – This screenshot depicts the cached version of a subcategory page that uses infinite scrolling and degrades to HTML links for pagination when users do not have JavaScript on.

On the page in previous screenshots, there are links Previous and Next pages for users without JavaScript active (or for search engines).

Figure 324 – This is a screenshot of the same page, but this time JavaScript is active. The Previous and Next links do not show up anymore. This was achieved by hiding the pagination section with CSS and JavaScript. Users can continuously scroll to see all watches.[35]

Continuous scrolling has many advantages such as better user experience on touch devices, faster browsing due to the elimination of page reloads, increased product discoverability, and external links consolidation.

However, there are disadvantages too,[36] and infinite scrolling does not perform better on all websites.

For example, on Etsy – an ecommerce marketplace for handmade and vintage items – infinite scrolling did not have the desired business outcome, so they reverted to old-fashioned pagination.[37] Infinite scrolling led to fewer clicks from users, as they felt lost in a sea of items and had difficulty sorting between relevant and irrelevant.

However, on other websites, infinite scrolling may work well as reported in this study.[38] As with most ideas for your website, an A/B test will tell you whether removing pagination it is helpful for users or not. If you plan to test infinite scrolling, here are few ideas for you.

Display visual clues when more content is loading.
Not everyone’s connection is fast enough to load content in the blink of an eye. If your server cannot handle fast user scrolling, or if browsers are slow, let the user know that more content is on the way.

Figure 325 – Notice the Loading More Results message at the bottom of the list. It conveys to users that more content is loading.

Consider a hybrid solution
A hybrid approach combines infinite scrolling and pagination. With this approach, you will display a “show more results” button at the end of a preloaded list. On mobile, make this button big to combat the fat-finger syndrome[39]. When the button is clicked, it loads another batch of items:

Figure 326 – In this example, more shoes are loaded with AJAX only when users click on the “show more results” button.

Add landmarks when scrolling
Amazon uses horizontal pagination in this product widget, to give users a sense of how many pages are in the carousel:

Figure 327 – You can find the horizontal pagination for this carousel on the top right side of the screenshot.

For vertical scrolling, adding landmarks such as virtual page numbers can provide users a sense of how far they scrolled and may help to create mental references (e.g., “I saw a product I liked somewhere around page 6”).

Figure 328 – This screenshot was modified to exemplify a navigational landmark (the horizontal rule and the text “Page 2”).

Update the URL while users scroll down
This is an interesting concept worth investigating. You can automatically append a pagination parameter to the URL when users scroll down past a certain number of rows.

This concept is best explained with a video, so I made a brief screen capture to illustrate it in case the original page[40] becomes unavailable (you can download this file from here).

If you regularly have 200 or fewer items in your listings, it is better to load all the items at once. Doing so will feed everything to search engines as one big view-all page. This is Google’s preferred implementation to avoid pagination.

Of course, users will see 10 or 20 items at a time, and you will defer loading the rest of them in the interface. However, you will use data from the already loaded HTML code.

This has the potential to save many pagination headaches. Depending on your website authority, you could go with even more than 200 items per listing.

If the list is huge, you should probably paginate. But even then, you may want to consider a view-all page.

Complement with filtered navigation
Large sets of pagination should be complemented by filtered navigation to allow users to narrow the items in the listing based on product attributes. Also, use subcategory navigation to allow users to reach deeper into the website hierarchy.

Figure 329 – Filters can reduce items in a list from hundreds to a few tens or fewer.

The previous listing page has 116 items, but if you filter by Brand=Samsung, the list shortens to 52 items. If you filter by Color or Finish Family, the list shortens to 9 items.
If infinite scrolling produces better results for your users and revenue, it is probably a good idea to keep it in place. But, make it work for users without JavaScript, remove the pagination client-side, and implement infinite scrolling for users with JavaScript on.

Secondary navigation

On listing pages, primary navigation is always complemented by some ancillary navigation. We call that secondary navigation.

I will refer to secondary navigation as the navigation that provides access to categories, subcategories, and items located deeper in the taxonomy. On ecommerce websites, this type of navigation is usually displayed on the left sidebar.

Figure 330 – Secondary navigation can appear very close to the primary navigation (either at the top or on the left sidebar) and provides detailed information within a parent category.

In many cases, the secondary navigation lists subcategories, product attributes, and product filters.

The entire left section in the example above is considered the secondary navigation; it includes subcategories as filters, filter names and filter values.

Unlike primary navigation, the labels in secondary navigation can change from one page to another to help users navigate deeper into the website taxonomy. This change of links in the navigation menu is probably the most significant difference between primary and secondary navigation.

From an SEO point of view, it is important to create category-related navigation. By doing so you offer users more relevant information, provide siloed crawl paths, and give search engines better taxonomy clues.

Figure 331 – Take Amazon, for example. When you are in the Books department of the website, the entire navigation is only about books.

Faceted navigation (AKA filtered navigation)

Ecommerce sites are often cluttered, displaying too much information to process and too many items to choose from. This leads to information overload and induces choice paralysis.[41] It is therefore essential to offer users an easier way to navigate through large catalogs. This is where faceted navigation (or what Google calls additive filtering), comes into play.

Whether your visitors are looking for something very specific or just browsing the website, filters can be highly useful. It will help users locate products without using the internal site search or the primary navigation, which in most cases shows a limited number of options for users.

Faceted navigation makes it easier for searchers to find what they are looking for by narrowing product listings based on predefined filters, in the form of clickable links.

Usability experts refer to faceted navigation as:

“arguably the most significant search innovation of the past decade”.[42]

Faceted navigation almost always has a positive impact on user experience and business metrics. One retailer saw a:

“76.1% increase in revenue, a 26% increase in conversions and 19.76% increase in shopping cart visits in an A/B test after implementing filtering on its listing pages”.[43]

This screenshot illustrates a usual design for faceted navigation:

Figure 332 – A sample faceted navigation interface.

It is common to present faceted navigation in the left sidebar, but it can also be displayed at the top of product listings, depending on how many filters each category has. In many instances, subcategories are also included in the faceted navigation.
Filters, filter values, and facets have different meanings.

  • Filters represent a group of product attributes. In this screenshot, Styles, Women’s Size, Women’s Width are the filters.
  • Filters values are the options under each filter. For the Styles filter, the filter values are Comfort, Pumps, Athletic, and so on.
  • Facets are views generated by selecting one or a combination of filter values. Selecting a filter value within the Women’s size filter and a filter value for the Women’s width filter creates the so-called “facet”.

Figure 333 – Selecting one or more filter values generates the so-called facet.

Faceted navigation is a boon for users and conversion rates, but it can generate a serious maze for crawlers. The major issues faceted navigation generates are:

  • duplicate or near-duplicate content.
  • crawling traps.
  • non-essential, thin content.

Figure 334 – If you received a Google Search Console message like the one in the screencap, faceted navigation is one of the possible causes.

There is no better example of how filtering can create problems than the one offered by Google itself. The faceted navigation on googlestore.com –alongside other navigation types such as sorting and viewing options– generated 380,000 URLs.[44] And keep in mind that this was a site that sold just 158 products.

If you are curious to find out how many URLs faceted navigation could generate for a product listing page, you can use the following formula for counting the possible permutations (without allowed repetition):

P=n!/r!(n-r)!

In this formula, n is the total number of filter values that can be applied, and r is the total number of filters. For instance, let’s say you have two filters:

  • The Styles filter, with five filtering options
  • The Materials filter, with nine filtering options

In this case, n will be 14, which is the total number of filtering options and r will be 2 because we have two filters. This setup could theoretically generate 91 URLs.[45]
If you add another filter (e.g., Color), and this filter has 15 filtering options, n becomes 29 and r equals 3. This setup will generate 3,654 unique URLs.

As I mentioned, the formula above does not allow repetitive URLs. This means that if users select (style=comfort AND material=suede), they get the same results as for selecting (material=suede AND style=comfort), at the same URL. If you do not enforce an order for URL parameters, then the faceted navigation will generate 182 URLs for the example with two filters, and 21,924 URLs for the example where three filters have been applied.

Figure 335 – The huge difference between the total number of pages indexed and the number of pages ever crawled hints at a possible crawling issue or some serious content quality issue.

Figure 336 – Notice how many URLs the price parameter generates?!

In the previous screenshot, the issue was identified and confirmed by checking the URL parameters report in Google Search Console. The large number of URLs was due to the Price filter, which generated 5.2 million URLs.

You can partially solve duplicate content issues generated by faceted navigation by forcing a strict order for URL parameters, regardless of the order in which filters have been selected. For example, Category could be the first selected filter and Price the second. If a visitor (or a crawler) choose the Price filter first and then Category, you make it so that the Category shows up first in the URL, followed by Price.

Figure 337 – In the URL above, although the cherry filter value was selected after double door, its position in the URL is based on a predefined order.

The same order is reflected in the breadcrumbs as well:

Figure 338 – If you need a breadcrumb that reflects the order of user selection, you can store the order in a session cookie rather than in a URL.

Another near-duplicate content issue generated by facets arises when one of the filtering options presents almost the same items as the unfiltered view. For example, the unfiltered view for Ski & Snowboard Racks has a total of 15 products, and you can narrow the results using two subcategories: Hitch Mount and Rooftops.

Figure 339 – The above is the product listing page for Ski & Snowboard Racks.

However, the subcategory Rooftop Ski Racks & Snowboard Racks includes 13 results from the unfiltered page. This means that except for two products, the filtered and the unfiltered pages are near-duplicates.

Figure 340 – The Rooftop Ski Racks & Snowboard Racks.

Faceted navigation comes with a significant advantage over hierarchical navigation: the filter combinations will generate pages that could not exist in a tree-like hierarchy because tree-like hierarchies are rigid and cannot cover all possible combinations generated by faceted navigation. The hierarchy structure is still good for high-level decisions, however.

Let’s say that you sell jewelry and would like to rank for the query “square platinum pendants”. Your website hierarchy only segregates into jewelry-type categories such as pendants, bracelets, etc., and then it allows filtering based on a Material filter, with values such as platinum, gold, etc. If there is no Shape filter to list the square option, your website will have no faceted navigation page for “square platinum pendants”.

However, if you were to introduce the Shape filter on the Platinum Pendants listing page, it would allow you to generate the Square Platinum Pendants facet, which narrows down the inventory based on the square filter value. This page is relevant to users and to search engines. You can further optimize this page with custom content and visuals, to make it even more appealing to machines and humans.

Figure 341 – An additional filter – Shape – would allow the targeting of more torso and long-tail keywords.

If there is no Shape filter to generate the Square Platinum Pendants facet, and if there is no hierarchical navigation that could lead to such a page, you will have to manually create a page that targets the “square platinum pendants” query. Then you will have to link to it internally and externally, so search engines can discover it. Depending on the size of your product catalog it will be practically impossible to create thousands or even millions of such pages manually.

Essential, important and overhead filters

Before discussing how to approach faceted navigation from an SEO perspective, it is important to break down the filters and facets into three types: essential, important and overhead.

Essential filters/facets
Essential filters will generate landing pages that target competitive keywords with high search volumes, which usually are “head” or “torso terms”. If your faceted navigation lists subcategories, those facets are essential, and they are called faceted subcategories.

Figure 342 – In this example, Bags, Wallets, and the remaining subcategories are essential filters. You should always allow search engines to crawl and index such filters.

Essential facets can also be generated by a combination of filter values under Brand + Category—for example, using the filter value “Nokia” for Brand and the filter value “Cameras” for Category.

Either Category or Brand can be considered a facet, as they function as filters for larger sets of data.

You can handpick the top combinations of essential filters that are valuable for your users and your business. Turn them into standalone landing pages by adding content, and by optimizing them as you would with a regular important page. This is mostly a manual process, as it requires content creation, so it is doable for only a limited number of pages at once.

However, if you do this regularly and you commit resources for content creation, it will give you an advantage over your competitors. Start with the most important 1% of facets and gradually move on. If you do a couple per day (you need only about 100 to 150 carefully crafted words), in a year you will have optimized hundreds of filtered pages.

All essential facet pages should have unique titles and descriptions. Ideally, the titles should be custom, while the descriptions can be boilerplate.

Make sure that search engines can find the links pointing to essential filters and facets. As a matter of fact, essential filters should be linked from your content-rich pages such as blog posts and user guides. For maximum link juice flow link from the main content area of such pages.

The URL structure for essential facets should be clean. It should ideally reflect, either partially or exactly, the hierarchy of the website in a directory structure or a file-naming convention:

Figure 343 – The URL for the Bathroom Accessories subcategory facet is parameter-free and reflects the website’s hierarchy.

Important filters/facets
These refinements will lead users and search engines to landing pages that can drive traffic for “torso” and “long-tail keywords”.

For example, if your analytics data proves that your target market searches for “red comfort shoes”, this means that URLs generated by Color + Style selections are important facets for your business. Search engines should be able to access important facet URLs.

You will have to decide what is and what is not an important facet, preferably on a category or subcategory basis. For instance, the Color filter can be relevant and important for the Shoes subcategory, but it will be an overhead filter for the Fragrances subcategory.

A particular case you need to pay attention to is the Sales or Clearance filter. In the next example, the retailer lists all the facets for all the products on clearance.

Figure 344 – The left navigation filters in the image above do not help users much because Rugs do not have sleeves, Snowshoes do not have a shirt style, and Pullovers do not have a ski pole style.

Instead of listing products, this retailer should list only subcategories in the left navigation and the main content area. This will make it more likely that users will handle the ambiguous nature of the Clearance page by first choosing a category that interests them. Once the desired category has been selected, the retailer should display the filters that apply to that category.

Depending on how your target market searches online, it is advisable to prevent search engines from accessing URLs generated when more than two filter values have been applied. If one of the applied filters is an essential filter, you will block when three filters have been applied.

This works best with multiple selections on the same filter (e.g., Brand=Acorn AND Brand=Aerosoles) because users are less likely to search for patterns like “{brand1}{brand2}{category}” (e.g., “Acorn Aerosoles shoes”).

Being able to select multiple filter values is useful for users who might select Red AND Blue Shirts, but they are not so useful for search engines. Therefore, such selections can be blocked for bots.

Figure 345 – An example of multi-selections on the Brand filter.

Note that blocking access to faceted navigation URLs by default, whenever multiple filters are applied, will prevent bots from discovering pages created by single value selections on different filters (e.g., Color=red AND Style=comfort). You will miss traffic for a large number of filter combinations (unless you manually create and optimize landing pages for all the important filters and facets, and unless you allow the bots to crawl and index those pages).

Let your data be the source of truth when deciding which facets you need to leave open for search engines. Gather data from various sources, then programmatically replace keywords with their filter values, when appropriate. This is very similar to the Labeling and Categorization technique described in the Website Architecture section. You need to identify patterns and see which facets or filters are used the most by your visitors. In your ecommerce platform, mark the important filters and let them be indexed.

The URL structure for important facets must be as clean as possible. It is OK to keep the important filter values in a directory or the file path structure. It is also OK to keep them in URL parameters, as long as you use no more than two or three parameters.

Figure 346 – When an important filter is applied, its value is appended to the URL, in the form of a directory. In this URL, the filter value is Kohler, under the Brand filter.

Whenever possible, avoid using non-standard URL encoding—like commas or brackets—for URL parameters.[46]

Often, search engines treat pages created by filters like subsets of the unfiltered page. To avoid being pushed into the supplemental index, you need to create unique titles, descriptions, breadcrumbs, headings, and custom content on these filtered pages. Boilerplate titles and descriptions may be fine, but do not just repeat the title of the unfiltered view on facet pages. The unfiltered view will be the view-all page or the index page, in case you implemented pagination with rel=“prev”/next.

Additionally, the breadcrumbs have to update to reflect the user selection; so, do the headings. This may sound obvious, but it is amazing how many ecommerce websites do not do it.

One technique that can be useful to increase the relevance of each filtered page, and to decrease near-duplicate content problems, is to write product descriptions that include the filter values used to generate the page. For instance, let’s say you sell diamonds. When a user selects a value under the Material filter, the product description snippet would include the value of the filter.

Figure 347 – The PLP above filters the SKUs by Material=white gold. The quick view product description for the second item cleverly includes the words “white” and “gold”.

This quick view snippet is different from the product description on the product detail page:

Figure 348 – Section (1) is the quick view snippet, section (2) the full product description.

Section (1) in the previous screenshot shows the quick view product description snippet on the product listing page. As you can see, the snippet was carefully created to include all the important filter values. Section (2) depicts the product description as found on the product detail page. These two product descriptions are different.

Writing custom product snippets for listing pages is a very effective SEO tactic even when you feature only 20 to 25 words for each product. However, it is difficult to write such snippets when you have thousands of products. A workaround is to write the detailed product descriptions to include the most important product filter values either at the beginning or the end of the detailed product description and then automatically extract and display the first/last sentence on the product listing page.

Another method used to increase the relevance of the listing pages generated by important facets is to add the selected filter values to the product listing, on the fly. However, this can transform into spam if you are not careful. If you go with this approach, make sure that you have rules in place to avoid keyword stuffing.

Overhead filters
These filters generate pages that have minimal or no search volume. All they do is waste crawl budget on irrelevant URLs. A classic example of an overhead filter is Price; in many instances, so is Size. However, keep in mind that a filter can be overhead for a business, but it can be important or even essential for another business.

You should prevent search engines from crawling URLs generated based on overhead filters, and you should mark filters as overheads on a category basis. Whenever a combination of filters includes an overhead value add the “noindex, follow” meta tag to the generated page, and append the crawler=no parameter to its URL. Then block the crawler parameter with robots.txt.

The directive in robots.txt will prevent wasting crawl budget, while the noindex meta tag will prevent empty snippets from showing up in the SERPs. If you have pages in the index that you need to remove, first implement the “noindex, follow” and wait for the pages to be taken out of the index. Once they are removed, append the crawl control parameter to the URLs.

Be careful about the combination of robots.txt and the noindex meta tag, as robots.txt will not allow robots access to a page-level directive and noindex is a page-level directive. If your website does not have an index-bloat issue or crawling issues, you may consider implementing rel=“canonical” instead of robots.txt.

You can also use AJAX to generate the content for overhead facets in a way that the URLs will not change, so search engine crawlers will not request unnecessary content. In this case, you need to block the scripts (and all other resources needed for the AJAX calls), with robots.txt. This will prevent search engines from rendering the AJAX links.

If you want to degrade the code for users with JavaScript off, you can use URL parameters, which can be placed either after a hash mark (#) or in a URL string that is blocked with robots.txt. However, the most stringent crawling restrictions come from not making the overhead URLs available to bots.

Figure 349 – Notice how the URL above contains the NCNI-5 string at the end.

The NCNI-5 string is used to control crawlers because all URLs containing the NCNI-5 string are blocked with robots.txt:

Disallow: /*NCNI-5*

To summarize, this is how Home Depot defines the filtered URLs for all three types of facets:

Figure 350 – Each filter/facet is treated differently, depending on how important each facet is.

The URL for the essential facet is made of a clean category name. The important facet URL includes the category name and the filter value, Kohler. The overhead URL –while it includes the category name and the filter value –, it also includes the crawl control string, NCNI-5.

It is a bad idea to rewrite URLs to make overhead filters look like static URLs. The following sample URL includes the overhead filter Price, with the values 50 to 100.
http://www.homedepot.com/b/Bath-Bathroom-Accessories-Hardware-Bathroom-Accessories/KOHLER/N-5yc1vZbz99Z1qh/Price/50-100

The URL above does not exist on Home Depot’s website; I added the /Price/50-100 part only to exemplify. Generating search engine-friendly URLs does not change the fact that there will be millions of irrelevant pages on your website.

Regarding URL discoverability, search engines do not need to find the links pointing to overhead filters or facets. In fact, you have to prevent search engines from discovering overhead facets.

If you have to allow search engines to crawl overhead facets, then keep the filters in parameters using standard HTML encoding and key=value pairs instead of in directories. This helps search engines differentiate between useful and useless values.

A faceted navigation case study

A searched on Google for “Canon digital cameras” lists Overstock on the first page, OfficeMax on the fifth and Target on the seventh.

Overstock’s approach to filtered navigation

Figure 351 – The above is the Digital Cameras sub-subcategory page filtered by Brand=Canon. It has a unique title, customized breadcrumbs, and a relevant H1 heading. Also, this page uses a good meta description. These elements send quality signals to search engines.

When users filter the SKUs by another brand (e.g., Sony), the page elements update. If they did not, the Canon Digital Cameras page would have the same H1, title, description, and breadcrumbs as the Digital Cameras page, which is not desirable.

Additionally, Overstock allows the crawl of essential and important filters, and it does not create links for gray-end filters (filters that generate zero results).

Figure 352 – The “10 Megapixels” filter value generates zero results. Therefore, it is not hyperlinked.

Overstock’s implementation of faceted navigation is SEO friendly, because it allows crawlers to access various filtered pages, and it updates page elements based on user or crawler selection.

A note on gray-end filter values: whatever you choose to do with these filter values in the interface (i.e., not showing them at all, showing them at the bottom of the filters list, or hiding them behind a “show more” link), gray ends filters should not be hyperlinked. If you have to hyperlink them for some reason, the header response code for zero results pages should be 404. If returning 404 is not possible, mark the pages with “noindex,follow”. Alternatively, you can use robots.txt to block URLs generating zero results.

OfficeMax’s approach to filtered navigation

Figure 353 – The image above depicts the Digital Cameras PLP on OfficeMax.

The title, the description, and the breadcrumbs do not update when the user selects a filter value under the Brand filter. This means that thousands of filtered pages will have very similar on-page SEO elements to the unfiltered page. Although the products on each filtered page will change, search engines will get a lot of near-duplicate meta tag signals.

This “stallness” might be the cause for Google not to index the faceted page resulting from filtering by Canon. The page that ranks for “Canon digital cameras” on OfficeMax is the Digital Cameras category page. This is not the ideal page to rank with because it does not match the user intent behind the search query. Filtering by Brand=Canon means that searchers have to take an additional, unnecessary step.

On the cached version of the Digital Cameras page, we notice that the faceted navigation is nowhere to be found. That’s happening because the faceted navigation is not accessible to search engines.

Figure 354 – The faceted navigation is not accessible to search engines.

A quick reminder here: if the links are missing in the cached version, Google might still be able to find them when they render the page like a browser would.

Maybe OfficeMax tried to fix some over-indexation issues or a possible Panda filter on thin content pages. However, this faceted navigation implementation is not optimal, as it completely blocks access to all filtered pages. Unless OfficeMax creates manual landing pages for all essential and important filtered pages, they have closed the doors to search engines and to the traffic those pages could bring in.

Target’s approach to faceted navigation

Figure 355 – Like OfficeMax, Target does not create relevance signals for filtered pages.

On Target’s website, the page title, breadcrumb, heading and description for their Canon Digital Cameras page are the same as on the unfiltered page, Digital Cameras. As a matter of fact, the aforementioned elements will be the same on hundreds or thousands of other possible filtered pages.

Moreover, since the page has a canonical pointing to the unfiltered page, its ability to rank is (theoretically) zero. Because of their approach, I thought they would have a Canon Digital Cameras page that can be reached from the navigation or other pages. If they had one, Google was not able to identify it.

Figure 356 – Google could not find a category page (or even a facet URL) relevant to Canon Digital Cameras. All the most relevant results were PDPs.

Google’s cached version of the page shows that faceted navigation does not create links:

Figure 357 – Because Brand is an important filter, and because in our example we used only one filter value, all the filter values under Brand should be plain HTML links.

Categories in faceted navigation

Hierarchical, category-based navigation is useful as long as it is easy for users to choose between categories. For instance, it could be more helpful for users if easy-to-decide-upon subcategories are listed in the main content area, as opposed to being displayed as facet subcategories in the sidebar. Subcategory listing pages should be used:

whenever further navigation or scope definition is needed before it makes sense to display a list of products to users. Generally, sub-category pages make the most sense in the one or two top layers of the hierarchy where the scope is often too broad to produce a meaningful product list”.[47]

Figure 358 – This category displays the next level of categories of the hierarchy in the main content area.

In the previous screenshot, faceted navigation (usually present in the left navigation as filtering options) is not yet introduced at this level of the hierarchy (the category level), and not even on the sub-subcategory level.

In the next example, you can see how the category-based navigation ends at the third level of the hierarchy; the first level is the Décor category, the second level is Blinds & Window Treatments, and the third level is Blinds & Shades.

The faceted navigation is displayed only at the third level of the hierarchy.

Figure 359 – You can see the faceted navigation displayed in the left sidebar, to help with decision making. You can also notice that the subcategory listing has been replaced with the product listing.

It is important to keep hierarchies relatively shallow, so users do not have to click through more than four layers to get to the list of products. Search engines will have the same challenges and may deem products buried deep in the hierarchy as not important.

Because faceted navigation is a granular inventory segmentation feature, it generates excess content in most implementations. It will also generate duplicate content—for instance, if you do not enforce a strict order for parameter filters in URLs.

So, what options do we have for controlling faceted navigation?

Option rel=”canonical”

Although rel=“canonical” is supposed to be used for identical or near-identical content, it may be worth experimenting with canonicals to optimize content across faceted navigation URLs.

Vanessa Fox, who worked for Google Webmaster Central, has suggested the following approach for some cases:

“If the filtered view is a subset of a single non-filtered page (perhaps the view=100 option), you can use the canonical attribute to point the filtered page to the non-filtered one. However, if the filtered view results in a paginated content, this may not be viable (as each page may not be a subset of what you would like to point to as canonical)”.[48]

Rel=“canonical” will consolidate indexing signals to the canonical page and will address some of the duplicate content issues, but search engine crawlers may still get trapped into crawling irrelevant URLs.

Rel=“canonical” is a good option for new websites, or for adding new filtering options to existing websites. However, it is not helpful if you are trying to remove existing filtered URLs from search engine indices. If you do not have indexing and crawling issues, you can use rel=“canonical”, as Vanessa suggests.

Option robots.txt

Robots.txt is the crawl control sledgehammer. Keep in mind that if you use robots.txt to block URLs, you will tamper with the flow of PageRank to and from thousands of pages. That is because while URLs listed in robots.txt can get PageRank, they do not pass PageRank.[49] Also, remember that robots.txt does not prevent pages from being indexed.

However, in some cases, this approach is necessary—e.g., when you have a new website with no authority and a very large number of items that need to be discovered, or when you have thin content or indexing issues.
If you use parameters in URLs and would like to prevent the crawling of all the URLs generated by selecting values under the Price filter, you would add something like this in your robots.txt file:

User-agent: *
Disallow: *price=

This directive means that any URL containing the string price= will not be crawled.

Robots.txt blocked URL parameter/directory

This method requires you to selectively add a URL parameter to control which filtered pages are crawlable and which are not. I described this in the Crawl Optimization section, but I will repeat it here as well.

First, decide which URLs you want to block.

Let’s say that you want to control the crawling of the faceted navigation by not allowing search engines to crawl URLs generated when applying more than one filter value within the same filter (also known as multi-select). In this case, you will add the crawler=no parameter to all URLs generated when a second filter value is selected on the same filter.

If you want to block bots when they try to crawl a URL generated by applying more than two filter values on different filters, you will add the crawler=no parameter to all URLs generated when a third filter value is selected, no matter which options were chosen, nor the order they were chosen. Here’s a scenario for this example:

The crawler is on the Battery Chargers subcategory page.

The hierarchy is: Home > Accessories > Battery Chargers
The page URL is: mysite.com/accessories/motorcycle-battery-chargers/

Then, the crawler “checks” one of the Brands filter values, Noco. This is the first filter value, and therefore you will let the crawler fetch that page.

The URL for this selection does not contain the exclusion parameter:

mysite.com/accessories/motorcycle-battery-chargers?brand=noco

Then, the crawler checks one of the Style filter values, cables. Since this is the second filter value applied, you will still let the crawler access the URL.

The URL still does not contain the exclusion parameter. It contains just the brand and style parameters:

mysite.com/accessories/motorcycle-battery-chargers?brand=noco&style=cables

Then, the crawler “selects” one of the Pricing filter values, the number 1. Since this is the third filter value, you will append the crawler=no to the URL.

The URL becomes:

mysite.com/accessories/motorcycle-battery-chargers?brand=noco&style=cables&pricing=1&crawler=no

If you want to block the URL above, the robots.txt file will contain:

User-agent: *
Disallow: /*crawler=no

The method described above prevents the crawling of facet URLs when more than two filters values have been applied, but it does not allow specific control over which filters are going to be crawled and which ones not. For example, if the crawler “checks” the Pricing options first, the URL containing the pricing parameter will be crawled.

Note that blocking filtered pages based solely on how many filter values have been applied poses some risks. For instance, if a Price filter value is applied first, the generated pages will still be indexed, since only one filter value has been selected. You should have more solid crawl control rules—e.g., if an overhead filter value has been applied, always block the generated pages.

It is also a good idea to limit the number of selections a search engine robot can discover. We will discuss this a bit later in this section, as the JavaScript/AJAX crawl control option.

Important filters or facets must be plain HTML links. You can present overhead filters as plain text to search engines (no hyperlinks), but as functional HTML to users (hyperlinks).

The blocked directory approach requires putting the unwanted URLs under a directory, then blocking that directory in robots.txt.

In our previous example, when the crawler checks one of the Pricing options place the filtering URL under the /filtered/ directory. If your regular URL looks like this:

mysite.com/accessories/motorcycle-battery-chargers?brand=noco&style=cables&pricing=1
when you control crawlers, the URL will include the /filtered/ directory:

mysite.com/filtered/accessories/motorcycle-battery-chargers?brand=noco&style=cables&pricing=1

If you want to block the URL, the robots.txt will contain:

User-agent: *
Disallow: /filtered/

Option nofollow

Some websites prefer to nofollow unnecessary filters or facets. Surprisingly, and in contradiction with the other official recommendation that tells us not to nofollow any internal links, nofollow is one of Google’s recommendations for handling faceted navigation[50]. However, nofollow does not guarantee that search engines will not crawl the unnecessary URLs or that those pages will not be indexed. Additionally, nofollow-ing internal links might send search engines the wrong signals, because nofollow translates into “do not trust these links”.

Hence, nofollow does not solve current indexing issues. This option works best with new websites.

It may be a good idea to either “back up” the nofollow option with another method that prevents URLs from being indexed (e.g., blocking URLs with robots.txt) or to canonicalize the link to a superset.

Option JavaScript/AJAX

We established that essential and important facets/filters should always be accessible to search engines as links. Preferably those will be plain HTML links. URLs for overhead filters and facets, on the other hand, can safely be blocked for search engine bots.

Theoretically, you can obfuscate the entire faceted navigation from search engines by loading it with search engine “unfriendly” JavaScript or AJAX. We have seen this deployed at OfficeMax. However, excluding the entire faceted navigation is usually a bad idea and should only be done if there are alternative paths for search engines to reach pages created for all essential and important facets. In practice, this is neither feasible nor recommended.

One option is to allow search engines access only to essential and important facets links, while not generating overhead links. For example, you load only the important facets and filters as plain HTML, while the overhead filters or facets are loaded with JavaScript or AJAX. Users will be able to click on any of the links, as they will be generated in the browser (e.g., using “see more options” links).

Figure 360 – Some of the filter values in the faceted navigation are not hyperlinked.

In this example, users are shown just two filter values for Review Rating, with a link to Show All Review Rating (column 1). When they click on that link, they see all the filter values (column 2). However, the Show All Review Rating is not a link for search engines (column 3).

This will effectively limit the number of URLs search engines can discover, which may be good or bad depending on your situation. If your target market searches for “laminate flooring 3-star reviews” then you need to make the corresponding link available to bots.

Similarly, you can obfuscate entire filters or just some filter values. For example, eBay initially presents users with only a limited number of filters and filter values, but then at a click on “see all” or “More refinements” it opens all the filters in a modal window:

Figure 361 – This modal window contains all the links required by users to refine the list of products.

However, the content of the modal window is not accessible to search engines, as you can see in this screenshot, where the “More refinements” is not hyperlinked:

Figure 362 – The “More refinements” element looks and acts like a link, but it is not a regular href.

One advantage of selectively loading filters and facets with robotted AJAX or JavaScript is that it may help pass more PageRank to other, more important pages. This is very similar to the old PageRank sculpting concept. However, remember that this “sculpting” happens only if search engines are not able to execute AJAX on such pages. And search engines are getting better by the day at executing JavaScript and AJAX. To make sure the links are not accessible to Googlebot block the resources necessary for the JavaScript or AJAX calls with robots.txt, and then do a fetch and render in Google Search Console.

If you know that some pages are not valuable for search engines, and if you do not want those useless pages in the index, then why allowing bot access to them in the first place?
Another advantage of selectively loading URLs is that it will prevent unnecessary links from being crawled.

The hash mark option

You can append parameters after a hash mark (#) to avoid the indexing of faceted URLs. This means that you can let faceted navigation create URLs for every possible combination of filters. As a note, remember that AJAX content is signaled with hashbang (#!). However, this scheme is no longer recommended by Google.

If you do an “info:” search for a page that includes the hash mark in the URL you will see that Google defaults the page to the URL that excludes everything after the hash mark.

For search engines, this page:
http://www.modcloth.com/shop/books#?price=28,70&sort=newest&page=1
defaults to the content on this page:
http://www.modcloth.com/shop/books

Figure 363 – Google caches the content of the page generated before the usage of hash marks.

The hash mark can potentially consolidate linking signals to modcloth.com/shop/books, but all the pages generated using the hash mark will not be indexed; therefore, they cannot rank.

However, you can place just the overhead filters after the hash mark. Whenever an essential or important facet is selected, include it in a clean URL, before the hash mark. Multiple selection filters can also be added after the hash mark.

You can also control crawlers using the URL parameters handling tools offered by Bing and Google.

Figure 364 – This setup hints to Google that the mid parameter is used for narrowing the content. I prefer to tell Google about the effect that each parameter has on the page content, but in the end, I will let them decide which URLs to crawl.

This setup presents only a clue to Google, so you still need to address crawling and duplicate content using another method (e.g., blocking overhead facets with selective robots.txt), or with a combination of methods.

Option noindex, follow

Adding the “noindex, follow” meta tag to pages generated by overhead filters can help address “index bloat” issues, but it will not prevent spiders from getting caught in filtering traps.

A quick note about using the noindex directive at the same time robots.txt: theoretically “noindex,follow” can be used in conjunction with robots.txt to prevent the crawling and indexing of new websites. However, if unwanted URLs have already been crawled and indexed, first you have to add noindex, follow” to those pages and let search engine robots crawl them. This means you will not block the URLs with robots.txt yet. Block the unwanted URLs with robots.txt only after the URLs have been removed from the index.

Sorting items

Users must be allowed to sort listings based on various options. Some popular sort options are bestsellers, new arrivals, most rated, price (high to low or low to high), product names, and even discount percentage.

Figure 365 – Some popular sorting options.

Sorting simply changes the order the content is presented in, not the content itself. This will create duplicate or near-duplicate content problems, especially when the sorting can be bidirectional (e.g., sort by price—high to low and low to high) or when the entire listing is on a single page (view-all).

Google tells us that if the sort parameters never exist in the URLs by default, they do not even want to crawl those URLs.

Figure 366 – This screencap is from Google’s official presentation, “URL Parameters in Webmaster Tools”.[51]

The best way to approach sorting is situational, and it depends on how your listings are set up.

Use rel=“canonical”

Many times, sort parameters are kept in the URL. When users change the sort order, the sort parameters are appended to the URL, and the page reloads. In this case, you can use rel=“canonical” on sorted pages to point to a default page (e.g., sorted by bestsellers).

Figure 367 – In this screenshot, you see that while sorting generates unique URLs for ascending and descending sort options, both URLs point to the same canonical URL.

The use of rel=“canonical” is strongly advised when the sorting happens on a single page, because sorting the content will change only how it is displayed, but not the content itself. This means that the content on each page, although sortable, will not be different and the generated page will be an exact duplicate. For instance, when sorting reorders the content on a view-all page, you generate exact duplicates (given that the view-all page lists all items in the inventory). However, even when the content is sorted on a page-by-page basis rather than using the entire paginated listing, you also create near or exact duplicate content.

Removing or blocking sort order URLs

This requires either adding rel=“noindex, follow” to sorting URLs or blocking access to them all together using robots.txt or within Google Search Console.

A screenshot of a cell phone Description generated with very high confidence

Figure 368 – In this example, the Ski Boots listing can be sorted in two directions (price “high to low” and price “low to high”).

When items can be sorted in two directions, the first product on the first page sorted “high to low” becomes the last product on the last page sorted “low to high”. The second product on the first page then becomes the second to last product on the last page, and so on. Depending on the number of items you list by default and on how many products are listed, you may end up with exact or near duplicates. For example, let’s say you list 12 items per page, and there are 48 items in total. This means that the last page in the pagination series will display exactly 12 items. When you list by price “high to low”, the products on the first page of the pagination will be the same with the products on the last page when sorting “low to high”.

One way to handle bidirectional sorting is to allow search engines to index only one sorting direction and remove or block access to the other. For example, you allow the crawling and indexing of “oldest” sort URLs and block the “newest”.

Figure 369 – Removing or blocking sort-order URLs is the easiest method to implement, and may help address pagination issues quickly until you are ready to move ahead with a more complex solution.

Use AJAX to sort

With this approach, you sort the content using AJAX, and URLs do not change when users choose a new sort option. All external links are naturally consolidated to a single URL, as there will be only one URL to link to.

Figure 370 – Sorting with AJAX does not usually change the URL.

Notice how the URL in the previous screenshot does not change when the list is sorted again by Bestsellers, in the next image below:

Figure 371 – While the content updates when users select various sort options, the URL remains the same.

Because the URL does not update when sorting, this method makes it impossible to link, share, or bookmark URLs for sorted listings. But, do people link or share sorted or paginated listings? Even if they do, how relevant will pagination or sorting be a week or a month from the moment it was linked or shared? Products are added to or removed from listings on a regular basis, frequently changing the order of products. The chances are that the products listed on any sorted page will be partially or totally different from the products listed on the same page, the next week or the next month.

So, shareability and linkability should not be concerns when you are deciding whether to implement AJAX for sorting, or not. If it is better for users, do it.

Use hash marks URLs

Using hash marks in the URL allows sharing, bookmarking, and linking to individual URLs. A rel=“canonical” pointing to the default URL (without the #) will consolidate eventual links to a single URL.

Figure 372 – In this screenshot, the default view lists items sorted by Most Relevant SKUs.

The URL above will be the canonical page. In the next screenshot, you will notice how the URL changes when the list is filtered by Price, low to high:

Figure 373 – The URL includes the hash mark and the filter value, ~priceLowToHigh.

Currently, search engines typically ignore everything after the hash mark unless you use a hashbang (#!) to signal AJAX content (which itself is deprecated). Search engines ignore everything after the hash mark because using it in URLs does not cause additional information to be pulled from the web server.

The hash mark implementation is an elegant solution that addresses user experience and possible duplicate content issues.

View options

Just as users prefer different sort options, some users want to change the default way of displaying listings. The most popular view options are view N results per page or view as list/grid. While good for users, view options can cause problems for search engines.

Figure 374 – In the example above, users can choose to view the listing page as a compact grid or as a detailed list; they can also choose the number of items per page.

Grid and list views

Figure 375 – The grid view (left) and the list view (right).

Usually, the grid and the list view present the same SKUs, but the list-view can use far more white space. This space can be filled with additional product information and represents a big SEO opportunity as it can be used not only to increase the amount of content on the listing page but also to create relevant contextual internal links to products or parent categories.

The optimal approach for viewing options is to load the list-view content in the source code in a way that is accessible to search engines, then use JavaScript to switch between views in the browser.

You do not need to generate separate URLs for each view. In case you do generate separate URLs, those pages will contain duplicate content, and the way to handle them is with rel=“canonical” to a default view. The default view has to be the page that loads the content for the list view.

For example, these two URLs point the rel=“canonical” to /French-Door-Refrigerators/products:

/French-Door-Refrigerators/products?style=List
/French-Door-Refrigerators/products?style=Grid

View-N-items

Many ecommerce websites have the View-N-items per page feature, allowing users to select the number of items in the listing:

Figure 376 – This is a typical drop-down for the view-N-items per page option.

If possible, your default product listing will be the view-all page. If view-all is not an option, then display a default number in the list (let’s say 20) and allow users to click on a view-all link.

Figure 377 – Nike’s view-all option is displayed right in the menu.

If view-all generates an unmanageable list with thousands of items, let users choose between two numbers, where the second number is substantially bigger than the default (e.g., 60 and 180). Remember to keep users’ preferences in a session or a persistent cookie[52], not in URL parameters.

Figure 378 – The second view option is substantially larger than the first one.

From an SEO perspective, view-N-items per page URLs are traditionally handled with rel=“canonical” pointing to default listing pages (which are usually the index pages for department, category or subcategory pages). For instance, on a listing page with 464 items, the view 180 items per page option can be kept in the key=value pair itemsPerPage=180, and the URL may look like this:

mywebsite.com/seat-covers/10A522.aspx?itemsPerPage=180

The URL above lists 180 items per page and will contain a rel=“canonical” in the <head> that points to the category default URL:

mywebsite.com/seat-covers/10A522.aspx

However, the canonical URL lists only 60 items by default, and that is what search engines will index. This means that a larger subset (the one that lists 180SKUs) canonicalizes to a smaller subset (the one that lists 60 SKUs). This approach can create some issues because Google will index the content on the canonical page (60 items) while ignoring the content from the rest of the view-N-items pages. In this case, you need to make sure that search engines can somehow access each of the items in the entire set (464 items). For example, you can make this work with paginated content that is handled with rel=“prev” and rel=“next”, so that Google consolidates all component pages into the canonical URL.

The use of rel=“canonical” on a view-N-items page is appropriate if the canonical points either to a view-all page or the largest subset of items. The former option is not desirable if you want another page to surface in search results (e.g., the first page in a paginated series with 20 items listed by default).

The approaches for controlling view-N-items pages are similar to those for handling sorting: a view-all page combined with AJAX/JavaScript to change the display in the browser, uncrawlable AJAX/JavaScript links, hash-marked URLs, or using the “noindex” meta tag. I mentioned these approaches in my preferred order, but keep in mind that while one approach might suit the particular conditions of one website, it may not work for another.

  1. Prioritize: Good Content Bubbles to the Top, http://www.nngroup.com/articles/prioritize-good-content-bubbles-to-the-top/
  2. New snippets for list pages, http://insidesearch.blogspot.fr/2011/08/new-snippets-for-list-pages.html
  3. More rich snippets on their way: G Testing Real Estate Rich Snippets, https://plus.google.com/+MarkNunney/posts/RqzNcKE9NSc
  4. Product – schema.org, http://schema.org/Product
  5. Below the fold, http://en.wikipedia.org/wiki/Above_the_fold#Below_the_fold
  6. Implement the First 1-2 Levels of the E-Commerce Hierarchy as Custom Sub-Category Pages, http://baymard.com/blog/ecommerce-sub-category-pages
  7. Usability is not dead: how left navigation menu increased conversions by 34% for an eCommerce website, https://vwo.com/blog/usability-left-navigation-menu-bar-conversions-ecommerce-website/
  8. User Mental Models of Breadcrumbs, http://www.angelacolter.com/breadcrumbs/
  9. Breadcrumb Navigation Increasingly Useful, http://www.nngroup.com/articles/breadcrumb-navigation-useful/
  10. Breadcrumbs, http://www.bing.com/webmaster/help/markup-breadcrumbs-72419f3f
  11. New site hierarchies display in search results, http://googleblog.blogspot.fr/2009/11/new-site-hierarchies-display-in-search.html
  12. Visualizing Site Structure And Enabling Site Navigation For A Search Result Or Linked Page, http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220110276562%22.PGNR.&OS=DN/20110276562&RS=DN/20110276562
  13. Rich snippets – Breadcrumbs, https://support.google.com/webmasters/answer/185417?hl=en
  14. Can I place multiple breadcrumbs on a page? https://www.youtube.com/watch?v=HXEYryd3eAY
  15. Location, Path & Attribute Breadcrumbs, http://instone.org/files/KEI-Breadcrumbs-IAS.pdf
  16. Taxonomies for E-Commerce, Best practices and design challenges -http://www.hedden-information.com/Taxonomies_for_E-Commerce.pdf
  17. Breadcrumb Navigation Increasingly Useful, http://www.nngroup.com/articles/breadcrumb-navigation-useful/
  18. HTML Entity List, http://www.freeformatter.com/html-entities.html
  19. Pagination and SEO, https://www.youtube.com/watch?v=njn8uXTWiGg&feature=youtu.be&t=11m
  20. Pagination and Googlebot Visit Efficiency, http://moz.com/ugc/pagination-and-googlebot-visit-efficiency
  21. The Anatomy of a Large-Scale Hypertextual, Web Search Engine, http://infolab.stanford.edu/pub/papers/google.pdf
  22. Implement the First 1-2 Levels of the E-Commerce Hierarchy as Custom Sub-Category Pages, http://baymard.com/blog/ecommerce-sub-category-pages
  23. Five common SEO mistakes (and six good ideas!), http://googlewebmastercentral.blogspot.ca/2012_03_01_archive.html
  24. Search results in search results, http://www.mattcutts.com/blog/search-results-in-search-results/
  25. View-all in search results, http://googlewebmastercentral.blogspot.ca/2011/09/view-all-in-search-results.html
  26. Users’ Pagination Preferences and ‘View-all’, http://www.nngroup.com/articles/item-list-view-all/
  27. Progressive enhancement, http://en.wikipedia.org/wiki/Progressive_enhancement
  28. Users’ Pagination Preferences and ‘View-all’, http://www.nngroup.com/articles/item-list-view-all/
  29. HTML <link> rel Attribute, http://www.w3schools.com/tags/att_link_rel.asp
  30. Faceted navigation best (and 5 of the worst) practices, http://googlewebmastercentral.blogspot.ca/2014/02/faceted-navigation-best-and-5-of-worst.html
  31. Implementing Markup For Paginated And Sequenced Content, https://web.archive.org/web/20140527145918/http://www.bing.com/blogs/site_blogs/b/webmaster/archive/2012/04/13/implementing-markup-for-paginated-and-sequenced-content.aspx
  32. Infinite Scrolling: Let’s Get To The Bottom Of This, http://www.smashingmagazine.com/2013/05/03/infinite-scrolling-lets-get-to-the-bottom-of-this/
  33. Web application/Progressive loading, http://docforge.com/wiki/Web_application/Progressive_loading
  34. Infinite scroll search-friendly recommendations, http://googlewebmastercentral.blogspot.ca/2014/02/infinite-scroll-search-friendly.html
  35. Infinite Scrolling: Let’s Get To The Bottom Of This, http://www.smashingmagazine.com/2013/05/03/infinite-scrolling-lets-get-to-the-bottom-of-this/
  36. Infinite Scroll On Ecommerce Websites: The Pros And Cons, http://www.lyonscg.com/insights/infinite-scroll-on-ecommerce-websites-the-pros-and-cons/
  37. Why did infinite scroll fail at Etsy?, http://danwin.com/2013/01/infinite-scroll-fail-etsy/
  38. Brazillian Virtual Mall MuccaShop Increases Revenue by 25% with Installment of Infinite Scroll Browsing Feature, https://web.archive.org/web/20131106172124/http://www.ereleases.com/pr/brazillian-virtual-mall-muccashop-increases-revenue-25-installment-infinite-scroll-browsing-feature-135237
  39. Typographical error, http://en.wikipedia.org/wiki/Typographical_error
  40. Better infinite scrolling, http://scrollsample.appspot.com/items
  41. The Paradox of Choice, http://en.wikipedia.org/wiki/The_Paradox_of_Choice:_Why_More_Is_Less
  42. Search Patterns: Design for Discovery, [page 95]
  43. Adding product filter on eCommerce website boosts revenues by 76%, https://vwo.com/blog/product-filter-ecommerce-ab-testing-revenue/
  44. Configuring URL Parameters in Webmaster Tools, https://www.youtube.com/watch?v=DiEYcBZ36po&feature=youtu.be&t=1m37s
  45. Permutation, Combination – Calculator, http://easycalculation.com/statistics/permutation-combination.php
  46. Faceted navigation best (and 5 of the worst) practices, http://googlewebmastercentral.blogspot.ca/2014/02/faceted-navigation-best-and-5-of-worst.html
  47. Implement the First 1-2 Levels of the E-Commerce Hierarchy as Custom Sub-Category Pages, http://baymard.com/blog/ecommerce-sub-category-pages
  48. Implementing Pagination Attributes Correctly For Google, http://searchengineland.com/implementing-pagination-attributes-correctly-for-google-114970
  49. Do URLs in robots.txt pass PageRank? https://productforums.google.com/forum/#!category-topic/webmasters/crawling-indexing–ranking/OTeGqIhJmjo
  50. Faceted navigation best (and 5 of the worst) practices, http://googlewebmastercentral.blogspot.fr/2014/02/faceted-navigation-best-and-5-of-worst.html
  51. URL Parameters in Webmaster Tools, https://docs.google.com/presentation/d/1xWy5TOkB4rwoUHXFPgwVMgl2Op9PayZOWa5wdW7ZB-o/present?pli=1&ueb=true#slide=id.g6205f11_0_28, page 18
  52. Persistent cookie, http://en.wikipedia.org/wiki/HTTP_cookie#Persistent_cookieChapter Eight: Product Detail Pages

CHAPTER 8

Product Detail Pages ( PDPs)

Length: 14,287 words

Estimated reading time: 1 hour, 40 minutes

Chapter-Head-Chapter8

Product Detail Pages

Many marketers consider that the “bread and butter” of ecommerce websites are the product detail pages, aka PDPs. Since that is where the add to cart micro-conversion happens, PDPs are considered the “money pages”, and tend to get the most SEO attention. After all, if you do not rank when someone searches for your products, you will not have the chance to sell them. While the focus of the product detail pages is to convince and convert, conversion elements have to be balanced with SEO.

In this part of the course I will break down the most important sections found on product detail pages, and we will look at ways to optimize for a better search experience.

I will explain how to optimize URLs for product detail pages, and then we will go into details about optimizing images and videos. We will also dig into optimizing product descriptions, and how to handle product variations and thin content.

Then we will see how you can optimize product names and discuss why it is important for SEO to collect and properly optimize product reviews.

Since products go in and out of stock often, I will show you how to address this situation as well. And finally, we will learn how to optimize page titles for ecommerce.

URLs

It is a good idea to keep products on category-free URLs whenever possible, because products can be re-categorized from one category to another and because category names can change, in time. Neither of these alterations is advisable, as re-categorization or renaming means that you will need to handle 301 redirects, even possibly 301s chains, which can quickly become a headache.

While a product can be accessed through multiple paths due to multi-categorization, the final PDP URL should not contain categories or subcategories.

Use: mysite.com/product-name instead of mysite.com/category-1/product-name or mysite.com/category-2/product-name

If you need to feature categories in the URL, you need to decide on a canonical URL for each product, then point the rel=“canonical” to the representative URL on all the possible URL paths that lead to that product. Also, try to link only to the canonical URL, especially in the global navigation and on internal links.

If the product comes in multiple variations, then the URLs for those SKUs should contain some important SKU attributes (e.g., the manufacturer, the brand name, the color attribute).

The URL might look like this:
mywebsite.com/brand-SKU-name-important-attribute

Keep in mind that including the brand in the URL is OK, since an SKU belongs to one brand only. If you need to use categories or subcategories to generate PDP URLs:

  • Set the category or subcategory name in stone.
  • Use the product’s canonical category and keep the product under that category or subcategory.

Images

Users can get product info straight from images, including details that are not covered in product descriptions (which are mainly skimmed, not read in detail). So, there is no surprise that high-quality images, taken from multiple-angles and showing the product-in-action, increase user satisfaction. However, images also need to be optimized for search engines.

When it comes to increasing conversion rates, savvy online retailers understand the importance of images, especially for product images. A study[1] of online consumers found that:

  • 67% of consumers believe that an image is “very important” when selecting a product.
  • More than 50% of consumers value the quality of a product image more than product information, description, or ratings and reviews.

From an SEO point of view, product images can drive organic traffic through Google Image Search and universal results that include images. Images can also be used to improve the document relevance, and to optimize internal linking.

To understand images, search engines will first look for the alternative text of the HTML image element, img. Some search engines will be able to extract the text from images, using a technique called optical character recognition, or OCR.

Let’s optimize an image for SEO. We will start with a very basic implementation of the image element, ending up with a highly optimized, SEO-friendly image tag. (Note: I will use the terms tag and element interchangeably.)

This is the basic image element:
<img src=“0012adsds.gif” />

Before we proceed, how do search engines analyze images? Here are some signals that search engines use to understand, categorize, and rank images:

  • They take into consideration colors, sizes and image resolution.
  • They look at the image type (e.g., is the image a photo, a drawing, or clip art?).
  • They also weight text by its distance from an image and extract context from the text around an image.
  • They look at the overall theme of the website. For instance, adult websites will have all images tagged as “adult” and will be filtered out when the safe search filter is on.
  • Search engines will use the alt attribute of the image tag. The content of the alt text is directly used in document relevance analysis. The title attribute of the image tag is not cached, but it can provide additional context.
  • They also use image file names.
  • They look at the total number of thumbnail images located on the same webpage as the ranked image.
  • OCR (optical character recognition).
  • Self-learning artificial intelligence trained by human input at large scale. ReCaptcha is one form of human input.

As you can see, search engines take into consideration plenty of clues when analyzing images. For those who want to know more about this subject, a Microsoft patent application from 2008 provides an interesting description of how images are ranked for image search.[2]

Did you know that when you solve an online captcha, Google (and possibly other search engines) uses that input to validate or refine artificial intelligence for image recognition? There are about 200 million captchas typed in every single day; that is a lot of human validation. If you are interested in this subject, watch the TED talk about ReCaptcha[3] and massive-scale online collaboration.

Here are some image optimization best practices:

Take your own product images
This is not an SEO factor per se, but it will help you differentiate from competitors, and can open doors for image licensing partnerships (which may come with some valuable backlinks). However, having familiar imagery is important when searchers look for a product they already know. If you do take your product images, do differentiate but try not to alter the look of the products too much.

Add an alt attribute to every significant image
Adding alt text to images is the best way to give search engines more information about the image and the page content. Without the alt text, the chances of an image being indexed in Google Images are lowered.
<img src=“0012adsds.gif” alt=“yellow t-shirt” />

The only attribute of the img element that gets cached by search engines is the content of the alt attribute.

Here’s a typical product listing grid:

Figure 379 – Products displayed in a grid view on a category listing page.

Below is the content that will be cached by search engines, based on the alt attributes:

Figure 380 – The alt texts are highlighted with a red border.

The alt attributes should contain keywords, but those should not be simply a list of keywords. When writing the alternative text for your product images, think of how you would describe the product image to a blind person in a very succinct and relevant way, in fewer than 150 characters. That sentence will be your alt attribute.

Most of the time, the alt text of a product thumbnail image is the exact product name. In the case of thumbnails for a category listing page, the alt text is the category name.

However, you can add a few more details by including significant product attributes. Instead of the alt text alt=“DG2 Stretch Denim Long Skirt”, you could use alt=“DG2 Stretch Denim Long Skirt in brown”.

Spacer images, 1px gifs, or other images used just for design purposes should still have an alt attribute, but it should be empty, alt=““. This is mostly for code validation and cross-browser compatibility. All other images that visually depict something important to visitors should have descriptive text.
Microsoft recommends to:

“Place relevant text near the beginning of the alt attribute to enable search engines to better correlate the keywords with the image. A copyright symbol or other copyright notice at the beginning of the alt attribute will indicate to the search engines that the most search-relevant aspect of the image is the copyright, rather than what the image depicts. If you require a copyright notice, consider moving it to the end of the alt attribute text”.[4]

More of Microsoft’s recommendations for alt text can be found in their “Image Guidelines for SEO” documentation.[5]

More and more websites have started using CSS sprites, to reduce the number of HTTP requests made to the web server, thus improving page load times. While this is great, the implementation makes it impossible to add alt attributes, raising accessibility and SEO concerns. You can load icons, spacers, and other small images using CSS sprites, but product images should be loaded as single images with proper alt texts.

Use the title attribute
Search engines do not cache the content of the title attributes for images. However, this does not mean that search engines do not use the title attribute to extract relevance signals, or that you should not implement it. The title attribute displays as a tooltip on mouseover in many browsers and is used to give users additional information.

Figure 381 – Title attributes show up as tooltips at hovering the mouse on images. The Outdoor Storage thumbnail contains the title attribute “Outdoor Storage”.

If an image is representative (i.e., a product image), it requires an alternative text, and it can have a title attribute too. The content of the title attribute should not be an exact copy of the alt text, but rather should complement it. Keep the attribute short enough (i.e., under 255 characters), and do not just list keywords—create a meaningful sentence.

Our initial sample image tag can now be improved to read:
<img src=“0012adsds.gif” alt=“yellow t-shirt” title=“athletic women wearing a yellow tee shirt” />

Do not underestimate title attributes just because search engines do not cache their content. They can play a big role in providing context to users, and we do not know how search engines use them to extract relevance.

Specify the width and height of the img tag
Let’s improve the img tag further, for faster browser rendering and better page load speed:
<img src=“0012adsds.gif” alt=“yellow t-shirt on a model” title=“athletic women wearing a yellow tee shirt while running” height=“250″ weight=“100″ />

Figure 382 – These image dimension tags helps with faster browser rendering.

Tip: for infinite scrolling or other image-heavy use, defer image loading until images are visible in the browser.

Use keyword-rich file names
You probably noticed the unfriendly file name used in the initial example: 0012adsds.gif. This filename does not help search engines understand what the image is about, and it should be avoided.

Figure 383 – This is an example of a good image file naming for a category thumbnail, as it includes the category name “brake discs”. The file names for product images should be even more specific.

Your file names should include the product name, the category name, or whatever is depicted in the image. Having keywords in file names has long been recognized as an SEO factor.[6]

A common challenge for large ecommerce websites that use hosted image solutions to manage, enhance, and publish media content, is that most of these solutions do not create SEO-friendly image names. For example, this URL is not SEO-friendly at all:
http://s7d5.samplemediahost.com/is/image/TA/116015_ZZZZ?$pdppreview_360$

Talk to your provider to find out whether there is a workaround to achieve a better file-naming convention.

If we further optimize our example to include a relevant file name, we now have the following image tag:
<img src=“yellow-t-shirt.gif” alt=“yellow t-shirt on a model” title=“athletic women wearing a yellow tee shirt while running” height=“250″ weight=“100″/>

You should set and enforce image-naming rules. Otherwise, things can quickly get messy. For example, you could have the rule to append the image ID at the end of the file name after two plus signs, as in this example:
yellow-t-shirt++0012.gif

Provide context for your images by using captions and nearby text
Image captions or nearby text surrounding the image can provide context to search engines.

4 ipods followed by text description

Figure 384 – The descriptions in this screenshot provide context to search engines.

Apart from adding relevant image captions, you can provide a better context for your images by placing plain text content nearby. Whenever possible, add a relevant sentence close to the image, both visually and in the HTML code.

Here’s how our img element can be improved even further by adding a caption to it:
<img src=“yellow-t-shirt.gif” alt=“yellow t-shirt on a model” title=“athletic women wearing a yellow tee shirt while running” height=“250″ weight=“100″ />Adidas 2011 Summer Collection </br> Yellow T-Shirt

In many cases, the caption will be the product or the category name.

Create standalone landing pages for each image
If it makes sense (e.g., if you sell stock images), create dedicated landing pages for each image.

landing page dedicated to a single image on pixmac.com

Figure 385 – This is an example of a clean and useful landing page dedicated to a single image. In our example, the image is the product that is sold online.

You can also encourage users to generate content on your website by allowing them to comment, share or rate images.

Make use of image XML Sitemaps
Create image XML Sitemaps and include information about your product and category images. Here are the official guidelines on how to accomplish this.[7]

xml source code highlighting image details section

Figure 386 – The basic information in the image Sitemap should include the path for your image files.

You will also be able to specify information such as image caption, geolocation, title attribute, and license. Once you have generated this file, submit it to Google using the Search Console.

Add EXIF data to your images
At least one search engine (Google) has confirmed that it uses EXIF[8] data when analyzing images. More and more photo and mobile devices automatically add EXIF information such as geolocation, the owner of the picture, or the camera orientation. If this data has the potential to provide search engines with additional info about images (and apparently it has), edit the EXIF data for your product and category images. Do not make this a top priority, however.

Adding image metadata such as User Comments can be a good way to reinforce your image’s title or alt text. Other metadata that may be useful are Artist, Copyright or Image Description.

screenshot of exif editing software

Figure 387 – You can use EXIF editors to change images’ metadata.

It may worth testing how adding EXIF metadata affects traffic from Image Search. Just keep in mind that Google re-crawls images at a much lower rate than the regular web. Also, it pays to mention that when you use image optimization tools to reduce the image size, you can accidentally remove existing EXIF data.

Group similar images into folders
If possible and appropriate, all images that can be logically grouped around a similar theme should be grouped into folders. You can replicate the directory taxonomy of your website to use it for images as well.

If the URL path for your T-shirts category is mysite.com/clothing/tshirts, then your images can be placed under mysite.com/clothing/tshirts/images/

browser address bar that contains a proper URL structure

Figure 388 – You can see how this t-shirt image is located under the /t-shirt/ directory. This directory contains only t-shirt images.

The advantages of grouping into folders are that you will be able to add keywords to the image URL and provide relevance clues to Google Image Search users. While grouping has limited influence on rankings, keywords in the directory structure are some of the signals search engines are looking for.

In our example, if we put the image under the /t-shirts/ directory, the img tag becomes:
<img src=“/t-shirts/yellow-t-shirt.gif” alt=“yellow t-shirt on a model” title=“athletic women wearing a yellow tee shirt while running” height=“250″ weight=“100″ />Adidas 2011 Summer Collection </br> Yellow T-Shirt

You should place your adult (or other sensitive) images into separate directories.

Use absolute image source paths
The way you reference the image source (src) does not directly influence rankings but using absolute instead of relative paths can help to avoid problems with crawling, broken links, content scrapers, and 404 errors.

If we update the source to reference an absolute path, our example becomes this:
<img src=“http://www.domain.com/t-shirts/yellow-t-shirt.gif” alt=“yellow t-shirt on a model” title=“athletic women wearing a yellow tee shirt while running” height=“250″ weight=“100″ />Adidas 2011 Summer Collection </br> Yellow T-Shirt

Make the images accessible through plain HTML links
Try not to use Flash or JavaScript to create slideshows, swatches, zooming, or other similar features; that will make it impossible for search engines to find image URLs for important images. If you have to use JavaScript, provide alternative image URLs. Otherwise, search engines may not be able to crawl images URLs easily.

Search engines know that users like high-quality images, so always keep the high-resolution product image URLs accessible when JavaScript is disabled. Search engines can execute JavaScript to some extent, but if the only way to reach a product image is with JavaScript enabled, crawlers may never discover that image.
CRO tip: Place your product images above the fold.

Implement plain text web buttons
One technique for improving page load speed, and in some cases even the internal link relevance, is to create web buttons with CSS and HTML. Instead of using the classic web button made of an image, you mimic the appearance of the button by overlaying plain text on a CSS-styled background.

Here’s how a sample implementation looks like:

Figure 389 – The text on the blue background buttons (e.g., “Buy Celebrex”) is plain HTML text that can be selected with the mouse.

Since search engines seem to assign a bit more weight to text links than to image alt text, this technique has the potential to increase the relevance of the linked-to page.
The opposite of this technique is to take “unwanted” text (e.g., site-wide boilerplate text) and embed it in images. For instance, if you have a global footer that includes a regulatory warning at the bottom of each page of the website, you could embed it into an image, so it is not diluting the page relevance. This is a bit gray hat. Keep in mind that this technique is used mostly by spammers to pass email filters, and it can be flagged as spam on web pages, too. Use it at your own risk.

Figure 390 – The text in the image above is not plain HTML text; it is embedded in an image. With such tactics, keep in mind that Google is fully capable of reading text from images.

Make it easy to share images
Whenever appropriate, make it easy for users to share and embed your images. This is great because with a mandatory image attribution requirement you can generate backlinks.

In this example take a look at how Flickr integrates social sharing and embed codes:

Figure 391 – You can encourage users to share images, especially product images. User-generated photos, such as a product in real life or inspirational pictures can be re-shared.

To recap, we started with a very basic image element:
<img src=“0012adsds.gif” />

And we have ended up with this optimized version:
<img src=“http://www.domain.com/t-shirts/yellow-t-shirt.gif” alt=“yellow t-shirt on a model” title=“athletic women wearing a yellow tee shirt while running” height=“250″ weight=“100″ />Adidas 2011 Summer Collection</br> Yellow T-Shirt

Videos

A 2011 study[9] found that videos in search results have a 41% higher CTR than plain-text results. One online retailer found that visitors who view product videos are 85% more likely to buy than visitors who do not.[10] According to econsultancy, Zappos sales increased between 6% and 30% on products with product videos.[11] There are multiple benefits to having videos on PDPs, so there is no doubt that you should do it if the budget permits.

Videos can be either self-hosted on your servers or hosted by a third-party provider. Some providers like YouTube are free, while others like Wistia are paid. Keep in mind that if you host videos on third-party websites rather than on your website, you might miss the opportunity to gather links to your videos. If the video goes viral, you will miss a lot of backlinks and social signals.​For example, Dollar Shave Club launched their viral video on YouTube, and its website gathered almost 20,000 backlinks.

Many websites mentioned the brand name without linking to Dollar Shave Club, however most of the websites linked to the YouTube video. Should they had self-hosted the video, people would have linked directly to the video URL on Dollar Shave Cub, thus increasing their backlinks.

Figure 392 – The number of backlinks for dollarshaveclub.com spiked after the video went viral. This is a positive side effect of the video becoming viral.

However, take a look at the number of backlinks the YouTube URL gathered:

Figure 393 – If these links would’ve pointed to Dollar Shave website, it would’ve helped increase their search authority.

Here are several tactics to get the most out of your product videos:

  • Transcribe the video and make the text available to search engines, whenever doing so makes sense (e.g., when you have an expert video-reviewing a product, or when you have how-to videos).
  • Add social sharing buttons and easy-to-use embed codes.
  • Create video XML Sitemaps[12] and submit them to Google and Bing.
  • Repurpose your videos to produce related content—e.g., presentations, user manuals, instructographics,[13] podcasts, etc. You can go the other way too: use other existing media types to create the videos.
  • Mark up the product video with schema.org vocabulary.[14]
  • If possible, embed the video with HTML5 rather than iframes.

It is best to either self-host the videos or use a paid hosting solution that allows you to embed the videos on your own URLs. This will increase the chances of getting video-rich SERP snippets for your domain name. YouTube provides rich snippets for the domains the videos are embedded on, but only sporadically.

The next image is a representation of how Google ranks a YouTube video in the first spot, while Zappos does not get a video rich snippet although a video is included on their page.

Figure 394 – Why does Google rank their own property (YouTube) while the content creator (Zappos), ranks below YouTube?

Product descriptions

Product descriptions should be written to improve conversions by creating an emotional connection with users and enticing them to act. It is a known fact that evoking any emotion is better than not evoking emotions at all. While most people only skim descriptions, if you carefully craft the first sentence to be engaging enough, you will increase the chances of making a sale.

The best product descriptions are written by copywriters who have familiarity with the product and some basic SEO training, and not by SEOs with some copywriting skills.

Figure 395 – Read this description. It does not read like the classic SEO style you are used to, isn’t it?

It is a lot of work to write converting product descriptions, and therefore you may be tempted to skip it. However, the good news is that many of your competitors will not invest in great product descriptions, for the same reason. You can capitalize on their mindset and start differentiating your brand and gaining SEO advantage at the same time.
Prominently display a brief product copy crafted to sell the benefits of the product (also known as the “benefits copy”), in an easy-to-spot place on the product page. You can complement it with a more detailed product copy that describes the product features (also known as the “features copy”), on a less important area of the page.

Try to incorporate the following into the copy:

  • Product-related keywords (e.g., SKU numbers, UPCs numbers, catalog numbers, IBANs, part number, etc.).
  • The root form of the words used in the product name, as well as variations and synonyms (e.g., “seat”, “seating”, “chair”).
  • The product name (make sure you repeat it in the product description at least once).
  • Other names the product might be known by.

Doing this will be important not only for search engines but for your internal site search as well (given that the site search uses multiple data sources to score and rank items). So, review your analytics data, incorporate frequently searched queries into your product descriptions, and use the same queries to feed your internal site search database.

Figure 396 – Section (1) of the copy focuses on benefits, while section (2) lists the features.

In the previous image, section one focuses on benefits while section 2 lists the product features. The layout of the PDP allows the features copy to follow immediately below the benefits copy, which is ideal but not always possible, due to design constraints. In many cases, the page layout allows room for only one or two sentences and a hyperlink to a section down the page, where you can list more detailed product info (i.e., details presented in tabbed navigation.)

Figure 397 – Tabbed navigation allows space for longer product descriptions.

Tabbed navigation is a very common design element on PDPs, but there are some concerns about how search engines treat content that is not visible to users (e.g., in non-active tabs, in accordions, in “read more” dropdowns, or in collapse & expand sections).

Search engines will index this content, but they do not assign the same weight to content that is not visible to users. In the past, Google suggested that text hidden for design purposes is fine as long as you do not hide too much content with too many links.[15] However, what that means is that this technique is not considered spam. However, coming mobile-first indexing, the content behind tabs will be assigned the same weight as visible text.

You can create separate URLs for each tab, but that would decrease the overall content on the product detail page, which is not a good idea. Moreover, the user experience is better if the tabs can be switched quickly, without reloading the page.

An interim solution, until mobile-first indexing is rolled out for your website, is to display the entire product description without any expand, collapse, or tabs. Search engines seem to prefer this, but design limitations introduce constraints.

Figure 398 – The above is a well-written product description that is fully visible without any parts of the content behind tabs or read more. This is also a very good example of how a “boring” product can have a great description.

If you want to use tabs to make the user experience more pleasant, consider the following:

  • Display the product description (or other more important content) in the default active tab. If search engines can understand which content is hidden, and which one is not, then putting the most important content in the default tab increases the chance of getting more out of it.

Figure 399 – The product description tab is the default active tab and is accessible when bots request the page. Additionally, the implementation uses hash marks to switch between tabs, which means that the content of the Specs and Reviews tabs is already loaded in the HTML code.

  • Do not generate separate URLs for each tab, unless the information provided is substantial enough to justify creating a new page.
  • If you want the content inside the tabs to be indexed, make sure it is available with JavaScript disabled.
  • If the tabs contain the same boilerplate text on all product detail pages (e.g., shipping information, legal, etc.), you can put the repetitive text in an iframe to avoid duplicate content issues.
  • Consider placing user reviews outside the tabbed navigation.

Product descriptions are one of the best spots to feature internal contextual links. Ideally, you will link to parent categories in the same silo (maybe using the same URLs as in breadcrumbs), but you can also link to other related products that make sense for users. Because internal links may be taking users away from the product page, you should balance internal linking and conversion.

If you are not careful, product descriptions can generate duplicate content, either within your website (if the same product description is used across multiple product variations’ URLs) or on external websites (if you use generic manufacturer-supplied descriptions).

Manufacturer-supplied descriptions
The general SEO wisdom is that you should write your own unique product descriptions. That is indeed one of the best approaches to optimizing PDPs if you can put it in practice. However, keep in mind that this does not work with every product or within every industry. For example, it makes sense to write unique product descriptions for expensive wristwatches, but not for ordinary pencils. Also, this is often not economically feasible for websites with very large inventories.

Moreover, in some cases, Google ignores the product description and chooses to rank what’s best for users based on their intent and location.[16] So, while unique product descriptions may not always rank at the top, you still have to test the impact of writing 100- to 200-word product descriptions at scale, before deciding whether it will work to your advantage. Start with your top 10% most important items, write the descriptions for conversions and branding, then measure the impact on rankings and traffic. If 10% is too much based on your inventory size, then start with the top 100 to 150 products.

I know at least two websites that were able to rank for very competitive keywords, by creating unique and very compelling product descriptions. Because 90%+ of the pages on those websites were PDPs, they drove up the relevance of the entire website. Those two websites rank now with almost no backlinks pointing to them.

If the generic description provided by the manufacturer/supplier is just a small part of the main content on a page, it should be fine, according to Google.[17] If the manufacturer requires you to keep their descriptions unaltered and that causes SEO problems, place each description in iframes with a noindex in the frame source. In this case, you will need to add unique content to differentiate your website from competitors.

Product variations

Even with unique product descriptions, you can run into crawling and duplicate content issues, if products come in multiple variations. For example, the Nike Dual Fusion shoes come in red, green, and black; this can generate unique URLs for each possible product variation. Usually, product variations generate exact or near-duplicate content, and such cases are best handled with rel=“canonical”. However, rel=“canonical” is not the only solution.

Decide how to handle product variations once you understand how your target market searches online; base the decision on your business goals.
For instance, if your target market uses search queries that include product variation keywords, you may want to have unique URLs for each variation. Make those pages available to users and search engines, and do not use rel=“canonical” to a representative URL. The challenge is to make these pages compelling, by adding unique product descriptions for each variation.

Let’s discuss a few approaches to handling product variations.

URL consolidation

With this approach, you handle all different product options in the interface, using a design that helps users make faster and better product selections. All product variations are listed on a single product detail URL that does not change when an option is selected in the interface.

For instance, you can provide product options with dropdowns or swatches, as depicted in the next screenshot.

Figure 400 – Changing the color and size options with a drop-down selector does not change the URL.

Figure 401 – The same with this example. These swatches change the product image and product description but at the same URL.

To increase the chances of the consolidated page surfacing in SERPs for queries that include variations such as colors – e.g. “Nike Dual Fusion 2 Run Gray” –, include the variation as plain text copy in a search engine friendly way. For instance, the product description or the specs copy would include something like, “this item is also available in gray, red and blue”.

If you already have unique URLs for each product variation, and you want to consolidate them into one representative URL, you can either 301 redirect or use rel=“canonical” to point to the authoritative URL.

Unique URLs for each product variation

Having unique URLs for each product variation allows product variation pages to show up in SERPs, for various product attributes queries. However, since the content on these pages is very similar, they might compete against each other—or worse, they might be completely filtered from the SERPs. If you use unique URLs for each variation, then the variations pages will include a self-referencing rel=“canonical” tag.

Additionally, having individual URLs for each product variation may dilute indexing properties and link authority, because people may link to multiple SKU URLs instead of a single one.

This approach should be implemented if your data shows that your target market searches for various product attributes such as model numbers, colors, sizes, etc. (e.g., different tire sizes, like P195 / 65 R15 89H).

The challenge and key to success with this approach is to create unique enough content for each variation page so that it does not get filtered by Panda, and it does not create duplicate content pages.

Unique URLs for each product variation with a canonical URL

A hybrid approach is to use the interface to allow users to select product options without changing the URL, and at the same time have separate URLs for each product variation. Each product variation URL can point to the authoritative product using rel=“canonical”. This is how Zappos handles product variations.
The canonical product page is http://www.zappos.com/nike-dual-fusion-run-2~2.

The following product variation URLs (different color models) point to the canonical URL above:
http://www.zappos.com/nike-dual-fusion-run-2-pure-platinum-dark-gray-gym-red-black
http://www.zappos.com/nike-dual-fusion-run-2-black-pure-platinum-dark-gray-metallic-platinum
http://www.zappos.com/nike-dual-fusion-run-2-pure-platinum-dark-gray-gym-red-black

Because all color variation pages point to a canonical page, Zappos ranks in Google SERPs with the canonical URL, even when someone searches for a color-specific product.

Figure 402 – Zappos ranks with the canonical PDP, which is the page for the pink shoes. Users will have to take additional steps to find the red color option, which is probably not the best search experience.

Some of the advantages of having separate URLs for product variations are:

  • Users can share each variation URL.
  • You can link internally to specific variations.
  • People can backlink to any variation URL.
  • You can list product variations on internal site search results, or even on category pages.

For example, if an item is available in various colors, and someone visits the category page to select “red” from the faceted navigation filters, you will want to show only red items. If you do not have separate URLs for “red”, it is simply not possible for that user to share the “red shoes” page with someone else.

Additionally, if you run product listing ads for variation-specific keywords, it is better to send users to a product variation landing page. Pages targeting product attributes tend to convert better than just one-size-fits-all pages that require users to find the product filtering options.

I usually recommend creating separate URLs for the most important product variations, not just for SEO reasons, but also to provide better landing pages for PPC and PLA campaigns. If you get into SEO issues (crawling, duplicate content or ranking cannibalization), you can always noindex variation URLs, or you can implement rel=“canonical” to point to the representative URL.

The following is a quote from Google:

“Google allows rel=“canonical” from individual product variations to a general/default version (e.g., “Taccetti 53155 Pump in Beige” and “Taccetti 53166 Pump in Black” with rel=“canonical” to “Taccetti 53155 Pump”) as long as the general version mentions the product variations. By doing so, the general product page acts as a view-all page, and only the general version may surface in search results (suppressing the individual variation pages)”.[18]

Thin content

Even after creating unique descriptions for every single product in the database, you might find that pages have been filtered out of SERPs because they have been classified as “thin”. This means that the amount of content on the PDPs is not “relevant enough” for Google to include the URLs in its results.

Figure 403 – Highlighted in red is the entire description for this product, and pretty much the entire text content on this page. Unless this page has good authority, its chances of being included in the SERPs based solely on content are slim.

Ask your programmers to provide a .csv file that contains the word count for each product URL. If the website is relatively small, run a crawl with Screaming Frog and sort the URLs by WordCount. Your data needs to include only non-boilerplate text, such as the word count for product descriptions, user reviews or other forms of UGC. Get this list in Excel and sort by lowest count. Find pages with low content (e.g., under 50 words), and add them to the copywriting queue based on their importance. If there are too many pages with thin content (I usually set that threshold at around 50 words of content unique to the site), you may even consider noindex them until you can add more content.

Product names

Product names are one of the elements that attract the user’s eye within moments of landing on a page. On most ecommerce websites, the design of the PDPs usually follows the same pattern: the product image is to the left, the product name is either above the image or to its right side. The add to cart button is to the right of the product image, and the product info is either on the right or below the product image.

Probably this is why users scan PDPs using the well-known “F” pattern.

Figure 404 – The “F” pattern applies to ecommerce websites, too. Image source: NNgroup

Although there seems to be little correlation between rankings and H1 headings,[19], Google suggests[20] that they assign more weight to H1s. It is therefore still a good idea to wrap the product name in an HTML heading element, preferably the H1.

The following is an excerpt from Google’s SEO Report Card, which aimed to identify potential areas for improvement on Google’s product pages:

“Most product main pages have an opportunity to use one <h1> tag … but they’re currently only using other heading tags (<h3> in this case) or larger font styling. While styling your text so it appears larger might achieve the same visual presentation, it does not provide the same semantic meaning to the search engine that an <h1> tag does. The product’s name and/or a few words about its features are great to have in an <h1> tag for the product main page”.[21]

However, if the document structure requires it, an H2 for the product name will work too. Note that the heading hierarchy on PDP templates will be different from the heading hierarchy on category pages or other page templates. Keep in mind that visually, the product name should be the largest font size on the product page.

Do not be afraid to create long product names. Two-column PDP layouts can easily accommodate this. Include the brand or the manufacturer associated with the product, especially if you sell products from multiple brands. Also include model numbers, collection names, SKU numbers, or other important product attributes.

Figure 405 – On this dress PDP, the product name includes the brand, the fabric, and the color. This is great info for users and search engines.

Figure 406 – The product name in the example above does not even include the category the product belongs to, slippers. It may be obvious to users that they are looking at slippers, but not including “slippers” in the product name is not good for search engines.

The person (or the team that adds new products to the catalog), should be trained to understand how your target market searches for those products and should propose product name templates based on that data.

This is not a complex process, and if you want to make sure you do not mess up the product names, add just the shortest product name in the database and then programmatically add other relevant product attributes to it.

Product naming gets complicated when you do not have control over product names. That can happen when you run a marketplace where suppliers upload product sheets. In such cases, naming conventions are hard to create and enforce, and it may be better to let suppliers use open text fields for product names. If sellers upload products, you should enforce a maximum number of characters to be used in the title. Amazon, for example, has a maximum limit of 250 characters. It is also a good idea to have a system that checks if the titles are not truncating words at the 250-character limit.

If you allow product names to be changed, give the update rights to one person only. Optimally, this person should be aware of the impact of changing product names (e.g., URLs might change, potential backlinks loss, internal linking updates, 301 redirects from old to new URLs, etc.).

Also, in most cases, it is a good idea to set product names in stone or not to update the product name URL when the product name changes. However, the latter option may pose some issues with new URLs containing old product names. So, you need to balance updating versus not updating URLs when product names change. Although not preferred, a solution is to keep the PDP URLs free of product names and use only product IDs in the URL. Consider this approach only if you cannot easily implement 301 redirects.

Use Schema.org Product[22] type to mark up your code with product names, brands, manufacturers, images, and a lot of other product properties. Search engines do not yet use many Product properties, but as long as you already keep product attributes in your database, it will not be much of a hassle to mark up your HTML code at a later date. Google supports some of these properties[23] and will gradually support even more. The preferred way to markup the content is JSON-LD.

Reviews

There is no doubt that product reviews are good for users and conversions. According to one study,[24] adding just the first review can increase conversion by 20%. Reviews enriched with additional info about the reviewer, or reviews that offer the ability to rate a particular product criterion (e.g., quality versus price) are even more useful for users.[25]

You can generate reviews by collecting them from people who purchased on your website, or you can integrate them from vendors who sell reviews. Keep in mind that it can take many purchases to get a single review. Anecdotally, it took Amazon 1,300 book sales to generate the first review[26] for Harry Potter and the Deathly Hallows.

This is why it is a good idea to implement both in-house and third-party reviews, especially if you are just starting out.

The reviews can end up on multiple URLs, depending on the solution you chose and how you customized their out-of-the-box implementation. This means reviews can occur on URLs on your own website, on the vendor’s website, or other competing websites. This is likely to create duplicate content issues, so you will need to pay attention to. Just keep in mind that there is no such thing as a “duplicate content penalty”. However, if your reviews are duplicated somewhere else, they may not be working as well as you would expect.

When implementing reviews, first you need to decide which type of content you want to surface in SERPs for “reviews” related queries: do you want to rank PDP URLs or product review URLs that are specially constructed to target queries like “review + product names”.

Let’s say you want PDPs to show up in SERPs for “reviews” related keywords.
In this scenario, the reviews should be placed on the PDPs and should be openly available to search engine robots. This means that the reviews will not be inserted in the code with JavaScript, AJAX or other technology that loads them client-side. The reviews should be available in the HTML source code when a bot fetches the PDPs. All other pages, sections, or subdomains that list the same reviews should be blocked with robots.txt.

For instance, Amazon allows the reviews for this bicycle SKU (Kent Super 20 Boys Bike (20-Inch Wheels), Red/Black/White) to be indexed.

Figure 407 – The customer reviews are included on the PDP itself, and Google caches the reviews.

To check whether your reviews implementation is SEO friendly, look for the content of the reviews in the text-only cached version of the PDP. Additionally, do a “Fetch & Render” using Google Search Console to see if the reviews are showing up on the rendered page. Make sure you are rendering the mobile pages as well.

What if you want a dedicated product reviews page to show up in SERPs, instead of the PDP?
Some vendors require a subdomain or a directory to publish their reviews, e.g., reviews.mysite.com or mysite.com/reviews/product. This is not necessarily a bad thing, and it is a valid approach for merchants who plan to attract searchers in the research stage of the buying cycle. The “reviews” keyword modifier (e.g., “Under Armour Stormfront Jacket reviews”) suggests that users are closer to a buying decision. To increase the chances of product review pages ranking for “review”-related keywords, consider linking internally and externally with “reviews” in the anchor text.

Figure 408 – The reviews subdomain shows up in SERPs. This was a deliberate decision on Clinique’s side.

If you want the dedicated product reviews page to show up in SERPs, be careful with duplicate content on your website. In many cases, you will list the same product reviews on both the PDP and the product reviews page. In these instances, you will have to prevent crawlers from finding the reviews on the PDP. When you have the same reviews on multiple URLs, search engines will have difficulty identifying the right page to surface in SERPs.

To check whether your reviews generate duplicate content, copy a few sentences one at a time from various reviews, and do a “site:” search on Google:

Figure 409 – In the example above, the same review shows on 15 URLs. This needs to be investigated.

However, if you list only a small fraction of the total number of reviews on the PDP (e.g., five out of 50), then you can let search engines access the reviews on the PDP, as well as on the product reviews page. In this case, do not block the reviews subdomains/directory with robots.txt.

You can see this implemented on Amazon:

Figure 410 – Amazon has a dedicated directory for product review pages.

The reviews for Kent Super 20 Boys Bike (20-Inch Wheels), Red/Black/White (Sports) are accessible to Googlebot and have been cached by Google. Amazon can afford this approach because the product reviews page lists three times more reviews than the PDP.

Figure 411 – Amazon could improve the product reviews page by consolidating the three paginated pages into one superset.

Amazon opens reviews pages for bots to rank multiple review pages related to this bike:

Figure 412 – The first position is taken by the PDP, while the second and third rankings are taken by the product reviews page.

Here are some SEO considerations you may want to keep in mind when implementing reviews.

Pay attention to duplicate reviews on other websites
If your provider syndicates reviews in an SEO-friendly manner to other websites (meaning they are accessible and available for indexing by search engines), that will cause duplicate content issues. Again, it is not like you are going to get penalized for doing this, but the SEO effectiveness of the reviews will be diminished.

If the provider syndicates the reviews, you should allow crawlers to access duplicate reviews only if you add substantial unique content to the pages the reviews are listed on, in addition to the reviews offered by your provider. For example, you can feature your Expert Reviews or reviews that you collected on your own.

If 90% of the reviews are syndicated somewhere else on the Internet, wrap them within an iframe, put them inside a blocked robots.txt subdomain, or AJAX the review implementation.

You also need to be careful if you syndicate your reviews on comparison-shopping engines (CSEs).

Figure 413 – ABT is the source of the review, but Bizrate and Shopzilla have the same content indexed and might rank above ABT.

If you plan to syndicate reviews on CSEs, then select which reviews to keep for your website, and which ones to syndicate.

Mark up reviews and ratings with structured data such as microdata, microformats, or RDFa.
Use Schema.org vocabulary to mark up the reviews and ratings to get SERP rich snippets (stars, ratings, videos, etc.). The reviews and ratings have to be displayed on the same page as the relevant product. Google explains its implementation in detail in this article.[27]

Also, when working with Schema markup, this tool might come in handy: http://schema-creator.org/product.php.

Separate URLs for each review
If your current implementation generates separate URLs for each review, then using rel=“canonical” to point to a view-all reviews URL is acceptable.[28]

Display reviews of related products
If a product does not have any reviews, but there are other closely related items with reviews (e.g., the same pair of Nike shoes, but in a different color), you can display reviews for the related item. However, you have to ensure that the reviews make sense to users. Do not markup the reviews with semantic markup and place those reviews in a JavaScript or robots.txt blocked iframe – the purpose is to offer something useful for users, and not to spam search engines.

Tip: Reviews are one of the best ways to keep product detail pages “fresh”. If you keep adding reviews to a product page regularly, the page will be crawled more often, its authority will increase, and it will show up in SERPs for more queries.

Other ways to freshen up PDPs are to include excerpts from relevant blog posts or to add one or two sentences from research papers related to the product (if applicable).

Expired, out-of-stock and seasonal products

Product lifecycles and seasonality can rarely be avoided. Some products can expire for good, and others can go out of stock. Out of those that go out of stock, some may be re-stocked, others will not. Other products are available only during a certain season, while some products are evergreen and never change or run out of stock. The way you handle product lifecycles from an SEO perspective depends on future inventory availability.

There is no definitively correct way to handle product lifecycles, but generally, try to:

  • Avoid removing OOS items URLs until you know if the product comes back in stock or not. If you remove URLs, they will return a 404 header response. 404 pages are taken out of the index after a while, and you might lose some possible backlinks. If you need to return a 404 page, then at least create a custom page that will help reduce bounce rates.
  • Also avoid serving soft 404s, which are light-content pages responding with the 200OK response code, but their content display just “Sorry, the item is no longer available” or something similar.[29]
  • Do not 301 redirect every out of stock PDP to the home page or their parent category page. Since the home page is unrelated to the out of stock product, 301 redirecting a PDP will be treated as a soft 404, and this will not preserve indexing signals.
  • Use meta “expiry” if your items are not available after a certain date. Classified ads can be marked up with this tag.

Figure 414 – The header response code for inexistent URLs should not be 200 OK.

In the screenshot above, you can see how the server responds with 200 OK for a dummy URL request. There are instances when this kind of setup makes sense, for example when you want a PDP to load properly with only a product ID present in the URL). In such cases, you need to include a rel=”canonical” to the representative URL.

Discontinued products

These are products that have reached the end of their lifecycles.[30] For example, Canon stopped manufacturing the Canon EOS-1Ds Mark III model in 2012. Sometimes, end-of-lifecycle products are replaced with a newer model, but other times they are discontinued for good.
If a product is replaced with a newer version, you can 301 the URL for the old model to the latest product URL. If possible, alert users with a message that the product they are looking for has been discontinued, and it has been replaced with a newer one. The old product name should not be close to the text “not available” in the source code. Otherwise, the “not available” text may show up in the SERP snippet. You can even place the non-availability message in a robotted iframe or JavaScript, to avoid that. Do this only to improve the CTR on SERPs, and not to attempt to game search engines.

Because the target market does not stop searching for a product immediately after the manufacturer discontinued it, you should redirect searchers only after you notice a significant decline in the search demand for that product, or when all your stocked items for that SKU are sold. Until that time, display a notice on the old PDP announcing that the product has been discontinued, and link to the newer version.

Some prefer leaving both pages alive indefinitely, with or without a notification message, depending on stock availability.

Figure 415 – This PDP URL is still available and responds with a 200 OK code, although the product has been discontinued. This is an acceptable solution because if you still have the discontinued item in stock, you want to sell it.

Upcoming products

Create pages for high-demand products that have not yet been released, but will be on the market shortly (i.e., a few months down the road). Such pages must be content rich and helpful for users. The usefulness of this tactic is that these pages will be mostly non-commercial, and they will have the ability to gather links organically from trusted sources more easily, compared to commercial pages.

A month before the new product launch, increase the number of internal links to those pages, for example by linking from the home page. Take pre-orders or capture contact info before the launch date. The moment the product becomes available, allow users to add to cart.

If you plan well, you will be positioned ahead of your competitors at the product release date.

Out-of-stock (OOS) products

There are two main use cases for OOS products:

  • The product will never be restocked.
  • The product goes only temporarily out of stock.

If the product is never restocked, you have a couple of options. The first is 301 redirects to one of the following pages:

  • Another variation of the product (i.e., the same product but in a different color).
  • A replacement product (i.e., an updated version of the product).
  • A parent category or subcategory (this one is not advisable, so I recommend not doing it).

The second option is to leave the PDP page return 404. You will do this if you cannot implement the first option.

The third option is to leave the PDP alive and return a 200 OK response code. In this case, it is very important to display a clearly visible notice communicating the reason for unavailability. It is also important to guide users to a replacement, or a similar product. Optionally, the “add to cart” button can be changed to “out of stock” and deactivated, so users cannot add the item to cart. Another option is to allow users to “save for later” or “backorder”.

To minimize the effects on the conversion rate for permanent out of stock PDPs, offer related items in a very accessible spot on the page.

If the product goes temporarily out of stock, the page should return a 200 OK response, and let customers know that the product is currently out of stock. It should also provide users an estimated availability date if that is possible.

Eventually, you should offer an incentive (e.g., a 10% discount) to compensate for the inconvenience, and to collect their email address to announce the re-launch of the product. Additionally, make sure users can back-order the product.

Figure 416 – The “Temporarily Out of Stock” messaging is easy to spot, and it is clear. However, it would be better to have it separated from the product name.

If all the products under a subcategory are out of stock and the PDPs received qualified traffic in the past, 301 redirect to the parent category. The subcategory page will redirect to the parent category since it does not have any stocked products. This may not be the best approach from a user experience perspective, but you may want to preserve eventual backlinks pointing to PDPs and subcategory pages. If the subcategory generates minimal revenue, traffic, and backlinks, let it return 404.

Keep in mind that shoppers may become frustrated if too much of your inventory is out of stock. In this case, markup the affected pages with noindex, and remove them from navigation until your inventory improves. This helps to address content quality and Panda penalties.

Here are some additional recommendations for handling out of stock products:

  • Google treats expired products as “soft 404” errors. This means that OOS pages are considered low-quality content, and in many cases, such pages should be noindexed.
  • Google’s official recommendation is to remove OOS pages from its index by returning a hard 404 Page Not Found header response. However, this approach does not work well for UX and conversion.
  • Out of Stock SKUs should not be presented anywhere in the site navigation. However, they can appear on internal site search result pages when someone queries the SKU number or the exact SKU name.
  • Out of Stock URLs should be accessible for type-in traffic or email, to assist those who have questions about a product they purchased in the past and is now OOS.
  • OOS products should be accessible to the sales team, on your intranet.
  • Neither 404’s on discontinued or long term out of stock product URLs (which is what many times Google recommends), nor 301 or 302 header redirects provide the most optimal user experience.
  • Google says do not 301 or 302 to a parent category or home page, and this is the correct approach.

Despite all the options above, I believe there is a better approach from a UX and SEO perspective: instead of a “useful” 404 page, show the OOS page only to users landing from outside your website (i.e., organic or referral).

On this page, display a modal window with a very short and clear message about the product status. Offer one link to the OOS product, and another link to the related or the replacement product. Be careful about the size of the modal window for mobile. It should cover a maximum of 20% of the screen size, and it should be placed at the bottom of the screen.

The messaging on the modal window can be similar to: “Sorry, this product is out of stock. Visit the out of stock product or navigate to the replacement product”. The modal window displays a 10 seconds countdown timer. Ten seconds should be enough for most people to read the message. At zero, the user is redirected to the most appropriate page.

The redirect is done with either JavaScript, which seems to be passing SEO signals along, or with the meta-refresh tag. If you want to pass authority from the old to the new page, the JavaScript timer and the meta refresh must be under 5 seconds.

Seasonal products

If the product is seasonal, handle it in a similar way to out-of-stock items. If the product will return in stock the next season, then leave the page in place, notify the users, and remove the ability to place orders. If it does not return, then 301 redirect to another variation of the product (i.e., same product but in a different color), or a replacement product.

Seasonal products, just like event and holiday URLs, require some attention in regards to URL naming and maintenance. For example, if you use years in the URL, when you update the URL the next year, it is like starting over again. Of course, you could do a 301 redirect from the previous year’s URL to the current one, but it is better to avoid using URLs that designate years or other time indicators. Instead, use a generic URL that can accommodate new dates or models in existing URLs.

For example, Ford uses ford.com/cars/focus/ for their newest Focus model, 2018. Toyota uses toyota.com/corolla/ for all Corolla models, no matter the release year, as you can see in this screenshot:

Figure 417 – This URL naming convention consolidates links to a single page, year after year.

The same recommendation applies to special-event URLs that occur regularly. Instead of mysite.com/valentines-day-2018, use mysite.com/valentines-day/. This page can be promoted harder when the time comes, but it should not be allowed to return a 404 status code after the event ended.

Title tags

While <title> is technically not a tag but rather an HTML element, it is often referred to as a tag in SEO contexts.

An internal analysis[31] that Google performed on their very own Google product pages found that over 90% of those pages could improve its SEO simply by optimizing the title tag.

Since Google emphasizes titles in blue text, they are the first element searchers scan on SERPs. Titles play a big role in determining whether searchers will click on a particular listing. They are also one of the most important on-page SEO factors, and when others link to your pages organically, they tend to use the page titles as anchor text.

It is important to mention that just as with Google Ads, the title of a SERP snippet has the biggest influence on CTR. Moreover, because SERP CTR and dwell time are now parts of RankBrain, it is very important to aim for better click-thru rates on your organic results. Higher than average CTRs and longer dwell times are quality signals used by search engines.

The SERP title myth
Many otherwise knowledgeable webmasters (and even a few SEOs) believe that the content of the title tag is the only source Google uses to generate and display the SERP titles.

Figure 418 – SERP titles are emphasized in blue.

Yes, most of the time, the content of the title tag is displayed in SERPs, as you can see in the screencap above. However, the SERP title is not based solely on what is wrapped within the HTML title tag. Google’s goal is to be relevant, so it is expected that they will not blindly use just the content of the title tag to generate the most relevant snippets for users.

For example, let’s say you forgot to add the product name in the title tag, and a user searches for that product. Google might classify the page as highly relevant for that search query, due to great content and backlinks. However, since the page title is missing, displaying an empty title in Google SERPs would be a poor experience. In such cases, Google will use other sections on the page to extract and display a title that is more useful to the searcher.

A very common question is: “Why is Google changing/rewriting/not indexing my title tag properly?”. As mentioned, Google’s goal is to provide the most relevant titles for searchers. Google will use various data sources and signals to accomplish this. They will also analyze the page content and look for external relevance signals from other sources (e.g., from the now extinct DMOZ, Yahoo! directory, or the anchor text in backlinks), to match a user query with relevant content extracted from a page.

Here are some scenarios that may trigger search engines to alter the SERP titles:

  • A malformed title tag.
  • Titles that are too short or too long.
  • A page blocked by robots.txt, but with many backlinks related to the search query.

Getting a different title in the SERPs than the one in the HTML code does not mean that Google indexed your pages or titles incorrectly; it just means that the search query determines whether your HTML title tag is displayed.

Since we are discussing titles and CTRs, I would like to touch on concepts such as SERP CTR, SERP bounce rate, dwell time and pogo-sticking.
SERP CTR is the click-through rate on organic search results.

SERP bounce rate[32] is a bounce that happens when searchers click on a SERP result, and then go back to the initial SERP without interacting with the content on the page they clicked on. That is not necessarily a bad thing, depending on how much time the searchers spent on the website.

Dwell time is the amount of time that a searcher spends on a page before returning to the SERPs.

Pogo-sticking is defined as going back and forth between a SERP and the web pages listed in the results.

All the above are “crowd-sourced” metrics used by search engines to self-evaluate the quality of their results. For example, if a spam page managed to rank first for a competitive keyword, but it does not get enough clicks because users easily identify its SERP snippet as spam as soon as they see it in the listing, that page may be deemed irrelevant in regard to the keyword. Similarly, if a page ranked number one for a particular keyword gets many clicks, but almost everyone bounces back within a second, that signal gets picked by search engines (most likely by RankBrain, in Google’s case). Search engines may reduce the rankings of that page because it is not useful for users and because of very low dwell time.

In an older crowd-sourced test I ran years ago, the rankings of the target URL went up from #16 to #12 for a long-tail keyword, after test participants clicked the URL in SERPs, visited a couple of pages, and spent some time on the test website. However, since this was not a large-scale experiment; it is possible that the fluctuation was just due to personalization, or natural SERP variations.

However, if you think about it, it makes sense for search engines to test and analyze how users react to different results, and to adjust results and algorithms based on SERP CTR and dwell time. Although it has not been officially confirmed, Matt Cutts suggests in a video that Google takes clicks into account when they test new algorithms on live results[33].

Remember, there is a metric that Google uses internally to measure the quality of their results: the long click.

“This occurred when someone went to a search result, ideally the top one, and did not return. This means Google has successfully fulfilled the query”.

The ideal scenario is to “finish the search” on your website. That is the ultimate quality signal you can send to search engines.

Consider the following suggestions for improving the effectiveness of your titles:

Title tag and H1s matching
One way to reinforce the relevance of a product page is to match the title tag with the H1 partially. When doing this both elements should contain the product name. This partial match is a good idea, because H1 and <title> should be conceptually related, but not the same.

Optimize your <title> tags for better SERP CTR, and the H1 for conversion and reassurance.

On PDPs, the product name is usually wrapped in an H1, and it can be a pattern along the lines of the following examples:

  • {Product_Name}
  • {Brand}{Product_Name}
  • {Brand}{Product_Name}{Variation or Attribute}

You can use the H1 product-naming convention in the title tag as well, but you need to change it a bit—for instance, by adding modifiers such as “Buy”, “Online”, “Free Shipping” or {Business Name}”. Your title would look similar to:

  • {Product_Name}–{Business_Name}
  • {Brand}{Product_Name}–{Business_Name}
  • {Brand}{Product_Name}{Variation}–{Business Name}

Figure 419 – In this screenshot, you can see that the title tag is different from the H1. Since the product name in H1 is very short, the title tag can easily be complemented with other useful product attributes.

Keep in mind that the keywords in the title tag should accurately reflect the page content and should also be present in the main content area.

A side note about the title tag for category pages: when a category page lists subcategories, either in the faceted navigation or the main content area, the <title> tag can include some of the most important subcategories; this is especially important when category names are very short.

Figure 420 – This is the SERP for “women dresses”. The title tag for Nordstrom includes “Cocktail Dresses” and “Maxi Dresses”. This approach works best for top-level categories with short names.

Keyword significance consolidation
This tactic works only for ecommerce websites that focus just on a particular product line (e.g., bar stools) or in a specific niche (e.g., you only sell furniture). The tactic will help increase the significance[34] of your main keyword, which in return will create more relevance around your website for that product line or niche. Here are the steps you need to take:

  1. On the home page, place the main keyword at the very beginning of the title tag.
  2. Use the main keyword towards the end of the title tag, on every page of the website, even if the title will become longer than 65 characters or 500 pixels.
  3. Mention the keyword in the main content area on each page of the website. If your pages are content rich, repeat the keyword every 250 words.
  4. Consolidate the contextual anchor text from internal pages to point to the homepage. For example, if “speedboat parts” is your most important keyword, then each page on your website should contain the keyword “speedboat parts” in the main content area, and the first instance of “speedboat parts” should link to the homepage.

Figure 421 – This is a screenshot from Google Search Console, prioGoogle removed the Keywords report.

The report above displays the keyword significance for a website that sells only Shoprider scooters. Notice how “shoprider” and “scooters” are the most significant keywords on this website. The website ranks in the top five for “Shoprider scooters” in Canada, close to Shoprider’s official website.

In the past, you could’ve downloaded this list to find keyword variations as well. That was a pretty useful report to understand how Google groups keywords, but unfortunately, it has been discontinued.

Figure 422 – The report showed the keyword variations as well.

Just a quick note here: keyword significance is not the same as keyword density. Keyword significance is measured at the domain level, while keyword density is measured at the document/page level.

Geo-targeting
Usually, ecommerce websites ship nationwide or even internationally. However, there are cases when you cannot ship outside a geographical region, due to regulatory restrictions. For example, if you sell wine int to Canada, you are not allowed to ship inter-provincially.

If you sell only to a specific region, province, or city, you can mention that in the title tag to increase the chances of showing up for a geo-personalized search query.

Figure 423 – The URL ranked second contains the city name in the title tag, while the URL at the bottom of page one does not.

If you are a retailer with multiple locations, build separate landing pages for each store location. The store address should be placed in the title tag, and the landing pages should reinforce the store locations, with mention of surrounding landmarks or with geo-tagged images. At the minimum, the address in the title should include the city and state or province.

Holiday-specific titles
Searchers’ behavior and the queries they use change around major holidays and events such as Boxing Day, Mother’s Day or Halloween. It is useful for searchers and SEO to update the title tag to accommodate these changes. For instance, around Valentine’s Day, the title “Valentine Gifts for Her. All Items on Sale & FREE Shipping” is more enticing and relevant to users than “Gifts for Her. All Items on Sale & FREE Shipping”.

Figure 424 – Tiffany updated the page title to match user intent, before Valentine’s Day.

Character count and pixel length
Google does not index just 65 characters from title tags. It only displays about 65 characters in the SERP title (or the corresponding length in pixels). In fact, Google indexes as many as 1,000 characters.

Knowing this opens the door to experiments such as thinking of your titles in blocks rather than a single 65-character unit. For example, it may be worth testing titles made of two blocks:

  • The first block of about 65 characters is where you craft the perfect title. This block will include category and subcategory names, product names, branding, calls to action, etc. Think of this as the title you would write if you were to follow the 65-character limit. Ideally, this will be the title seen by searchers on SERPs.
  • The second block will contain second-tier keywords such as product attributes, model numbers, stock availability, plurals, synonyms, and so on. You could eventually repeat the most important keyword for your website at the very end of the title, on all pages except the homepage.

If the title is a complete sentence, and you want the entire sentence to show up in the SERP, it is best to keep that sentence under 65 characters.

Branded titles
When I refer to branded titles, I refer to using your brand name, not the other brands that you might be selling. The decision of whether to add your brand name to the title tag depends on various factors such as:

  • The goal of your organic search campaigns, for example branding versus rankings.
  • How strong your offline brand is.
  • The authority of your website (i.e., external links pointing to your site).

I usually recommend not placing the brand at the beginning of the title tag. However, the final placement should consider the following:

  • If you have a well-established brand and a more than decent website authority, you can place the brand at the beginning of the title tag.
  • If you try to build a brand, then place the brand at the beginning of the title.
  • If your brand has only some recognition and your goal is to drive unbranded traffic, you should add the brand name at the end of the title tag.
  • If your brand is not known, or if you do not care about branding, do not include your brand name in the title at all.

Figure 425 – Big names like Amazon include their brand name right at the beginning of the tag. This tactic works for recognizable brands as they rely less on page titles for SEO reasons.

Sometimes Google will change the title tag to append the brand name either at the end or beginning, whenever it believes it makes sense for users. In the example below, you can see how a search for “engagement rings” returns Costco’s website, with the brand name at the end of the title.

Figure 426 – The SERP title includes the brand name, Costco.

However, if you look at the screencap from their HTML code, you will notice that the title tag does not include the brand name.

Figure 427 – However, the HTML page title of that same page, does not include the brand name. Because the SERP title would be too short (just the category name) Google decided to add the brand name automatically, to make the results more appealing to searchers.

Keyword prominence
The term prominence refers to the closeness of the keywords to the beginning of the title tag. On category pages, start the title with the category name; on product detail pages, start with the product name.

But why is prominence important? Firstly, search engines assign more weight to words at the beginning of the title. Secondly, western readers skim text from left to right, and it is important to reassure them that the page is relevant by placing the category or product name at the beginning. An exception to this is if you have an established brand, or if you are trying to build one; in these cases, start the title with your brand.

Keyword proximity
Proximity refers to how close words are to each other. If your targeted keyword is “women’s dresses”, you should not place other words between “women’s” and “dresses”. For example, the title “Women’s Casual & Formal Dresses” is not ideal; instead, it should be “Women’s Dresses: Casual, Formal, Going Out and more dress styles at{BrandName}”.

Just a quick note about category names: when deciding on category names, do some basic search volume research. The screenshot below is from Google Keyword Planner, and it shows that “women’s dresses” has significantly more search volume than “womens dresses”.

Figure 428 – The search volume for “women’s dresses” is almost double compared to “womens dresses”.

The importance of keyword prominence seems to have decreased after the Hummingbird update, as Google is not focusing on exact match keywords as much as it used to. However, it is still advisable not to break apart important words such as category or subcategory names.

User intent modifiers
User intent keyword modifiers are words that can be placed before or after the targeted keywords, to attract searchers at a specific buying stage. Based on user intent, search queries can be categorized into three main categories: informational, transactional and navigational.

We discussed user intent in detail in the Keyword Research section. There, we saw that while the vast majority of search queries are not transactional, informational and navigational queries are still valuable because they can assist conversions. Hence, ecommerce websites should make efforts to capture consumers with relevant content at each buying stage.

One way of clarifying the purpose of a page to users is to include user intent keyword modifiers in the title tag:

  • You can include transactional modifiers such as “buy”, “sale”, “discount”, “cheap”, on category and product detail pages.
  • Navigational modifiers (e.g., Sears Store Vancouver, BC) can be included on store location pages. You can add the brand name on “About Us” and “Contact Us” pages.
  • Educational modifiers such as “learn”, “discover”, “read”, “find”, or “guide” can be included on shopping guide pages.

Keywords order
In some cases, you will find that words have different search volumes, and even different meanings, if arranged in a different order. For example, “dog toys” has a different meaning than “toys dog”, and it also has a different search volume. When the order of words creates different meanings, you will have to create separate landing pages.

Singular versus plural
It is known that using the same keyword more than twice in the title tag may raise spam flags. However, is the plural form of the keyword considered repetition?

When search engines analyze the content of a document, they use a process called stemming[35]. That means that they strip words to their root form (e.g., “dresses”, “dressed”, “dressing” are all variations of the root word “dress). If you view the matter from this angle, the plural variation can be considered a repetition. Although Google will treat singular and plural words as different keywords, I would not recommend using singulars and plurals in the same title.

That is because there is more than just stemming when it comes to plural or singular—there is user intent. Generally speaking, search queries containing plurals suggests that users are looking for a list of items rather than just one particular item. Moreover, in some instances, the same word can have different meanings in singular versus plural—e.g., “car cover”, which may refer to insurance cover versus “car covers” as in weather-proofing.

I recommend using the plural on listing pages or shopping guides, and the singular on PDPs. For example, the title on a category page can read “Canon Digital SLR Cameras”.

For a product detail page under this category, the title will read “Canon EOS 60D 18 MP CMOS Digital SLR Camera with 3.0-Inch LCD (Body Only)”.

Stop words
In computing, words such as “and”, “or”, “the”, “in”, are named stop words.[36] Since these were usually deemed non-essential for relevance scoring until the Hummingbird update, search engines used to filter them out, when analyzing and classifying documents. Use a natural language to create your titles, and if that means including stop words, do not sweat it; you are good.

You should pay attention if your CMS automatically removes stop words from titles and URLs because some stop words are important for users and can completely change meanings. For instance, if you sell music online, and the CMS automatically removes the word “that” from the band name “Take That”, you will end up with a very suboptimal page title, e.g., “Best Take Albums” instead of “Best Take That Albums”.

Word separators
The word separator most used by SEOs is the pipe sign “|”, but symbols such as hyphens and even commas are good choices too. Google suggests not to use underscores,[37] and I would also recommend staying away from the following special characters: ‘ “ < >{}[] ( ).

Some websites use catchy titles with a lot of non-alphabetic symbols to grab searchers’ attention (e.g., ~~~!FREE iPods!~~~), and possibly higher CTRs. Keep in mind that the use of special symbols may get you a better CTR but might also result in spam flags.

Character savers
If you need to squeeze in more text, you can replace certain words with their corresponding symbols—for example, the word “and” with the “&” symbol, or the word “with” with the “/” symbol or the word “copyright” with the “©” symbol. Remember to implement special characters using HTML entities (“&” as &amp; “©” as &copy).[38]

Other great space-saving options are abbreviations (e.g., instead of “extra-large” you could use XL), and shorter synonyms (e.g., T-shirts instead of tee-shirts). The decision on which version of the keyword to use in the title has to be based on the search volume for those keywords, and also on the content targeted on the page.

Calls to action (CTAs)
A page that ranks second, but has a great compelling CTA in the title, could theoretically grab more clicks than the page ranked first if that first page has a poor title. Remember that one of the most important elements tested in advertising and conversion rate optimization is the headline. On SERPs, your headline is the title.

CTAs include action verbs, unique selling points or promotional words. Sometimes, promotions can also affect CTR. An example of a promotional title is: “All Digital Cameras 60% OFF”.

Competitive differentiators and free shipping
If you know that your target market is sensitive to a particular feature or benefit that is part of your unique selling proposition (e.g., you offer a “lowest price guarantee”), use that to attract more clicks on your listing, and to differentiate from competitors. You can do the same if you have a competitive edge (e.g., you are the exclusive retailer of a product/line of products).

Figure 429 – “Free shipping” is extremely exciting for shoppers, and Zappos features that prominently in their page titles. This tactic works for conversion and better SERP CTRs.

Test your titles
SEO testing is theoretically possible,[39] but very hard to statistically conclude since search engines involve a lot of uncontrolled variables. However, title tag variations are one of the easiest SEO tests you can run. Here are some ideas for your tests:

  • Place your brand at the beginning or end of the title.
  • Add one or more important product attributes to the product name.
  • Add the most important subcategory names before or after the parent category name.
  • Test various unique selling points at the beginning/end of the title.
  • Test various title patterns

References:

  1. It is All About the Images [Infographic], http://www.mdgadvertising.com/blog/its-all-about-the-images-infographic/
  2. Ranking Images For Web Image Retrieval, http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=1&p=1&f=G&l=50&d=PG01&S1=20080097981.PGNR.&OS=dn/20080097981&RS=DN/20080097981
  3. Luis von Ahn: Massive-scale online collaboration, https://www.ted.com/talks/luis_von_ahn_massive_scale_online_collaboration
  4. WEB1000 – The ‘alt’ attribute of the <img> or <area> tag begins with words or characters that provide no SEO value, http://msdn.microsoft.com/en-us/library/ff723935(v=expression.40).aspx
  5. Image guidelines for SEO, http://msdn.microsoft.com/en-us/library/ff724026(v=expression.40).aspx
  6. Is it better to have keywords in the URL path or filename? https://www.youtube.com/watch?v=971qGsTPs8M
  7. Image Sitemaps, https://support.google.com/webmasters/answer/178636?hl=en
  8. Does Google use EXIF data from pictures as a ranking factor? https://www.youtube.com/watch?v=GMf6FmRus2M
  9. Video SEO White Paper, http://www.aimclearblog.com/2011/04/04/download-aimclear%C2%AE-video-seo-white-paper/
  10. Pop-ups, video buttons and color swatches can turn site search results into selling tools., http://www.internetretailer.com/2010/03/31/inside-search
  11. Six retailers that used product videos to improve conversion rates, https://econsultancy.com/blog/61817-six-retailers-that-used-product-videos-to-improve-conversion-rates#i.1k2dagwxune85p
  12. Creating a Video Sitemap, https://support.google.com/webmasters/answer/80472?hl=en
  13. Using Instructographics For Online Marketing, http://www.pitstopmedia.com/sem/using-instructographics-for-online-marketing
  14. schema.org markup for videos, https://support.google.com/webmasters/answer/2413309?hl=en&ref_topic=1088474
  15. Will I be penalized for hidden content if I have text in a “read more” dropdown? https://www.youtube.com/watch?v=UpK1VGJN4XY
  16. Webmaster Central 2013-09-27, https://www.youtube.com/watch?v=R5Jc2twXZlw&feature=share&t=20m49s [min 20:49]
  17. Will having the same ingredients list for a product as another site cause a duplicate content issue?, https://www.youtube.com/watch?v=LgbOibxkEQw
  18. SEO tips for e-commerce sites, http://maileohye.com/seo-tips-for-e-commerce-sites/
  19. Whiteboard Friday – The Biggest SEO Mistakes SEOmoz Has Ever Mad, http://moz.com/blog/whiteboard-friday-the-biggest-seo-mistakes-seomoz-has-ever-made
  20. How many H1 tags should be on each HTML page? https://www.youtube.com/watch?v=Hgy3Oc9zfOw&feature=youtu.be&t=42s [min 00:42]
  21. Google’s SEO report card, http://static.googleusercontent.com/external_content/untrusted_dlcp/www.google.com/en/us/webmasters/docs/google-seo-report-card.pdf
  22. Thing > Product, http://schema.org/Product
  23. Non-visible text, https://support.google.com/webmasters/answer/146750?hl=en#product_page
  24. PowerReviews Spreads Consumer Reviews Between E-Commerce Sites, http://techcrunch.com/2011/07/26/powerreviews/
  25. Ecommerce UX: 3 Design Trends to Follow and 3 to Avoid, http://www.nngroup.com/articles/e-commerce-usability/
  26. The Magic Behind Amazon’s 2.7 Billion Dollar Question, http://www.uie.com/articles/magicbehindamazon/
  27. Rich snippets – Reviews, https://support.google.com/webmasters/answer/146645?hl=en
  28. Can I specify the canonical of all of a product’s review pages as a single URL?, https://www.youtube.com/watch?v=AXnbBsRbKDA
  29. Farewell to soft 404s, http://googlewebmastercentral.blogspot.ca/2008/08/farewell-to-soft-404s.html
  30. End-of-life (product), http://en.wikipedia.org/wiki/End-of-life_(product)
  31. Google’s SEO report card, http://static.googleusercontent.com/media/www.google.com/en//webmasters/docs/google-seo-report-card.pdf
  32. Opinion: Is SERP Bounce a Ranking Signal or a Quality Factor for SEO? http://www.searchenginejournal.com/opinion-is-serp-bounce-a-ranking-signal-or-a-quality-factor-for-seo/35464/
  33. What’s it like to fight webspam at Google? https://www.youtube.com/watch?v=rr-Cye_mFiQ&feature=youtu.be&t=2m50s&noredirect=1 [min 02:50]
  34. Content Keywords, https://support.google.com/webmasters/answer/35255?hl=en
  35. Stemming, http://en.wikipedia.org/wiki/Stemming
  36. Stop words, http://en.wikipedia.org/wiki/Stop_words
  37. Is comma a separator in a title tag?, https://www.youtube.com/watch?v=jHSqLYUPq8w
  38. HTML Character Sets, http://www.w3schools.com/charsets/default.asp
  39. SEO Tip: Titles matter, probably more than you think, http://www.thumbtack.com/engineering/seo-tip-titles-matter-probably-more-than-you-think/