Ecommerce SEO



Length: 1,847 words

Estimated reading time: 10 minutes

This e-commerce SEO guide has almost 400 pages of advanced, actionable insights into on-page SEO for ecommerce. This is the first out of the 8 sections.

Written by an e-commerce SEO consultant with over 25 years of research and practical experience, this comprehensive SEO resource will teach you how to identify and address all SEO issues specific to e-commerce websites in one place.

The strategies and tactics described in this guide have been successfully implemented on top 10 online retailers, small & medium businesses, and mom-and-pop stores.

Please share and link to this guide if you like it.

About this ecommerce SEO resource

Welcome to the most comprehensive e-commerce SEO resource on the Internet—if you can find one that’s more extensive than this one, I will remove this statement:). My name is Traian, and I am the author of this e-commerce SEO compendium.

If you are involved with e-commerce in one way or another or if your work touches even a bit on SEO, you are about to engage in a source of knowledge full of actionable SEO tactics to help you optimize your website correctly.

This guide is a web version of my book – “Ecommerce SEO” –now retired from the shelves. For those of you who have already read the book, you might remember I mentioned that the book would evolve into an easier-to-update medium – this online guide is now that medium. I would also like to thank everyone who purchased the book and left these five-star reviews:

Many thanks to several industry leaders who supported me with this initiative. I am humbled to have received reviews from them:

I would also like to thank everyone else in the SEO industry who supported me but is not listed here. Thanks a lot to all of you who left raving reviews on social media. It is much appreciated.

Feel free to contact me on the website or via LinkedIn or Twitter.

This e-commerce SEO guide evolved from the desire to offer those involved in e-commerce access to SEO advice in a single place. The Internet contains much information on this subject, and the online SEO community is amazing. However, the SEO resources that ecommerce professionals need are widely scattered.

So, I put everything I researched, learned, and practiced about SEO into a single resource.

We will start with the foundation of an ecommerce site, which is the website’s architecture. Then, we will continue with keyword research to determine parts of the content strategy. Next, we will learn how to guide crawlers and avoid search engine bots’ traps, and then we will explore using internal linking to improve relevance and create strong topical themes.

We will continue by deconstructing the most important pages for ecommerce websites—the home page, listing pages, and product detail pages—each in separate sections.

To get the most out of this resource, it is better to go through the chapters in the order they are listed without skipping them. I will often reference concepts and tactics described in previous chapters.

Who read this resource?

This is for you if you are a small or medium business owner who runs an ecommerce website. You have probably realized by now that running an ecommerce business requires many skills. Depending on your educational background, you are either putting much time and work into learning various disciplines such as programming, design, usability, and copywriting, or you are contracting qualified help.

This source of information will help you realize how complex SEO is and should help you set realistic expectations. More importantly, don’t expect organic traffic to be a silver bullet. Business-wise, it is a good idea to diversify your acquisition channels to email, social media, referrals, and more while working your way up in organic results.

If you are an ecommerce executive, read this guide to understand how almost any decision you make regarding the website will affect its search visibility. The information here will show you what needs to be done to have an SEO-friendly ecommerce website, but it is up to you to prioritize based on your current situation and objectives. It will also help you have more educated conversations with your search engine optimizer(s).

Even if you work in a medium-sized business, you may realize that you do not have all the expertise or resources in-house, so you will have to hire outside talent. This resource should help you understand what to look for when hiring that talent. As an executive, your time is probably at a premium, so if you do not feel like learning SEO stuff, at least let your web dev, marketing, or production department know about this information.

If you are a search engine optimizer, I hope you will find this content helpful not only because it presents most of the SEO issues encountered by e-commerce websites in a single resource but also because it provides advice and options for addressing those issues. Let your manager know about this asset. It will help them understand that e-commerce SEO cannot be addressed overnight and that e-commerce SEO does not have strict recipes for success because SEO is part tech, part marketing, and part art.

The information is also very valuable for web developers involved with e-commerce since it discusses on-page SEO issues and proposes solutions. However, it does not detail how to write the code to address the problems. While working on addressing an issue, the developer should decide which approach is best, given your particular technical setup. For example, sometimes a 301 redirect is impossible, whereas a rel= “canonical” is. While I may recommend one approach over another, you will have to decide whether it is possible to implement the recommended method.

What type of websites is this information for?

This repository of information is for websites that face complex issues such as faceted navigation, sorting, pagination, or crawl traps, to name just a few. However, remember that a website’s complexity is not directly tied to how big a business is in terms of revenue. Start-ups, SMBs, and enterprise websites can be complex, regardless of their revenue. This compendium is just as useful for large websites (e.g., sites with tens of thousands of items) as small and medium websites (e.g., with tens to thousands of items).

Ecommerce extends across many segments, such as travel, where you can sell air tickets, railway tickets, hotel bookings, tour packages, etc. It also extends to retail, financial services, digital goods and services, consumer packaged goods (CPG), and many others. While most examples presented in the book and website are for retail and CPG, the SEO principles discussed here apply to all other ecommerce segments. These principles also apply to non-ecommerce websites with complex structures and navigation.

I will use the terms item and product to refer to a physical good, but an item can also be a digital product, such as a game or a song. The term item will have a different meaning depending on each business. For an online hotel reservation website, the item will be a hotel, and it will be presented on the hotel description page; for a paid content publisher, the item may be a journal; for a real estate listings website, the item will be a real estate property, and so on. Also, I will refer to item and product interchangeably.

On-page SEO issues only

We’re going to address on-page SEO issues only. Link development is a big part of the SEO equation and requires its own knowledge asset. However, while links have been the main target of SEOs for a long time, you should optimize your website by putting people and content first.

Level of Expertise

The source of knowledge contains intermediate to advanced SEO tactics but is also a good start for newbies. If you are one of them, the information shared here should set you in the right SEO mindset.

You may find that I give particular advice or opinions about a topic without getting into detail. That may be because that topic is discussed in detail in a referenced work. If you want to know more about those topics or if you are a total newbie to the SEO field, check out those resources.

I have often been asked what the best SEO advice I can give those who do SEO for e-commerce is. Here it is:

Optimize for users without chasing the algorithm. Your ultimate goal is the long click, a.k.a. fulfilling or terminating the query. We will discuss “the long click” internal metric used by Google later. In a nutshell, it means a searcher Googles something and finds your website at the top, and when they land on it, they find whatever they are looking for on your website without needing to go back to the search results.

Before ending this intro, I would like to tell you that you can make it on the first page of Google, Bing, or any other big search engine. But you need to have the necessary knowledge from various tech and web development areas. You must also keep updated with how algorithm changes affect e-commerce websites. Remember that you also need to work hard to achieve first-page results; this is not 2010 anymore, where a few tweaks would rank your pages at the top.

I would love to hear your success stories, so don’t hesitate to contact me.

See you at the top!


Website Architecture

Length: 10,291 words

Estimated reading time: 1 hour, 10 minutes


This chapter will explore the concepts behind building optimized ecommerce website architectures.

A great site architecture means making products and categories findable on your website so users and search engines can reach them as efficiently as possible.

There are two concepts you should be aware of regarding site architecture:

  • Efficient crawling and indexing. This refers to the technical architecture or TA.
  • Classifying, labeling, and organizing content. This refers to information architecture or IA.

Together, information and technical architecture form the site architecture (SA). Understanding these two concepts will help you build search-engine-optimized websites that are search-engine and user-friendly.

It is important to differentiate between information and technical architecture:

  • Information architecture is the process of classifying and organizing content on a website while providing user-friendly access to that content via navigation. This process is done (or should be done) by information architects.
  • Technical architecture is designing a site’s technical and functional aspects. Web developers mostly do this.

Keep in mind that SEO involves both information and technical architecture knowledge.

Information Architecture

The Information Architecture Institute’s definition of IA is:

  • The structural design of shared information environments.
  • The art and science of organizing and labeling websites, intranets, online communities, and software to support usability and findability.
  • An emerging community of practice focused on bringing principles of design and architecture to the digital landscape.

This definition shows that information architecture goes beyond websites and hints at its complexity. It also reveals how flexible and theoretical information architecture is.

From an ecommerce standpoint, let’s oversimplify the definition of information architecture to this single sentence:

The classification and organization of content and online inventory.

You should be familiar with two other important information architecture concepts: taxonomy and ontology. While these names might be intimidating, the concepts are easy to understand.

Taxonomy is the classification of topics into a hierarchical structure. For ecommerce, this translates into assigning items to one or more categories. Ecommerce taxonomies are usually vertical, “tree-like” structures. A website’s taxonomy is often referred to as its hierarchy. To visualize a taxonomy, think of breadcrumbs.

Notice how the breadcrumbs above mimic the website taxonomy. In our examples, one branch of the taxonomy “tree” leads to Duvet & Comforter Covers and the other to Aloe Vera Gels.

The structures depicted in these two screencaps are ordered using a parent-child relationship, from broader to narrower topics, and they are called taxonomies. One way to create ecommerce taxonomies is to use a controlled vocabulary, a restricted list of terms, names, labels, and categories. Usually, it is the information architects who develop these vocabularies.

In terms of SEO, you should use semantic markup to help search engines understand taxonomies. One such markup can be applied to your site breadcrumbs.

Search engines use Microdata or RDFa markup to generate breadcrumb-rich snippets similar to this one:

Search engines can sometimes display the website taxonomy directly in the search engine results pages (aka SERPs).

Search engines can sometimes display the website taxonomy directly on search engine results pages (SERPs).

We will discuss breadcrumbs in detail later in this book, but briefly, this is how the source code for the previous rich snippet example looks like:

Figure 3 -The highlighted text shows the Breadcrumb vocabulary markup.

The second information architecture concept you need to be aware of is ontology. It means the relationships between taxonomies.

If an ecommerce hierarchy can be visualized as an inverted tree, with the home page at the top, then an ontology is the forest showing relationships between trees. An ontology might encompass various taxonomies, with each taxonomy organizing topics into a particular hierarchy.

An ontology is a more complex taxonomy containing richer information about a website’s content and items. We are just beginning to build ontology-driven sites, and one standard ontology vocabulary for ecommerce is GoodRelations.

The Semantic Web aims to help artificial intelligence agents such as search engine bots crawl through and categorize information more efficiently. It is also designed to assist in identifying relationships between items and categories (e.g., relationships between manufacturers, dealers, and prices).

Figure 4 – Related Categories or Related Products can be considered a form of ontology.

Suppose you are not an information architect or a business analyst. In that case, you probably will not be involved in identifying related categories and products, but it is important to know these terms in your discussions with information architects.

Sometimes, related categories and products are automatically identified by the ecommerce platform or by specialized software.

Why is information architecture important for search engines?

A correctly designed information architecture will result in a tiered website architecture. A good architecture has an internal linking structure that will allow child pages (pages that can link upwards in the hierarchy, such as product detail pages or blog posts) to support the more important parent pages (upper-level pages that link down in the vertical hierarchy, such as category and subcategory pages).

Figure 5 – Pages that link to each other at the same hierarchy level are called siblings. They share the same parent.

With correct internal linking, a blog article, for example, “Top 5 New Features of Canon Rebel T5i DSLR,” will support the product detail page Canon Rebel T5i DSLR. Canon Rebel T5i DSLR will support the Digital Cameras category, further supporting the top-level category: Electronics.

Figure 6 – This pyramid-like structure is a very common architecture for ecommerce.

One of the questions that often comes up when deciding on the hierarchy is, “What is the best number of levels to reach a product detail page?”

The famous three-click rule, which suggests that every page on a website should take no more than three clicks to access, is OK to use as a guide, but do not get stuck on it. However, it is perfectly fine if you need a fourth level in the hierarchy.

Information architects, business analysts, or merchandising teams can help identify relationships between categories, subcategories, and products. Based on these findings, you will decide on rules for an internal linking strategy. Such rules can include:

  • Only highly related categories will interlink.
  • Categories will link only to their parents.
  • Subcategories will link to related subcategories or categories.
  • Product pages link only to related products in the same category and parent categories.

A proper website architecture will help your website rank for the so-called head terms. For e-commerce websites, these are usually the category pages at all hierarchy levels. However, internal linking is insufficient for a subcategory page to reach the top of the search engine results pages for category-related search queries.

Because head terms are usually competitive, a page targeting such terms should also include the following:

  • Relevant and useful content. This means that your listing pages should display more than just a list of items. You must present more than just product pictures and pricing on product detail pages.
  • Backlinks from related trusted external websites.

Additionally, proper information architecture means good usability. Great usability and content create an excellent user experience, leading to an increased dwell time (which is good for SEO).

Dwell time is the time a searcher spends on a page before returning to the SERPs. The longer this time is, the better.

Pogo-sticking means going back and forth between a SERP and the web pages listed in the results. For example, you search for something, click on the first result, are unhappy, and return to the SERP. Then you click on the second result, you are still not happy, and you go back to the SERP again, and so forth, until you find what you are looking for or until you refine your search query.

A SERP bounce happens when a search engine user clicks on your page in the SERPs and then returns to the results without interacting with any page elements.

Note that a high SERP bounce is not inherently bad for SEO, but a low dwell time might be. An increased dwell time sends quality signals because it hints to search engines that your page is relevant for a search query.

Navigation, such as primary, secondary, breadcrumbs, or contextual, is also one of the critical components of website architecture. Navigation is jointly crafted by various business members, led by the information architect. Given that the primary navigation will be present on almost every page, it influences how authority and link signals (i.e., PageRank and anchor text) are passed to other pages.

Fortunately, there are ways to give users what they want (findability, discoverability, and usability) and simultaneously guide search engine bots toward what you want them to discover, crawl, and index.

How can SEO add value to IA?

Remember, information architecture is not about technical issues but about organizing digital inventory and content. So, while SEO has a key role in information architecture, it should not dictate how information is labeled and organized. Information architecture is about making content easy to find and helpful for users. However, because most SEOs are biased towards marketing and technology rather than user experience and usability, it is advisable to involve an information architect and an SEO consultant when working out the information architecture.

Try to involve the SEO person from the initial stages of the information architecture process, to provide suggestions and feedback from a search engine standpoint, and to contribute to the overall site architecture discussion. Once the information architect designing the draft information architecture listens to what the SEO says, they can brainstorm with the other teams about implementing the SEO recommendations with minimal changes to the initial information architecture format.

Technology and marketing teams often dismiss a certain information architecture because it does not have traffic potential. Do not make that mistake. When optimizing for search engines and their users, you should listen to what other teams in the business have to say and only then suggest solutions.

As mentioned, SEO’s role is to provide consultancy from the perspective of search engines. Let’s look at a few areas where SEO input is valuable.

The concept of flat architecture

In a flat architecture, deep pages – pages at the lower levels of the website hierarchy (usually the product detail pages) – are accessible to users and search engine bots within a balanced number of clicks for users (or hops, for bots).

Figure 7 – This figure depicts what flat website architecture looks like.

The opposite of flat architecture is the so-called deep architecture, and it may look like the diagram on the next page:

Figure 8 – In a deep-architecture model, pages are mostly linked in a vertical structure.

We will use math to illustrate the concept of flat architecture:

  • At level 0 (home page), you link to 100 category pages; 100^1=100 pages linked.
  • From each page at level 1 (the category pages), you link to 100 subcategory pages and 100^2=10,000 subcategory URLs.
  • From each page at level 2 (the subcategory pages), you link to 100 product pages; 100^3=one million product page URLs.

In three “clicks,” search engines can reach and crawl (and eventually index) one million pages.

Note: the 100 links-per-page example was used as a guide only. You can have more or fewer links, depending on your site authority.

Let’s look at the scenario of a direct visit to your homepage. To reach a product detail page from the home page, a user will have to perform the following actions:

  • First, click on the Cosmetics category page.
  • The second click is on the Eye subcategory page.
  • The third click on the product details page.

If no external links point directly to that product details page (known or abbreviated as PDP), search engines will find the PDP URL similarly to users. The bot will crawl from an entry page and eventually reach the product detail page. Keep in mind that search engines will enter your website through a multitude of URLs, not only through the home page.

In our scenario, it took only three clicks to reach the PDP, but if the website is structured using deep information architecture, it might take users and search engines more clicks or hops.

But how and why did we adopt flat architecture?

The concept of flat website architecture seems to have its roots in web design, and it started with the three-click rule becoming a best practice around the year 2000.

However, when usability experts tested this rule, they found it did not work for users as expected. As a matter of fact:

“Users’ ability to find products on an ecommerce website increased by 600 percent after the design was changed so that products were four clicks from the homepage instead of three” (p. 322).

Then smart SEOs jumped in, thinking that if the rule was good for users, it should also be suitable for search engines. SEOs found a way to funnel more PageRank to deeper levels and optimize crawling by providing shorter paths for search engines. However, the initial goal was to avoid ending up with pages in the supplemental index because of their very low PageRank; it was not to flatten the site architecture.

Here are a few important pointers about flat architecture:

  • Unless you sell a limited number of products (e.g., just ten dietary supplement pills) or unless you have a very limited number of pages on the site, do not flatten to the extreme. That means not linking from the home page to hundreds of product detail pages to build a flat architecture.
  • Flat architecture is about the distance between pages in terms of clicks, not about the number of directories in the URL. For example, you can link from the home page directly to a subcategory URL at the fourth level of the hierarchy (e.g., to promote a subcategory that generates high profits. In this example, the Recliners page is only one click away from the home page (which fits the flat architecture concept). Still, it is four levels down in the directory hierarchy (which matches the deep architecture concept).
  • If you have already organized your hierarchy using URL directories, do not remove them just for flattening.

As long as the directories do not generate super-long URLs, they have advantages such as:

  • Facilitating easier website “theming” (we will discuss this in the Siloing section).
  • Presenting users with a clear delineation of the categories on your website.
  • Allowing for easier SEO, information architecture, and web analysis (e.g., you can use to troubleshoot indexation problems).
  • Google and other search engines may use your directory structure to create rich snippet breadcrumbs.

Figure 9 – SERP breadcrumbs will show up only if the directory hierarchy is clear to search engines.

In this screenshot, you can see how Google displays breadcrumbs directly in SERPs. However, such rich snippets will show up only if the directory hierarchy is structured or if you mark up your breadcrumbs with Schema vocabulary.

URLs don’t need to replicate the exact website taxonomy. If you want, you can keep the URL structure under two directories deep. Here’s an example.

On hotel reservation websites, it is common to have a taxonomy based on hotel geo-locations:

Taxonomy: Home > Europe > France > Ile-de-France > Paris


Even though the URL reflects the hierarchical taxonomy, it is too long and difficult to type in or remember.

If the website sells only hotel rooms, the alternative URL might look like:

If the website offers other travel services, such as air tickets or car rentals, then the alternative URL will include the type of service, and it might look like this:

Regarding the directory structure for hotel booking websites, it is worth noting that hotels are a special ecommerce case because you cannot re-categorize hotels from one city to another. However, for online retailers, product re-categorization happens frequently.

Keep the PDP URLs free of categories whenever possible to avoid issues with moving products from one category to another or issues related to poly-hierarchies (items categorized in multiple categories).

For example, to reach the product page 3-Level Carousel Media Center, a user will navigate through:

homepage –

category page –

subcategory page –

sub-sub category page –

However, once the searcher reaches the product detail page, the URL is free of categories and subcategories:

Tip: setting product names in stone is also a good idea.

Notice a couple of things about the previous URLs:

  • The product page URL is free of category, subcategory, or sub-subcategory names.
  • The category and subcategory URLs include the trailing forward slash (/) at the end. That hints to search engines that the URLs are directories, and more content can be found on those pages.

Figure 10 – This is how Google treats trailing slashes in URLs.

  • The product page has a .html file extension. The file extension hints to the search engines that the document is an HTML page, not a directory. The file extension can be anything, i.e., .php or .aspx—because the file extension does not matter at all to search engines.

Removing category names from URLs is a trade-off with your data analysis, as it will make the web analysis a little bit more challenging. However, this difficulty is surmountable. For example, you can group pages in your analytics tool or markup the HTML code with different strings to group pages based on your rules.

At the same time, make sure your web analytics tool is set up to group pages for analysis easily. Without unique identifiers for URLs, it is more difficult to segment data. You can also use tag managers such as Google Tag Manager to create content groups using Data Layers.

Figure 11 – The flat architecture concept on an ecommerce site.


In the simplest terms, siloing means creating a site architecture that allows users to find information in a structured manner while linking pages using a controlled pattern to guide how search engine bots crawl the website. Usually, this structure is a vertical taxonomy.

Siloing sounds like a fancy term, but it is just good information architecture because siloing is a one-part website hierarchy and one-part navigation (using internal linking).

Figure 12 – At a basic level, siloing means that pages in a taxonomy branch/category (i.e., PDPs) should not link to pages in a different branch/category.

In strict hierarchy patterns, child pages are only linked to and from their respective parent pages. This is not possible without a strictly controlled internal linking, and it is challenging to create such a strict internal linking pattern mainly because:

  • Primary navigation is present on all pages so that cross-linking will happen naturally.
  • Poly-hierarchies, which means multi-categorizations for products or subcategories. For example, the Office Furniture category can be categorized under Office Products and Furniture.
  • Subcategory cross-linking and crossover products. For instance, you may have to link from a product categorized under Home Theater to another product made by the same brand but categorized under Audio.

Because ecommerce websites are complex, they are most likely to have a hierarchy that frequently interlinks silos. In practice, it is complicated and, sometimes, not even advisable to prevent internal linking between silos.

Figure 13 – Cross-linking happens naturally

The internal linking architecture can be very cumbersome and difficult to control, even for ecommerce websites with just a few hundred products, as you can see in this graph:

Figure 14 – This node graph shows how complex internal linking can be.

In this example, a website with just a few thousand pages generated over 250,000 internal links.

The siloing method

Conceptually, siloing is done by identifying the main themes of the website (for ecommerce, those will be departments, categories, and subcategories) and then interlinking only pages belonging to the same silo (for example, linking only within the same category).

The good part is that ecommerce websites are usually developed using a similar architecture, with separate hierarchies (themes) for each department or category.

By siloing the website into themes, you can rank high for semantically unrelated keywords with the same site, even though the themes are entirely different, e.g., “hard drives” and “red wines”.

You can achieve silos with directories or with internal linking.


Information architects create hierarchies using user research, user testing, keyword research, and analyzing your web traffic. The URL structure will present the labels used to describe these hierarchies. Your silos will be the directories in the URLs.

Whenever possible, use a hierarchy created with directories.

Internal links

With internal linking, you create virtual silos, as pages in the same silo do not need to be placed in the same directory. You achieve virtual silos by controlling internal links in such a way that search engine bots will only find links to pages in the same silo. This concept is similar to bot herding or PageRank sculpting, with subtle differences in meaning and application.

Siloing with directories

Siloing with directories is the easiest to implement on new websites during the information architecture process. From a user experience perspective, creating the website hierarchy with directories is the best way to go.

But in the end, siloing with directories is nothing less than creating good vertical hierarchies, which the URLs reflect. Many online retailers create them naturally by branching out all categories, without overthinking about SEO and without being obsessed with keywords in the anchor text or with internal linking patterns.

A sample silo with directories would look like this:


Does this type of siloing look familiar to you? It should be if you use directories in your URLs. Moreover, this is nothing more than a proper hierarchy. So, if you design your website hierarchy correctly, you do not even need to worry about siloing with other methods.

Keeping the directory depth low is best practice, ideally fewer than four or five levels.

Siloing with internal linking

For example, siloing with directories may not always be possible if you wish to change an existing hierarchy on an established website. In this case, you will create virtual silos using carefully controlled internal linking.

Usually, pages in a silo need to pass authority (PageRank) and relevance (anchor text) only to other pages within the same silo. This prevents the dilution of the silo’s theme and sends the maximum power to the main silo pages.

Here are some rules for linking within and between silos. A page in a silo:

  • It should link to parents.
  • It can link to siblings if appropriate. Siblings are pages at the same level in the hierarchy.
  • It should not link to cousins.
  • It could, eventually, link to uncles or aunts. Uncles and aunties are siblings of the node’s parent.

Figure 15 – An over-simplified siloing diagram.

In this simplified siloing example, sibling number one could eventually link to uncles, siblings of that node’s parent. That means that if you have to link two related supporting pages found in separate silos (which are called cousins), you should link only to the silo’s uncles.

If you need to link to pages outside the silo, you can block those links from being accessible to search engines (e.g., using AJAX – Asynchronous JavaScript and XML, iframes, JavaScript with robots.txt). Note that there is a fine line between white hat and gray hat SEO; such linking may cross that line. This is because Google’s definition of manipulative techniques lies in answer to the question: “Would you do it if search engines did not exist?”

The goal is not to take siloing to the extreme. If a page is relevant and you want to link to it, then do so, even if it is in a different silo or theme.

Siloing with internal links is a powerful advanced SEO technique, especially for large websites with multiple departments, themes, or categories that are not semantically related, i.e., groceries and mobile phones. However, it is important to know that siloing is not easily achieved, and it pays to be aware of the existing dangers.

If you want to silo with internal linking, know that:

  • PageRank sculpting with rel=” nofollowis not recommended.
  • Virtual siloing means you somehow have to “hide” internal links from search engine bots; doing so may fall outside search engine guidelines.
  • Hiding internal links from search engines using iframes, AJAX, JavaScript, or similar techniques can qualify as cloaking since you show users content different from search engines; this could result in penalties.
  • If you want to obfuscate links with AJAX or JavaScript for SEO reasons, identify the percentage of users with JavaScript turned off. If that is a significant segment of your total visitors, ensure your website works correctly without JavaScript. Non-JavaScript users should be able to finish all micro and macro conversions on your site. An example of a micro-conversion will be an “add to cart” event, while a macro-conversion is a completed order.
  • Trading away too much for SEO at the expense of usability and accessibility is not the right way.
  • Siloing may require hiding entire navigation elements, such as facets and filters, from search engines. There are risks associated with such bold tactics.

Figure 16 – The nofollow links are marked with the red-dotted border

The image above shows how only the top-level categories (Women, Men, Baby, etc.) and the immediate next hierarchy level (Clothing, Shoes, Accessories, etc.) pass authority through links. Category links are nofollow. This is a bold (likely bad) SEO approach to handling primary navigation menus.

Proper internal cross-linking is helpful and necessary for good rankings, and we will discuss this in detail in the Internal Linking section. However, remember that internal linking must be built for users first and only then for search engines. It would be best to link consistently, thematically, and wisely (using synonyms, stems, plurals, singulars, and so on) to support rankings for categories and subcategories.

You should not remove navigation elements just for SEO purposes. Keep the links that are useful for users in the interface, and if you want to remove links for SEO reasons, do it by blocking those links with AJAX or JavaScript.

Another theming method is to evolve taxonomies into ontologies: instead of linking based strictly on a vertical taxonomy, interlink conceptually related items. For example, you can interlink a particular fragrance with the sunglasses manufactured by the same brand. This type of interlinking requires defining semantic and conceptual relationships between categories and items and then deciding on internal linking based on predefined business rules.

One such business rule is crowdsourced recommendations (AKA Customers Who Bought This Item Also Bought…). Do users often buy certain products together? If yes, then cross-link those product detail pages, even if they are in different silos.

If this type of linkage generates too many internal links on some pages, you can always block the less important links (you must define how many links are too many for your particular situation). However, for users’ sake, interlink whenever necessary without concern about siloing.

If the business rules are based on data, you will not link adult toys to children’s books. Also, you will not link to hundreds of related products but just to a few highly related items.

Here’s what Google has to say about the subject of theming an internal architecture in a post on their official blog:

Q: Let’s say my website is about my favorite hobbies: biking and camping. Should I keep my internal linking architecture “themed” and not cross-link between the two?

A: We haven’t found a case where a webmaster would benefit by intentionally “theming” their link architecture for search engines. And, keep-in-mind, if a visitor to one part of your website cannot easily reach other parts of your site, that may be a problem for search engines as well.

This is a reminder not to take siloing to the extreme. However, siloing with directories is natural, and the resulting internal linking is also great for users and search engine bots.

I lean towards a hybrid siloing concept combining the following:

  • Good website hierarchy is reinforced by directory structure (a patented Google signal for classifying pages).
  • Rule-based internal linking.
  • Depending on the case, fewer links are available to search engines, which can be done with or without AJAX/JavaScript. We will discuss this subject later in the course.

Generate content ideas

It is widely known that keyword research can help with generating content ideas. Keyword research also enables you to expand from a relatively narrow set of head keywords (category and subcategory keywords) to a large number of torso and long tail keywords. These long-tail keywords can then be used to generate content ideas, identify product attributes, and improve product descriptions.

Based on the initial taxonomy created by the information architect, you can identify keyword patterns, tag user intent, group keywords according to buying stages, and find search volumes; I will cover these tactics in the Keyword Research section.

This type of research provides excellent insights usually overlooked by the other teams in an ecommerce business.

Suppose you want to consistently publish content that your target market will find relevant consistently. In that case, you must know the queries searchers use and, more importantly, the type of content they seek. Are they looking for general information about your products? If so, you would do well to emphasize review-type content and how-to articles. Are they searching for products to buy? If so, you could improve the content on a product detail page.

You can better address your target market needs once you understand what they want by discovering the user intent behind the search query. When you do so, you will be better able to address their needs on your landing pages. When your landing pages address people’s needs, conversion rates will skyrocket, and rankings may improve (as an indirect quality signal).

Here are some interesting facts about search queries:

Figure 17 – The search demand curve, as explained by MOZ. Notice how the long tail of keywords and chunky middle make for more than 80% of the keywords.

Why did I mention these search query facts?

It is because the correct way to start keyword research and build a great website architecture is by recognizing that only a small fraction of your target market is ready to buy at any given moment. Many e-commerce websites mistakenly focus on targeting keywords such as department, category, or subcategory names while completely ignoring a large number of informational search queries (and even navigational). I will detail a keyword research process in the Keyword Research section of the book.

Let’s look again at a typical ecommerce website architecture:

Figure 18 – Under this sample hierarchy, product detail pages are not supported by any other content-heavy level below the PDP level.

There are four levels in the example above: The first level is the home page, which is supported by categories (second level), subcategories (third level), and product detail pages (4th level). The subcategory and product detail pages support the category level; product detail pages then support subcategory pages. However, the product pages are the “leaves” in this example – the last level of the e-commerce hierarchy.

When an ecommerce website does not support important pages (i.e., categories or PDPs) with an additional content-heavy level in the hierarchy, it can miss a considerable amount of organic traffic coming from informational search queries. It will also miss out on the ability to create useful contextual links to product, subcategory, and category pages.

In our example, you can overcome these challenges by creating a 5th level in the hierarchy. This level can be a blog, a learning center, or a projects section on the website, to name just a few ideas. This content-rich section can also be outside of your existing hierarchy.

I could not find a single reference to their blog on the Victoria’s Secret website. This is bad for them but good news for the small guys competing in their niche.

Figure 19 – Only five pages on this website contained the word “blog”, and none were part of a real blog.

Here are two ideas for you:

  • Add a new layer of support for all pages on the website, especially for product and subcategory pages. As I mentioned, this layer can be a blog, a forum, expert Q&As, how-to guides, buying guides, white papers, workshops, etc. This layer will generate additional organic traffic and support contextual, internal linking. Additionally, it may help build a community around your brand, which is always great.
  • Conduct keyword research with this new level in mind so you will not dismiss informational keywords. Categorize such keywords into the Informational bucket in your spreadsheets and plan content based on them. There is more about this process in the Keyword Research section.

Let’s say you sell home improvement items and want more people to visit your website and buy them. However, many searchers in this niche are DIYers, using keywords specific to the awareness and research stages. Then why not create a series of DIY home improvement projects and publish them on a content-heavy website section?

Look at the following inspiring piece of content from Home Depot’s blog. Home Depot is not into selling instructional DIY DVDs, but they are attracting the target market with highly related content. Home Depot has an entire DIY section on its website.

Figure 20 – This page supports category and product pages by linking to them.

When you add a new content-rich layer in the hierarchy, you:

  • Expose your brand to your target market in the early stages of the buying funnel.
  • Add a new way to generate more traffic.
  • Give visitors more reasons to buy from you.
  • Reinforce product and category pages with better internal linking.

Let’s see how SEO could help regarding information architecture.

Evaluate the information architect’s input.

Planning an e-commerce architecture starts with information architects identifying the navigation labels such as departments, categories, or subcategories.

In many cases, information architects do not associate this process with the keyword research process, which is good because navigation has to serve the users, not the bots. However, you should evaluate the architect’s input from a search engine perspective.

Here’s an example of how to do that using Google Trends. If the information architect wants to label one of the categories in the primary navigation as mp3 players, the following search trend comparison data might change their mind.

Figure 21 – The trend for “iPod” is downwards, but it is still a few times more than the one for “mp3 player”.

Indeed, the iPod can be a child of the mp3 player parent. Still, it would be best to brainstorm with others in the team to decide whether making the iPod category easier to find would be more beneficial for users, which may mean displaying it directly in the primary navigation.

The search volume for a parent category is often higher than the search volume for a child category, but as you can see in this example, this rule is not definitive.

Also, note that Google Trends displays normalized data on a scale of 1 to 100, where 100 is the highest search volume ever recorded. Google Trends does not present absolute search volumes.

All e-commerce websites will have primary navigation (aka global or main navigation), secondary navigation (aka local navigation), and some contextual navigation. Another form of navigation specific to ecommerce websites is faceted navigation.

Primary and secondary navigation

Primary navigation is for the content/links most users are interested in, but remember that importance is relative (something important for your business may not be as important for another business). Generally, on e-commerce websites, primary navigation displays departments, categories, or market segments (i.e., men, women, kids, etc.).

Primary navigation is the easiest type for most users to identify. It allows direct access to the website’s hierarchy and is displayed on almost every page.

Figure 22 – A sample primary navigation on Kohl’s website.

On a side note, it will be difficult for Kohl’s to rank for top-level category keywords (e.g., Home, Bed & Bath, Furniture, Outerwear, etc.) since they will have to compete with niche-specific websites that are laser-focused on a single segment—for example, a company that sells just furniture. Kohl’s can achieve good rankings but will require significant work, including onsite SEO and quality backlink development.

Regarding secondary navigation, even information architecture experts like Steve Krug, Jesse-James Garret, and Jacob Nielsen cannot agree on a definitive definition.

Secondary navigation stands for content of secondary interest and importance to users. Again, importance is relative to each business.

Strongly connected with navigational links, an SEO best practice recommends keeping the number of links on a page under 100. However, this is an obsolete rule; you can list more than 100 links on your pages, depending on your website’s authority.

You will see high authority websites like Walmart listing hundreds of internal and external links:

Figure 23 – There are 633 links on this page. This may be too many unless you have an excellent site authority.

For usability reasons, Walmart’s many links result from using the so-called fly-out mega menus in the primary navigation. This type of menu makes deeper sections of the website easily accessible to users.

Mega menus allow direct linking to subcategories and products, but you must be careful to keep the number of links to a reasonable limit. Since the primary navigation is present on most pages, it significantly influences how authority moves back and forth between pages.

Consolidating a long list of departments into one place involves design considerations (limited screen estate) and user experience (too many options to skim at once). However, it also affects the PageRank passed to the other pages.

Figure 24 – Design limitations forced Walmart to reduce the number of links in the navigation. Notice the “See All Departments” link at the bottom of the primary navigation.

However, Walmart has a separate page for the complete list of their departments (i.e., health) and categories (i.e., vitamins):

Figure 25 – The “All Departments” consolidation is a clever idea because this page will act as a sitemap for people and search engine bots.

SEO can help information architects decide which categories are the most important for users and should be listed in the primary navigation. Use web analytics tools to identify metrics such as the most searched terms on the website, the most viewed pages, and the highest search volume from pay-per-click campaigns.

Figure 26 – The keyword with the highest number of internal site searches could eventually be placed in the navigation if it makes sense, or it can be placed near the search field.

Contextual navigation

Contextual navigation refers to the navigation present in the main section of web pages. It excludes boilerplate navigation items like those displayed in headers, sidebars, or footers.

Some examples of contextual navigation on ecommerce websites include sections such as:

Figure 27 – Customers who viewed this item also viewed

Figure 28 – Best Sellers

Figure 29 – Contextual text links in the main content (MC) areas.

Figure 30 – Links in Recommended Products carousels.

You must discuss contextual navigation with the information architect to identify relevant relationships between categories, subcategories, and products and plan the internal linking accordingly.


SEO can help with the prioritization of labels in the navigation.

It is helpful to know how many pages will be linked from structural sections of the website (primary, secondary, and footer links) on each page template. This is important to estimate because you must determine how many links you can display in the contextual navigation (only if you need to limit the number of links on pages).

This is not a definitive rule, but if you start a new website, keeping the number of links on each page to a maximum of 200 is a good idea. This is because you will initially have only a small amount of authority to pass to lower levels.

Here are some prioritization guidelines:

  • Keep the number of top-level categories or departments in the primary navigation low to avoid the paradox of choice. Research has established that having too many options is bad for decision-making.
  • The short-term memory “rule of seven items” does not apply to primary navigation, as users do not need to remember the labels.
  • You can list more categories on a “view-all departments” or “view-all categories” page.

Figure 31 – In a horizontal design, the primary navigation is constrained by design space.

As you can see in the examples above, the primary navigation is constrained by a horizontal design space. Notice how short the category names must be. Macy’s displays eleven labels in the primary navigation, the same as with BackCountry, while Office Depot lists only nine.

  • Vertical primary navigation placement allows for more categories to be listed:

Figure 32 – Costco displays 18 categories in the menu (the same as Sears), while Walmart displays only 13.

Specialty retailers will probably have less than two or three departments (sometimes none). In those cases, they may not list departments in the menu but categories. General department stores can have up to 20 departments.

  • You can break each category level into 20 to 40 subcategories, depending on how extensive your inventory is.
  • If a parent category needs more than 40 subcategories, consider adding a new parent category or implementing faceted subcategories.
  • Ideally, the hierarchy depth to reach a product detail page should be under four levels:
    • Two levels deep: home, category, and product detail page (this is suitable for niche retailers).
    • Three levels: home, category, subcategory, and product detail page (this is the most common setup for medium-sized e-commerce websites).
    • Four levels deep are home, department, category, subcategory, product detail page OR home, category, subcategory, sub-sub category, and product page. This setup is specific to marketplaces, large department stores, or websites with extensive inventories.
  • If the hierarchy has more than four or five levels, use faceted navigation to allow filtering by product attributes.
  • To improve the authority (PageRank) and the relevance (anchor text) of product detail pages, add a content layer (e.g., blog, community forums, user reviews, and so on) in the hierarchy just below the product detail page level and link to relevant items from there.
  • Ordering categories (or items) alphabetically is not always the best option. You should prioritize based on popularity and logic whenever possible and, eventually, complement it with alpha navigation if user testing proves that such a type of navigation is useful.

Figure 33 – An older version of primary navigation on OfficeMax, featuring alpha navigation.

Figure 34 – Newer screenshot after OfficeMax tested the alpha navigation and reverted to category name navigation.

  • If a category has too few items, consider moving them to an existing category with more items, but do this only if the new categorization makes sense for users.
  • A category with too many items (i.e., thousands) may generate information overload. In this case, you can break the category into smaller subcategories. Additionally, create a user experience that allows better scope selection before displaying a list of items.

Keyword variations

Planning a categorized product hierarchy is not easy. At the top category level, the labels in the primary navigation must be intuitive, have the appropriate search volumes, and be concise enough to support menu-based navigation. It is worth repeating that determining the hierarchy of an ecommerce website based solely on keyword research is neither ideal nor recommended. However, keyword research should complement and support information architecture.

One common question regarding keywords is handling misspellings, synonyms, stemming, or keyword variations for a category. Where do you place them in the website’s information architecture?

This should be easy for your internal site search: you must associate each keyword variation, misspelling, etc., to an existing product or category and redirect users to the respective canonical product or category page. If there is no exact match between the variation, misspelling, or synonym and a category on your site, send users to an internal search result page.

For example, when someone searches for “tees,” “tee shirts,” or “t-shirts,” you return results for “t-shirts.” You can redirect the searcher to the t-shirts category landing page or product listing page if there is an exact match between the search query and the category name.

Figure 35 – Make sure your internal site search works appropriately and does not return wrong products, as in this example (a search for “t-shirt” returned bras).

In this screenshot, I wanted to highlight the improper handling of internal site search results, returning bras when someone searches for “t-shirts”.

Handling keyword variations for external search engines is a bit more complicated. Commercial search engines like Google and Yahoo must understand and connect keyword variations with the right content on your website.

Previously, you would’ve created individual pages to target keyword variations (or a group of keyword variations). However, Google shifted to ranking topics instead of individual keywords. Therefore, your pages must include the searched keyword and semantically related words (e.g., synonyms, plurals).

Ensure you are not overdoing it; including all 20 possible keyword variations on a single page is spammy.

Here are some ideas for you to consider:

Target the most common variations in the title and description or both.

Figure 36 – Gap targets keyword variations in the description, while Sears uses the title tag.

Use product and category descriptions.

One option is to use category or product description sections to add keyword variations in the copy. The bottom of the image below highlights how this website uses two keyword variations for “t-shirts”.

Figure 37 – This retailer uses the words “tees” and “t shirts” in the category description copy to capture traffic for those keyword variations.

Take advantage of related searches.

This approach requires displaying a “related searches” section on your pages. This section may contain several of the most used keyword variations:

Figure 38 – Remember that Related Searches sections should be useful for users first and only then for search engine bots.

Identify possible information architecture problems.

You can perform the “site:” query on Google, for example, “category_name” (without quotes), to see whether search engines list the right page at the top. You can also use products and subcategories in the site: query. For example, you can search for: gourmet products “gourmet products”

If the page you optimized for on your website does not show up at the top of the results, various reasons are possible, such as:

  • Improper internal linking. This happens when the internal linking architecture does not support the correct page.
  • Thin content, no content, or inaccessible content (e.g., JavaScript reviews) on the right page.
  • External links point to the wrong page(s), diluting and reducing the relevance of the correct pages. If people link to the wrong pages, you must ask yourself why. Maybe those other pages are more relevant to them?
  • Page-specific penalties.

Of course, an in-depth analysis is required to identify the cause of these issues. When determining the cause of such problems, it is important to understand how the targeted page (the page you want to rank with at the top of the SERPs) is linked internally from other pages on your website and external sites.

One of the tools for analyzing this is Google Search Console:

Figure 39 – The Internal Links report will display the most important internal links, but only for the most important pages on the website.

This report is basic, but it can provide some immediate insights. Look for signals such as:

  • Are there more internal links to the wrong page(s) than to the desired page?
  • Is the targeted page linked from parent pages (pages higher in the hierarchy)?
  • Is the targeted page linked from pages with high authority?
  • Is the targeted page linked with the proper anchor text?

If there are issues like these, it is time to restructure your internal linking. Remember that Google will not let you download the complete list of links, only the top ones.

Another useful method to assess the internal linking is to run a crawl on your website using tools like Xenu Link Sleuth or Screaming Frog and export the results to Excel.

It is also a good idea to run the most important terms on your internal site to check whether there is a match between the URL returned by your internal site search and the URL returned by search engines.

For instance, let’s say that Google returns the Gourmet Products category URL in the first position when you search for “ gourmet products”. If you were to click on the result, the Gourmet Products page opens:

Figure 40 – Costco’s organic search landing page, pointing visitors to the right category page.

However, Costco’s internal site search returns a different page: a search results page. This is not the best approach from a usability point of view or for search engines because Google does not want to list other results pages in its SERPs.

Figure 41 – In Costco’s case, this mismatch may happen because of the setup of the internal site search rules.

When there is an exact match between a user’s query and a category name, it is preferable to redirect the user to the listing page instead of to a search results page.


Regarding choosing the links’ names in the navigation, labeling is an area where information architecture and search engine optimization overlap. SEOs and information architects must understand the user’s mental model to label the navigation correctly. Labeling is difficult and presents a real challenge for large ecommerce websites. Research from eBay shows how complicated it can get.

While most ecommerce taxonomies can be architected based on a predefined vocabulary, SEO can assist in labeling.

Let’s say you sell toys. Start by searching for the category name (“toys”) using Google’s Keyword Planner:

Figure 42 – Do not forget to set up the targeting options based on your target market.

Download the list generated by Keyword Planner and open it with Excel. Then, categorize keywords into “buckets” by mapping each keyword to either its category, attribute, or filter name:

Figure 43 – Categorize keywords into “buckets”.

Insert a pivot table that counts the occurrences of the category:

Figure 44 – Sort by Count of Category.

If you sort by Count of Category, you can get an idea of what needs to be present in the navigation. You can also identify filter values that can be used in the faceted navigation.

Some navigation labels will be easily identified after tagging fewer than a hundred keywords. For instance, in our example, it seems clear that “brand” should be a primary or secondary navigation label, and users should be able to navigate and filter items by brands. Other possible candidates in this example are “age,” “theme,” and “character.”

Take the findings from this type of research and discuss them with the information architect.

Another thing you should do with the keyword list generated by Keyword Planner is to get the individual word frequency using tools such as

Figure 45 – Words sorted by frequency.

Visually, this is how the word frequency will look like for our previous example:

Figure 46 – The “word cloud” for a list of keywords.

The image above is what we call the “word cloud,” and in our example, I excluded the words “toys” and “toy” to make the other words stand out.

The frequency of the word “kids” is particularly interesting. If you sell toys only for kids (no other target age, i.e., adults), you probably should exclude the word “kids” from your analysis.

If you are in this niche, you may notice that a few essential segments/labels are missing from this keyword list:

  • One is the gender label (girls and boys).

  • Is your target market price-sensitive? Then “pricing” might be another segmentation/label ( shop by price).

Insights like the ones above cannot be discovered using keyword tools. So, how do you identify these “hidden” labels? By conducting user research, user testing, creating consumer personas and scenarios, user flows, website maps, and wireframes.

Remember that from an information architecture perspective, labeling does not stop with the text used for links and navigation. There are different types of labels as well, such as:

Document labels

  • URLs (whenever possible, URLs should contain keywords that make sense to searchers and search engines).
  • File names
    (having relevant keywords in filenames is important for SEO and users).

Content labels

  • Page titles should make sense to searchers and search engines. When there is a partial match between the keywords in the HTML title element and the search query, search engines will emphasize (bold) the matched keyword(s), which may help with SERP click-through rates (CTR).
  • Headings and sub-headings. Headings use large fonts and attract the eyes almost immediately. Putting keywords in headings assures users they are in the right place and helps with dwell time and bounce rates.

Other types of navigation labels

  • Breadcrumbs. Remember that since search engines became so popular, home pages have not been the only entry points to websites. Therefore, use breadcrumbs to communicate your site’s hierarchy to searchers easily and quickly.
  • Contextual text links. Using keyword-rich anchor text placed in a sentence or paragraph is one of the best ways to interlink pages vertically or horizontally.
  • Footers are also a type of navigational label.
    A quick note on this type of navigation: this is probably the place people spam the most by creating tens of keyword-rich internal links.

Figure 47 – The screenshot depicts a footer that makes this website a good candidate for an over-optimization filter.

This footer is mainly boilerplate text, meaning that search engines will most likely ignore it when assessing this page’s content and the anchor text’s relevance.

It does not help to repeat “men’s {category name}” across a million pages since search engines can exclude boilerplate text pretty well when computing relevance.

Figure 48 – An excerpt from Google’s webmaster guidelines regarding boilerplate repetition.

It is funny how SEOs refer to the concepts discussed in this course section as on-page SEO factors, while information architects refer to the same as labels. It seems that SEOs and information architects work with similar and related concepts. However, they still cannot easily agree on optimizing websites for both searchers and search engines.


SEO can help information architects with canonicalizing poly-hierarchies.

Very often, multiple suitable hierarchies could be appropriate for a given item. It is important to help the information architect choose the best fit for the canonical hierarchy and to stick to it. You should link only to the canonical hierarchies from the primary or secondary navigation.

Ideally, all links on the website should point to only one canonical hierarchy.

You can keep as many logical hierarchies as are helpful to users, but to avoid confusing search engines, link to the canonical hierarchy as well.

For example, the Elmo category can be found under:

Toys > Stuffed Animals > Elmo (URL:

Gifts > Holidays > Christmas > Elmo (URL:

If you decide that the first hierarchy is the canonical one (usually canonical hierarchies are the shortest), then whenever you link internally to the Elmo category, use the URL

You can use your web analytics tool to see how most users reached a page. For example, look at the Navigation Summary report generated using Google Analytics (under Behavior –> All Pages) and see how most people reached the Elmo page:

Figure 49 – To get this report, follow the steps illustrated in this screenshot. Use the Visitors Flow report under the Audience tab for a more detailed analysis.

Additionally, you look at the Refined Keywords dimension in the Behavior –> Search Terms section to understand what keyword refinements were made after a search for “Elmo”. The Refined Keyword report can also be a source of keyword variations, as you can see in the following screenshot:

Figure 50 – The Refined Keyword report can be a source for keyword variations.

Remember that there is no wrong or right way to classify a product into certain taxonomies if you refine them over time. However, setting that in stone is a good idea once you decide on a canonical hierarchy.

Here are some other SEO tips for ecommerce information architecture:

  • If you use Google Analytics (or any other web analysis tool), activate the Site Search Tracking option. Analyze what users search for and use that information to decide on the website’s hierarchy. However, do not rely solely on your web analytics data because you will miss a lot of data sourced outside your site.
  • Use keyword research tools to identify keyword variations and suggestions for the terms you have in mind or those generated with user research and card sorting.
    • Google Keyword Tool
    • Search Term/Query Reports
    • Wordstream
    • Ubersuggest
    • Google Suggest
    • Google Correlate
    • SEMRush
    • SpyFu
  • Analyze your competitors’ website architecture and navigation, but do not copy mindlessly. Use their information for inspiration, but ultimately create your site architecture.
  • Use a crawler on your competitors’ websites and sort their URLs alphabetically. For this to work, you may need to crawl many URLs (i.e., 250k+).
  • Find your competitors’ sitemaps (the HTML and the XML Sitemaps) and analyze them in Excel.

Figure 51 – Sorting URLs alphabetically can reveal the website structure.

  • Download the DMOZ taxonomy and look at the shopping categorization.
  • When choosing category names, use Google Trends to check whether there is a steep drop in what people search online over time.

Figure 52 – Notice how the interest in “digital cameras” trends downwards. Maybe this has to do with mobile phones that yield increasingly better pictures.

  • Do not create the website hierarchy solely on keyword research data; validate with card sorting and user interviews. Nowadays, you can quickly do that online.
  • Perform simple navigational queries, like “contact{your_brand}” and make sure the contact URLs, and all other important URLs, are user-friendly.

Figure 53 – This is a not-so-friendly “contact us” URL.

Remember, labeling applies to URLs, too, not only to links. In this example, the URL is not optimized for users (nor for search engines). The CMS may limit this, suggesting it’s time to ditch the old CMS for a new one.

A friendly URL will read OR

  • If you need to categorize large volumes of items, you can use the power of folksonomy, which is an academic term for what we commonly call crowdsourcing. Services such as Mturk from Amazon will allow you to categorize products quickly and even create relationships between products using real people. However, it would be best if you were careful about how you select participants and what instructions you give them.
  • When card sorting tests are in progress, listening and observing are more important than putting words in your users’ mouths.
  • When you remove/update categories from your website (at all levels), ensure that the URLs belonging to the updated categories redirect to the most appropriate working page.
  • When you develop or update the website, create a checklist of SEO requirements for the information architect (e.g., directory and file name conventions, canonicalization rules, lower casing all URLs, data quality rules for data input teams, seasonality, and expired content handling, parameters handling, and so on). I will not provide an extensive checklist here because people tend to limit themselves to using just the pointers in the list while missing others. After reading this book, you should be able to come up with your list.
  • Send email alerts to the search engine optimizer when someone removes or updates categories, subcategories, or products so that they can check the header responses for the new and old URLs. This task can be easily automated.

Technical architecture

At the beginning of this section, I mentioned that site architecture (SA) is made of information architecture (IA) and technical architecture (TA). We then looked at several information architecture topics. Now, it is time to discuss technical architecture.

While duplicate content and crawlability issues are well-known SEO headaches, many search engine optimizers categorize them under the information architecture umbrella. However, they are, in fact, technical issues. Most SEO tips you will learn during the next chapters address technical architecture issues.


Keyword Research

Length: 10,599 words

Estimated reading time: 1 hour, 10 minutes


Keyword Research

While SEO is the abbreviation for “search engine optimization”, SEO experts do not improve how search engines work; they optimize for search engines. And because the primary purpose of search engines is to be helpful to the people who use them, SEO would be better thought of as optimizing your website for search engine users. Search engine users are also referred to as searchers.

The search trifecta includes three entities:

  • The user
  • The search engine
  • The website

A common search experience looks like this: The user enters a search query on the search engine, which leads searchers to the website, which should (in an ideal state) fulfill the user’s query.

Figure 54 – The search trifecta.

We often skip the user and jump straight to the search engine when performing keyword research. This section describes what I believe is a better approach to keyword research: start with the user, continue with the website, and finally, consider the search engine.

I will refer to keywords and queries interchangeably, but there is a subtle difference between them.[1] A search query is a series of words users type into a search engine. A keyword is an abstract concept within a search query.

Figure 55 – Keywords are abstract concepts.

For example, on e-commerce websites, keywords are represented by department, category, or subcategory names. A search term that contains several words, including the keyword, is a search query.

Good information architecture and keyword research are at the foundation of great ecommerce websites that perform the best in search engines and convert at high rates. In the Information Architecture section, we found that deciding on primary and secondary navigation labels (or category and subcategory labels) based solely on keyword research is not optimal—it should be complemented by user testing and research or by using controlled and custom vocabularies. That is because the user intent is not always reflected in what they type in a search engine. That is also why estimating user intent by analyzing keywords or search queries isn’t easy.

Keyword or query research is a core concept for e-commerce websites because it is important for both users and search engines to map keywords with the right type of content. Discussing search engines and keywords outside the context of users is not the correct SEO approach.

In marketing, research means collecting all the raw data you will later use to perform an analysis. About keywords, research means collecting keyword data from different sources. Here are some data and metrics that you might collect:

  • The keywords or the search queries used by searchers on search engines.
  • Their associated search volumes.
  • The current rankings (keep in mind that rankings are difficult to measure accurately due to personalization and geo-location).
  • Competitiveness data such as the average DomainAuthority of the top 10 ranking domains.

You will collect the above keyword data directly from search engines or by using third parties.

Gathering keywords

Collecting the initial set of keywords is straightforward, but the number of potential sources is overwhelming. You can use the following:

  • Google’s Keyword Planner.
  • Google’s Display Planner.
  • Google’s autosuggest feature (crank it up with Ubersuggest or Keyword Snatcher).
  • Google and Bing’s related searches.
  • Bing’s Keyword Research feature within Bing Webmaster.
  • You can collect keywords by brainstorming with various internal departments or using your existing Google Ads campaigns.
  • You can also use Google Search Console data.
  • Social media sources (Twitter, Facebook, LinkedIn, etc.).

Other less-known sources for collecting keywords are:

  • Internal site search data using Google Analytics or other web analysis tools.
  • Voice-of-the-customer surveys and research.
  • User testing.
  • The anchor text of the natural links to your pages.
  • Competitor analysis.

Even though there are many keyword tools, the most extensive set of keywords and search queries and the most accurate search volumes can be extracted from pay-per-click (PPC) advertising platforms such as Google Ads.

I recommend collecting keyword data using an active Google Ads campaign rather than the data Google Ads provides without a live campaign. This is because when you run a live campaign, the Google Ads data goes beyond the keyword suggestions within the Keyword Planner. A live campaign can generate a handy list of long-tail keywords (use the Search Terms Report in Google Ads), and in my experience, that list is impossible to capture with any other tool.

Besides Text Ads, you should run Product Listing Ads in Google Ads (via the Merchant Center) and Dynamic Search Ads and then use the Search Query Report to get a fantastic number of relevant keywords. Many of those keywords will be long-tail keywords.

Figure 56 – It is easier to rank for a search query with more words (long-tail keyword) because the search query is usually less competitive.

Unfortunately, many marketers stop keyword research after collecting only quantitative data, such as search volumes. I call this the traditional keyword research approach.
In the digital marketing world, the following is a typical scenario:

“We identified that these keywords have the highest search volumes, so we should target them. We will change page titles, we will go with a 3% keyword density, and we will build a bunch of backlinks to the pages targeting those keywords”.

Alternatively, if the marketing person or agency is more knowledgeable, the scenario may sound like this:

“These keywords have a decent amount of traffic and have good conversion rates, as per your analytics data. They are competitive, and that is why we should optimize the internal linking and build backlinks to the most appropriate SEO’d pages”.

Yes, search volume data research is necessary, but you need to go much deeper than this if you want to increase organic traffic. Search volumes are just the starting point. Considering your users’ concerns, questions, and FUDs (fears, uncertainties, and doubts) would be best. All of these affect their purchasing decision. Once you identified those pain points, create content that attracts qualified organic traffic and generates sales.

Seasoned marketers call this concept Intent to Content.

Creating Personas

One of the best ways to map “intent-to-content” is by creating personas. A definition of persona is “a quasi-imaginary representation of your customer based on market research and real data about your existing customers. A persona includes demographics, behaviors, motivations, and goals”.

Ecommerce businesses, especially in the B2B space, need to go above and beyond and develop well-researched buyer personas to attract people in the early stages of the buying funnel. You must create and market content for every stage of the buying funnel.

Let’s say that you sell promotional products to businesses. Here’s what an oversimplified persona creation could be like:

Start by identifying the segments you need to market to and by giving them names, for example:

  • Vera, the Marketer.
  • Chris, the IT Geek.
  • Brad, the Economic Buyer.

You will focus on Vera, the Marketer, if you sell promotional products.

Creating Vera’s buyer profile should be comprehensive. Everyone involved with marketing and sales should contribute to a Persona Questionnaire based on their experience, knowledge, and online research of Vera. You can also interview existing customers who share Vera’s profile to find commonalities. A joint marketing and sales team should develop this questionnaire and should include questions such as:

  • Where does Vera read online?
  • Where does she go to ask for help?
  • What kind of wording does she use online?
  • What challenges does Vera face right now?
  • What are her goals?
  • What does her career path look like?
  • What motivates her to select a competitor?
  • How does she make decisions?

Additionally, you can collect and analyze public résumés to identify career paths for people involved in marketing decisions, like Vera. Here’s what the word cloud for marketing managers’ accountabilities may look like:

Figure 57 – Responsibilities for marketing managers.

Some of the most important facts you need to uncover about Vera are her pain points and how she makes decisions. You will reveal such data by engaging on websites where she goes to read, educate herself, or ask for help.

Once you have identified Vera’s pain points and top challenges related to your vertical (in our example, promotional products), bucket them into different content types and rank them based on the most severe problems. Then, prepare content to address each pain point or challenge (e.g., case studies, how-to’s, extensive guides, etc.).

One type of content for the upper buying funnel can be targeted to raise awareness about a given challenge. For the mid-funnel, the content might be a Guide on addressing the same challenge, with examples of how you helped other businesses deal with the problem. The content can be a case study for those ready to make a vendor selection. However, none of these should be salesy, just excellent and useful information.

You will identify Vera’s most important problems, educate her, and prove you have the products she needs.

Note: usually, it is a bad idea to “gate” upper and mid-funnel content, for example, by asking for email and contact details to access the content.

The Intent-to-Content concept became more relevant and prominent after the Hummingbird algorithm update. This update focused on processing conversational queries, which are longer, question-like queries including modifiers such as how to, where is, or where can. The focus shifted away from the traditional word-parsing approach.

Another objective of Hummingbird was to match the user intent so that Google could provide answers rather than just search results. The intent to content concept became even more important after introducing the so-called “position zero”, also known as the quick answer box.

We need to discuss the buying funnel and the user intent to map keywords-to-intent and intent-to-content.

The buying funnel

In the US, e-commerce conversion rates are about 3%[2] because online retailers focus mostly on converting branded traffic and marketing to consumers in the late stages of the buying funnel. Also, ecommerce websites usually concentrate their link-building campaigns on category and subcategory anchor text.

Keyword research and web analytics tools, PPC data, and other similar sources provide insights on which keywords the target audience searches for, when, and where they search from. However, savvy online retailers must understand the searcher context and create a content plan accordingly. They try to answer the why question. User testing is one great way to gauge the why, but it has limits.

Users do not just turn their computers on, type in your website name, and buy products from you. First, they realize they have a need; next, they research online,[3] decide what’s right for them, and—only then—purchase.

This journey is called the buying funnel.

Figure 58 – The stages of the buying funnel.

In the image above, you can notice how keywords in the awareness stage are generic and broad. They gradually become more specific until they finally become the product the searcher wants to purchase.

Although the keyword categorization in this example may seem straightforward and logical, in practice, the keywords used by consumers will belong to multiple categories and be found in various buying stages. That is why you need not be too particular about where to bucket a keyword.

Here are some great insights from one study that mapped 40,000 PPC keywords to the buying funnel:[4]

  • Targeting only keywords in the Purchase and Decision stages of the buying funnel for ecommerce websites can, theoretically, lead to 79% less organic traffic.
  • The buying funnel is representative of actual online consumer behavior, at least at the individual query level (6. Discussion and Implications, p. 11).
  • Advertisers can use the model to organize separate campaigns targeting various consumer touch points (6.3. Practical Implications, p. 14).
  • The implication is clear: do not ignore Awareness key phrases (6.3. Practical Implications, p. 14).

The researchers analyzed data from about seven million keywords from a large retail chain having a brick-and-mortar and online presence:

Figure 59 – The stages of the buying funnel[5].

As you can see in this screenshot from the study, awareness and research keywords comprise almost 80% of the total keywords. The same research indicates high PPC advertising costs for the Awareness (25%) and Research (57%) stages. A staggering $4.6 million (57% of $8 million) could have been saved with a proper keyword-to-content strategy.

When you map keywords to buying stages and to user intent, and when you develop content accordingly, you will:

  • Generate content that attracts organic traffic.
  • Create content that can be linked to more easily.
  • Support pages in the vertical silos up to top-level categories.
  • Reduce advertising costs.

The awareness stage

This is the first stage in the buying funnel. Your customers realize that they have a need or a problem, and they start researching general information about what would help them fill that need or fix it. They want to know what products or services are available on the market.

For an ecommerce website, the queries that can be associated with the awareness stage are the broadest, most generic terms, such as department, category, or subcategory names (e.g., “commuter bikes”, “winter jackets”, “car racks”, “running shoes”, “cruise deals”, “diamonds”, and so on).

However, longer and natural language queries are also found at the awareness stage. For example, a searcher wants to know how to save on his daily commute. He starts by typing “best ways to save on commuting”, then reads an article about commuter bikes.

At this stage, consumers do not know yet what will address their needs and are still seeking information, so awareness queries usually contain neither brand nor specific full product names. They can include an action or a problem that needs to be solved—e.g., “removing wine stains”.

According to the same study cited earlier,[6] an awareness search query:

  • Does not contain a brand name.
  • Could contain the partial product name/type.
  • Could contain the problem to be solved.

At this stage, the user intent is mostly informational.

Tactics for this stage
The search queries associated with the awareness stage are what most of the e-commerce websites target, for example, head terms such as department, category, subcategory, or sub-subcategory names. These search terms are usually super-competitive, and realistically, ranking for such keywords will not happen unless your website has a significant amount of authority and reputation in the industry (including backlinks). You will also need a lot of great content that establishes your website theme and makes you the go-to resource for a subject matter or theme (i.e., wines).

Some tactics associated with attracting traffic for these keywords are:

  • Creating content such as community pages, how-to pieces, blog articles, educational content, and linking vertically from such content. You will need significant content and consistent linking to support category pages.
  • Siloing the website with directories and internal linking.
  • Building themed backlinks to the content-rich pages and articles.
  • Showcasing instructional videos by featuring them on your website and through social media.

The research stage

At this point, the consumer has identified the type of product or service that could help. The possible customer can recognize brands in your industry or niche but has not yet decided on a definitive brand. The customer still needs to refine their knowledge before making a purchasing decision.

While the search queries are still broad, the consumer uses more specific terms, including keyword modifiers such as brand names or geo-locations, instead of generic searches.

The queries may look like “lightweight commuter bikes”, “insulated winter jackets”, “rear mounted car racks”, “cross training running shoes”, “European cruise deals”, or “4-carat wedding ring”. The queries can be subcategories, sub-subcategories, or product attributes.

Long-tail queries can be found at this stage as well. In our biking example, the consumer may type: “What are the best brands for commuter bikes”, “Which brand is more reliable”, “Compare {brand1}with{brand2}”, “what bike size do I need”, “what is the cost of an electric bike”, “how much will it cost to maintain a bike”, and so on.

At this stage, the user intent is still mostly informational, but transactional intent may be there, too.

Tactics for this stage

  • Write product reviews, product comparisons, and many articles to answer your target market’s questions.
  • Write extensive user guides (e.g., “How to Select an Electric Bike” or “How to Choose a Commuter Bike in 10 Easy Steps”). This type of content is an organic traffic driver and can potentially become a real link magnet. Then, promote this content socially and with influencers in your industry to generate buzz and, hopefully, backlinks. If you are a small or medium business focusing on a line of products or niche, there is some encouraging news for you. Because the authoritative websites you compete with on Google have a lot of inventory and themes to create content for, you may have an advantage if you are focused.
  • Create buyer personas to identify a) where the target market goes to read information and b) what questions they have. Once identified, group them into topics and write articles to answer the questions. To learn more about personas, read this leaked document about BestBuy’s buyer personas.[7]
  • Keyword-rich internal linking is also crucial at this stage. Internal linking is essential at all buying funnel stages, so make sure you cross-link from informational pages to category and product details pages.

The decision stage

Now, your prospective customer knows what solution is good for them. The prospect will research the best store to buy from, and it will try to get the most value. His logic and emotion will favor a particular brand. The possible customer will be much closer to making a purchase decision.

At this stage, the consumer has chosen a product and a brand but not the exact model number or version of the product. In our bicycle example, the searcher wants the Ridgeback brand and needs a commuter bike.

The Decision stage is where comparison shopping occurs, so search queries often include brand names and technical specifications. At this point, his queries will be more focused than in the previous two stages and can consist of very strong commercial intent keyword modifiers like “sale”, “discount”, “coupon”, “buy” or “buy online”.

Going back to our example, the searchers’ keywords can be “ridgeback coupons”, “ridgeback commuter bikes deals”, “ridgeback free shipping”, “ridgeback bike size guide”, “ridgeback commuter bikes comparison”, and so on.

The user intent is mostly transactional, with some commercial intent. Some navigational queries may occur when consumers check the manufacturers’ websites directly.

Tactics for this stage

  • Ensure your website ranks for branded search queries such as “{brandname}reviews”.
  • Your website will claim the first positions if you have pages that target reviews-related keywords and if you build just a couple of good backlinks from external sites to those pages.
  • A dedicated template page for “{brandname}reviews” will allow you to publish all the reviews for any product.

Figure 60 – This online retailer has a Reviews and News template for each brand.

Here’s how the website above ranks for “ridgeback reviews”. They got the #1 and #2 positions:

Figure 61 – If you own the brand, your site should easily come at the top, even without many backlinks.

Other tactics that you may consider are:

  • Distribute coupons to build links and brand awareness.
  • Write how-to content, user guides, and product comparison pages.
  • Optimize your brand pages and product descriptions to include reassurances, shipping estimates, refund policies, etc. Think in terms of optimizing your content for conversions rather than SEO.
  • Have a Promotions/Coupons/Reviews page targeting your brand terms.

Figure 62 – SERP for results for “Macy’s coupons”.

In the image above, you can see the results for “Macy’s coupons”. This is a great keyword to rank for, and Macy’s is ranked #2 for its brand name plus “coupons”. By creating this Coupon page on their website, they are taking away traffic from coupon websites.

  • If you accept coupon codes at checkout, make no mistake; consumers will leave the process to find your coupons. Instead of allowing users to leave the checkout to find current promotions outside your website, use a pop-up window or open a page in a new tab to list your current promotions and coupon codes.
  • Create interactive tools for finding, comparing, or visualizing products (e.g., virtual eyewear, try before you buy tools, see the painting in your room, etc.).

The purchase stage

This is the stage at which consumers know exactly what they want to buy or at least the brand they want to buy from. The queries contain specific product names and the exact model number or version of the product (e.g., Ridgeback Meteor 14). The keywords are the most focused at this stage. These are probably easier keywords to classify because they often contain the product name or the brand name. For ecommerce websites, the landing pages most associated with these queries are the product detail pages.

At this stage, the user intent is mostly transactional, with some navigational intent (for example, typing “Amazon” in a search engine to buy a book or purchasing directly from the manufacturer’s website).

Tactics for this stage

  • Engage appropriate influencers for product reviews and send qualified traffic to your website.
  • Develop backlinks to product detail pages.
  • Optimize product detail pages to include detailed product specs, persuasive descriptions, great images, questions and answers, etc.
  • Offer coupons.

The purchasing stage is the last in our buying funnel model. Some marketers and sales professionals have gone into greater depth and broken it down into even more detailed steps. However, if you start breaking down the funnel into four stages and begin developing content based on these stages, you will see traffic and sales increase nicely over time.

Keep in mind that a purchasing decision is never going to be linear. Prospective customers will start their journey in the middle or at the end of the funnel. Regardless of where the journey begins, you can capture consumers at any stage if your content is well-planned.

Knowing about the buying funnel stages is important for understanding another keyword research concept, the user intent.

The relationship between these two concepts is pretty tight. Usually, a consumer in the Awareness stage will use informational search queries. When the searcher is in the Purchasing stage, it mainly uses transactional intent keywords with strong commercial intent.

The user intent

Users try to accomplish something when they go to search engines and type queries. That something can be:

  • Finding a business that can be located either online or offline.
  • Getting more information about a product or a service.
  • Purchasing an item.

Searchers have a goal in mind, and that goal is called the user intent. Search engine users type in phrases representing their intents, and Google tries to match those intents with the most relevant results. If you understand this concept, you understand the importance of mapping keywords-to-intents and developing content accordingly.

Figure 63 – Three types of user intent keywords.

The specialty literature[8] breaks down the user intent into three categories:

  • Navigational – when searchers use a search engine to navigate to a specific website.
  • Informational – when searchers want to find content and info about a specific topic.
  • Transactional – when searchers want to engage in an activity, such as buying a product online, downloading or playing a game, seeing pictures, viewing a video, etc. Transactional intent does not necessarily involve a purchase.

Google’s guidelines for quality raters (the SERPs human evaluators[9] ) who assist with quality control of the SERPs refer to the same categories as Navigation, Informational, and Action.

When discussing user intent types, it is worth mentioning commercial intent. Commercial intent is an independent dimension that can apply to all three types of user intent, with transactional queries probably carrying a higher commercial intent than the other two. A Microsoft Research study found that 38% of the queries have commercial intent, and the rest are non-commercial.[10]

Figure 64 – Navigational and informational keywords can have commercial intent, too.

For example, when a consumer wants to buy a car, he will perform the research online, but he will seal the deal in a dealership. His queries, whether informational, transactional, or navigational, will all have some commercial intent because his final goal is to purchase a car.

Mapping keywords to intent is not an easy task. Even search engines cannot accurately classify user intent in general, let alone commercial intent. So, map keywords to user intent as best you can. As long as you start categorizing based on intent, you will begin generating ideas for content that matches the intent and is relevant to users. This is the best SEO approach to stand the test of continuous algorithm updates.

Below are some guidelines for classifying user intent, but remember that many keywords can be placed into multiple intent buckets.

These are queries containing:

  • Companies, brands, organizations, or people’s names.
  • Parts or full domain names.
  • The words “website” or “website”.

Navigational queries are the easiest to spot during keyword research. For this query type, ensure you appear at the top of the SERPs for your brand and domain name search queries. If you are not showing up at the top for such queries, you might have a more significant problem than mapping user intent. You might have a site-wide penalty.

Figure 65 – Best Buy pays for its brand name to appear on Google Ads. That is because they deemed the branded keywords very valuable.

If you sell someone else’s brands, it is not a good idea to put efforts into ranking for keywords made of brand names only because this means competing directly with the brand owners and their social media profiles. Overtaking them in rankings is not possible—unless the brand sucks in terms of SEO—and even then, it is going to require significant effort.

If you own the brand or are a manufacturer that sells your products, ensure that your website shows up for possible keywords containing your brand name and product names. For example, if you manufacture and sell computer RAM, your site should rank at the top for brand queries, including the products you sell (e.g., “Kingston 1Gb RAM” and “Kingston 1 Gb RAM”).

Informational intent queries, or the “know” queries

These are queries containing:

  • Question words (e.g., ways to, how to, what is, etc.).
  • Informational terms (e.g., list, top, playlist, etc.).
  • Anti-commercial queries (e.g., DIY, do-it-yourself, plans, tutorial, guide, etc.).
  • Words like instructions, information, specs.
  • Words like help, resources, FAQ.
  • A category or subcategories (e.g., digital cameras, raincoats, etc.).

If you have difficulty classifying keywords based on intent, one trick is to find the navigational and transactional intent queries first, then assume that the rest are informational.

If you want to learn how Google teaches its search quality raters to classify the search queries, I recommend reading Google’s “Search Quality Rating Guidelines”, especially Section Two.

Informational intent is the type of intent ecommerce websites should start shifting their attention to because informational keywords provide the chance to get in front of the target market in the early stages of the buying funnel. The earlier your audience is exposed to your brand, the higher the chances of closing a sale.

The content that addresses informational queries encompasses all media types, such as text, video, audio, etc. It includes product descriptions, technical specs, expert reviews, infographics, instruct-o-graphics, blog posts, how-to guides, etc.

When creating content for this type of intent, your goal is not to sell your products but to position yourself as the authoritative source in your space. You need to become a publisher of reliable and useful content. Informational queries are perfect because they represent an excellent opportunity to increase brand awareness and show expertise.

The fact that 80% of search queries are informational[11] represents a massive opportunity for those who plan for long-term gains. These queries can be very generic, for example, head terms such as category and subcategory names (e.g., “cars” or “insurance brokers”), but long-tail keywords as well. For example, search queries like “What is the most fuel-efficient car on the market?” or “Life insurance brokers in New Westminster, BC” are informational. Note that either of these two example queries could also have a transactional intent.

To cover as many informational queries as possible, you must create educational content for consumers who are not yet ready to buy or do not even know what they need. Your goal is to provide searchers with content that answers their questions and fills their need for information. Also, your content must assist in nudging searchers further down the buying funnel.

Informational intent queries appear at the Awareness, Research, and Decision stages. You must gradually guide consumers towards more transactional content, eventually leading to conversions. After all, a macro-conversion (e.g., a web purchase) happens only at the end of several micro-conversions, such as reading an article about a problem, finding the right product, adding to the shopping cart, clicking on proceed to checkout, etc.

One way to check whether there is a disconnect between user intent and the content on your website is by looking at the ecommerce transactions (and conversion rates) for your keywords (if the keywords data is available):

Figure 66 – This Google Analytics report became almost useless after Google stripped the search query data from the referring URLs (now showing as “not provided”).

You can look at which pages or keywords perform poorly using Google Analytics. In this example, getting 3.3k organic visits from a single keyword and ending with just one conversion indicates something is wrong. It may be because the searchers land on an improper landing page or because the landing page attracts the wrong keywords. It may also be related to conversion frictions, such as your pricing being higher than competitors.

Another way of finding this disconnect is by analyzing the keywords’ bounce rate (whenever you can get this kind of information):

Figure 67 – Whenever you can identify a keyword that drove traffic to the website, look at its bounce rate.

A high SERP bounce rate is usually bad because it shows that searchers landed on your website and didn’t find what they expected. However, blog pages typically have a high bounce rate since visitors might find the answer they want in the article and then leave.

As you start looking at keywords through the user intent prism instead of just words and numbers and try to solve high bounce rates, low conversion rates, and low transaction numbers, you will learn more about your visitors. This will help with organic traffic and everything marketing and sales.

When analyzing the performance of the informational queries, remember that such keywords will most likely not convert on the first visit.

The transactional intent or the “do” queries

These are queries containing:

  • Calls to action (subscribe, purchase, pay, play, send, download, buy, listen, view, watch, find, get, compare, shop, search, sell, etc.).
  • Entertainment terms (pictures, movies, games, and so on).
  • Promotional terms (coupons, deals, discounts, for sale, quotes).
  • Complete product names.
  • Comparison terms (where to buy, prices, pricing, compare prices).
  • Terms related to shipping (next day shipping, same day shipping, and free shipping).

However, not all transactional queries contain verbs. For example, the “Dell Vostro 1700” search query can be transactional and informational because the user wants to read more or buy it. Also, transactional queries do not necessarily have to involve money or purchases. They reflect only the desire to perform some action on the Internet.

Transactional queries with commercial intent occur more frequently in the decision and purchasing stages. Such keywords should land visitors on category and product detail pages or landing pages built to funnel visitors to a page where a commercial transaction occurs (e.g., a product comparison tool or a finder tool).

Transactional queries are most likely to generate the highest return on investment (ROI) for pay-per-click campaigns, which is why their cost-per-click can be high. However, the ROI would be even better if you had previously “touched” the pay-per-click searcher with an organic result. Upon landing on your website from the PPC ad, your brand might be recognized, which might positively affect conversions.

A possible way to connect user intent with search queries is using surveys sourced from your organic traffic. You can implement a modal window or a pop-up that tracks the search query used by visitors. The downside is that search engines no longer pass search query data in the URL string so that you will get only a fraction of the queries.

However, when you can identify the keyword, trigger the modal window and ask a simple question, such as “What is the goal of your visit to our website today?”. Provide two possible options:

  1. I am shopping for something to buy now or soon.
  2. I am looking for more information about some products/services.

Mapping intent to content

Your content strategy should be created by understanding where in the buying funnel the user is when he types in a search query, mapping his user intent, and bucketing the searcher into the right persona he belongs to.

But why is user intent so important? It is because an algorithm matches user intent with search queries: Hummingbird.

One of the metrics used by search engines to measure the match between the user intent, the search query, and the perfect result is SERP user engagement.

However, the ultimate metric that Google uses to quantify if the content on a page matches the user intent behind a search query. This metric is called The Long Click.

The following is a quote from the book “In the Plex: How Google Thinks, Works, and Shapes Our Lives” and describes the long click:

“On the most basic level, Google could see how satisfied users were. To paraphrase Tolstoy, happy users were all the same. The best sign of their happiness was the “Long Click” — This occurred when someone went to a search result, ideally the top one, and did not return. That meant Google has successfully fulfilled the query”.

Let’s start the keyword mapping process.

“Tired Jamie” is a persona developed for the scenario in which a buyer wants to purchase a mattress.

Scenario: Jamie cannot sleep at night and wants to improve her sleep. She finds that old mattresses can cause poor sleep and decides it’s time to buy a new one. She starts looking for information on choosing a mattress that can provide the best night’s sleep. She discovered a useful mattress finder tool that recommended foam mattresses based on her input. Next, she starts researching which brands sell foam mattresses and which have the best reviews. She found Tempur-Pedic®, which seems to be a trusted brand by many people, so she investigated their various types of mattresses. Finally, she knows what she wants and is actively looking for that product.

First, do your best to map her keyword journey by sorting keywords from top to bottom based on the buying funnel.

Figure 68 – This is the beginning of the keyword mapping process.

In this example, Jamie starts with a broad search, “tossing all night,” then refines it to “how to improve my sleep.” After discovering that mattresses can cause poor sleep, she refines her search to “how to choose a mattress”. Once she found the type of mattress that seemed to solve her problem, she started investigating “foam mattress brands”.

Once she finds a brand she trusts, she will look for their products by searching for “tempur pedic mattresses”; this search query contains the brand and the category of products. Finally, she looks for the specific product she intends to buy: “tempur pedic cloud luxe breeze”.

In the second column, tag each keyword by the most appropriate theme or silo it belongs to. For example, all keywords containing the word “mattress” will be bucketed in the Mattresses silo. “Tossing all night” and “How to improve your sleep” do not belong to a specific category of products so that you can assign them to a generic “resources” silo.

Next, map the keywords to the user intent while remembering that a query can have multiple intents. In our example, the first four keywords are informational, and the last two are transactional.

Then, add the type of content that fits the intent and the search query (e.g., for the search query “how to choose a mattress,” you can build a mattress finder tool).

Now, you need to add more details.

Figure 69 – We will keep adding more data to this table.

The URL represents the page you want to rank in the SERPs. This is the page you deem the most appropriate to rank with. In our example, for the keyword “tempur pedic cloud luxe breeze,” you will want to rank with this product page:


In the Anchor Text(s) column, you will list the internal anchor text used to link to the targeted URLs (this can also be used as anchor text for your backlinks).

Note: failing to establish contact and presence with consumers who perform informational queries is one cause of single-digit conversion rates. Too often, ecommerce websites try to sell too early.

If you develop content for all the buying stages and intents, you can land prospective customers on your website at the research stage. Then, gradually nudge them to the purchasing stage without having them exit your website to find answers from competitors. If one of your competitors becomes the trusted source of advice for that potential customer, you have lost the sale.

So, ensure you optimize the right pages for the right queries. You want to rank a page with informational or educational content if the query is informational. Likewise, if the query is transactional, you need to optimize and rank with pages that have transactional intent.


Keyword prioritization is difficult because:

  • You need to consider many various metrics.
  • The SERP ranking factors are not publicly available.
  • Several metrics, such as competitiveness, come from third-party sources (not directly from the search engines).

Therefore, any keyword evaluation model based on ranking factors and competitiveness metrics is subjective. Prioritization methodologies are usually based on keyword difficulty, search volumes, business goals, profits, margins, conversion rates, or a combination of such metrics.

One lesser-known keyword prioritization method is based on forecasted rankings’ revenue opportunity. This evaluation model determines a monetary value for the top 10 rankings using the average SERP CTRs.

Note that this model is meant only as a tool to help you identify the lowest-hanging opportunities.

Figure 70 – We’re adding search and business metrics to the process.

In this table, you can get the Search Volume data with Keyword Planner, the Current Ranking data with the ranking tracking tool of your choice, and the Organic Visits data using your web analytics tool. The Revenue data is also collected from your web analytics tool. You will generate the Per Visit Value by dividing Revenue by Organic Visits.

I excluded metrics such as the conversion rate or the number of conversions on purpose. That is because ecommerce websites have multiple micro-conversions and macro-conversions (e.g., a web sale, a newsletter subscription, reaching a critical page, submitting a form, etc.), and this evaluation method is based solely on revenue, not conversion rates.

If you want to dig into details and evaluate based on each type of conversion (for example, prioritize keywords that generate more email subscriptions), then use only the newsletter revenue data for each keyword and prioritize accordingly.

The columns “Rev. if ranked 1…10” represent the revenue opportunity for various positions if you rank organically at those positions. Looking at the “Rev. if ranked #1” column, you can see that although the keyword “tempur pedic cloud luxe breeze” is a transactional keyword and has the highest Per Visit Value ($25), it is not the keyword with the highest potential to increase revenue. That keyword would be “how to choose a mattress”.

Figure 71 – SERP CTRs from Optify.

For this forecasting method, I used the organic SERP CTRs based on research done by Optify.[12]

The next step is adding keyword competitiveness data. There are a few different methods for assessing the competitiveness of a keyword. The easier ones are:

  • The average Domain Authority (DA) or Page Authority (PA) of the top 10 ranking pages and root domains. Note that the table below includes the average PageRank, but this data is not publicly available anymore.
  • MOZ generates the keyword difficulty score. The CI index is from serpIQ (now defunct).

Figure 72 – We’re adding competitiveness data.

Now that you have some quantitative information, you can slice and dice the keyword data any way you like. I suggest analyzing data in sets or themes only. If you mix keywords related to mattresses with dresser keywords, your data will be skewed.

Also, it is important to balance the forecasted revenue and the costs associated with obtaining the necessary rankings to achieve that revenue. Remember to produce content, promote it through various marketing channels, and build backlinks. All these actions have a cost.

Figure 73 – We’re adding costs related to producing and promoting content.

Content Creation Cost estimates how much it will cost to create the content necessary to promote the keyword. Each keyword might have a different cost depending on the type of content you need to create; creating an article is less expensive than creating a video, which is less costly than creating an interactive tool or a mobile app. The Cost per Link estimates how much it will cost to build one link to that content.

This is the Costs formula:

Costs=content creation cost + (cost per link * (average DA / 10)*2)

The number 2 in this formula is a cost coefficient tied to your domain authority. The lower the domain authority, the higher the coefficient. You can use the following brackets as guidelines for adjusting the coefficient based on your DA:

  • DA 0–20, coefficient=5
  • DA 21–40, coefficient=4
  • DA 41–60, coefficient=3
  • DA 61–80, coefficient=2
  • DA 81–100, coefficient=1

The Costs formula indicates that the lower your DA, the more links you need to build to achieve first-page rankings. For example, for the keyword “tempur pedic mattresses”, if your website DA is 65, the coefficient is 2, and the Costs formula is:

Costs = $ 250 + ($200 * (45/10)*2)=$250 + ($200 *9), where nine means you will need to build nine good-quality links.

You can download the Excel file containing the example formulas here.

Returning to user intent, remember that search engines aim to provide straight answers for search queries and the best possible results for keyword searches. If search engines fail at this, they will lose users, market share, and advertising revenue. It is, therefore, crucial for search engines to identify user intent as best as they can. Remember, Google changed its algorithm to focus on this with the release of Hummingbird. Microsoft used to have a publicly available commercial intent detector tool, but unfortunately, it has been discontinued. So, you will have to classify the results manually.

Whenever you doubt how search engines map keywords to intent, use Google’s help to assess what type of pages it returns for a specific keyword. First, log out of all your Google accounts. Then, clean out the browser cookies, open an incognito session, and type in the keyword you want to research.

For example, look at the SERP for the “digital camera” keyword. (see screenshot below). For this keyword, seven listings are informational or educational resources (non-commercial intent such as reviews, news, images, tips, wiki, etc.), and four are online retailers (commercial intent). Keep in mind that I counted image results as just one single result. If you sell digital cameras and want to rank for this keyword, you must create great educational resources on your website and promote them heavily on both your and external websites.

Given the number of informational results for “digital camera”, it seems Google does not assign a strong commercial intent to this keyword. Then why do ecommerce websites try to rank their Digital Cameras category URL rather than a Digital Camera page dedicated to educational content and tools? Wouldn’t a category page create a disconnect between the user intent and the content on that page?

Figure 74 – SERP for “digital camera”.

Imagine you walk into a store to get information about which digital camera best suits your needs, only to encounter a pushy salesperson who tries to sell you items he wants you to buy rather than what you think you need. You will probably thank him nicely and leave without buying. The same applies to online experiences; if searchers land on a page that does not fit their intent, they will bounce.

Creating content based on keyword research must address the possible buyers of your products and those who will link to your content. That is because most people who buy from you will not link to a product or category page. Customers may share the purchase socially, but back-linking will happen only from people who believe the content they link to is valuable. Buyers think about the value they get by buying the product from you, but those who link to you think about the value they offer to their audience.

Product attributes and keyword variations

Online retailers often sell products with similar attributes and would like to rank for many keywords and product variants. For example, you sell a red, blue, and green sweater in three different sizes (extra-large, large, and small). This matrix will generate nine product variants: small red sweater, large red sweater, extra-large red sweater, small blue sweater, large blue sweater, extra-large blue sweater, small green sweater, large green sweater, and extra-large green sweater.

In the section on product detail pages, we will discuss how you should approach product variants. However, for now, let’s say that creating unique product descriptions for each variant may not be the best idea unless you have a large budget for content creation. Instead, handle product variants in the interface without reloading the entire content (with URL changes). To achieve this, you can use dropdowns to allow users to pick a color and AJAX to load the content specific to that color variant.

If you already have unique URLs for product variants, choose a canonical URL and point all product variant URLs to it (be careful what you choose as the canonical version). However, if the URLs are clean (i.e., they do not have too many URL parameters), do not change the URL structure without consulting an SEO expert.

Figure 75 – A simple decision chart inspired by MOZ.[13]

Keyword strategies

This section will discuss a few less-talked-about keyword strategies for e-commerce websites.

Target low-hanging fruits

When you run an ecommerce website, the number of keywords you want to rank for is enormous, so it is not economically feasible to target all of them with link-building campaigns. You can rank organically for many long-tail keywords by supporting them with content. Other keywords (usually the more competitive terms such as category and subcategory names) will only rank if you support them with content-rich website sections and links from external websites.

An often-overlooked keyword strategy is to focus on keywords ranked on page 2, especially those ranking between positions 11 and 15. Moving a keyword from the second to the first page is usually less competitive than moving the same keyword four spots up, from 5 to 1. In the same way, moving from position 21 to 17 (also four positions up) will not generate a substantial increase in visits.

Let’s illustrate this concept with a keyword with 1.2 million monthly searches, “wedding dresses”. Moving this keyword from position 11, where it gets less than 2.6% of the clicks, which is about 3,100 visits, to position 6, where it gets 4.1% of the clicks, which is about 4,900 visits, represents a 158% improvement in traffic. Moving the same keyword from position 21 to 16 will generate a minimal rise in visits.

Figure 76 – Keywords ranking at the top of page 2 can be a good target for link development

The idea is that by building links to keywords ranking on the second page, you gradually increase your website’s authority and, at the same time, generate more organic traffic. These links will eventually support the link-building campaign for more competitive terms.

Of course, you should not focus solely on keywords ranking on the second page. A thorough analysis will identify keywords that rank on the first page and do not have much competition—it also makes sense to target those.

Target holidays and retail days search queries
Holidays such as Christmas, Hanukkah, Thanksgiving, Easter, and major retail events such as Back to School, Halloween, Cyber Monday, Black Friday, and Boxing Day represent significant traffic and revenue opportunities for all e-commerce websites and online retailers. Shoppers are more open to spending during holidays. However, their search patterns change around these special shopping days.

The ecommerce shopping days calendar created by Shopify[14] shows no month without a major shopping event. Promotions change rapidly, and shoppers will shift their search queries very quickly. Smart ecommerce websites have to adapt to and capitalize on such shifts. However, many websites do not have the agile SEO abilities to take advantage of shopping day opportunities.

Here are some common mistakes made by online retailers regarding targeting shopping events:

  • Not updating page titles, descriptions, and headings to include event-related modifiers.
  • Not updating page titles, descriptions, and headings to include event-related modifiers until just a few days before that event. This is too late from a business point of view and for SEO. Google Insights research[15] suggests that Black Friday searches can come as early as July.

Figure 77 – 30% of shoppers plan their Christmas shopping list before Halloween.

  • Creating year-specific pages (e.g., Christmas 2018) and removing them without proper redirects once the holiday/event ended.
  • Not planning a “flash” link building campaign to target event-specific modifiers. Regarding link building, a flash campaign means two to three months in advance.
  • Not targeting last-minute buyers by adding “free” or “next-day shipping” in the page titles.

Figure 78 – 55% of consumers expect free shipping.[16]

Additionally, very few ecommerce websites will create content (e.g., guides, ideas, how-to’s) specifically targeting such retail dates. That is a shame since this type of content can capture potential customers during their research stage, when they use informational searches queries, like “Halloween costumes ideas”, “Christmas gift guides” or “Easter egg decorating pictures”.

Use keyword modifiers to update titles, descriptions, and headings
The way consumers search online before and during shopping events differs from how they do so the rest of the year. They add keyword modifiers to their usual search queries to better define their intent. Event modifiers are words like “Christmas”, “Cyber Monday” or “Boxing Day”, but also “same day shipping”, “next day shipping” and even “gifts”.

Look at the spikes in search volume associated with the “next day shipping” keyword modifier. The peaks reach the maximum a few days before Christmas. You should be fast enough to capitalize on this search pattern change.

Figure 79 – Shipping-related queries increase significantly around Christmas.

Figure 80 – Adding “same-day shipping” or “next-day shipping” to your titles by December 15 may prove wise.

Let’s say you want to capitalize on searches that contain the keyword “Christmas”. Add the word “Christmas” in the title of the category or product detail pages immediately after Cyber Monday is over. You can also consider altering meta descriptions and page content as well. Be sure to check the rankings associated with these pages a couple of days after you made the updates (and regularly after that), to see whether there is a traffic drop. You can expect some fluctuations, but as you get closer to Christmas, you should see an increase in rankings and traffic.

If there is a drop, revert to the usual titles. If there is an increase, change all the titles for the category, subcategory, and product detail pages. As you get closer to Christmas (e.g., December 15), change the title to “Free same-day Christmas shipping” since “free shipping” tops the list of the strongest incentives for visitors to buy goods online.

Get any page crawled and indexed by Google in less than one minute
Use the Fetch as Google feature in Google Search Console to achieve this. In the new GSC, use the Test Live URL functionality.

Figure 81 – Once you hit the Fetch button, Googlebot will crawl the submitted URL. If the page passes Google’s filters, it will be indexed in minutes.

Figure 82 – The number of fetch requests in Google Search Console is limited, so use your quota wisely.

Once Christmas ends, change the titles to target the next Holiday, e.g., Boxing Day. If there is a gap of more than three to four weeks between shopping events, you can default to the usual titles.

You may want to use an automated system that allows event-specific titles, descriptions, and headings to be updated on specific dates. If that is not possible, then at least set up calendar reminders a month before the less important shopping events and two months before the most important ones. You can refer to this article[17] for consumer trend data and the importance of each consumer holiday.

Create holiday-specific landing pages.
To drive targeted traffic, marketers create holiday or promotion-specific landing pages with PPC, email, and catalogs. During the year, they will create pages for “Christmas 2018 Promotion”, “Father’s Day Specials” or “Valentine’s Day Two-for-One Deals”. You have probably noticed this implemented by big brands or small but smart competitors.

By creating these landing pages, which are visually themed per the event they target, marketers make ecommerce websites more attractive to visitors. These pages can get natural links from deals or coupon websites if your brand is recognizable or you push the pages with an outreach campaign.

Usually, when e-commerce websites use specific event or holiday landing pages, they publish them on URLs such as However, improper redirect handling, such as no 301 redirects or 301 redirects to the wrong pages, may lead to PageRank loss once the event ends. In such mishandling, the pages are removed from the website.

Here are a couple of tips if you use separate URLs for holiday or shopping events:

  • Do not include years or any other time or date indicators in URLs. Including time indicators in page titles, descriptions, headings, and main content is okay.
  • When the event is over, redirect the event pages to the most appropriate website sections or keep the URLs alive (but with changed content).
  • You can “revive” the promotion-specific URLs a few weeks before each event in the following years.

A mixed approach
Whenever possible, I like to implement another tactic: updating titles, descriptions, and headings while customizing the look and feel of the existing landing pages. Instead of having separate URLs for each event, your current landing pages (for example, your category pages) will become the event landing page.

Choose the most important categories on your website or those promoted during a specific consumer holiday and customize their look and feel to match the event. Customization can be as simple as displaying a banner at the top of the main content area or adding a background image for the entire page, or it can be as complex as creating an entirely new event-themed layout.

You will not publish this new layout under a different URL. Instead, this themed look, feel, and messaging will be released on the regular category page URLs. For example, your Christmas 2018 promotion includes a 25% discount on all Cleansing products. Rather than creating the URL for this holiday, you will use its usual URL, However, this page will be themed with a Christmas look and feel.

The main benefit of customizing the existing category pages for shopping events is that you can build backlinks to category pages more easily. Another benefit is that there will be no future redirect headaches. Also, if other websites want to link to your promotions, they will link to your category pages.

Once the holiday ends, return to the usual, non-themed layout. You will also have to update titles, descriptions, and page copy (a bit).

Tip: If your website gets image-rich snippets, you can “theme” the image thumbnails with an event or holiday-specific icon. Once the holiday/event is over, revert to the usual images. For instance, if you sell cameras, instead of this video thumbnail:

Figure 83 – Video listing in search results.

Use a themed image:

Figure 84 – Personalizing the video thumbnails can lead to a better CTR.

Optimize for “gift card” related keywords

Last-minute shoppers often buy e-gift cards instead of real products to avoid shipping delays. If you are still debating using gift cards, consider the following:

  • According to Giftango Corp, 26.7% of the gift cards sold during December 2011 were sold between December 21 and 24.[18]
  • 57.3% of shoppers planned to buy a gift card in 2011.[19]
  • Gift cards were the most requested gift in 2012, with 59.8% of US shoppers wanting one.[20]
  • E-gift cards reach their recipients instantly (no delays, no shipping, and no hassles).

It makes sense to offer both e-gift cards (perfect for last-minute shoppers) and gift cards (great for those who do not know what to buy as gifts).

Target long-tail keywords
For ecommerce websites (especially those new on the market), it is more viable to start by targeting long-tail search queries and gradually progress towards more competitive head terms. Usually, carefully chosen torso and long-tail search queries tend to generate more qualified traffic and have less competition. However, keywords containing brand names may prove as competitive as head terms.

Targeting search queries that assist with conversion is a good tactic. Often, such search queries require content (interactive tools, comprehensive guides, etc.), but think of this content as a long-term investment. For example, targeting the search query “how to choose a digital camera” may require creating a camera finder tool. If you target “how to choose shaving cream,” you must create an extensive (eventually interactive) and visually appealing resource specifically for that.

Here are just a few benefits of targeting long-tail keywords:

  • You will achieve organic search results fast.
  • It helps gather insights about your customers.
  • It assists with improved paid search results (through better Quality Scores).

Figure 85 – This is the SERP for the query “how to choose shaving cream”. None of the top 10 results has a product finder or a product wizard. If you are in this niche, that is your opportunity.

In addition to focusing on long-tail search queries, you may need to avoid targeting head terms with very vague user intent. For example, let’s say you sell greeting cards. Would ranking for a keyword like “greeting” or “cards” be useful? No, because you cannot identify the user intent behind these keywords. You will invest a lot to brand your business for those terms, and you will get a ton of traffic if you rank at the top, but generic terms generate very few conversions at a very high cost per conversion. Instead, you can start targeting keywords like “40th birthday greeting cards for dads”, perhaps on a blog post or, if it is a popular search query, with a content-rich subcategory page.

As you can see, keyword research is far from simple or fast. It is a process that cannot be fully automated, and human review is irreplaceable, especially when bucketing keywords for relevance to your business. After going through this section, you hopefully understood that performing keyword research without considering user intent is bad.

During the next sections, we will find that keywords are part of almost every on-page SEO factor, from page titles to URLs, internal anchor text, and product copy. However, for search engines to find and analyze keywords, they must first find and reach the pages where those keywords are featured. Since ecommerce websites are a challenging crawling task for search engines, you must optimize how search bots discover relevant URLs. This process is called crawl optimization, and it is the subject of the next section of the guide.



Crawl Optimization

Length: 6,918 words

Estimated reading time: 50 minutes


Crawl Optimization

Crawl optimization aims to help search engines discover URLs efficiently. Relevant pages should be easy to reach, while less important pages should not waste the so-called “crawl budget” and should not create crawl traps. The crawl budget is defined as the number of URLs search engines can and want to crawl.

Search engines assign a crawl budget to each website, depending on the authority of the website. Generally, the authority of a site is somehow proportional to its PageRank.

The crawl budget concept is essential for e-commerce websites because they usually comprise a vast number of URLs—from tens of thousands to millions.

Suppose the technical architecture puts the search engine crawlers (robots, bots, or spiders) in infinite loops or traps. In that case, the crawl budget will be wasted on unimportant pages for users or search engines, which may leave important pages out of search engines’ indices.

Additionally, crawl optimization is where very large websites can take advantage of the opportunity to have more critical pages indexed and low PageRank pages crawled more frequently.[1]

The number of URLs Google can index increased dramatically after introducing their Percolator[2] architecture (with the “Caffeine” update[3] ). However, it is still important to check what resources search engine bots request on your website and to prioritize crawling accordingly.

Before we begin, it is important to understand that crawling and indexing are different processes. Crawling means just fetching files from websites. Indexing means analyzing the files and deciding whether they are worthy of inclusion. So, even if search engines crawl a page, they will not necessarily index it.

Crawling is influenced by several factors, such as the website’s structure, internal linking, domain authority, URL accessibility, content freshness, update frequency, and the crawl rate settings in webmaster tools accounts.

Before detailing these factors, let’s discuss tracking and monitoring search engine bots.

Tracking and monitoring bots

Googlebot, Yahoo! Slurp, and Bingbot are polite bots [4], which means that they will obey the crawling directives found in robots.txt files before requesting resources from your website. Polite bots will identify themselves to the web server so you can control them. The requests made by bots are stored in your log files and are available for analysis.

Webmaster tools, such as the ones provided by Google and Bing, only uncover a small part of what bots do on your website—e.g., how many pages they crawl or bandwidth usage data. That is useful in some ways but is not enough.

For really useful insights, you have to analyze the traffic log files. From there, you can extract information that can help identify large-scale issues.

Log file analysis was traditionally performed using the grep command line with regular expressions. But lately, there have also been desktop and web-based solutions that will make this geek analysis easier and more accessible to marketers.

On ecommerce websites, monthly log files are usually huge—gigabytes or even terabytes of data. However, you do not need all the data inside the log files to be able to track and monitor search engine bots. You only need the lines generated by bot requests. This way, you can significantly reduce the size of the log files from gigabytes to megabytes.

Using the following Linux command line (case sensitive) will extract just the lines containing “Googlebot” from one log file (access_log.processed) to another (googlebot.log):
grep “Googlebot” access_log.processed > googlebot.log

To extract similar data for Bing and other search engines, replace “Googlebot” with other bot names.

Figure 86 – The log file was reduced from 162.5Mb to 1.4Mb.

Open the bot-specific log file with Excel, go to Data –> Text to Columns, and use Delimited with Space to enter the log file data into a table format like this one:

Figure 87 – Status filters the data to get a list of all 404 Not Found errors encountered by Googlebot.

Note: you can import only up to one million rows in Excel; if you need to import more, use MS Access or Notepad++.

To quickly identify crawling issues at category page levels, chart the Googlebot hits for each category. This is where the advantage of category-based navigation and URL structure comes in handy.

Figure 88 – The/bracelets/ directory needs some investigation because there are too few bot requests compared to the other directories.

By pivoting the log file data by URLs and crawl date, you can identify content that gets crawled less often:

Figure 89 – The dates the URLs have been fetched.

This pivot table shows that although the three URLs are positioned at the same level in the hierarchy, URL number three gets crawled much more often than the other two. This is a sign that URL #3 is deemed more important.

Figure 90 – More external backlinks and social media mentions may increase crawl frequency.

Here are some issues and ideas you should consider when analyzing bot behavior using log files:

  • Analyze server response errors and identify what generates those errors.
  • Discover unnecessarily crawled pages and crawling traps.
  • Correlate days since the last crawl with rankings; when you make changes on a page, make sure to re-crawl it; otherwise, the updates won’t be considered for rankings
  • Discover whether products listed at the top of listings are crawled more often than products listed on component pages (paginated listings). Consider moving the most important products on the first page rather than having them on component pages.
  • Check the frequency and depth of the crawl.

The goal of tracking bots is to:

  • Establish where the crawl budget is used.
  • Identify unnecessary requests (e.g., “Write a Review” links that open pages with the exact content except for the product name, e.g.,,, and so on).
  • Fix the leaks.

Instead of wasting budget on unwanted URLs (e.g., duplicate content URLs), focus on sending crawlers to pages that matter to you and your users.

Another useful application of log files is to evaluate the quality of backlinks. Rent links from various external websites and point them at pages with no other backlinks (product detail pages or pages that support product detail pages). Then, analyze the spider activity on those pages. If the crawl frequency increases, that link is more valuable than a link that does not increase spider activity. An increase in crawling frequency on your pages suggests that the page you got the link from also gets often crawled, which means the linking page has good authority. Once you identified good opportunities, work to get natural links from those websites.

Flat website structure

Suppose there are no other technical impediments to crawling large websites (e.g., crawlable facets or infinite spaces[5]). In that case, a flat website architecture can help crawling by allowing search engines to reach deep pages in very few hops, therefore using the crawl budget very efficiently.

Pagination—specifically, de-pagination—is one way to flatten your website architecture. We will discuss pagination later in the Listing Pages section.

For more information on flat website architecture, please refer to the section titled The Concept of Flat Architecture in the Site Architecture section.


I will refer to accessibility in terms of optimization for search engines rather than optimization for users.

Accessibility is probably a critical factor for crawling. Your crawl budget is dictated by how the server responds to bot traffic. If your website’s technical architecture makes it impossible for search engine bots to access URLs, then those URLs will not be indexed. URLs already indexed but not accessible after a few unsuccessful attempts may be removed from search engine indices. Google crawls new websites at a low rate, then gradually increases to a level that does not create accessibility issues for your users or your server.

So, what prevents URLs and content from being accessible?

DNS and connectivity issues
Use to check for DNS issues. Everything in red and yellow needs your attention (even if it is an MX record).

Figure 91 – Report from

Using Google and Bing webmaster accounts, fix all the issues related to DNS and connectivity:

Figure 92 – Bing’s Crawl Information report.

Figure 93 – Google’s Site Errors report in the old GSC.[6]

One DNS issue you may want to pay attention to is related to wildcard DNS records, which means the web server responds with a 200 OK code for any subdomain request, even for ones that do not exist. Unrecognizable hostnames are an even more severe DNS-related problem (the DNS lookup fails when trying to resolve the domain name.)

One large retailer had another misconfiguration. Two of its country code top-level domains (ccTLDs)—the US (.com) and the UK (—resolved to the same IP. If you have multiple ccTLDs, host them on different IPs (ideally from within the country you target with the ccTLD), and check how the domain names resolve.

If your web servers are down, no one can access the website (including search engine bots). Server tools like Monitor.Us, Scoutt, or Site24x7 can help you monitor your site’s availability.

Host load
Host load represents the maximum number of simultaneous connections a web server can handle. Every page load request from Googlebot, Yahoo! Slurp, or Bingbot generates a connection with your web server. Since search engines use distributed crawling from multiple machines simultaneously, you can theoretically reach the limits of the connections, and your website will crash (especially if you are on a shared hosting plan).

Use tools like the one found at to check how many connections your website can handle. But be careful; your site can become unavailable or even crash during such tests.

Figure 94 – If your website loads in under two seconds when used by many visitors, you should be fine. The graph was generated by

Page load time
Page load time is not only a crawling factor but also a ranking and usability factor. Amazon reportedly increased its revenue by 1% for every 100ms of load time improvement,[7] and Shopzilla increased revenue by seven to 12% by decreasing the page load time by five seconds.[8]

There are plenty of articles about page load speed optimization, and they can get pretty technical. Here are a few pointers to summarize how you can optimize load times:

  • Defer loading of images until needed for display in the browser.
  • Use CSS sprites.
  • Use http2 protocols.

Figure 95 – Amazon uses CSS sprites to minimize the number of requests to their server.

Figure 96 – Apple used sprites for their main navigation.

  • Use content delivery networks for media (and other files that do not update often.
  • Implement database and cache (server-side caching) optimization.
  • Enable HTTP compression and implement conditional GET.
  • Optimize images.
  • Use expires headers.[9]
  • Ensure fast and responsive design to decrease the time to the first byte (TTFB). Use to measure TTFB. There seems to be a clear correlation between lower rankings and increased TTFB.[10]

If your URLs load slowly, search engines may interpret this as a connectivity issue, meaning they will give up crawling the troubled URLs.

The time Google spends on a page seems to influence the number of pages it crawls. The less time to download a page, the more pages are crawled.

Figure 97 – The correlation between the time spent downloading a page and the pages crawled per day seems apparent in this graph.

Broken links
This is a no-brainer. When your internal links are broken, crawlers cannot find the correct pages. Run a full crawl on the entire website with the crawling tool of your choice and fix all broken URLs. Also, use the webmaster tools provided by search engines to find broken URLs.

HTTP caching with Last-Modified/If-Modified-Since and E-Tag headers
Regarding crawling optimization, “cache” refers to a page stored in a search engine index. Note that caching is a highly technical issue, and improper caching settings may make search engines crawl and index a website chaotically.

When a search engine requests a resource on your website, it first requests your web server to check its status. The server replies with a header response. Based on the header response, search engines download or skip the resource.

Many search engines check whether the resource they request has changed since they last crawled it. If it has, they will fetch it again—if not, they will skip it. This mechanism is referred to as conditional GET. Bing confirmed it uses the If-Modified-Since header[11]. Google too.[12]

Below is the header response for a newly discovered page that supports the If-Modified-Since header when a request is made to access it.

Figure 98 – Use the curl command to get the last modified date.

When the bot requests the same URL the next time, it will add an If-Modified-Since header request. If the document has not been modified, it will respond with a 304 status code (Page Not Modified):

Figure 99 – A 304 response header

If-Modified-Since will return 304 Not Modified if the page has not been changed. If modified, the header response will be 200 OK, and the search engine will fetch the page again.

The E-Tag header works similarly but is more complicated to handle.

If your ecommerce platform uses personalization or the content on each page changes frequently, implementing HTTP caching may be more challenging, but even dynamic pages can support If-Modified-Since.[13]


There are two major types of sitemaps:

You can also submit Sitemaps in the following format: plain text files, RSS, or mRSS.
If you experience crawling and indexing issues, remember that sitemaps are just a patch for more severe problems such as duplicate content, thin content, or improper internal linking. Creating sitemaps is a good idea, but it will not fix those issues.

HTML sitemaps

HTML sitemaps are a form of secondary navigation. They are usually accessible to humans and bots through a link in the footer at the bottom of the website.

A usability study on various websites, including ecommerce websites, found that people rarely use HTML sitemaps. In 2008, only 7% of the users turned to the sitemap when asked to learn about a site’s structure,[14] down from 27% in 2002. Nowadays, the percentage is probably even lower.

Still, HTML sitemaps are handy for sending crawlers to pages at the lower levels of the website taxonomy and for creating flat internal linking.

Figure 100 – Sample flat architecture.

Here are some optimization tips for HTML sitemaps:

Use segmented sitemaps
When optimizing HTML sitemaps for crawling, it is important to remember that PageRank is divided between all the links on a page. Splitting the HTML sitemap into multiple smaller parts is a good way to create more user and search-engine-friendly pages for large websites, such as e-commerce.

Instead of a huge sitemap page that links to almost every page on your website, create a main sitemap index page (e.g., sitemap.html) and link from it to smaller sitemap component pages (sitemap-1.html, sitemap-2.html, etc.).

You can split the HTML sitemaps based on topics, categories, departments, or brands. Start by listing your top categories on the index page. How you split the pages depends on your catalog’s number of categories, subcategories, and products. You can use the “100 links per page” rule below as a guideline but do not get stuck on this number, especially if your website has good authority.

If you have over 100 top-level categories, you should display the first 100 on the site map index page and the rest on additional sitemap pages. You can allow users and search engines to navigate the sitemap using previous and next links (e.g., “see more categories”).

If you have fewer than 100 top-level categories in the catalog, you will have room to list several important subcategories as well, as depicted below:

Figure 101- A clean HTML sitemap example.

The top-level categories in this site map are Photography, Computers & Solutions, and Pro Audio. Since this business has a limited number of top-level categories, there is room for several subcategories (Digital Cameras, Laptops, Recording).

Do not link to redirects
The URLs linked from sitemap pages should land crawlers on the final URLs rather than go through URL redirects.

Enrich the sitemaps
Adding extra data by annotating links with info is good for users and can provide some context for search engines. You can add data such as product thumbnails, customer ratings, manufacturer names, etc.

These are just some suggestions for HTML sitemaps to make the pages easier for people to read and very lightly linked for crawlers. However, the best way to help search engines discover content on your website is to feed them a list of URLs in different file formats, one of which is XML.

XML Sitemaps

Modern e-commerce platforms should auto-generate XML Sitemaps, but often, the default output file is not optimized for crawling and analysis. Therefore, it is important to manually review and optimize the automated output or generate the Sitemaps using your own rules.

Unless you have concerns about competitors spying on your URL structure, it is preferable to include the path of the XML Sitemap file within the robots.txt file.

Search engines request Robots.txt every time they start a new crawling session on your website. They analyze it to see if it has been modified since the last crawl. If it wasn’t modified, search engines use the existing robots.txt cached file to determine which URLs can be crawled.

If you do not specify the location of your XML Sitemap inside robots.txt, search engines will not know where to find it (except if you submitted it within the webmaster accounts). Submitting to Google Search Console or Bing Webmaster allows access to more insights, such as how many URLs have been submitted, how many are indexed, and what errors may be present in the XML file.

Figure 102 – If you have an almost 100% indexation rate, you probably do not need to worry about crawl optimization.

Using XML Sitemaps seems to have an accelerating effect on the crawl rate:

“At first, the number of visits was stabilized at a rate of 20 to 30 pages per hour. As soon as the sitemap was uploaded through Webmaster Central, the crawler accelerated to approximately 500 pages per hour. In just a few days it reached a peak of 2,224 pages per hour. Where at first the crawler visited 26.59 pages per hour on average, it grew to an average of 1,257.78 pages per hour which is an increase of no less than 4,630.27%”.[15]

Here are some tips for optimizing XML Sitemaps for large websites:

  • Add only URLs that respond with 200 OK; too many errors and search engines will stop trusting your Sitemaps. Bing has,

“a 1% allowance for dirt in a Sitemap. Examples of dirt are if we click on a URL and we see a redirect, a 404 or a 500 code. If we see more than a 1% level of dirt, we begin losing trust in the Sitemap”.[16]

Google is less stringent than Bing; they do not care about the errors in the XML Sitemap.

  • Have no links to duplicate content and no URLs that canonicalize to different URLs—only to “end state” URLs.
  • Place videos, images, news, and mobile URLs in separate Sitemaps. You can use video sitemaps for videos, but mRSS formatting is also supported.
  • Segment the Sitemaps by topic or category and by subtopic or subcategory. For example, you can have a sitemap for your camping category – sitemap_camping.xml, another one for your Bicycles category – sitemap_cycle.xml, and another one for the Running Shoes category – sitemap_run.xml. This segmentation does not directly improve organic rankings, but it will help identify indexation issues at granular levels.
  • Create separate Sitemap files for product pages — segment by the lowest level of categorization.
  • Fix Sitemap errors before submitting your files to search engines. You can do this within your Google Search Console account using the Test Sitemap feature:

Figure 103 – The Test Sitemap feature in Google Search Console.

  • Keep language-specific URLs in separate Sitemaps.
  • Do not assign the same weight to all pages (your scoring can be based on update frequency or other business rules).
  • Auto-update the Sitemaps whenever important URLs are created.
  • Include only URLs that contain essential and important filters (see section Product Detail Pages).

You probably noticed a commonality within these tips: segmentation. It is a good idea to split your XML files as much as you can without overdoing it (e.g., just 10 URL per file), so you can identify and fix indexation issues more easily.[17]

Remember that sitemap, either XML or HTML, should not be used as a substitute for poor website architecture or other crawling issues but only as a backup. Ensure there are other paths for crawlers to reach all important pages on your website (e.g., internal contextual links).

Here are some factors that can influence the crawl budget:

Crawlers will request pages more frequently if they find more external and internal links pointing to them. Most ecommerce websites experience challenges building links to category and product detail pages, but this has to be done. Guest posting, giveaways, link bait, evergreen content, outright link requests within confirmation emails, ambassador programs, and perpetual holiday category pages are just some of the tactics that can help with link development.

Crawl rate settings
You can alter (usually decrease) the crawl rate of Googlebot using your Google Search Console account. However, changing the rate is not advisable unless the crawler slows down your web server.

With Bing’s Crawl Control feature, you can set up dayparting:

Figure 104 – Bing’s Crawl Control Interface.

Fresh content
Updating content on pages and then pinging search engines (i.e., creating feeds for product and category pages) should quickly get the crawlers to the updated content.

If you update fewer than 300 URLs per month, you can use the Fetch as Google feature inside your Google Search Console account to get the updated URLs re-crawled quickly. You can also create and submit a new XML sitemap for the updated or new pages regularly (e.g., weekly).

There are several ways to keep your content fresh. For example, you can include an excerpt of about 100 words from related blog posts on product detail pages. Ideally, the excerpt should include the product name and links to parent category pages. Every time you mention a product in a new blog post, update the excerpt of the product detail page, as well.

You can even include excerpts from articles that do not directly mention the product name if the article is related to the category in which the product can be classified.

Figure 105 – The “From Our Blog” section keeps this page updated and fresh.

Another great tactic for keeping the content fresh is continuously generating user reviews, product questions and answers, or other user-generated content.

Figure 106 – Ratings and reviews are a smart way to update pages, especially for high-demand products.

Domain authority
The higher your website’s domain authority, the more visits search engine crawlers will pay. Your domain authority increases by pointing more external links to your website—this is much easier said than done.

RSS feeds
RSS feeds are one of the fastest ways to notify search engines of new products, categories, or fresh content on your website. Here’s what Duane Forrester (former Bing’s Webmaster senior product manager) said in the past about RSS feeds:

“Things like RSS are going to become a desired way for us to find content … It is a dramatic cost savings for us”.[18]

With the help of RSS, you can get search engines to crawl the new content within minutes of publication. For example, if you write SEO content to support category and product detail pages and link smartly from such supporting pages, search engines will also request and crawl the linked-to product and category URLs.

Figure 107 – Zappos has an RSS feed for brand pages. Users (and search engines) are instantly notified every time Zappos adds a new product from a brand.

Guiding crawlers

The best way to avoid wasting the crawl budget on low-value-added URLs is to avoid creating links to those URLs in the first place. However, that is not always an option. For example, you must allow people to filter products based on three or more product attributes. Alternatively, you may want to allow users to email a friend from product detail pages. Or, you have to give users the option to write product reviews.

If you create unique URLs for “Email to a Friend ” links, you may create duplicate content.

Figure 108 – The URLs in the image above are near-duplicates. However, these URLs do not have to be accessible to search engines. Block the email-friend.php file in robots.txt

These “Email to a Friend” URLs will most likely lead to the same web form, and search engines will unnecessarily request and crawl hundreds or thousands of such links, depending on the size of your catalog. You will waste the crawl budget by allowing search engines to discover and crawl these URLs.

It would be best to control which links are discoverable by search engine crawlers and which are not. The more unnecessary requests for junk pages a crawler makes, the fewer chances it has to reach more important URLs.

Crawler directives can be defined at various levels, in this priority:
Site-level, using robots.txt.

  • Page-level, with the noindex meta tag and with HTTP headers.
  • Element-level, using the nofollow microformat.

Site-level directives overrule page-level directives, and page-level directives overrule element-level directives. It is important to understand this priority because, for a page-level directive to be discovered and followed, the site-level directives should allow access to that page. The same applies to element-level and page-level directives.

On a side note, if you want to keep content as private as possible, one of the best ways is to use server-side authentication to protect areas.


Although robots.txt files can assist in controlling crawler access, the URLs disallowed with robots.txt may still end up in search engine indices because of external backlinks pointing to the “robot-ed” URLs. This suggests that URLs blocked with robots.txt can accumulate PageRank. However, URLs blocked with robots.txt will not pass PageRank since search engines cannot crawl and index the content and the links on such pages. The exception is if the URLs were previously indexed, in which case they will pass PageRank.

It is interesting to note that pages with Google+ buttons may be visited by Google when someone clicks the plus button, ignoring the robots.txt directives.[19]

One of the biggest misconceptions about robots.txt is that it can be used to control duplicate content. There are better methods for controlling duplicate content, and robots.txt should only be used to control crawler access. That being said, there may be cases where one does not have control over how the content management system generates the content or cases when one cannot make changes to pages generated on the fly. In such situations, one can try to control duplicate content with robots.txt as a last resort.

Every ecommerce website is unique, with its own specific business needs and requirements, so there is no general rule for what should be crawled and what should not. Regardless of your website particularities, you must manage duplicate content using rel= “canonical” or HTTP headers.

While tier-one search engines will not attempt to “add to cart” and will not start a checkout process or a newsletter sign-up on purpose, coding glitches may trigger them to attempt to access unwanted URLs. Considering this, here are some common types of URLs you can block access to:

Shopping cart and checkout pages
Add to Cart, View Cart, and other checkout URLs can safely be added to robots.txt.

If the View Cart URL is, you can use the following commands to disallow crawling:

User-agent: *
# Do not crawl view cart URLs
Disallow: *viewcart.aspx
# Do not crawl add to cart URLs
Disallow: *addtocart.aspx
# Do not crawl checkout URLs
Disallow: /checkout/

The above directives mean that all bots are forbidden to crawl any URL that contains viewcart.aspx or addtocart. aspx. Also, all the URLs under the /checkout/ directory are off-limits.

Robots.txt allows limited use of regular expressions to match URL patterns, so your programmers should be able to play with many URLs. When you use regular expressions, the star symbol means “anything,” the dollar sign means “ends with, “and the caret sign means “starts with.”

User account pages
Account URLs such as Account Login can be blocked as well:
User-agent: *
# Do not crawl login URLs
Disallow: /store/account/*.aspx$

The above directive means no pages under the /store/account/ directory will be crawled.

Below are some other types of URLs that you can consider blocking.

Figure 109 – These are other types of pages you can consider blocking.

A couple of notes about the resources highlighted in yellow:

  • If you run e-commerce on WordPress, you may want to let search engine bots crawl the URLs under the tag directory. The recommendation was to block tag pages in the past, but not anymore.
  • The /includes/ directory should not contain scripts required to render page content. Block it only if you host the scripts necessary to create the undiscoverable links inside /includes/.
  • The same goes for the /scripts/ and /libs/ directories – do not block them if they contain resources necessary for rendering content.

Duplicate or near duplicate content issues such as pagination and sorting are not optimally addressed with robots.txt.
Before you upload the robots.txt file, I recommend testing it against your existing URLs. First, generate the list of URLs on your website using one of the following methods:

  • Ask for help from your programmers.
  • Crawl the entire website with your favorite crawler.
  • Use weblog files.

Then, open this list in a text editor that allows searching by regular expressions. Software like RegexBuddy, RegexPal, or Notepad++ are good choices. You can test the patterns you used in the robots.txt file using these tools, but remember that you might need to slightly rewrite the regex pattern you used in the robots.txt, depending on the software you use.

If you want to block crawlers’ access to email landing pages under the /ads/ directory, your robots.txt will include these lines:

User-agent: *
# Do not crawl view cart URLs
Disallow: /ads/
Using RegexPal, you can test the URL list using this simple regex: /ads/

Figure 110 – RegexPal automatically highlights the matched pattern.

If you work with large files that contain hundreds of thousands of URLs, use Notepad++ to match URLs with regular expressions because Notepad++ can easily handle large files.

For example, let’s say that you want to block all URLs that end with .js. The robots.txt will include this line:

Disallow: /*.js$
To find which URLs in your list match the robots.txt directives using Notepad++, you will input “\.js” in the “Find what” field and then use the Regular expression Search Mode:

Figure 111 – Regular expression search more in Notepad++

Skimming through the highlighted matching URLs marked with yellow can clear doubts about which URLs will be excluded with robots.txt. When blocking crawlers from accessing media such as videos, images, or .pdf files, use the X-Robots-Tag HTTP header[20] instead of the robots.txt file.

However, remember, if you want to address duplicate content issues for non-HTML documents, use rel= “canonical” headers.[21]

The exclusion parameter

With this technique, you selectively add a parameter (e.g., crawler=no) or a string (e.g., ABCD-9) to the URLs you want to be inaccessible, and then you block that parameter or string with robots.txt.

First, decide which URLs you want to block.

Let’s say that you want to control the crawling of the faceted navigation by not allowing search engines to crawl URLs generated when applying more than one filter value within the same filter (also known as multi-select). In this case, you will add the crawler=no parameter to all URLs generated when a second filter value is selected on the same filter.

Suppose you want to block bots when they try to crawl a URL generated by applying more than two filter values on different filters. In that case, you will add the crawler=no parameter to all URLs generated when a third filter value is selected, no matter which options were chosen or the order they were chosen. Here’s a scenario for this example:

The crawler is on the Battery Chargers subcategory page.
The hierarchy is: Home > Accessories > Battery Chargers
The page URL is:

Then, the crawler “checks” one of the Brands filter values, Noco. This is the first filter value; therefore, you will let the crawler fetch that page.
The URL for this selection does not contain the exclusion parameter:

The crawler now checks one of the Style filter values, cables. Since this is the second filter value applied, you will still let the crawler access the URL.
The URL still does not contain the exclusion parameter. It contains just the brand and style parameters:

Now, the crawler “selects” one of the Pricing filter values, the number 1. Since this is the third filter value, you will append the crawler=no to the URL.
The URL becomes:

If you want to block the URL above, the robots.txt file will contain:User-agent: *
Disallow: /*crawler=no

The method described above prevents the crawling of facet URLs when more than two filter values have been applied, but it does not allow specific control over which filters will be crawled and which ones will not. For example, if the crawler “checks” the Pricing options first, the URL containing the pricing parameter will be crawled. We will discuss faceted navigation in detail later on.

URL parameters handling

URL parameters can cause crawl efficiency problems and duplicate content issues. For example, if you implement sorting, filtering, and pagination with parameters, you will likely end up with many URLs, wasting the crawl budget. In a video about parameter handling, Google shows[22] how 158 products on generated an astonishing 380,000 URLs for crawlers.

Controlling URL parameters within Google Search Console and Bing Webmaster Tools can improve crawl efficiency, but it will not address the causes of duplicate content. You will still need to fix canonicalization issues at the source. However, since ecommerce websites use multiple URL parameters, controlling them correctly with webmaster tools may prove tricky and risky. Unless you know what you are doing, you are better off using either a conservative setup or the default settings.

URL parameters handling is mostly used to decide which pages to index and which to canonicalize.

One advantage of handling URL parameters within webmaster accounts is that page-level directives (i.e., rel= “canonical” or meta noindex) will still apply as long as the pages containing such directives are not blocked with robots.txt or other methods. However, while it is possible to use limited regular expressions within robots.txt to prevent the crawling of URLs with parameters, robots.txt will overrule page-level and element-level directives.

Figure 112 – A Google Search Console notification regarding URL parameters.

Sometimes, you do not have to play with the URL parameters settings. This screenshot shows a message saying that Google has no issues categorizing your URL parameters. You can leave the default settings if Google can easily crawl the entire website. To set up the parameters, click the Configure URL parameters link.

Figure 113 – This screenshot is for an ecommerce website with fewer than 1,000 SKUs. You can see how the left navigation generated millions of URLs.

In the previous screenshot, the limit key (used for changing the number of items listed on the category listing page) generated 6.6 million URLs when combined with other possible parameters. However, because this website has strong authority, it gets a lot of attention and love from Googlebot and does not have crawling or indexing issues.

When handling parameters, you first want to decide which ones change the content (active parameters) and which do not (passive parameters). It is best to do this with your programmers because they will know the best usage of parameters. Parameters that do not affect how content is displayed on a page (e.g., user tracking parameters) are a safe target for exclusion.

Although Google does a good job of identifying parameters that do not change content, it is still worthwhile to set them manually.

To change the settings for such parameters, click Edit:

Figure 114 – Controlling URL parameters within Google Search Console.

In our example, the parameter utm_campaign was used to track the performance of internal promotions, and it does not change the page’s content. In this scenario, choose “No: Does not affect page content (ex: track usage).”

Figure 115 – Urchin Tracking Module parameters (UTMs) can safely be consolidated to the representative URLs.

To ensure you are not blocking the wrong parameters, test the sample URLs by loading them in the browser. Load the URL and see what happens if you remove the tracking parameters. If the content does not change, then it can be safely excluded.

On a side note, tracking internal promotions with UTM parameters is not ideal. UTM parameters are designed to track campaigns outside your website. If you want to track the performance of your internal marketing banners, then use other parameter names or event tracking.

Some other common exclusion parameters you may consider are session IDs, UTM tracking parameters (utm_source, utm_medium, utm_term, utm_content, and utm_campaign), and affiliate IDs.
A word of caution is necessary here, and this recommendation comes straight from Google.[23]

“Configuring site-wide parameters may have severe, unintended effects on how Google crawls and indexes your pages. For example, imagine an ecommerce website that uses storeID in both the store locator and to look up a product’s availability in a store:
If you configure storeID to not be crawled, both the /store-locator and /foo-widget paths will be affected. As a result, Google may not be able to index both kind of URLs, nor show them in our search results. If these parameters are used for different purposes, we recommend using different parameter names”.

You can keep the store location in a cookie in the scenario above.

Things get more complicated when parameters change how the content is displayed on a page.

One safe setup for content-changing parameters is to suggest to Google how the parameter affects the page (e.g., sorts, narrows/filters, specifies, translates, paginates, others) and use the default option Let Google decide. This approach will allow Google to crawl all the URLs that include the targeted parameter.

Figure 116 – A safe setup is to let Google know that a parameter changes the content and let Google decide what to do with the parameter.

In the previous example, I knew that the mid parameter changes the content on the page, so I pointed out to Google that the parameter sorts items. However, I let Google do it when deciding which URLs to crawl.

I recommend letting Google decide because of how Google chooses canonical URLs: it groups duplicate content URLs into clusters based on internal linking (PageRank), external link popularity, and content. Then, Google finds the best URL to display in search results for each cluster of duplicate content. Since Google does not share the complete link graph of your website, you will not know which URLs are linked the most, so you may not always be able to choose the right URL to canonicalize to

  1. Google Patent On Anchor Text And Different Crawling Rates,
  2. Large-scale Incremental Processing Using Distributed Transactions and Notifications,
  3. Our new search index: Caffeine,
  4. Web crawler,
  5. To infinity and beyond? No!,
  6. Crawl Errors: The Next Generation,
  7. Make Data Useful,
  8. Shopzilla’s Site Redo – You Get What You Measure,
  9. Expires Headers for SEO: Why You Should Think Twice Before Using Them,
  10. How Website Speed Actually Impacts Search Ranking,
  11. Optimizing your very large site for search — Part 2,
  12. Matt Cutts Interviewed by Eric Enge,
  13. Save bandwidth costs: Dynamic pages can support If-Modified-Since too,
  14. Site Map Usability,
  15. New Insights into Googlebot,
  16. How Bing Uses CTR in Ranking, and more with Duane Forrester,
  17. Multiple XML Sitemaps: Increased Indexation and Traffic,
  18. How Bing Uses CTR in Ranking, and more with Duane Forrester,
  19. How does Google treat +1 against robots.txt, meta noindex, or redirected URL,!msg/webmasters/ck15w-1UHSk/0jpaBsaEG3EJ
  20. Robots meta tag and X-Robots-Tag HTTP header specifications,
  21. Supporting rel=” canonical” HTTP Headers,
  22. Configuring URL Parameters in Webmaster Tools,
  23. URL parameters,


Internal Linking Optimization

Length: 11,404 words

Estimated reading time: 1 hour, 20 minutes


The importance of external links for rankings is a well-documented SEO fact and part of conventional SEO wisdom. However, internal links can also impact rankings.

Links, either internal or from external websites, are the primary way for site visitors and search engines to discover content.

If a page does not have incoming internal links, not only may that page not be accessible to search engines for crawling and indexing, but even if it gets indexed, the page will be deemed less valuable (unless a lot of external links point to it). Examples of pages without internal links are product detail pages accessible only after an internal site search or entire catalogs available only to logged-in members.

On the other hand, if a page does not link to other internal pages, search engine robots will be at a dead end.

Internal links can lead crawlers into traps or unwanted URLs containing “thin,” duplicate, or near-duplicate content. Internal links can also put crawlers into circular referencing.

When optimizing internal linking, remember that websites are for users, not search engines. Links help users navigate and find what they want quickly and easily. Therefore, consider an approach that balances links available to users and bots. Build the internal linking for users, and then accommodate the search engines.

E-commerce websites have an interesting advantage: a large number of pages results in a large number of internal links. The larger the website and the more links pointing to a page, the more influential that page is. Strangely enough, although SEOs typically know that the more links you point to a page, the more authority the page receives, many SEOs still focus on getting links from external websites first.

However, why not optimize the lowest-hanging fruit first, the internal links? When you optimize your internal linking architecture, you do not need to hunt for external backlinks. You need to increase the relevance and authority of key pages on your website by creating quality content that attracts organic traffic and links and interlinking pages thematically.

Let’s see how ecommerce websites can use internal linking to boost relevance, avoid or mitigate duplicate content issues, and build long-tail anchor text to rank for natural language search queries.

Crawlable and uncrawlable links

Before we move forward, a quick and important note: do not mindlessly implement any of the techniques discussed in this section. Decide which solution best suits your website based on your business needs and specific situation. If you are in doubt, get help from an experienced consultant before making changes.

A crawlable link is a link that is accessible to search engine crawlers when they request a web resource from your web server.

An uncrawlable/undiscoverable link is one that search engines cannot discover after parsing the HTML code and rendering the page. However, that uncrawlable link is still accessible to users in the browser.

Uncrawlable links can be created on the client side (in the browser) using JavaScript, AJAX, or by blocking access to the resources required to generate the URLs using robots.txt. Uncrawlable links are created on purpose and are not the same as broken links, which occur accidentally. Also, uncrawlable links are not the same as hidden links (e.g., off-screen text positioned with CSS or white text on a white background).

Because the main goal of e-commerce websites is to sell online, they must be useful and present information in an easy-to-find manner. Imagine an ecommerce website that does not allow users to sort or filter 3,000 items in a single category. However, this sorting and filtering generates URLs with no value for search engines and, in some cases, limited value for users. Since the current crawling and browsing technologies depend on clicks and links, these issues are here to stay for a while.

However, why do ecommerce websites generate overhead URLs, and why can search engines access such URLs? There are plenty of reasons:

  • URLs with tracking parameters are needed for personalization or web analysis.
  • Faceted navigation can generate many overhead URLs if it is not properly controlled.
  • A/B testing can also create overhead URLs.
  • If the order of URL parameters is not enforced, you will generate overhead URLs.

So, how do you approach overhead URLs?

A compromise for offering a great user experience while helping search engines crawl complex websites is to make the overhead links undiscoverable for robots. In contrast, the links are still available to users in the browser. For example, a link that is important for users but not important for search engines can be created with uncrawlable JavaScript.

Before we look at some examples, keep the following in mind:

  • Decide whether there is an indexing or crawling issue to be addressed in the first place. Are 90%+ of your URLs indexed? If yes, you may need to build links to the other 10% of pages. Or maybe you can add more content to get those 10% pages indexed.
  • Would you hinder user experience by blocking access to content with JavaScript?
  • Hiding links from robots may qualify as cloaking, depending on the reason for the implementation. Here’s a quote from Google:

“If the reason is for spamming malicious, or deceptive behavior—or even showing different content to users than to Googlebot—then this is high-risk”.[1]

Please note that from an SEO perspective, I advocate using uncrawlable links only for the following reasons:

  • Create better crawl paths to help search engines reach important pages on your website.
  • Preserve the crawl budget and other resources (e.g., bandwidth)
  • Avoid internal linking traps (i.e., infinite loops).

I do not endorse this tactic if you want to spam or mislead unsuspecting visitors. There are a couple of methods for keeping crawlers away from overhead URLs.


Let’s say you do not want any links generated by the faceted navigation to be visible to search engines. In this scenario, embed the faceted navigation in an <iframe> and block the bots’ access to that iframe using robots.txt.

The advantage of using iframes is that they are fast to implement and remove if the results are unsatisfactory. One disadvantage is that you cannot granularly control which facets can be indexed; once the iframe source is blocked with robots.txt, no facet will be crawled.

Figure 117 – This screenshot highlights a classic faceted (left-hand navigation) implementation. This type of navigation often creates bot traps.

Intermediary directory/file

The directory implementation requires including a directory in the URL structure and then blocking that directory in robots.txt.

Let’s say that the original facet URL is:


Instead of linking to the URL above, you will link through an intermediary directory, which is then disallowed by robots.txt. The URL contains the /facets/ directory:


Your robots.txt will disallow everything placed under this directory:

User-agent: *
# Do not crawl facet URLs
Disallow: /facets/

Instead of a directory, you can also use a file in the URL. The controlled URL will include the facets.php file, which will be blocked in robots.txt.

If this was the original facet URL:


Using the robbotted file, this is how the new URL will look like:


User-agent: *
# Do not crawl faceted URLs
Disallow: *facets.php*

JavaScript and AJAX

Using JavaScript or AJAX is another method used to control access to internal links, to silo the website, and to avoid duplicate content issues at the source. Search engines can execute JavaScript statements, such as document.write(). They can also render AJAX to discover content and URLs but only to some extent,[2] and there are limitations to what they can understand. However, remember that the major search engines evolve rapidly, and in a matter of months, they might be able to execute complex JavaScript.

While SEOs usually want to make AJAX content more accessible to search engines, you aim for the opposite when you want to control internal links. You will use JavaScript or AJAX to generate the links in the browser (client-side) rather than in the raw HTML. Depending on the implementation, those URLs may not be available to bots when they fetch the HTML and render the page.

One application of this method is to generate clean internal tracking URLs in HTML and add the user tracking parameters on demand in the browser.

Let’s say you have three links on the homepage, and all point to the same URL, but each link is in a different location on the page. The first link is in the primary navigation, the second is on a product thumbnail image, and the last is in the footer. Your Merchandising team wants to track where people clicked, and they ask the Analytics team to track the click locations. The Analytics and Dev teams will tag each URL with internal tracking parameters. The three tagged links may look like these:

The trackingkey parameter in the first link communicates to the web analytics tool that the click came from the home page (indicated by the hp string) in the watches category located in the primary navigation (primary_nav). The other two URLs are similar, except the link location in the user interface is different. When the Analytics team added these tracking parameters, they created three duplicate content pages, which is undesirable.

Of course, in this scenario, you can use rel=” canonical” to point to a significant URL or use the URL Parameter Tool in Google Search Console to consolidate to a canonical URL. However, for our purpose, we want a solution that avoids creating duplicate URLs in the first place.

Here’s one way to use JavaScript to avoid generating duplicate content URLs when using internal tracking with parameters.

In the source code, your anchor element will look similar to this:

<a href=“” param-string=“trackingkey=hp-watches-primary_nav”>Watches</a>

This URL is clean of parameters, which is great for SEO.

The page featuring this link includes a JavaScript code that “listens” when users click the tracked link. When the left mouse button is pressed, the href is updated client-side, by appending the content of the param-string attribute to the URL.

This is how the URL will look like at the mousedown event:

<a href=“” param-string=“trackingkey=hp-watches-primary_nav”>Watches</a>

Now, the URL includes the internal tracking parameter trackingkey. However, the browser added the parameter; it was not present in the raw HTML code when the bot accessed the page (you can get the sample HTML and JavaScript code from here).

If you create uncrawlable links with JavaScript, remember that Google can identify links and anything that looks like a link to them, even if it is a JavaScript link. For instance, OnClick events that generate links are the most likely to be crawled. I have seen cases where Google requested and tried to crawl virtual page view URLs generated by Google Analytics.[3]

Also, note that using JavaScript to create undiscoverable links for bots can be tricky from an engineering perspective. Such links may also hinder the user experience for visitors who do not have JavaScript enabled. If your existing website does not work with JavaScript turned off, you should be fine using AJAX links. However, if your website fully degrades for non-JS users, do not sacrifice user experience for SEO.

The user-agent delivery method

This approach is controversial because it delivers content based on the user agent requesting the page. The principle is simple: When a URL request is made, identify the user agent making the request and check if it’s a search engine bot or a browser. If it is a browser, add internal tracking parameters to the URL; if it is a bot, deliver a clean URL.

Do you think this method is too close to cloaking? Let’s see how Amazon uses it to add internal tracking parameters to URLs on the fly, client-side. If you go to their Call of Duty: Ghosts – Xbox 360 page[4] while using your browser’s default user agent and mouse over the Today’s Deals link, you will get a URL that contains the tracking parameter ref:

Figure 118 – The internal tracking parameter shows up in the URL.

Now, change the user agent to Googlebot, reload the page, and mouse over the same link. This time, the URL does not include the tracking parameter. To change the browser user agent, use one of the many free browser extensions or the web dev tools in the browser.

Figure 119—When Googlebot is used as a user agent, the tracking parameter is no longer in the URL.

Below is their HTML code. The top part of this screenshot shows the code when Googlebot requests the page, and the bottom part depicts the code served to the default user agent.

Figure 120 –This “white-hat cloaking” may be OK if you only play with URL parameters that do not change the content.

Assessing internal linking

The first step towards internal linking optimization is the diagnosis. Analyzing if pages are linked properly can reveal technical and website taxonomy issues.

Using Google Search Console is one of the fastest and easiest ways to ascertain which pages are most interlinked (and therefore deemed more important by search engines).

Figure 121 – The Internal Links report in Google Search Console.

Look at the Internal Links report under the Search Traffic section in Google Search Console. Are your website’s most important pages listed at the top? For e-commerce websites, those are usually the categories listed in the primary navigation.

Notice the /shop/checkout/cart/ directory in the image above. The URL is the second most linked page on the website. This makes sense from a user standpoint because this link must be on most pages. However, the cart link is not important for search engines, so you can disallow the entire /checkout/ directory in robots.txt to prevent everything under it from being crawled.

Figure 122 – The shopping cart link is the only one followed.

Next, let’s see how each page is linked anchor text-wise. We will use the IIS SEO Toolkit, which used to be one of the best yet underestimated on-page SEO crawlers and audit tools until other desktop crawlers emerged.[5]

Figure 123 – The IIS SEO Toolkit, an indispensable on-demand desktop crawler.

You do not hear the SEO community talk much about this tool, maybe because it is Microsoft technology. However, its flexibility and extended functionalities are better than Xenu (free) and at least at par with Screaming Frog (paid).

The IIS SEO Toolkit is free, which makes it a great tool to start with. Unfortunately, its development stopped years ago, so it cannot compete with the newer crawlers.

Once you have identified and fixed all the problems reported by the IIS SEO toolkit, you can consider upgrading to an enterprise tool such as:

  • Botify – undisclosed pricing
  • DeepCrawl – $89/month (100k URLs per month)
  • OnCrawl – $69/month (100k URLs per month)
  • Screaming Frog (my personal preference)

These monthly costs are estimates based on the minimum number of URLs crawled per month as of December 2019 (prices might have changed since then).

Figure 124—The IIS SEO toolkit offers virtually thousands of ways to analyze your website, and you can slice and dice the SEO data in almost any way you can imagine.

Install the tool on your Windows machine and run your first crawl. As the name implies, it is simple to set up and does not require an IIS server. However, the toolkit uses the default IIS server in Windows, so you might need to activate the IIS server component. Also, Windows 10 users must undergo an additional fix to make it run.

Let’s see how to use the toolkit to identify major internal linking issues.

Finding broken links with this toolkit is a breeze, just like it should be with any decent crawler. The broken links report is under the Violations or Content sections.

Figure 125 – You can find broken links using the Violations or Content reports.

You already know that broken links are an issue that needs attention because they also hinder user experience. So, use the toolkit to identify and take care of them.

The general SEO wisdom is that any page should be accessible in as few clicks as possible. Four or five levels are OK and acceptable for users and bots, but any more becomes problematic.

Figure 126—The Link Depth report can uncover issues such as circular referencing, malformed URLs, and infinite spaces.

As depicted in this screenshot, URLs buried 24 levels deep suggest a problem with internal linking. In this example, the issue stemmed from malformed URLs creating circular references.

Use the Pages with Most Links report in the Links section of the tool to identify the number of outgoing links from each page on your website. Sort the data by Count to get a quick idea of where the problems are.

Figure 127 – Including 936 links on a page is a bit concerning.

The number of links on the same page template should usually be similar. However, in this case, the number of links for the product detail page type quickly increased from about 300 to around 900.

When you find such big differences, check the pages that seem off the charts. Investigate why there are so many links compared to the other pages on the same template.

The Pages with Most Links report is also available in the Violations section:

Figure 128 – Check the Violations report to identify problematic pages.

Identify hubs

The Most Linked Pages report can help you identify internal hubs. Regarding website taxonomy and internal linking, a hub is a parent with lots of children linking to it.

Usually, the largest link hubs on ecommerce websites are the home page and the category pages linked from the primary navigation. You might have internal linking issues if you see other pages at the top. The number of products under a certain category also influences how many links a category gets.

Figure 129 – The Most Linked Page report is similar to the Internal Links report in Google Search Console.

The numbers in the previous image highlight three issues:

  • The most linked page does not have a <title> tag.
  • The shopping cart URL seems to be getting too many internal links. Because the shopping cart URL is dynamic, bots will try to access it from multiple pages, which is not ideal.
  • A significant number of internal links point to 301 redirects, suggesting a link to a 301 redirect somewhere in the primary navigation. Whenever possible, link directly to the final URL.

To examine each URL more deeply and obtain additional details on how it is linked, right-click the URL you want to analyze and click View Group Details in New Query.

Figure 130 – Finding out how each URL is linked.

Then click on Add/Remove Columns and add the Link Text column. Then click on Execute at the top left to update the report.

Figure 131 – You can remove/add columns from/to your reports.

Regarding section one in the screenshot above, if a page is linked using an image link, the IIS SEO toolkit does not report the image’s alt text. This is one of the toolkit’s downsides.

I highlighted a mismatch between the anchor text and the linked page in section two. The highlighted page is linked using the “customer service” anchor text, which is wrong because the linked page is not the customer service page.

Look for this kind of mismatch in your analysis.

Next, let’s aggregate anchor text:

  1. Click on Group by.
  2. Select Link Text in the Group by tab and hit Execute.
  3. You will get a count of each anchor text pointing to that URL.
  4. To analyze a different URL, simply change the value in the Linked-URL field.

Figure 132—If a page is linked with too many varying anchor texts, you must evaluate how close the anchor texts are semantically and taxonomically.

Ideally, you consistently link to category pages using the category name, but a few variations in the anchor text are acceptable. For example, you can link to the Office Furniture category with the anchor text “office furniture” and use “furniture for the office.” When you link to product detail pages (PDPs) from product listing pages (PLPs), use the product name as the anchor text. You can vary the anchor text if you link to PDPs from blogs, user guides, or other content-rich pages. The anchor text can include details such as product attributes, brands, and manufacturers.

You must create custom reports (named “queries” in the IIS SEO Toolkit) to get an overall picture of the site-wide anchor text distribution. This is where the tool’s enormous flexibility comes in handy.

To create a custom report, go to the Dashboard, click on the Query drop-down, and select the New Link Query:

Figure 133 – Adding a New Link Query.

  1. Select the field name values in the new tab (Links) as depicted in the image above.
  2. In the Group By tab, select Link Text from the drop-down.
  3. Click Execute.

Figure 134 – You have the internal anchor text distribution for the entire website.

In the example above, notice a couple of things that need to be investigated further:

  • First, why does the most linked page have no anchor text?
  • Second, how about blocking bots’ access to the shopping cart link?

If you want to look at a fancy visualization of your hub pages, use the Export function of the IIS SEO Toolkit to generate the list of all URLs. Then, import that file into a data visualization tool.

Figure 135 – Sample internal linking graph.

The image above is a visualization example generated with Gephi. Here are some tutorials on how to generate link graphs using Google’s Fusion Tables,[6] NodeXL[7] and Gephi[8].

Problematic redirects

Using the Redirects report will help identify internal PageRank leaks, unnecessary 302 or 301 redirects, and undesirable header response codes. You can sort the issues by linking the URL to make the analysis easier and see them grouped by page.

Figure 136 – Sort by Linking-StatusCode to identify issues.

Regarding the two notes in this screenshot:

  1. The currency selection is kept in the URL rather than in a cookie. For this website, each currency selection generated a unique URL on almost every page, which is bad.
  2. Instead of linking to a URL that returns a 301 (Moved Permanently), link directly to the destination.

Figure 137 – The unnecessary redirects are also available in the Violations report.

Wrong URLs blocked by robots.txt

The toolkit can also help you identify URLs robotted by mistake. Use the Links Blocked by robots.txt report to find such URLs.

Figure 138 – The Help.aspx page is blocked with robots.txt

Do not block bot access to help pages (or similar pages, e.g., FAQs or Q&As). You want people who have questions about your products or services to be able to find such pages straight from a search engine query. The content on these pages can potentially reduce calls to customer service.

Because the help page is located under the /Common/ directory, which is blocked with robots.txt, search engines cannot access it and will not index it.

Figure 139 – All pages under the /Common directory will be blocked.

In the Links Blocked by robots.txt report, look for pages and URLs that should be indexed but are blocked by mistake.


The Protocols report displays the various protocols used to link internally to resources on the website:

Figure 140 – Do you interlink HTTPS with HTTP pages?

If your website uses non-secure HTTP and secure HTTPS protocols, what happens when visitors switch between HTTP and HTTPS pages? Do they get warning messages in the browser? Do you link to the same URL with secure and non-secure protocols?

We know that shopping carts, logins, and checkout pages should be secure, and such pages do not need to be indexed by search engines. However, it is best to switch everything to HTTPS. Remember that when you switch from non-secure HTTP to secure HTTPS, there might be a temporary drop in traffic.

Other issues

Here are some other common internal linking mistakes:

  • Inconsistent linking happens when you link to the same page using multiple URLs, for example, linking to the homepage as and When you link an internal URL, be consistent—link using a consolidated URL only.
  • Default page dispersal occurs when you link to index files rather than root directories. For example, many web admins link to index.php when linking to home pages. Instead, you have to link to the root directory, which is just the slash sign, /.
  • Case sensitivity that leads to 404 Not Found errors. For instance, Apache servers are case sensitive, so if you link to the URL Product-name.html using upper-case “P” instead of lower-case, the server may return an error.
  • Mixed URL paths happen when you link to the same file using absolute and relative paths. This is not an SEO issue per se; however, adopting standardized URL referencing helps with troubleshooting web dev issues. Also, if you use absolute paths, when content scrappers steal content, they may still leave the absolute links to your URLs.

When you assess your competitors’ internal linking from an SEO perspective, compare the source code generated with Googlebot used as a user agent and with JavaScript disabled with the source code generated when you use the browser’s default user agent. Are there any internal linking differences?

You should also analyze the internal linking differences between the cached version and the live page.

Nofollow on internal links

The nofollow microformat[9] is a Robot Exclusion Protocol that applies at the element level. It prevents PageRank and anchor text signals from being passed to linked pages. Nofollow applies to the HTML element A.

Some SEOs use the nofollow attribute, believing it will prevent the indexation of the linked-to URL. Often, we find statements similar to “nofollow the admin, account, and checkout URLs to prevent these pages from being indexed.”

Such statements are not accurate because nofollow does not prevent crawling or indexing.

Figure 141 – Interpretation of nofollow by the individual search engine, according to Wikipedia.

Matt Cutts, who worked as the head of the Webspam team at Google, says that Google does not crawl nofollow links, and here are his words:

“At least for Google, we have taken a very clear stance that those links are not even used for discovery”.[10]

However, Google’s Content Guidelines documentation states something different:

“How does Google handle nofollowed links? In general, we do not follow them”.[11]

Notice the “in general” mention in the statement above.

A test I performed some time ago with internal nofollow site-wide footer links showed that although it took about a month, Googlebot, MSNbot, and Bingbot crawled and indexed the nofollow links. Yahoo! Slurp was the only bot that didn’t request the resource.

I recommend using nofollow not to keep search engines away from content but just to prevent crawling. Remember that if you nofollow links that search engines previously discovered, those links may still be indexed. Also, if external links point to nofollow URLs, those URLs will get indexed.

Figure 142—The nofollow tag is often applied to links such as shopping carts, checkout buttons, and account logins.

A few years ago, nofollow was used to funnel PageRank to important pages, a tactic called “PageRank sculpting.” However, nowadays, most SEOs know that PageRank sculpting with nofollow no longer pays off[12], and many e-commerce websites have stopped nofollowing internal links.

However, some continue doing it, as you can see in this screencap:

Figure 143 –Instead of nofollow links like the ones in the image above, a better approach is to consolidate links into a single page.

Consolidating links is a good approach because when you nofollow a site-wide URL like “Terms of Use,” you completely remove that page from the internal links graph.[13] This means the page will not receive internal PageRank but will also not have internal PageRank to pass.

The previous example introduces a more important issue: nofollow-ing links in primary or secondary navigation. Depending on what links you nofollow, you could be making a big mistake.

It is important to know that PageRank is a renewable resource, which means it flows back and forth between pages that link to one another. According to the original formula, the PageRank metric uses a decay factor (AKA damping factor) between 10% and 15% at each iteration to avoid infinite loops[14].

Page A is the home page, and it links to category pages B and C from the primary navigation menu. To simplify, let’s assume that pages B and C do not have any external links pointing to them.

Figure 144 – An overly simplified PageRank flow.

The most important thing to understand from this diagram is that pages B and C each return PageRank to page A, which increases the PageRank for page A.

Let’s see what happens when you add rel= “nofollow” to the link pointing to Page C in the primary navigation:

Figure 145 – The nofollow attribute stops sending PageRank to page C.

When the nofollow is applied, page C stops sending internal PageRank back to page A because Page C does not receive any internal PageRank to pass.

When I researched examples for this topic, big names such as Toyota surprised me by using nofollow in the global navigation. You can see in the screenshot below how Toyota nofollow-ed all the links pointing to car models such as Yaris and Corolla.

Figure 146 – The links in the red dotted border are nofollow.

Note: PageRank was still publicly available at the time of the research. Back then, Toyota’s home page had a PageRank 7, and the Yaris page (a nofollow link in the primary navigation) had a PageRank 5. The PageRank 5 was mostly due to many external links rather than the internal linking flow.

Figure 147 -The Yaris page gets a lot of external backlinks from more than 2,500 domains.

However, the situation was different on This time, the categories linked from the primary navigation did not get many backlinks from external sources. While their home page had a PageRank 5, all Shop by Type pages had a “not ranked” PageRank.

Figure 148—Because the Shop by Type pages were linked from the primary navigation, they should have a decent amount of authority (e.g., at least PageRank 3).

The nofollow attribute on primary navigation links does not mean those pages will not appear in SERPs; search engines cached them all at that time. Also, using nofollow on those links does not mean more PageRank was passed to other follow links. It meant that the authority of the nofollow-ed URLs in the primary navigation was significantly reduced.

If some links are unimportant for users, consider removing them from the navigation altogether. Not every category needs a link in the primary navigation menu.

If you want to send link juice only to specific links or pages, here are some alternatives to nofollow:

  • Have fewer links on the linking page.
  • Move important links to prominent places.
  • If you do not want to pass link juice to certain links, make them undiscoverable for bots.
  • Block search engine bots from discovering overhead links.

Keep in mind that nofollow is not a solution for duplicate content.

Because nofollow is incorrectly used to prevent indexing, it may also be incorrectly used to prevent duplicate content issues. However, adding nofollow to links is not the best approach for controlling duplicate content. Since nofollow is not 100% crawling and indexing is fail-proof, how can it prevent indexation or duplicate content?

Internal linking optimization

Users navigate from one page to another by clicking on links. That is one of the core principles of the Internet, and it hasn’t changed since the Web’s inception. However, while links are simply a way for people to navigate within or between websites, search engines will use links as authority and relevance signals.

For search engines, though, all links are not created equal. Some links are assigned more weight based on various criteria. For example, links surrounded by text are considered more important than links in footers, as Google states in the video[15].

Links surrounded by text are called contextual text links. In contrast, links used to structure a website (for example, links in the primary and secondary navigation or breadcrumbs) are called structural or hierarchical links.

One reason contextual text links receive more search engine weight is that users often ignore structural links to go straight to the content [16] and rarely scroll to click on footer links. Search engines deem contextual text links more important than structural links, such as footer links.

Figure 149—Structural links in several types of navigation, such as primary, secondary, and faceted navigation. The contextual text links are present in the main content area.

Large websites such as ecommerce have the advantage of generating an incredible number of internal links; however, most are structural links that do not carry the same power as contextual text links. Moreover, in some cases, Google might even ignore boilerplate or structural links:

“We found that boilerplate links with duplicated anchor text are not as relevant, so we are putting less emphasis on these”.[17]

There are several ways to optimize internal linking, and there is no excuse for not maximizing SEO opportunities within your direct control.

Theoretically, a large number of factors could influence the value of an internal link [18], but we are going to limit it to the following:

  • The position of the link in the page layout.
  • The type of link, e.g., contextual versus structural link.
  • The text used in the anchor.
  • The type of link, as in an image link versus a text link. As reported in this article, an image’s alt text seems to have less ranking value than text links [19].
  • The page authority and the number of outbound links on the page.

The link’s position in the page layout (e.g., in the primary navigation, footer, or sidebar) influences how much PageRank flows to the linked-to page.[20]
Microsoft has the VIPS patent (VIPS stands for A Vision-based Page Segmentation Algorithm[21]), which talks about breaking down page layouts into logical sections. Microsoft has another paper about Block-Level PageRank, which suggests that PageRank passed out to other pages depends on the link’s location on the page.[22]

Google has a patent on “Document ranking based on semantic distance between terms in a document[23] and another patent called “Reasonable Surfer[24]. These two papers indicate that links placed in prominent places pass more PageRank than links in less important page sections.

Contextual text links are assigned more weight than primary and secondary navigation links and deemed more important than footer links. However, the keyword-rich anchor text in the primary navigation (which is present on almost every page of the website) compensates for relevance. Therefore, primary navigation links are at least as powerful as contextual text links.

Unfortunately, you can have only a limited number of anchors in the primary or secondary navigation, so you must choose carefully. However, with contextual links, you can implement many anchors because you are not limited by design space or strict anchor text labeling. For example, you may be restricted to using the anchor text “hotels” in your structural navigation. Still, on content-rich pages, you can use contextual text links such as “5-star hotels in San Francisco” or “San Francisco’s best 5-star hotels”.

Related to the link position, the concept of the First Link Rule is worth mentioning. This rule says that only the first anchor text matters to search engines when multiple links on the same page point to the same URL.[25]

Figure 150 – Each of these URL pairs points to the same URL twice but with not-so-optimal anchor text

Regarding the first pair, linking to the home page with the anchor text “home” may confuse search engines. This is because the anchor text “home” conflicts with the anchor text “home & garden products.”

Regarding the second pair, Children’s Bedroom Furniture should be a category page at a separate URL.

For the third pair, the “Decorating with Metal Beds” link points to a shopping guide, which is great. However, the link using the anchor text “modern metal beds” should point to a category page (if keyword research unveils that “modern metal beds” is an important category). For example, the link could point to the Metal Beds category page, filtered by the Style=modern.

If you want to make search engines count multiple anchor texts,[26] one of the best options is to add the hash sign (#) at the end of the URLs[27].

So, if your first link is, then all subsequent links will read like

However, if you link to the same URL with varied anchor text, you do not need to use the hash in the URL. For example, you can link to the same product page once with the product name and the second time using the product name plus the manufacturer name. Ensure the varied anchor text is related and relevant to the linked-to page.

You will often encounter multiple URLs pointing to the home page—once on the logo and once in the breadcrumbs.

Figure 151 – Both links (logo and breadcrumb) point to the homepage.

The logo’s alt text is “UGG Australia,” and the anchor text in the breadcrumb is “Home.” While having a “Home” link is good for usability, I am not a big fan of the “home” anchor text. I would either:

  • Use the brand name in the breadcrumb because, in this particular case, the brand name (UGG) is very short. Instead of “Home”, I would use “UGG Australia” or just “UGG”.
  • I would replace the anchor text “Home” in the breadcrumb with a small house icon and use the alt text “UGG Australia” for that icon.

Multiple same-page linking happens when a page contains multiple links to the same URL. On ecommerce websites, this frequently arises when links on product listing pages point to product detail page URLs. One link is on the clickable image thumbnail, and the other link is on the product name:

Figure 152 – Multiple links to the same product details page.

Figure 153 – The HTML code for the previous image.

If we look at the source code for the previous example, we will find that the alt text of the thumbnail image is “black”, and the product anchor text is “Solid Ribbon Belt”. This sends confusing relevance signals and is not optimal.


  1. The <A> element has an alt attribute but is in the wrong place because it is not allowed on the <A> element. This alt attribute was probably intended to be a title attribute.
  2. The alt texts #1 and #2 should be switched.
  3. The alt attribute on the A tag (#1) should be removed.

Figure 154 – This is the product name text link.

Let’s talk about several options for addressing multiple links generated by image thumbnails and product names:

  • Repeat the product name text in the image alt text. This is the easiest way to tackle this particular type of issue. The thumbnail’s alt text will become “Solid Ribbon Belt” in our example.
  • Wrap the image and text under a single anchor or a single link. This is not always possible, and it is not good for accessibility.
  • Deploy the URL hash if you need to use unrelated anchor text to point to the same URL.
  • Place the product name above the image (this is against usability and design conventions).
  • Code the page so that the text link is above the image link in the HTML code, while in the browser, you will use CSS to display the anchor text below the image. This is a bit complex to implement, and it is not a very good idea.

In the case of multiple same-page links, if the anchor texts are unrelated, they will send confusing relevance signals. However, PageRank will pass through both links.[28]

Now that you know contextual text links are important let’s see how user-generated content, product descriptions, brand pages, and blog posts can help you create more of them.

User-generated content
User-generated content (UGC) is one of the best ways to feed search engine bots, send engagement signals, and help users make purchasing decisions.

Figure 155 – The highlighted texts are potential internal links.

In this screenshot, you can see two typical reviews displayed on a product detail page. Reviews can add to the overall main content text and help with conversions. I highlighted in yellow some words that could be potential internal links.

Product reviews
Product reviews are one type of content-heavy user-generated content and represent a huge opportunity for generating contextual links. However, not many e-commerce websites fully take advantage of product reviews for internal linking purposes.

While researching this topic, I was surprised that only one of the top 50 online retailers was adding contextual links within user review content. For whatever reason (maybe poor SEO implementation, vendor restrictions, fear of linking out from product detail pages and losing conversions, and so on), the other retailers did not. Very few of the top 50 online retailers deployed SEO-friendly reviews. We will discuss optimizing reviews in detail in the section dedicated to product detail pages.

Figure 156 – This product is very popular, with over a thousand reviews.

In the example above, the product has 1,221 reviews. If you were to add just one contextual link within 10% of the reviews, you would create 120 powerful contextual internal links.

Product descriptions
Many ecommerce pages contain text-rich sections. Take product detail pages, for example; each product has or should have a description. These content-rich sections are great places to link up to parent categories and brand pages:

Figure 157 – The highlighted text could link to a brand page (i.e., Bobeau).

When you link from product descriptions, it is important to link to the parent category and, optionally, to other highly related categories.

Optimized brand pages

Figure 158 – Most of the time, brand pages are nothing more than product listing pages.

Nothing is wrong with listing products on a brand page, but you must make brand pages content-rich.

Suppose you want to build relevant and valuable contextual text links that send more PageRank authority to product or category pages. In that case, brand pages must include text, media, and social signals. Be creative with the content, and link smartly. Add a paragraph about the brand’s history and link to the brand’s top sellers. Alternatively, you can add interesting facts about the brand and useful reviews. Get the brand owners interviewed and publish the interview on their brand page. You can then ask for a link or a mention from their Press or News section.

Look at how Zappos improved the internal linking on their brand pages and how they carefully interlink thematically related pages:[29]

Zappos’ brand page does a good job of satisfying users and search engines:

  • Zappos uses section 1 as a sitemap to guide bots to other related pages on their website.
  • Section 2 implements brand-specific RSS feeds. Search engines are instantly notified when new products are published for a brand.
  • In section 3, you can see how they use text-rich content for contextual linking.
  • In section 4, they link to the brand’s featured products.
  • In section 5, Zappos features contextual links within user reviews.

As mentioned in the Information Architecture section, blogs can support and increase authority for category and product pages, but very few e-commerce websites fully utilize blogging.

Figure 159 – Contextual text links from the main content area carry significant authority. Make sure your content-rich pages link internally to PDPs and PLPs.

When you write blogs, link internally from the main content areas to pages on your website, ideally to product and category pages.

Figure 160—This is a good implementation of internal linking from blog posts. Although it is important not to overdo it, an exact internal anchor text match is still important.

At the risk of becoming annoying, I need to stress this: if you are not blogging, you are missing many long-tail search queries used by possible customers in the early buying stages.

Remember, you write articles not to sell or promote something but to grab long-tail traffic for informational search queries and to support pages higher in the hierarchy. The amount of content you need to create to support a category, subcategory, or product depends on how competitive each keyword is.

To create contextual links, use blog comments, user-generated content, user or customer support questions and answers, guest posts, product images with captions, user-submitted images, curated rich media, and even shop-able images.

Anchor text

The anchor text optimization principle is simple: the text used in the anchor sends relevance clues to search engines, and it must be relevant to the page it links to. For example, if the anchor text is “suitcases” and the linked-to page includes the phrase “suitcases” and other semantically related words. The anchor text in the incoming link is given more weight.

However, if you use “click here” on internal anchor text pointing to, let’s say, hotel description pages, search engines will assign less relevance to those anchors, as they are too generic and don’t communicate anything about the linked-to page. In our previous example, use the hotel names in the anchor text when linking internally to hotel description pages.

The following study was conducted on over 3,000 e-commerce and non-e-commerce websites and analyzed more than 280,000 internal links and their corresponding anchor text[30]. The study examined the most common words used in the internal anchor text. This screenshot shows those terms ranked by frequency.

Figure 161 – The study examined how 3,000 websites use anchor text in internal linking.

Seven out of 10 anchor text links could be logically consolidated into three groups, represented by the numbers in the image. This technique is called link consolidation, and it is a better alternative to link sculpting with nofollow. Remember that if the links you consolidate are in the footer, then the value of doing this is minimal.

Let’s see what anchor texts you use to link pages internally.

First, determine whether you use generic anchor texts such as “click here” or “here” on your website. After you run the crawl on your website, use the IIS SEO Toolkit to check whether The link text is irrelevant. The violation is reported under the Violations Summary section of the tool:

Figure 162—Double-click any violation titles in the Violations Summary section for more details about each error.

There are situations where it is OK to use “click here” as anchor text, for example, when you link to a page unimportant for rankings or use “click here” as a call to action. In fact, “click here” is one of the most powerful calls to action used in online marketing.

By default, the IIS SEO Toolkit searches for the words “here” and “click here” in anchors. In practice, there are more generic anchors that you should pay attention to. A more comprehensive list of generic anchors is available here.

If you want to be exhaustive with this analysis, export the anchors list from the IIS SEO Toolkit and use Excel for a deeper analysis. Here’s how to do it.

Figure 163 – Create a new link query.

In the IIS SEO Toolkit, go to the Dashboard, click the Query drop-down, and click on New Link Query.

Figure 164 – Use the settings depicted in sections (1) and (2).

Use the following settings in section (1):

  • Linked Is External Equals False.
  • Link Type Not Equal Style.
  • Link Type Not Equal Script.
  • Link Type Not Equal Image.

In the Group By section, select Link Text. Then hit Execute, sort by Count, and then click Export. This will generate the aggregated link text report. You can export the data to a .csv file.

Once the Links tab opens, right-click anywhere in the gray area and select Query—> Open Query.

Figure 165 – Importing an XML query in the IIS SEO Toolkit.

Open the file generated by the IIS SEO Toolkit using Excel, and name one of the spreadsheet Anchors. Name the first column Anchor, and list all the anchor texts. Name the second column, Occurrences, and list the occurrence count (the SEO toolkit generates this data.)

Add a third column (name it Presence) and leave it empty for now because this column will be filled in later using a VLOOKUP function.

Figure 166 – The count of occurrences for each anchor text.

Create a new spreadsheet and name it Generic anchors. Then, create two columns: Generic Words and Presence. Then, list all generic keywords in the Generic Words column. Fill the Presence column with the number “1”:

Figure 167 – Adding the number one in the Presence column will be used to match the anchors on your website with the generic anchor text list.

Now, go back to the Anchors spreadsheet and add the following VLOOKUP formula in cell C2:
=VLOOKUP(A2,’ generic anchors’!A:B,2,FALSE)

Figure 168 – VLOOKUP is a built-in Excel function that works with data organized into columns.

Copy the VLOOKUP formula down in column C. You can double-click on the tiny dot in cell C2 (the dot at the bottom right of the cell) to automatically fill column C with the VLOOKUP formula.

If there is an exact match between the anchors used on the website and the generic keywords list, the column C cells will be filled with the value “1”. You will get “#N/A” when there is no match. Sort or filter by “1”, and you will get the list of generic anchors on your website:

Figure 169 – The anchor text “Blog” is one of the most used internal anchor texts. Additionally, there are some other generic anchors such as “click here”, “here”, “home”, or “website”.

So, we identified that the “blog” anchor text is heavily used on this website; this large number suggests that it is probably a site-wide link.

Next, we will use the IIS SEO Toolkit to see which pages link to the Blog section.

You will need to open a new query by going to Dashboard –> Query –> New Link Query;

In the Field Name section, use the following settings:

Figure 170 – You can group the data by Link Text.

  • Link Type Not Equal Style.
  • Link Type Not Equal Script.
  • Link Type Not Equal Image.
  • Link Text Equals “Blog”.

In the Group By section, select Link Text (if Group By does not show up by default, you must click the Group By icon just below the Links tab. Next, hit Execute. This report will show how often “blog” was used as anchor text.

Double-clicking on “Blog” will open a detailed list of Linking URLs. Repeat the process for all generic anchor texts.

You must be more creative and replace the anchor text “blog” with something more appealing to search engines and people. Even {CompanyName}Blog is a better choice, but you could theme this anchor text even more. For example, you can use {CompanyName} Fish & Hunt Blog if you sell fishing or hunting equipment. If you sell running shoes, you could use Mad Runner’s Blog, and so on.

When you link to category or product detail pages, use the category or product names as anchor text. For instance, you will link to the product detail page with the book’s name if you sell books. You can also vary the anchor text by adding brands or product attributes to the product name.

Exact internal anchor text match still matters for e-commerce websites if you do not go overboard, for example, by spamming with site-wide footer links. Usually, it is a good idea to match the search queries with your internal anchor text as closely as possible. However, how do you know which anchors to use to link to a page that lists, let’s say, ignition systems for a 2004 Audi A3? By doing keyword research.

For example, you can break down the keywords by years, makes, models, product types, or categories if you sell auto parts. Collect keyword data from as many sources as you can: user testing, Google Analytics, Google Ads data, your webmaster accounts, competitor research, or data from your Amazon account. Put all the keywords in a master spreadsheet and remove duplicates using Excel.

Add the metrics that you want to take into consideration, and your table may look like this:

Figure 171 – I like adding a keyword ID in the first column to revert to the original data at any time by sorting by ID.

As metrics, I will consider the average monthly searches for each keyword and the number of conversions.

Now, you need to identify search patterns. You will do this by replacing each word with its corresponding product attribute or category to which it belongs. For example, you will replace “2007” or any other year with the placeholder{year}, “Chevy” or any other make with the placeholder{make}, and “grill” or any other category name with the placeholder{category}. Replace all the words until you end up with many placeholders.

Figure 172 – You can speed up this process if your programmers can write a script to replace keywords with attributes automatically.

Once you have replaced all the words with placeholders, identify the most used patterns by using pivot tables:

Figure 173 – You can identify the most used patterns using pivot tables.

For your pivot table settings, use Keyword Pattern for your rows, and for Values, use the following:

  • The sum of average monthly searches.
  • The sum of conversions.
  • The count of keyword patterns.

There you have it! The most common pattern in our example is {year}{make}{model}{category}. However, the pattern with the most searches is {make}{model}. The pattern with the most conversions is {make}{model}{category}.

By mimicking user search patterns in your internal linking, you will increase the relevance of the linked-to pages.

Anchor text variation
Despite RankBrain becoming better at understanding keyword variations, it is still a good idea to vary the internal anchor text pointing to the same URL. For ecommerce websites, the category and subcategory pages will allow only some room for keyword variations. For example, when you link to the Vancouver Hotels page, you can use “hotels in Vancouver” or “Vancouver hotels”.

When you link to a product listing page (for example, a page listing all Rebel XTi cameras), you can add the brand name (“Canon Rebel XTi”) or the product line the product belongs to (e.g., “Canon EOS Rebel XTi”).

Figure 174—The anchor text can be more varied when you link from content-rich areas such as blog posts or user guides.

Contextual text links allow more anchor text variation than structural links. Structural links are often based on rules, such as using only the product names or product names plus product attributes. Therefore, structural links are not very flexible, while contextual text links are.

For product variants (e.g., model numbers or different colors), the anchor text on the item name can contain differentiating product attributes:

Figure 175 These three SKUs are variants of the “Canon Digital Rebel XTi 10.1MP” product.

In this screenshot, the three SKUs are variants of the same product, “Canon Digital Rebel XTi 10.1MP”. The first SKU is just the camera body. The second SKU includes a lens, and the anchor text includes that detail. Similarly, the third SKU includes a lens but in a different color.

Remember to link using text that makes sense for users without forcing keywords. Also, just a reminder that when you use plurals in the anchor text (e.g., “digital cameras”), consider linking to a listing page because search queries that contain plurals usually denote that users want to see a list of items.

Merchandising and marketing teams needed to cross- and upsell, so ecommerce websites started featuring sections such as Related Items. It can have various names such as people who purchased this also purchased…, you may also like…, people also viewed…, related products, or related searches. This concept was originally introduced to help increase the average order value by increasing the number of items added to the cart by users. This tactic also helps users navigate to related products or categories.

Figure 176—The You May Also Like section in this screenshot is commonly found on e-commerce websites and is a good example of Related Items components.

SEOs realized that related items sections could also be used to:

  • Optimize internal linking by interconnecting deep pages (i.e., facets) that were otherwise impossible or difficult to connect with other navigation URLs, such as breadcrumbs.
  • Flatten the website architecture.
  • Silo the website architecture by linking to siblings and parent categories. Keep in mind that siloing with related products requires very strict business rules.

Links from “Related links” sections can be used to boost the authority of any page(s) whenever needed:

  • By linking directly from the category listing or home page, you can boost the crawling, indexing, and, eventually, the rankings of newly added products.
  • If there are products with very high value for your business, linking from the home page will send more authority to those products.
  • You can also link to houses in nearby neighborhoods on a page that lists all houses for sale in a particular district.
  • You can boost hotel description pages by linking to recently reviewed hotels from city listing pages.

If you have a lot of data to rely on, you can implement related products, categories, or searches with the help of recommendation engines. Such engines optimize the shopping experience on the fly, but often, they are implemented with uncrawlable JavaScript. One way of tackling related items implemented with JavaScript is to define and load a set of default products accessible to search engines when they request a page. You will then replace or append more items with AJAX once the page loads in the browser to improve the discoverability for users and bots. The idea is that you do not want to leave the rendering of the content to Googlebot.

Figure 177 – The related items section on the left side of the screenshot is accessible to search engines, as you can see in the cached version of the page on the right side.

On a side note, while the content of the recommendation engine is indexed, the images’ alt text could be improved. On the other hand, on the website below, the AJAX implementation prevents search engines from finding the recommended products:

Figure 178—The You May Also Like section should appear in the cached version just after the last product in the list, but it does not.

If the website above wants to flatten its website architecture by internally linking from related products, it must ensure search engines can access the links in the related products section. Use fetch and render using Google Search Console to clarify if search engines can render the items in the You May Also Like section. If it works there, it will work for search, too.

Googlebot is a headless browser. This means it is a web browser without a graphical user interface but one that can render and “see” the content on JS-powered pages.

Also, remember that what you see when you use the “cache:” operator is not the same as what Google renders at its end. Google’s source of truth is very close to what “Fetch and Render” provides in Google Search Console, while the cached version is just the raw HTML. It is most likely that Google uses both the cached and rendered versions of a page to ensure people are not spamming.

Here are a few things to consider when implementing related or recommended items:

  • If you need to add tracking parameters to recommended item URLs, do so in the browser, at mousedown or onclick events. If you cannot use click events, canonicalize the tracking parameters using Google Search Console or rel=” canonical” relationships.
  • Keep the number of recommended items low and focus on quality (three to five products should be enough).
  • If you want to provide even more recommended items, use carousels.

The website below links to a sweater and sandals PDP because those products are related to the product detail page they are featured on.

Figure 179 – You can interlink related items even if they are in different silos if it makes sense for users (e.g., link from a skirt PDP to the sandals PDP that completes the look).

Popular Searches are another internal linking tactic that can be implemented on ecommerce websites. These popular searches (which don’t necessarily come from your internal site search) can be an SEO power horse, especially for large ecommerce sites. You can automate the creation of internal links to categories, products, product listing pages, and facets at scale, which will help increase the number of internal links to those pages. You can also consider various metrics such as conversion rates, search volumes, or rankings to distribute the internal links more efficiently; the more popular a query is, the more internal links will be required for page one rankings.

Internal linking over-optimization

While internal links with exact match anchor text typically do not hurt,[31] do not overdo it. Let’s look at a few scenarios that can raise over-optimization flags.

Unnatural links to the homepage
It does not help much to replace the anchor text “home” with your primary keyword.[32]

Figure 180 This looks spammy.

If your domain or business name is “online pharmacy”, it may be fine to use keyword-rich anchor text to point to the home page; otherwise, do not do it.

Too many contextual text links
A high ratio of internal anchor text links to content is not advisable. For example, if a category description content has 100 words and you place 15 anchors in it, that is too much.

Figure 181 – Contextual links are great, but that does not mean you must abuse them, as depicted in the image above.

Contextual text links can be created either programmatically or added manually by copywriters or SEOs. In both cases, you need to define rules to avoid over-optimization.

Let’s exemplify with a set of rules for category descriptions:

  • Add links to other products from the parent category. The maximum number of products linked per 100 words is two.
  • Add links to related categories. The maximum number of related categories linked per 100 words is two.
  • The maximum number of consecutive anchor text links is two.
  • The maximum number of links with the same anchor text is one.
  • The minimum number of links per 100 words is two.

Use these rules just as guidelines and customize them based on your circumstances.

The following is an example of decently safe internal linking:

Figure 182 – The text in this paragraph flows naturally, and the anchors also seem natural.

Keyword-stuffed navigation and filtering
Some ecommerce websites try to enhance rankings for head terms like category or subcategory names by stuffing keyword-rich anchor text links in the primary navigation, similar to what you see in this screenshot:

Figure 183 – Did you notice how each subcategory link contains the upper category name?

It is not necessary to use keywords repeatedly in the main navigation. If your website architecture is properly built, search engines will understand that if the category name is Watches, all the links and products within it belong to the Watches category.

The same applies to other forms of navigation, such as faceted navigation.

Figure 184 – These links look spammy, too.

You can use properly nested list items to help search engines understand categorization so that you do not need to repeat the category name in every filter value in the left navigation.

Because PageRank is a renewable metric, having external links to category and subcategory pages provides ranking authority to the target pages and increases the amount of PageRank that flows throughout the entire website. Moreover, because it is not economically feasible to build links to individual product pages for ecommerce websites with large inventories, the link-earning efforts should be focused on category and subcategory pages. Remember that backlink building is complex and outside the scope of this course.

Focusing your link-building efforts on just a few top-performing category pages is a good idea for new websites or websites with limited marketing budgets. Still, generally, you need to diversify your targets. Once you build enough links to a category page, that page becomes a hub: it will pass link equity to pages downwards and upwards in the website hierarchy. The more hubs you build, the more natural your website will look, and the more PageRank will flow throughout it.

You can identify existing link hubs using Google Search Console and use them to your advantage. Anytime you want to boost a new page, you can tap the power of the hubs. For example, you identified that the Women’s Apparel subcategory is a hub. If you want to boost the Women’s Sleepwear category, link to it contextually from the main content on the hub page.

  1. Browser-specific optimizations and cloaking,!topic/webmasters/4sVFlIdj7d8
  2. GET, POST, and safely surfacing more of the web,
  3. Google Analytics event tracking (pageTracker._trackEvent) causing 404 crawl errors,!topic/webmasters/4U6_JgeCIJU
  4. Call of Duty: Ghosts – Xbox 360,
  5. Free SEO Toolkit,
  6. One More Great Way to Use Fusion Tables for SEO,
  7. Visualize your Site’s Link Graph with NodeXL,
  8. How To Visualize Open Site Explorer Data In Gephi,
  9. rel=” nofollow” Microformats Wiki,
  10. Interview with Google’s Matt Cutts at Pubcon,
  11. Use rel=” nofollow” for specific links:
  12. PageRank sculpting,
  13. Should internal links use rel=”nofollow”?,
  14. Damping factor,
  15. Are links in footers treated differently than paragraph links?,
  16. Is Navigation Useful?,
  17. Ten recent algorithm changes,
  18. Link Value Factors,
  19. Image Links Vs. Text Links, Questions About PR & Anchor Text Value,
  20. Are links in footers treated differently than paragraph links?,
  21. VIPS: a Vision-based Page Segmentation Algorithm,
  22. Block-Level Link Analysis,
  23. Document ranking based on semantic distance between terms in a document,,716,216.PN.&OS=pn/7,716,216&RS=PN/7,716,216
  24. Google’s Reasonable Surfer: How The Value Of A Link May Differ Based Upon Link And Document Features And User Data,
  25. Results of Google Experimentation – Only the First Anchor Text Counts,
  26. 3 Ways to Avoid the First Link Counts Rule,
  27. When Product Image Links Steal Thunder From Product Name Text Links,
  28. Do multiple links from one page to another page count?,
  29. Agave Denim,
  30. [Study] How the Web Uses Anchor Text in Internal Linking,
  31. Will multiple internal links with the same anchor text hurt a site’s ranking?,
  32. Testing the Value of Anchor Text Optimized Internal Links,



Length: 9,019 words

Estimated reading time: 1 hour



In this part of the guide, I will break down the most common sections found on home pages and describe how to optimize each for better search engine visibility and user experience. You will learn how to improve the primary navigation, which impacts almost every page on the website. I will show you how to better use the internal site search field by helping users and search engines with discoverability and findability. You will also learn how to optimize marketing, merchandising, and plain text areas.

Of all the pages on a website, home pages usually have the highest authority because of the way PageRank flows and because most of the backlinks point to it. I said usually because, in some cases, other pages can beat the home page in terms of authority. For example, if the internal linking architecture is broken or another page receives more external links, search engines might not deem the home page the most important.

Every department, including marketing, merchandising, information architecture, SEO, UX, and even the executives, wants a piece of the homepage. Hence, home pages are often cluttered with content and links to tens or even hundreds of categories, calls to action, marketing banners, etc. This makes home pages unfriendly to users and search engines.

The biggest advantage of home pages is that they pass a lot of PageRank downwards on the website taxonomy. Pages linked directly from the home page get more search engine love.

Do you need a boost for a new product or a new category? Link to it from the homepage. It is a simple concept. For example, if you want to increase authority for the most profitable or best-converting city or hotel pages, link them from the homepage.

If you want to push authority to categories or product pages but do not want to crowd the home page, add a Featured Categories section on the index page of the HTML sitemap.

Figure 185—The links within the New Top Products and Top Ink Families sections will create shorter paths for crawlers and send the linked pages more authority.

Before getting into the details, remember that when optimizing home pages, it is important to balance SEO with user experience and business goals. The same applies to all other pages on your website.

You do not want too many links on the home page; you want important links in prominent places. Also, the decision to add, remove, or consolidate links on the home page needs to consider users first and only then accommodate search engines.

Let’s see which sections appear most frequently on ecommerce home pages and then discuss optimization tactics specific to the most important of them:

  • Logo (this is the area where you display the logo and, possibly, a tagline).
  • User account (this area displays links to register and sign in, my account, order tracking, and other pages that require a login).
  • Site personalization (these are links to the country or currency selectors, store locator, color theme, etc.).
  • Search field (this is the area surrounding the internal site search field).
  • Primary navigation (this is the global navigation area).
  • Cart area –  where you list links to the shopping cart or checkout.
  • Marketing and merchandising areas (i.e., carousels, internal banners, featured products, top categories, most popular deals/brands, and so on).
  • Promotional area (e.g., wish lists, gifts).
  • Help area (e.g., FAQ, live chat, contact us, help center).
  • Footers.

Primary navigation

Theoretically, any HTML link is a navigation element. Still, for our purposes, we will refer to navigation in the context of primary and secondary navigation menus used by visitors to browse items. Primary navigation is also known as global navigation, while secondary navigation is called local navigation.

Primary navigation usually appears horizontally at the top of a web page or, sometimes, vertically as a sidebar on the left side of the page. Primary navigation is easy to identify, as it consistently appears in the same position across the entire website.

For e-commerce websites, the labels in primary navigation represent major information groups. Depending on how the information architects, marketing, and usability team structure the website, labels can organize information by departments, topics, top-level categories, target market, alphabetical order, or other ways.

Figure 186—Walmart’s primary navigation is vertical, and the labels in this screenshot are the departments.

Figure 187 – The primary navigation on Bed Bath & Beyond is horizontal and lists top-level categories.

Figure 188 – Crocs’ website’s main target market segments represent the primary navigation.

When you have many items to list in the menu, display the global navigation vertically. Use a horizontal layout when you can fit all the important labels at the top of the design.

Number of links in the primary navigation

Displaying the primary navigation horizontally limits the number of links placed at the top of the layout to between five and twelve, depending on how long the labels are. Do not worry; these are reasonable numbers for users and search engines. A vertical navigation bar is more versatile and displays more categories.

On ecommerce websites, however, the primary navigation is often supplemented by sub-navigation such as drop-downs, fly-outs, or mega menus. This navigation can substantially increase the number of links on any given page.

Figure 189 – This is a very common drop-down menu implementation. The Clothing section lists several topical links.

Usually triggered at mouse hover or drop-downs, fly-out menus pose usability issues.[1] However, mega menus seem to perform OK.[2] Refer to these two articles for more information on the usability of sub-navigation menus.

Figure 190 – This screenshot shows an example of the so-called “mega menus.” Such menus can include more than just a list of links; they can also include images.

Figure 191 – Staples handles drop-down menus in a more user-friendly manner. They gave up the standard mouse-over implementation; the sub-menu expands at a click only. Additionally, notice the red down arrow icons; such icons suggest to users that more information is displayed under the labels.

There isn’t a hard limit on how many links to list in sub-menus. I recommend using as few or as many as it makes sense for users; do not worry about search engines too much. Let me explain why.

You often hear suggestions to limit the number of links on a page to send more authority to other, possibly more important, pages. While it may be the right SEO approach on certain page types, such as product listing pages with faceted navigation, this recommendation often disregards the user experience and usability. A widely accepted SEO best practice suggests that the entire menu must be SEO-friendly, meaning that search engines should be able to crawl all the links in the menu. This is not accurate.

In practice, some ecommerce websites could benefit more from not allowing robots to crawl just about any link in sub-menus. There is no doubt about it: You should present to users as many links as make sense, but you can also make some of them undiscoverable for search engines.

This “unfriendliness” of the links cannot be categorized as cloaking if not done for cloaking purposes. After all, even Amazon uses this technique to push authority to products and categories listed in the main content area of the page, as you can see in this screenshot:

Figure 192 – Take a look at the cached version of the page. It does not include the Shop by Department links.

Their global navigation (labeled Shop by Department) presents more than 100 links to users, which helps users navigate the website. Still, the entire mega menu links are not hyperlinked for search engines, as you can see in the cached version of the page. A quick note: if the links do not appear in the cached version, it does not mean that search engines are unaware of them, as bots can execute JavaScript to discover those links. It all depends on how Amazon made those links uncrawlable.

Figure 193 – Other retailers in the top 100 obfuscate links as well.

Moreover, Amazon is not the only top retailer to do this. Here’s another retailer in the top 100 doing it as well. Their approach is slightly different; they allow access to department links, but all the top categories, such as Appliances or Electronics, are not plain links.

Figure 194—The primary navigation links are cached in this example, but the sub-menu navigation links are not.

Making the links uncrawlable may sound strange and against SEO common sense, but it is a great way to balance usability and SEO. Users will get the links they need, while search engines will have access to prioritized links, which will prevent wasting the crawl budget.

The number of links in the primary navigation also depends on the number of categories and subcategories in your taxonomy. If you have only five top category pages, each with two to five subcategories, you can list them all in the primary navigation. If you have twenty categories with ten subcategories each, you need to consider the primary navigation more.

A more radical technique for limiting the number of links in the navigation is to eliminate sub-menus. There will be no drop-downs or mega menus, just a dropline menu. A dropline menu is a sub-menu with only one line of items. You can see it in use on Ann Taylor’s website. When using a dropline menu, you must choose the labels carefully.

Figure 195 – Ann Taylor uses a dropline menu with carefully chosen labels.

Figure 196 – Aeropostale uses a minimalist approach with only four links in the primary navigation. This approach works well on apparel websites.

If you keep a few links, ensure the navigation helps users find content quickly and does not make their task more difficult.

Going back to PageRank basics, we know that the more links on a page, the less authority flows through each link. Regarding user experience, the more cluttered a page is, the more complex content findability[3] is, and the higher shopper anxiety[4] is. Therefore, it makes sense to reduce the number of options and improve user experience by minimizing decision paralysis.[5]

This is where click tracking and analysis play an important role. You identify which links are helpful for users using metrics such as the most-clicked links and remove the ones that are not. If multiple links point to the same URL, you can leave only one link or implement browser-side URL tracking parameters.

You have several tools at your disposal to track clicks and click paths. For example, Google Analytics allows for a nice visual click analysis with reports like User Flow and Behavior Flow.

Figure 197 – The Visitor Flow Report in Google Analytics.

You can also use the Navigation Summary report (found under Behavior –> Site Content –> All Pages) for path analysis in Google Analytics. You get up to 500 data points that you can export and analyze further with Excel:

Figure 198 – The Navigation Summary report can provide some interesting insight.

Other tools, like CrazyEgg or ClickTale, can also perform click analysis. Regardless of your tool, the goal is to identify links that can be removed from the navigation or consolidated into logical groups.

Homepages often link to top-selling and high-margin products, categories, or subcategories, which is good for users and search engines. If you are concerned about the number of links on the page, you can easily consolidate the links to the About Us, Contact Us, Terms of Use, and Privacy pages.

I advise testing to see whether the uncrawlable links approach or the reduced number of links approach works best for you, as each website is different in its vertical.

Navigation labels

The categories and subcategories listed in the menus should consider business goals and user testing. For example, if 20% of your categories generate 80% of the revenue, then those categories should be linked from the primary navigation. You should also experiment with other decision metrics, such as the most-searched terms on the website, the most-visited pages, etc.

Regarding the labels present in the navigation text links, there are two schools of thought in terms of SEO:

  • The labels should not contain the target keywords.
  • The labels should contain the target keywords.

What I can recommend with confidence is to:

  • Avoid forcing keywords in the primary and secondary navigation labels.
  • Avoid clever labeling[6]. Navigation labels must be easy for users to understand, enabling searchers to find the information they want quickly and easily. For example, what should the label “Inspired Living” mean on a home improvement website?
  • Design the navigation to pass the trunk test[7], making it intuitive and using clear labels.

Search engines should not be the focus when labeling primary and secondary navigation. You have to label for users, not for bots. If that means using keywords in labels, that is fine. If your labels do not require keywords to be useful for users, do not force keywords just for SEO reasons.

Sometimes, keywords can show up in the navigation naturally. For example, if an ecommerce website sells musical instruments and wants to rank for the keyword “guitars”, having a Guitars label in navigation is natural and will help.

There is an alternative for those who want to promote longer keywords in the primary navigation menus. You can use images to design the menu that includes short text labels for users while adding longer labels as keywords in the alt text of the images for search engines. However, do not spam the alt text. Additionally, you can implement a technique called image replacement[8].

See the screenshot below for an example of short labels in an image plus longer alt text. The short image labels are Shoes, Handbags, Watches, Jewelry, and Dresses.

Figure 199 – Short labels allow users to scan them easily.

However, the alt text of each image label contains longer keywords, as you can see in the HTML source code:

Figure 200 – The alt texts are longer than the image labels.

Your end goal is to create navigation menus that meet user needs and reflect their behavior on the website. Because these image labels are short, people can scan them quickly, which is great for UX.

Menu labels can represent a single category or multiple categories grouped together. Whenever you group categories into a single label, you must create separate URLs for each category.

For example, suppose you group Pharmacy, Beauty, and Health into one label (as in the image below). In that case, you need to create a separate URL for the Pharmacy category, another for Beauty, and another for the Health category. That is if you want to increase the chances of ranking separately for keywords like “pharmacy”, “health”, or “beauty”.

Figure 201 – If you want to rank for the keywords “pharmacy”, “health”, or “beauty” separately, you need separate category pages for each term.

Grouping categories works best when planning a new website or if the current website is new and you can make changes without affecting an established taxonomy.

The opposite of grouping is de-grouping, when you want to split categories under a single label to create multiple categories. Splitting URLs will be more complicated if your website has been online for a while and you grouped categories under a single URL. In the latter case, one option is to leave the grouped category and create new pages for each subcategory.

Link to canonical categories from the primary navigation. This creates short, unique crawl paths for search engine robots.

Search field

The internal site search on e-commerce websites is often enhanced with autosuggest or autocomplete functionality that displays items from popular searches or products and category names. While autocomplete may be great for users, it is implemented with AJAX, and the suggested links are inaccessible to search engines.

However, you can help improve findability for users and search engines by adding plain HTML links to popular searches directly under the search field. Track internal site searches with a web analytics tool to identify which links are best for users.

Figure 202—Google Analytics has a Site Search report that lists internal site searches. To get this list, you can use the Start Page as your primary dimension and the Search Term as a secondary dimension.

In general, the SEO  advice regarding URLs generated by the internal site searches is that those URLs should be disallowed with robots.txt. However, that is not always the case, and if you can generate high-quality internal site search results that are helpful to users and bots and meet several criteria (i.e., high CTR), then you can index those URLs. Automating the process with safeguards can lead to thousands (or more) of helpful pages. Introducing an element of human rating for such pages can also provide quality signals for Google.

You can also add product attributes or filters as links, for example, by linking to a page that sorts by size or type of shoes. Additionally, you can link to popular searches:

Figure 203 – Using related terms near the search field is great because the Search By links serve as entry points for search engines and as a discovery tool for users.

If you want to improve the user experience, these links should vary from page to page to match:

  • Top searches performed on that specific page.
  • The next PDP or CLP visited after viewing the current page. You can get this data using your web analytics tool.

On a product detail page, you can list the most used product filters for that product or the leaf category it belongs to. A leaf category is at the bottom of the e-commerce taxonomy, meaning no other categories are under it (no children).

If you use the HTML <label> element for search fields, keep in mind that this content is indexable:

Figure 204 – Consider improving the text for <label> elements.

While the <label> element will not impact rankings, if you use it along with the <input> element, consider improving the wording within the label tag.

In this example, you can see that the text “find something great” is being used in the label tag. However, this website should use something more relevant and enticing to users, specific to the page visitors are on. For example, on all Shoes category pages, you can use “search for shoes”; on all Bags category pages, you can use “search for bags”; on all Toys pages you can use “search for toys”, and so on.

Also, creating HTML links for search buttons is not a good idea since search engines will unnecessarily try to crawl anything that looks like a link.

Figure 205 – This implementation creates links that bots will try to access.

In the previous implementation, the href points to the hashmark that will become a link to the home page, with the anchor text “search”. This is not optimal, and you should avoid this kind of implementation.

Text Areas

Usually, home pages do not allow too much room for plain text content, so you find very few contextual text links going out from most home pages. Many ecommerce websites try to get around this challenge by adding text at the bottom of the page, close to the footer section, as in this example:

Figure 206 – This is a common approach on many websites.

Some websites even use CSS to position text sections at the top of the source code, while visually, the text is at the bottom of the page. This might have worked in the past to overcome the 100Kb indexing limit, but nowadays, it is useless. Remember that search engines can render pages, so they will know where that text is displayed in the UI. Keep in mind that Google stops the rendering of a page after 10,000 pixels. If your so-called “SEO content” is below that threshold, it may not get picked up.

For those who still believe in the so-called “text to code ratio”, this ratio can be altered by using the tab navigation or SEO-friendly carousels like the following ones:

Figure 207 – Carousels allows you to add more text content to home pages.

In the example above, there is plenty of plain text for search engines to feed on. There are also great internal links.

This approach makes this design implementation useful for users and search engines.

As you can see in the cached version of the page, the text content of the carousel is available to search engines for analysis and rankings:

Figure 208 – The cached version of the page.

Another method for including more text in the main content areas and for creating more contextual text links from a home page is to use tabbed navigation. While the text displayed in tabs is not given full importance in a desktop world, this will change when Google switches to mobile-first indexing.

Figure 209 – You can add plenty of plain text content in a relatively limited design space using tabbed navigation.

If you want to add more content, use expand and collapse features. However, remember not to fill that content with spam, or you may get into trouble.

If you decide to use tabbed navigation in the main content area, it is worth mentioning that users easily overlook tabs. Hence, you need to provide strong design clues to help them understand that there is more content behind this type of navigation.

Marketing and merchandising area

Sliders, carousels, or static banners are some of the most used marketing sections on home pages. Merchandizing sections include featured products, top categories, most popular deals, top brands, and so on. SEO does not own these areas, but there is often room for organic improvement.

While carousels seem to have several usability and conversion issues,[9],[10] they are still present on many ecommerce websites, from Dell and Hewlett Packard to other small and medium businesses:

Figure 210 – Carousels can pose some usability and conversion issues.

From an SEO perspective, there are two usual issues with carousels:

  • The entire carousel is built with unfriendly JavaScript.
  • The text content and the links in carousels are embedded in images.

Let’s see how you can identify unfriendly JavaScript carousels.

Unfriendly JavaScript

Custom-made carousels may use AJAX or JavaScript to populate content in the carousel dynamically, which is when you can get into SEO trouble. Carousels can be tested from an SEO point of view by placing them on a public test domain, subdomain, or page. Once Google crawls and caches the URL, look at the cached version of the page. If you see something along the lines of “loading”, or “waiting for content”, or if the content of the carousel is missing, it means that the carousel’s implementation is not that SEO friendly.

Figure 211 – The content of this carousel is not indexed. Instead, the text “loading…loading”, is.

Additionally, you can test the implementation using the Fetch and Render in Google Search Console.

If you want to send authority to specific items and index links within carousels, correct the rendered HTML code so the content of the carousel can be found and crawled by search engine bots.

Figure 212 – A product carousel at

In the example above, neither section is cached by search engines. Additionally, the items linked from Section 1 are nofollow-ed, which suggests that this online retailer does not want the links to be crawled and indexed.

However, the links in section 2 are followed, which may suggest that those items are supposed to be indexed. However, section 2 is implemented with JavaScript, which can create accessibility issues for those URLs.

Embedded text and links

To make the text and links accessible to search engines, use CSS positioning and image replacement to overlay text on background images. Another option is to create the carousels or banners in plain HTML and CSS:

Figure 213 – The carousel in the image above is built with CSS and HTML.

Figure 214—The text on this banner is also overlaid with CSS and HTML. You can check if the text is plain HTML by selecting the page content using CTRL+A. Plain text whitens out when selected in the browser.

When the imagery is more complex, and you cannot implement image replacement or plain CSS/HTML carousels, use image alt text and <area> and <map> tags to create the links.

For example, the image below embeds three call-to-action links: shop now, sign up for emails, and like us on Facebook.

Figure 215 – The three calls to action in this example are implemented with area maps.

The code deploys the HTML <map> element with three areas to make the links clickable. Each area tag has its alt text:

Figure 216 – The above image depicts the HTML source code for the previous screenshot.

If you think about it, maps inside images make sense for ecommerce websites. For example, apparel websites featuring model looks could allow users to click on the hat, the pants, or any other clothing depicted in the hero images to send users directly to product pages. The area tag will effectively make such images shoppable.

You can improve image maps by implementing SEO-friendly tooltips and hot spots that expand at a click. Notice the + and $ signs in these two carousels:

Figure 217 – The two hotspots can be used to add more text content.

When you hover the mouse over these two icons, the + and $ signs expand to provide more details. The text in the tooltips is indexable:

Figure 218 – The text in the + tooltip is indexable.

Since <map> and <area> tags are not commonly used by marketers, here are a couple of tips for you:

  • Every area element should have an alt attribute, even if it is empty (alt= “”).
  • The alt text should describe the image in 150 characters maximum.
  • According to Microsoft, the alt text should not start with the word “copyright” or with the copyright symbol [© or (c)][11] or any other character or symbol that has no search-engine relevance. Start the alt text with the most important words.
  • Avoid Flash area maps.
  • Use XML or text files for the tooltip content. This will allow copywriters to make changes easily.
  • If a particular Google ad you tested worked well, it might also help with the CTR on tooltip links since ad titles are usually fewer than 25 characters.

Merchandising areas

Products and categories linked from the merchandising areas of the home page, such as hot deals, best sellers, or top brands, tend to receive more internal link authority than products linked using structural links. Therefore, you can use these areas to push more SEO love to the products and categories that are the most important for your business.

Figure 219 – Merchandizing is common among online consumer electronics retailers.

Sometimes, the items listed in these areas are implemented with carousels. The design of the carousels should allow people to identify that they are looking at a carousel and allow them to have full control of the play, pause, next, and previous functionalities.

Many users will probably miss the black arrows and the dots that control the carousel in the previous image.

If you want to send PageRank to the items in the carousel, implement it in an SEO-friendly way, as previously discussed. When you have to use AJAX to load the items, I recommend loading a default set of items in the first slide of the carousel. This first slide is loaded in the raw HTML and is accessible to search engines. Load the next slides with AJAX when the user clicks the controller buttons.

Here are some SEO tips to optimize the items listed in the merchandising areas:

  • Wrapping the merchandising section name in an HTML heading is not mandatory, but it can be done if you want. Since there will likely be more than one merchandising area, I recommend using level 2 headings (H2) or higher. H1s are usually used on more important labels, and for e-commerce templates, it is not a good idea to have multiple H1s on the same page.

Figure 220 – A sample headings outline.

On product listing pages, the product image thumbnails and names need to be optimized to send one consolidated signal. This means that the product’s image alt text and anchor text should be the same or slightly different.

Figure 221 – The image’s alt text is the same as the anchor text.

  • Add one to three links to manufacturers, brands, or relevant product attributes whenever you have the space to do so:

Figure 222 – You can see how Amazon links to various category pages from the Top Holiday Deals section.

  • Avoid linking with generic anchor text:

Figure 223 – The “See Details” link in the image above is not optimal. Instead, the product name should be the link. If you want to keep the “see details” link, embed it in an image with the product name as alt text.

  • Do you need a “Buy Now” or “Add to Cart” link on the items listed in the merchandising areas of the home page? Do visitors add to the cart directly from the homepage? If not, consider removing such strong CTAs or replacing them with softer CTAs such as “More details” or “Find more”.

If you are keen on using add-to-cart buttons in these sections, implement them with either the HTML <button> element or using JavaScript:

Figure 224 – The “More Details” link is implemented as HTML <button> elements so it won’t be crawled.

Figure 225 – This “add to cart” link is implemented with JavaScript, which minimizes the chances of being followed and crawled.

  • If you want to add even more content inside merchandising areas, use CSS expand and collapse features:

Figure 226—When you click on the “more details” link in this example, more information about the “MacBook Pro” sale appears.

  • Use alt text and image map areas for links whenever you use images to display promotional banners.


Most of the logos are implemented as images. Sometimes, logos embed the company’s tagline, slogan, or unique selling proposition.

Alt text or image replacement?

There is much debate about how to implement logos properly, not only from an SEO point of view but also as HTML markup. Regarding SEO, some argue that using the alt text on the logo image is enough; others recommend using the image replacement technique. Regarding HTML markup, some believe logos should be wrapped in H1, while others say they should be H2 or have no heading markup.

No matter how you provide better context for a logo image, whether using alternative text or image replacement, that content will not lift rankings. Just because you use the primary keyword in the alt text of the logo (for example SiteName – Digital Cameras) does not mean that search engines will consider your website the authority for that keyword. It may help a tiny bit in obscure niches.

If you can easily and safely implement image replacement for your logo, do it. Image replacement is not spamming if you do not abuse it with keyword stuffing or other crazy stuff. W3C uses image replacement in their logo; A List Apart and Smashing Magazine use it too. MOZ used to do the same before their site redesign.

However, remember that using alt text on image logos will work just as well as image replacement.

Let’s look at a few options you can consider for the text describing the logo, implemented either with alt text or image replacement:

  1. You can use only the company name (e.g., “Staples logo”).
  2. Alternatively, use your company name plus two or three top categories (e.g., “Dell laptops, tablets, and workstations”).
  3. You can also use dynamic text that includes the brand and category names. In this case, the text changes from one page to another. For example, on the Home page you will use the alt text “Microsoft logo”, but on the Tablets page you will use the alt text “Microsoft logo – Tablets”.
  4. Another option is to use the company name plus a tagline, unique selling proposition, or slogan (e.g., “KOHL’s – expect great things”). Ideally, the slogan should be short, descriptive, and contain some keywords.

Figure 227 – Kohl’s slogan.

 5. You can also use the company name, followed by the company slogan in plain text:

Figure 228 – HP’s slogan.

HP’s slogan states the website’s focus and cleverly includes the keyword pattern{Printing Solutions}. This improves relevance for keywords like “large format printing solutions”, “commercial printing solutions”, and “industrial printing solutions”.

I like implementing option #5 wherever possible, followed by #4, #3, #2 and #1. In each case, the text must represent the logo it describes and should not be spammy.

You can provide context for search engines using the logo’s alt text or image replacement. Both should be user-friendly and are OK for SEO if you do not spam. Note that the alt text on clickable images is equivalent to the anchor text on text links, so it is worth optimizing them.

However, some implementations do not allow alternative text on logos. One such implementation is the use of CSS backgrounds or using CSS sprites.

Figure 229 – Walmart uses CSS sprites for its logo and other icons. In the cached version of the page, the alt text for the logo is missing since the logo was implemented as a CSS background.

The logo should link using the homepage canonical URL to consolidate relevance signals. This ensures that the internal reputation is not split between multiple URLs.

For example, do not link to the home page using the default URL (i.e., index.php) and the root (/). Choose a canonical version, usually the root, and link consistently. This is not an issue for search engines nowadays. However, it is a web development best practice to be consistent with your internal links, just as using lowercase only for your UTM tags values is best practice for tracking and analytics.

Wrap the logo using the Organization markup.

Google supports[12] Schema markup for organization logos[13]. This means you can markup the HTML code to specify which image should appear as the logo in the Google Knowledge Graph when someone searches for your brand name.

Simply wrap the logo using the Organization markup as in this example:[14]

<div itemscope itemtype=“”>

<a itemprop=“url” href=“”>Home</a>

<img itemprop=“logo” src=“” />


This will help get more SERP estate dedicated to your brand.

One frequent question is, “Should logos be wrapped in H1”?

Wrapping the logo in an H1 heading is highly contentious.[15] Additionally, the SEO influence of H1’s is not significant enough.[16],[17]

I think the logo should not be marked up with H1 or any other heading, for that matter. A heading is a textual element that should be marked up as text, while a logo is a branding image and should be marked up as an image.

Utility links

Site personalization, user account login, help, and cart links can be labeled as what usability experts call utilities (Steve Krug[18]) or courtesy navigation (Jesse James Garret[19]) links.

Site personalization

Some of the best-known site personalization links are shipping destination, language, and currency selectors:

Figure 230 – A “ship to” country selector.

These are links used to personalize the shopping experience based on user geolocation, shopping currency, or site language. For example, someone vacationing in France might visit a Canadian website to send a gourmet gift basket to their loved ones in Canada. You identify the visitor IP as French and decide to change (automatically or pop-up-based) the ship-to country to France, the item currency to EUR, and the language to French. However, you need to give users the ability to change these settings.

Figure 231 – In this example, the language is set to French, and the currency is set to Euro.

One of the common SEO mistakes with site personalization links, such as ship-to, language, and currency selectors, is that they create crawlable URLs, even if they are temporary 302 redirects. This happens because, in many cases, the user selection is kept in the URL, as in the image below:

Figure 232 – These crawlable URLs will create duplicate URLs for each page with currency selectors, which is undesirable.

The solution is straightforward: do not create crawlable URLs for such selectors; keep the user choices in cookies rather than URLs. You can also use AJAX to load the user choices and set up cookies.

Figure 233—When the “ship to” icon at the top right is clicked, a modal window opens for users to select. When the CONTINUE button is clicked, the choice updates are made using AJAX, and the users’ choices are kept in cookies.

Store Locator

Depending on how important web-to-store behavior is for each company, these links are present in more or less prominent places in the page layout. The link is sometimes in the masthead, at the bottom of the pages, or in the footer.

Figure 234: Walmart greatly emphasizes store locations because web-to-store visits are primordial.

Figure 235 – The store locations don’t seem too important to Nordstrom.

The Store Locations link being positioned in the footer implies, to a certain degree, that Nordstrom’s web-to-store traffic is not important. However, the mobile version of the website displays the store location icon in the primary navigation:

Figure 236 – The mobile version of Nordstrom’s website.

This mismatch between mobile and desktop designs makes me wonder if the desktop design was well planned.

The store locator link should not be nofollow or otherwise blocked access to. The link should land users on a page that lists all the store locations or that allows easy and quick location searches. Additionally, you should have a dedicated landing page for each store location and create a separate XML Sitemap for store location pages.

Cart links

These are the shopping cart and checkout links.

Figure 237—The shopping cart icon is usually located at the top right of the page. Amazon has trained shoppers to expect it there.

Perpetual mini-shopping carts[20] are often implemented with AJAX and are not crawlable. That is fine because you do not need the shopping cart or checkout pages to be crawled or indexed. On a side note, such carts are called perpetual because they display the number of items in the cart while users navigate other parts of the website. Persistent carts are the carts carried across multiple sessions when users place something in the cart, leave the website, and then return later, and the items are still in the cart.

Some prefer to nofollow these URLs to preserve the crawl budget. This will not hurt, but it is not necessary for SEO. Checkout pages can be blocked in robots.txt or noindexed with the meta robots tag at the page level.

Help links

These are the links to Contact Us, FAQ, Live Chat, Help Center, and similar pages. If your Live Chat link is not JavaScripted, block it with robots.txt. Links such as Help, FAQ, and Contact Us should be accessible to robots as plain HTML links.

You can consolidate many help links under a single menu to list more links in a limited space:

Figure 238 – The drop-down list under the Need Help? The label contains important help links for users.

On many websites, help links also appear in the footer, which means the links will be duplicated.

User account links

These links allow users to create or log into their accounts, register, track their order status, etc.

Figure 239 – Best Buy disallows the entire secure www-SSL subdomain. This subdomain hosts the user accounts, and it has been blocked. That is a good approach.

Usually, account links lead to secure HTTPS sections, and there is no need for search engines to crawl or index account pages.

You will notice that many ecommerce websites nofollow the user account links:

Figure 240 – User account links are nofollow.

The links highlighted with a red border in the image above (and below) mean they are nofollow. Both Target and OfficeMax nofollow user account pages, but this is not necessary nowadays.

Figure 241 – The red dotted border means that those links are nofollow.

Google recommends leaving the nofollow off not only for account links but also for all internal links.[21] This, they say, allows PageRank to flow freely throughout your website.

Remember that controlling the crawling of a website, no matter how it is done (e.g., with nofollow, robots.txt, or uncrawlable JavaScript), is tricky.

Here’s why:

  • If you control crawling with robots.txt, PageRank flows into robotted URLs,[22],[23], but does not flow out from those URLs.
  • If you control crawling with nofollow, PageRank does not flow into the nofollow URL,[24] but it flows out from the linked-to URL.
  • If you control with uncrawlable links, then no PageRank flows in, but it flows out (if the page was discovered by search engines using other systems).

Will your account pages help searchers if search engines index them? Do searchers use terms like “Account Login at{your_company_name”}? If not, is there any other reason to have these pages indexed? If there is no other reason, consider blocking bot access to such pages.

If you want pages completely out of search engines’ indices, first allow bots to crawl and add a noindex meta tag at the page level, then allow the pages to be crawled. This means you will block those pages with robots.txt until they have been crawled.

If you are concerned about link “juice” flow, robots.txt might be a better crawl optimization alternative to nofollow. PageRank does not flow out of robots.txt blocked pages because Google cannot crawl those pages to determine how PageRank should flow through each link. However, if pages were indexed before being blocked by robots.txt, Google knows where and how to pass PageRank. Theoretically, leaving those pages open for crawling occasionally should allow Google to crawl them temporarily and flow PageRank.

However, if you are not worried about PageRank flow, there are a few alternatives to robotted and nofollow account links:

  1. Consolidate links into groups:

Figure 242 – The Your Account section is a drop-down menu consolidating several links.

Figure 243 – These account pages are accessible with JavaScript turned off.

2. Deactivate some account links until the user signs in:

Figure 244 – When clicking sign in, the drop-down menu displays account inactive URLs.

In the example above, the links to Your Profile, Check Order Status, Points Balance, Your Coupons and Your Lists are not active HTML links until you sign in.

Additionally, the sign in, join for free, and My Location links are implemented with JavaScript and are not accessible to search engines:

Figure 245 – Search engines can’t find the sign in, join for free, and My Location links.

3. Use a modal window to log in users:

Figure 246 – Clicking on the sign in link at the top right of the page opens a JavaScript modal.

This modal window is loaded on demand, and its content is inaccessible to search engines at page load.


Footers remain one of the most abused site-wide sections of websites. This is probably because footer links still work –at least for some websites–, despite Google saying that this strategy is not acceptable, and it does not work.

Figure 247 – Look no further than Amazon to see the risky use of keyword-rich footer links.

The footer is often the last section that site users check if they cannot find the information they need anywhere else on the page. Frequently, users will not check the footer at all, so it is important not to bury links there.

Footers are usually implemented as boilerplate text and send less link authority to the linked-to pages. Yahoo! even stated that they might devalue footer links:

The irrelevant links at the bottom of a page, which will not be as valuable for a user, do not add to the quality of the user experience, so we do not account for those in our ranking”.[25]

Google sends contradictory signals to webmasters by stating that site-wide links are outside its content quality guidelines, but it does not take action against those who abuse such links.

Since footers are at the bottom of web pages, the CTR on footer links is pretty low. However, footers can be great for user experience, especially fat footers,[26] and may also be a useful internal linking tactic.

Let’s discuss some optimization ideas:

Do not repeat the primary navigation in the footer

If some links are listed in the primary navigation menu, you do not need to repeat them in the footer. The most important links for users should be somewhere in the masthead. Keep the footer for relevant but less important links.

Group links logically

If you want to reduce the number of links in the footer, consider pointing multiple URLs to a single page. If space is a concern, you can implement JavaScript drop-downs like in the image below:

Figure 248 – Instead of creating unique URLs for each topic in the Your Orders section, create just one main URL for Your Orders and include the content of all sections on that page (e.g., Order Status, Shipping & Handling, etc.)

Take a look at how Staples and YouTube implemented this. Both are very good examples of useful footers:

Figure 249—The Corporate Info “pop-up” menu links to relevant pages but takes up less visual space.

Figure 250 – Staples leaves the links open for crawling.

Also, the links in the pop-up menu are open for crawling, as you can see in the cached version of the page.

Figure 251 – Country selector on YouTube.

On YouTube, the country listing is available when the Country selector is clicked, but it’s implemented client-side. Imagine if this list was available in plain HTML on every single page of the website.

Figure 252 – The country links are not available to crawlers.

Walmart uses [+]expand and [-]collapse links to give users more options, but search engines do not have access to subcategory links:

Figure 253 – A click on the small [+] signs reveals links to several subcategories.

While the top categories links, such as Electronics, Bikes, and Toys, are crawlable, their corresponding subcategories, like laptops, Apple iPads, Tablets, and TVs, are not:

Figure 254 –This approach combines user experience and SEO.

Remove useless links

Track clicks on footer links with URL parameters or click tracking tools, such as CrazyEgg. Are you helping users with those links, or do you want an optimized footer for search engines? If people do not click on some of your “optimized” links, do not link them from the footer.

On the other hand, if a link in the footer gets a high number of clicks, consider placing it in a more prominent location on the page.

Test having versus removing links in the footer and measure how each affects conversion rates, SEO, or usability.

Do not abuse footers.

We know that search engines do not like external site-wide footer backlinks.[27] Also, we know that search engines treat links in boilerplate sections differently from contextual links, where the former do not pass as much link authority.

Footer links are not inherently bad; they do not cause SEO problems unless the footers display spammy internal links.

Footers seem to be one of the preferred sections for over-optimization. Look at the following screenshot for example:

Figure 255 – The extensive use of “women’s” and “men’s” in these anchors is unnecessary.

If you stuff exact anchor text keywords in the footer just because doing so still works, remember the Penguin penalty that may come with this tactic. That is right; Penguin is not only about external backlinks but also about internal links. For more information, see pointer number one in this video,[28] this comment[29], and this video.[30]

Figure 256 – Site-wide links with exact anchor text, like in this example, will create problems. The larger the website, the higher the chances of being filtered by Panda or Penguin.

Another trick you must avoid is placing “SEO content” far below the footer, well below the “normal” view area. Even if Google is not able to penalize you algorithmically if you do this, this page certainly will not pass a manual review.

Figure 257—The “SEO content” starts after the real footer ends, a foolish attempt to trick users and search engines. I wonder whether that is why this website had a gray toolbar, PageRank, in the past.

Tabbed navigation

If you need to provide users with more links in the footer, tabbed navigation is another option, as it will allow you to display more links. Take a look at this example:

Figure 258 – Tabbed navigation will allow you to display more links.

You can optimize tabbed navigation further by adding more text in the footer. Take a look at how 1-800 Contacts does this:

Figure 259 – A great usage of tabbed navigation in the footer.

Each link on the left side triggers a new tab. For example, when you click on Our Commitment a new tab is active (with different content in it:

Figure 260 -This is good content for a footer. Adding a few internal contextual text links would add more SEO value to this section.

Dynamic footers

Ecommerce websites can make footers more appealing to search engines by dynamically updating the content and the links in footers to be relevant to each section of the website and even to each page. This approach works best when implemented with a tabbed navigation that allows at least 150 words of content to be displayed in the footer.

Consider the same example from 1-800 Contacts. They could add a new tab featuring a relevant excerpt from a recent blog post in the footer. For example, on the Avaira brand page, the new tab name could be something like Avaira News, and the blog post excerpt would contain the brand name and, eventually, one or two links to Avaira products. The footer on the Biomedics brand page would be related to Biomedics products, and so on.

Just as properly related linking is helpful for users, adding page-specific links in the footer is also good for usability, as it customizes the shopping experience to meet users’ expectations.

You can even customize the links in the tabbed navigation. For example, on a product detail page for Avaira lenses, you can dynamically change the link from “Contact Lens Brands” to “Avaira Contact Lenses”, listing only products that Avaira manufactured.

Dynamic footers make sense if you consider user intent. These footers will help users by presenting content relevant to the page they are on and helping vary the internal anchor text. They will also reduce the occurrence of exact-match anchor text and prevent the footer from generating site-wide links. Currently, many e-commerce websites do not use this concept.

Compared to other areas of a page, footers have not evolved into something more helpful for users over the past 20+ years. Maybe you can start changing that and use it to your organic advantage simultaneously.

Debugging and support

Footers can also be used as a debugging, development, or customer-support feature.

Tagging footers with unique text containing the year and month the page was last generated or updated (e.g., “page generated September 2017”) will allow some basic crawl debugging a month or two later with the site: operator, for example, by using site:mysite “page generated September 2017”.[31]

  1. Does User Annoyance Matter?
  2. Mega Menus Work Well for Site Navigation,
  3. Findability,
  4. The Paradox of Choice,
  5. ‘How We Decide’ And The Paralysis Of Analysis,
  6. Avoid Category Names That Suck,
  7. Interface Design >Navigation > Trunk Testing,
  8. Nine Techniques for CSS Image Replacement,
  9. Auto-Forwarding Carousels and Accordions Annoy Users and Reduce Visibility,
  10. Are carousels effective?
  11. WEB1009 – The <img> or <area> tag does not have an ‘alt’ attribute with text,
  12. Using markup for organization logos,
  13. Thing > Property > logo,
  14. Thing > Organization,
  15. The H1 debate,
  16. Survey and Correlation Data,
  17. Whiteboard Friday – The Biggest SEO Mistakes SEOmoz Has Ever Made, – Check #4 starting around 5:00
  18. Do not Make Me Think: A Common Sense Approach to Web Usability, 2nd Edition –
  19. The Elements of User Experience: User-Centered Design for the Web and Beyond (2nd Edition) (Voices That Matter),
  20. Persistent Shopping Carts vs. Perpetual Shopping Carts,
  21. Should I use rel=” nofollow” on internal links to a login page?
  22. PageRank: will links pointing to pages protected by robots.txt still count?,
  23. Will a link to a page disallowed in robots txt transfer PageRank,
  24. PageRank sculpting,
  25. Eric Enge Interviews Yahoo’s Priyank Garg,
  26. SEO and Usability,
  27. Link schemes,
  28. Smarter Internal Linking – Whiteboard Friday,
  29. Smarter Internal Linking – Whiteboard Friday,
  30. Webmaster Central 2013-12-16 @ 27:52
  31. How to Build an Effective Footer,


Category & Product Listing Pages (PLPs)

Length: 21,316 words

Estimated reading time: 2 hours, 25 minutes


Category & Product Listing Pages

Those involved in e-commerce, in one way or another, refer to product detail pages (also known as PDPs) as the “money pages.” This seems to imply that many view PDPs as the most important pages for e-commerce. Because of this mindset, PDPs often get the most attention at the expense of listing pages such as product or category listing pages (CLPs).

However, listing pages are, in fact, the hubs for e-commerce websites and can collect and pass the most equity to lower and upper levels in the website hierarchy. Also, link development for e-commerce usually focuses on category and subcategory pages, so listing pages deserve more attention.

Listing pages display content in a grid or list. When these pages list products, they are referred to as product listing pages or PLPs. When the pages list categories, subcategories, guides, cities, services, etc., they are called category landing pages or category pages; sometimes, they are called intermediary category pages.

Two types of listings

Listing pages usually display one of two types of items:

  • Products – this listing displays items belonging to the currently viewed category.
  • Subcategories – this listing displays subcategories under the currently viewed category or department.

Product listings

Product lists (or grids) display thumbnail images for all the items categorized in a certain category or subcategory. This means all the items listed there share a common parent in the hierarchy.

Figure 261 – This screenshot shows a traditional product grid. All the items displayed in the main content area belong to the Guitars category.

The product list approach has the advantage of sending more authority directly to the products in the list, especially to those on the first page of the list. However, these listings can present too many options to users, who may have to sift through hundreds or thousands of products, as depicted in the image below:

Figure 262 – 2,839 clothing items in a single category will require pagination.

In many cases, showing the entire list of products belonging to a top-level category will not make sense to users. They need guidance in choosing a product, and listing thousands of items is too much and too generic.

Let’s talk about several recommendations for optimizing product listings.

Deploy an SEO-friendly Quick View functionality
Use this feature to provide more content and context for users and search engines. This functionality is usually implemented with modal windows to quickly provide product summary information without visiting the actual product detail page:

Figure 263 – A click on the QUICK LOOK button opens the modal window to the right. This approach can improve the shopping experience.

Implement this functionality with SEO-friendly JavaScript whenever possible to make it work to your advantage. For example, you can deliver more crawlable content by loading the static product description in the source code but displaying it in the browser only when Quick Look is clicked. Dynamic information such as product availability, available colors, or pricing can be loaded on-demand with AJAX.

Like any other method that displays content to users only at certain browser events, it is wise not to abuse the Quick Look implementation. This means that the content should be super-relevant and brief. Fifty to 150 words for the product description is probably more than enough.

Also, the internal linking should not be excessive; two to five links in the short product description are enough.

Additionally, you may want to consider the number of items you load in the default view, which is the view that search engines cache. If you load 20 products, each with 100-word descriptions, that is 2,000 words of content on that page. If you load 50 products, that is 5,000 words, which may be too much.

Create and improve internal algorithms to optimally display items in the list.
SEO is about increasing profits from organic traffic by optimizing for users coming from search engines. If a user lands on a category page and the first items in the list/grid do not generate profits, you are missing opportunities.

You need to design and use an algorithm that assigns a product rank to every item and organize the products based on this metric. The algorithm does not have to be complex. It can consider a few metrics, such as percentage margin, sales statistics, stock availability, proximity to user location, and even hand-picked items.
The idea is to put the profit-maximizing items first on the list.

Most sites have the best-selling or most popular items as the default view in the product list, which is good for usability because most customers will be looking for bestsellers.[1] However, that does not mean that you should not try to optimize profits by experimenting with your rankings algorithm that displays at the top the products most important to you.

Add category-specific content
Adding content to PLPs can increase their chances of ranking higher in search result pages. This applies to categories at all levels of the hierarchy.

You are probably familiar with the “SEO content” for category descriptions; many ecommerce websites have it nowadays, usually at the bottom of the page. Take a look at the screenshot on the next page:

Figure 264 – The “SEO content” is displayed after the product grid or list to allow items to be displayed above the fold. The SEO influence of this content can be improved by adding links to several internal pages.

Do you wonder if this tactic works for Newegg?

Figure 265 – They rank #2 for “LCD monitors”, above Best Buy.

Of course, other factors helped this page rank second, but that category description will also have some influence. Remember, SEO is about making small, incremental changes.

Some websites prefer to place this type of content above the listing, but this approach does not allow room for much copy, and it will push the listing down the page, as you can see in this screenshot:

Figure 266 – The category “SEO text” at the top of the listing pushes the products down the page.

In the above example, the category description is not long, but the marketing banner pushes the product grid further down.

There is no doubt that more text content can help with SEO. However, adding too much content above the product list can push the products below the fold, confusing users and negatively affecting conversion rates. On the other hand, displaying the content below the product grid is not as helpful and effective as having the content at the top of the page.

There are a few techniques to address this issue, such as collapsing/expanding more content at a click, or using JavaScript carousels. I find SEO-friendly tabbed navigation to be one of the most elegant solutions for fitting a lot of content at the top of the page. This approach is good for users and bots and can be done within a limited space without spamming.

A quick note on content behind tabs and expandable clicks: Before mobile-first indexing, such content was considered less important and given less authority. However, this changed when mobile-first indexing went live.

Let’s compare screenshots of the “before and after” tabbed navigation. The screenshot below shows how a category page looked like on REI. It displayed some content at the top of the list but did not use tabbed navigation. Notice how the content at the top pushes the listing down the page.

Figure 267 – This implementation does not require tabs.

And this is how the tabbed navigation version looks:

Figure 268 – This new design uses tabs to display more above the fold.

The Shop by Category is the default tab, which is great for users because it lists subcategories. The last tab, Expert Advice & Activities, holds a whole lot of SEO value:

Figure 269 – The content above is great for users and search engines.

The content in the previous image is not only well-written content that focuses on users and conversions rather than SEO, but it is also great “food” for search engines. This type of content targets visitors at various buying stages and will move them further into the conversion funnel, which is awesome. It will also increase the category’s chances of ranking better in the SERPs.

A quick note: REI could easily add one or two contextual text links to thematically related subcategories or products to give them some SEO equity.

The lesson is that whatever content is placed in the tabbed navigation should be useful for users, not just boilerplate text.

I mentioned it before and will say again: ecommerce websites must become content publishers to succeed in the long run. This is not an SEO strategy but rather a healthy marketing approach. Your content on each page should match the user intent targeted with that page. If the query you target with a page is generic, i.e., targeting category names, try to satisfy multiple intents on that page. I call this multi-intent content.

In addition to the great content wrapped in this tab, REI added even more content at the bottom of the subcategory grid, outside the tabbed navigation:

Figure 270 – Adding more content at the bottom of the page is intended to increase the relevance of this subcategory page.

One great implementation of SEO content at the bottom of the listing grid is on The Home Depot’s website. They placed buying guides, project guides, and category-related community content, which is great for users, and search engines will fall for it. It would be interesting to test the effects on conversions if this type of content was moved up in the layout to just above the product grid.

Creating the kind of content deployed by Home Depot is a win-win tactic because:

  • Users will get helpful content to assist with their needs and questions, which leads to better conversion rates.
  • Search engines will love such content, which leads to more organic traffic.

Figure 271 – A very useful section is listed at the end of the product listing.

Another option for adding more content to category pages is to present a link or a button to more content just above the listing. You can see it exemplified in this screenshot:

Figure 272 – When users click the View Guide button, they are taken to a new page. The guide on this new page is long and good, but it adds no value to the listing page itself.

Instead of opening the guide on a new page, a better SEO option is to open a modal window that contains an excerpt from the guide. Preload the text excerpt in the HTML code to be accessible to search engines. This modal window will contain a link to the HTML guide, so users can click on it if they need to read the entire guide.

Creating content is time and resource-consuming, so you need to identify the top-performing or best-margin categories to start with and then gradually proceed to others.

Capitalize on user-generated content (UGC)
User-generated content (UGC) is a highly valuable SEO asset. Let’s examine two types of UGC that you can implement on listing pages: product reviews and forum posts.

Product reviews
Adding relevant product reviews will influence conversion rates and search engine rankings:

Figure 273—This screenshot shows how the product reviews section is displayed at the bottom of the product listing page. Ideally, the reviews in this section match some of the products in the listing.

If the listing is paginated, the reviews should be listed on the index page and should not be repeated on paginated pages. If you have enough reviews to populate pages 2-N of the series, you might be tempted to do so, but this is not a good idea.

In such cases, you may consider increasing the number of reviews you list on the index page. Instead of listing three reviews, increase to five or ten.
When doing so, you must create rules to avoid duplicate content issues between listing and product detail pages. Such rules can be:

  • Do not display more than two reviews for the same product on the same page.
  • Display only five reviews on the same listing page.
  • on the listing page do not display the same reviews you displayed on the product page

Forum posts
Community content such as forum posts can be handy not only in the forum section of the website (of course, if you have one) but on category pages as well:

Figure 274 – Besides product reviews, relevant forum posts are listed below the category grid.

Optimize for better SERP snippets
Product listing pages can get rich snippets in Google search result pages:

Figure 275 – SERP snippet enriched with list item count. Sometimes, Google displays only the number of items in the listing; other times, it also displays a few item names.

While many ecommerce websites are interested in knowing how to get these rich snippets, Google’s official recommendations do not go into much detail:[2]

“If a search result consists mostly of a structured list, like a table or series of bullets, we will show a list of three relevant rows or items underneath the result in a bulleted format. The snippet will also show an approximate count of the total number of rows or items on the page” (for example, “40+ items” as in the screenshot above)”.

Figure 276 – Clean HTML code can help get the item count in the rich snippet.

Google can use your HTML code to generate rich snippets, meaning it does not necessarily need semantic markup like This is why it is important to keep your code clean and well-structured.

Remember that if your listing pages get rich snippets that include item names, then the description line in the SERP snippet will be shorter than the usual ones. Instead of two or three lines of text, the description snippet may be truncated to just one line of text. You may want to check the impact on SERP CTR in such cases.

Here are some tips on how to get rich snippets for category listings:

Validate the HTML code for your list
If you open a list item element but do not close it or nest elements improperly, Google will have more difficulty understanding the page structure.

Figure 277—Each product in the grid is wrapped in a properly closed list item element. Notice the DIV and UL class names.

Do not break the HTML tables
The rich snippet will display the number of items on the index page (e.g., “40+ items”) if the product grid contains 40+ items in a single table, but only if the table markup has no breaks. If something between items 10 and 11 breaks the table, Google will display “10+ items” instead. If you list your products in multiple tables, Google will display the count from only one.

Use suggestive HTML class names
It is reported[3] that using the class name “item” in the item’s DIV helps with getting rich snippets for category pages:

“Just to confirm, wrapped a few items in <div class=items> and the snippet has been updated. Took four days to appear in the SERPs”.

This advice seems to be working, at least to some extent, as you can see in the following screenshot:

Figure 278 – Notice the LI class name.

The DIV that wraps the product grid contains the word “products”, and this seems to be common among websites that get rich snippets without using semantic markup. Also, the list item class name contains the word “item”.

Figure 279 – The rich snippet for the Running Shoes category includes the number of items per page and the total number of items in this category.

A large total of items in the list may attract more clicks because a large selection is one of the things consumers look at when choosing where to click. This brings us to another optimization idea.

Reconsider the number of items in the listing
If the number of products within the currently viewed category is reasonably low and easily skimmable (e.g., 50 items in a grid of five rows by ten columns), then load them all on one page. Depending on how many other links you have on the page and your overall domain authority, you can sometimes pump up this number to 100 or even more.

Suppose you think it’s necessary to display a low number of items from a user experience perspective. In that case, you can load 50, 100, or 150 items in the source code in an SEO-friendly way and use AJAX to display only 10, 15, or 20 items in the browser to avoid information overload. You can then use AJAX to update the page content based on user requests, such as scroll down, sort, display all, and so on.

If you have thousands of products in the same category, consider breaking them into more manageable subcategories. After segmenting into smaller chunks, you can list the subcategories instead of products.

Tag product reviews with structured markup
This is a debatable tactic, so you must consider how you implement it. Search engines do not support product review markup on product listing pages and may consider such markup spam; be careful.

Figure 280 – The Review markup can only be safely used on PDPs. PLPs should not contain this markup.

I recommend not using the AggregateOffer entity in your markup on PLPs; doing so will raise spam concerns. If you want to provide some structured data, the safest entity to use on PLPs is the Offer.

To learn more about product markup, read this article.[4]

Display category-related searches below the search field
“Related searches” sections have traditionally been used to link internally to other pages and to flatten the website architecture. Here’s a classic example:

Figure 281 – The “Related Searches” section helps link internally to other pages.

Related searches are displayed to help users with discoverability and findability by providing highly related links to other pages on a website. So, why not place them closer to where users will search, such as the search field? You can see this in action on Zappos’s website:

Figure 282—The Search by links are prominently placed to push authority to the linked pages and help users.

However, on Zappos, the links are the same on every page, and they do not make sense on the Bags section of the website:

Figure 283 – Zappos displays search options as plain HTML links below the search field.

Size, Narrow Shoes, and Wide Shoes are not useful refinements for someone looking for bags. Instead, you can dynamically change these links to something related to bags, maybe by linking to a page that filters bags by Style or another attribute that suits the Bags category.

If you don’t want to use too much space to list ten or more related popular searches, you can implement a modal window that opens when you click “Popular Searches”. Make sure its content is available to search engines at page load. You can list as many popular related keywords for each category as possible.

Figure 284 – The image above depicts a possible implementation of popular searches using a modal window.

As mentioned in the Home Pages section, you can use one of the following sources to identify searches helpful to users:

  • Find the top searches performed on each category page.
  • Identify the products or subcategories most visited after viewing the category page.
  • Get the top referring keywords. Remember, Google and other commercial search engines hide search queries behind the “not provided” label.

Defer loading the product thumbnail images
When you load tens of items in a listing, many of them will likely be below the fold.[5] Loading all thumbnail images at once is neither necessary nor recommended—lazy load only when the user scrolls down to view more products.

While image deferring has little to do with rankings, it can help improve user experience by decreasing page load time.

Figure 285 – Notice how small the scroll slider is (highlighted in a red rectangle); this size conveys that the page is long. The products in the screenshot are several thousand pixels “below the fold”.

A word of caution about the meaning of fold: the “fold” has a very clear meaning in print (i.e., the physical fold of the newspaper right in the middle), but with websites, the meaning of fold is blurry. You will need to define and identify where the fold is for your website, considering the browser resolution and the devices used by most of your users.
The fold will be different on mobile than on desktop.

Remove or consolidate unnecessary links
Product lists often have redundant links. For example, this screenshot has an image link on the product thumbnail image and another link on the product name. Both links point to the same URL.

Figure 286 – The links on the product thumbnail image and the product name point to the same URL.

There are several ways to address this issue, and we have discussed this before. Please refer to the Link Position section in the Internal Linking section.

Another issue very similar to the thumbnail-product name redundancy occurs when you place a link on the review stars and the text link displaying the number of reviews for the same product:

Figure 287 – The image link on the stars and the text link on the number “6” point to the same URL.

In the example above, no links can provide strong relevance clues to search engines due to the lack of anchor text so that you can keep only one link. I would keep the links on the star images because you can add more SEO context using the alt text, and the link area on those star images is larger than the text numbers. The link on the number of reviews could eventually be JavaScript-ed.

Removing unnecessary links or other page elements can de-clutter the design, introduce white space between products, and reduce the number of links that leak authority to the wrong pages.

Figure 288 – It is unnecessary to display the Special Offers link for each product. Instead, use tooltips or display small icons or stickers to highlight such offers.

Another element frequently listed in product lists is the “add to cart” button. I am not saying you should remove it without proper analysis, but you can always A/B test to see how it influences the conversion rate.

I suggest tracking “add to cart” events and analyzing whether users add to the cart directly from product listings. If they do, go a step further and identify what type of users do that (e.g., returning customers, first-timers, etc.) In many cases, those who add directly from product listings are return customers who are very familiar with your brand, your products, and your website; usually, they know exactly what they want from you. If you remove the “add to cart” buttons, these users will know they can add products to their cart from product detail pages.

The usefulness of the “add to cart” buttons on product listings must be tested, i.e., by replacing them with other CTAs, adding more product details, or removing the buttons altogether.

Make the listing view the default view for bots (and searchers, if it makes sense)
Usually, the list view allows more room for product-related content, which is useful for users and search engines.

Figure 289 – This is the grid view. There isn’t much room to feature product info in a grid (name and price only).

Figure 290 – There is room for more product info to be displayed in a list view.

In the example above, the list view is the default view for users and search engines, but users can switch to the grid view in the interface.
At the beginning of this section, I mentioned two types of listings.

We have discussed product listings. Now, let’s talk about the second type:

Category listings

To list categories means that instead of displaying products, you show the available subcategories for a category, i.e., using thumbnail images. Category listings are implemented at the first two or three levels of a site’s category hierarchy, depending on the product catalog size. Because the number of subcategories to display is low, category listing pages usually use a grid view rather than a list view.

Let’s look at how HomeDepot implemented the subcategories grid in a user and search-engine-friendly manner:

Figure 291 – This is the first level of the hierarchy.

The first level in the hierarchy (Appliances) lists several subcategory thumbnails (Refrigerators, Ranges, Washers, etc.), as well as sub subcategory links (e.g., under Refrigerators, they display links to French Door Refrigerators, Top Freezer Refrigerators, Side By Side Refrigerators, etc.)

When you click on Refrigerators, a category listing is loading. This time, the listing displays some of the most important sub-subcategories for the Refrigerators subcategory.

Figure 292 – This is the third level of the hierarchy.

The third level in the hierarchy (Appliances > Refrigeration > Refrigerators) still lists categories instead of products. This encourages users to take a more deliberate selection path before the page displays tens or hundreds of products.

Implementing subcategory listings in the first two levels of the ecommerce hierarchy has the advantage of sending more PageRank to subcategory pages. That is better than sending more PageRank to just a few products because your link development efforts should point to category and subcategory pages. Targeting product pages with link building is not economically feasible unless you have a large budget or only a few products in the catalog. Developing external backlinks builds equity for category and subcategory pages, which further flows to PDPs.

Implementing the first two levels of the e-commerce hierarchy as subcategory listings also improves the user experience. Usability tests have shown[6] that users can be encouraged to navigate deeper into the hierarchy and make better scope selections.

The choice between product and subcategory listing depends on each website’s particularities. Subcategory listings are usually a better choice, especially for websites with large inventories. Deciding which subcategories to feature at which level of the hierarchy should be based on business rules (e.g., the top five subcategories with the highest margins or the top five bestsellers).

Here are several of my recommendations for building better subcategory listing pages:

  • To send SEO authority directly to products, add a list of featured/top items at the bottom of the listing, as shown in this example:

Figure 293 – Remember not to list too many products; five to 10 items should be enough.

  • Keep the left sidebar navigation available to users because that is the spot we have been trained to look to for secondary navigation; this navigation pattern influences conversions[7]. Also, it is easier to scan and choose from secondary navigation links.
  • The secondary navigation will not contain filters until a user reaches the point where you list products instead of subcategories.
  • Display professional subcategory thumbnails, as exemplified here.

Figure 294 – High-quality imagery reassures users that they are dealing with a serious business.

  • Add a brief description of the category whenever possible, and eventually link to buying guides or interactive product-finder tools that may help users decide which product is right for them. This is especially important if your target market is unfamiliar with the items you sell or if you sell high-ticket items.

Figure 295 – A brief description of each category may help first-time buyers understand your terminology and can provide more context for search engines.

Figure 296 – Providing guides and educational content helps increase conversions.

In the example above, the original design did not include buttons like “Find the right fridge”, “Find the right washer”, or “Find the right dryer”. However, those links can be of great value to users and help with SEO. If searchers click such buttons after landing from a generic search query (e.g., “best appliances”), the clicks will help lower bounce rates and lead to longer dwell times.

  • Use an SEO-friendly Quick View functionality to add more details about each category.

Just as this functionality works on product listings, a similar approach can be implemented for category listings.

Figure 297 – In this screenshot, I added the More Info button to the original design for illustration purposes.

Clicking on More Info will open a modal window. In this window, you can include details such as a brief explanation of the category, what users can expect to find under this category, links to more subsequent categories in the hierarchy, and even FAQs.


Breadcrumbs are a form of navigational elements and are usually displayed between the header and the main content:

Figure 298 – Breadcrumbs provide a sense of “location” for users.

For example, a breadcrumb on a website selling home improvement products might read Home > Appliances > Cooking.

In a breadcrumb structure, Home, Appliances, and Cooking are called elements, and the > sign is called a separator.

Breadcrumbs are frequently neglected as an SEO factor, but here are a few good reasons for you to pay more attention to them:

  • Breadcrumb links are very important navigational elements that communicate the location of a page in the website hierarchy to users and help them easily navigate the website.[8],[9]
  • Breadcrumbs are one of the best ways to create silos by allowing search engine bots to crawl vertically upwards in the taxonomy.
  • Breadcrumb navigation makes it easier for search engines to analyze and understand your site architecture.
  • Breadcrumbs are one of the safest places to use exact anchor text keywords.

Despite their great usability and SEO benefits, many e-commerce websites fail to implement breadcrumbs correctly for users and search engines.

Figure 299 – Can you guess what the above page is about?

Take a quick look at this screenshot. Which page do you think this is? Is it the Edition page? Maybe the Gifts page? Or the Designer Sale? Or is it the Shop by Designer page? None of these. It is the Shoes & Handbags category listing page. Did you find the label yet? It is the drop-down in the left navigation.

Using a breadcrumb on this website would help users understand their position in the hierarchy.

If usability alone has not yet convinced you to pay more attention to breadcrumbs, then maybe it is time to remind you that properly implemented breadcrumbs show directly in Bing[10] and Google search results[11]

Figure 300 – Breadcrumb-rich SERP snippets.

This screenshot shows that BestBuy, NewEgg, and Dell websites have a breadcrumb structure in the results snippet, but Walmart does not. Perhaps their HTML code for breadcrumbs is not properly marked up.

On the subject of featuring breadcrumb-rich snippets in SERPs, a Google patent[12] discusses the taxonomy of the website, internal linking, primary and secondary navigation, and structured URLs, among other things they consider when deciding to display breadcrumbs in SERPs. To increase the chances of breadcrumbs showing up in search engine result pages, implement them consistently across the website and follow Google’s official guidelines by using the Breadcrumbs structured markup with microdata or RDFa.[13]

In the past, breadcrumb-rich search result listings allowed users to click on the underlined blue SERP title and the breadcrumbs in the listing. However, Google decided not to hyperlink these links in the breadcrumbs. I believe people clicked on the intermediary category links and landed on pages that didn’t match their intent, so Google decided to retire it.

If a product belongs to multiple categories, it is OK to list multiple breadcrumbs on the same page[14] as long as the product is not categorized in too many categories. However, the first breadcrumb on the page has to be the canonical path to that product because Google picks the first breadcrumb it finds on the page.

Depth-triggered breadcrumbs

Some websites implement breadcrumbs only at a certain depth in the hierarchy, but that is not optimal for users and search engines.

Figure 301 – There are no breadcrumbs when users are on the top of the Furniture category page.

Figure 302—Breadcrumbs are displayed when users navigate to a Furniture subcategory (i.e., Bedroom Furniture). All subcategories under Bedroom Furniture will have breadcrumbs.

Depth-triggered breadcrumbs may work fine for users who start navigating from the home page, but nowadays, every page on your website could serve as an entry point for users and search engines. Therefore, it is important to feature breadcrumbs from the first level of the website taxonomy. Additionally, featuring breadcrumbs only on some pages and not on others may confuse users.

Figure 303 – The category name is often displayed in the breadcrumbs.

Repeating the category name in the breadcrumb and heading is OK. However, the last element in the breadcrumb should not be linked. That element will contain a self-referencing link to the active page, confusing users.

There are three types of breadcrumbs, depending on how they are implemented:[15] path-based, location-based, and attribute-based.

Path-based breadcrumbs

This type of breadcrumb shows the path users have taken within the website to get to the current page. It is the “this is how you got here” clue for users. The breadcrumbs will dynamically update to reflect the user’s historical navigation path. Page view history is achieved with either URL tagging or session-based cookies.

Implementing this type of breadcrumb anywhere except internal site search result pages is not a good idea. Users landing from search engines can reach deep sections inside a website without ever needing to navigate through it, making a path-based breadcrumb meaningless for users. The same applies to search engine bots, which can reach deep pages on your website from external referral sources.

Location-based breadcrumbs

This is the most popular type of breadcrumb, and it indicates the position of the current page within the website hierarchy. It is the “you are here clue” for users. This type of breadcrumbs keeps users on a fixed navigation path based on the website’s taxonomy, no matter which previous pages they visited during navigation. This is the type of breadcrumb recommended by taxonomists[16] and usability experts.[17]

On top-level category pages, the breadcrumb will be just one link to the home page, while the category name will be plain text (not a hyperlink).

Figure 304 – The category name is not hyperlinked because this is the active page. This is the correct behavior.

The first element in the breadcrumb should always be a link to your homepage, but it does not necessarily have to use the anchor text “home page” or “home”. You can use the company name instead or a house icon with the company name in the alt text.

The subsequent breadcrumb levels are the category and subcategory names used in your taxonomy. Also, please don’t link a page to itself, as it will confuse users.

Figure 305 – There are instances when a keyword in the anchor text link pointing to the home page may make sense (for example, when your business name is The Furniture Store). But even then, use it with caution.

Attribute-based breadcrumbs

As the name suggests, attribute-based breadcrumbs use product attributes or filter values (such as style, color, or brand ) to create navigation that is presented in a breadcrumb-like fashion. This type of breadcrumb is the “this is how you filtered our items” clue for users:

Figure 306 – This page presents the breadcrumbs as filters.

If you click on Bed &Bath and then on Comforter Sets, you will see the breadcrumbs listed at the top. In the example above, the breadcrumb elements are not links (the “X” signs are links).

Technically speaking, these are not breadcrumbs but rather filter values. However, this implementation mimics the traditional breadcrumbs usage, and users will expect the filters to be clickable, just as they expect the breadcrumbs to be displayed horizontally and not vertically.

I usually do not recommend replacing categories with filters. Choosing between a category and a filter can be based on search volumes. For example, having separate categories for red shoes sizes 11″ does not make sense. Size + Color selections are product attributes that translate into a left navigation filter.


You need to separate each element in the breadcrumb trail clearly; you can divide the elements using separators. The most common separator between breadcrumb elements is the “greater than” sign (>). Other good options may include right-pointing double-angle quotation marks (»), slashes (/) or arrows (→). Remember to mark the separators with the correct HTML entities.[18]


SEO for category pages starts to get complicated when the listings need pagination. Pagination occurs on e-commerce websites because of the large number of items that must be segmented across multiple paginated pages (also known as component pages). It usually occurs on product listing pages and internal site search result pages.

Figure 307—The pagination in the example above spans 113 pages, which is way too much for users to handle, and it will be tricky to optimize for bots.

If pagination occurs on pages that list other subcategories instead of products, you should revise making subcategories available to users without pagination. You can achieve that by increasing the number of subcategories you list on a page or by breaking the subcategories into sub-subcategories.

Pagination is one of the oldest issues found on websites with large sets of items, and addressing it was like aiming at a moving target. As of 2019, the most recommended approach is rel=“prev” and rel=“next”.

However, a couple of tactics addressed pagination even before Google introduced these relationships at the end of 2011. Such tactics included noindexing all pages except the first page or using a view-all page.

To make pagination even more intriguing, Google says one option for handling pagination is to “leave as-is” [19], suggesting that they can identify a canonical page and handle pagination well.

However, anything you can do to help search engines better understand your website and crawl it more efficiently is advantageous. The question is not whether you need to deal with pagination but how to do so.

From an SEO perspective, a “simple” functionality such as pagination can cause serious issues with search engines’ ability to crawl and index your site content. Let’s examine several concerns regarding pagination.

Crawling issues
A listing with thousands of items will need pagination since a huge listing like that will not help either users or search engines. However, pagination can screw up a flat website architecture like nothing else.

For instance, in this example, it may take search engines about 15 clicks to reach page 50 of the series. If the only way to reach the products listed on page 50 is by going through this pagination page by page, those products will have a very low chance of being discovered. Probably, those pages will be crawled less frequently, which is not ideal.

Figure 308 – We are on page 7 in the series, and this page lists an additional three pagination URLs (8, 9, and 10) compared to page 1.

In our pagination example, there are missing component pages in the series (pages 2 and 3), which means that bots can jump from page 1 straight to page 4. Because of these gaps in component URLs, bots can reach page 50 in about 15 hops instead of 43 (bots will need 43 hops to reach page 50 because they can go from page one straight to page 7 since page 7 is listed on the index page. From page 7, it will be another 43 clicks on Next links until they reach page 50).

The odds that Googlebot will “hop” through paginated content to crawl the final pages decreases with each page in the series and more significantly at page 5.

The graph above is from an experiment on pagination. As you can see, Google crawls component pages 6-N far less frequently.[20]

The experiment concluded that:

“The higher the page number is, the less probability that the page will be indexed… On average, the chance that the robot will crawl to the next page of search results decreases by 1.2 to 1.3% per page”.

If you have many component pages, find a way to add links to intermediary pages in the series. Add links to multiple intermediary pages instead of linking to pages 1, 2, 3, and 4 and then jumping to the last page. I can break the series into four parts in our previous example by linking to every 28th page. Why did I choose to link to every 28th component page? I wanted to break the pagination into four (112 divided by four is 28).

The fewer component pages you have in the pagination, the fewer chunks you will use. For example, if you have 10 component pages, you will list links to all of them. If you have 50, you will divide by 2; 100 will be divided by 4, and 200 will be divided by 8. So now, the pagination may look like:

Figure 309 – Make sure the navigation on each page in the series makes sense for users.

Once you change pagination, you can assess the impact after a week or two. Additionally, you can use your server logs to determine Googlebot’s behavior before and after you update pagination.

Duplicate metadata
While the products listed on pages 2 to N are different, very often, each component page has the same page title and meta description across the entire series. Sometimes, even the copy of the SEO content is duplicated across the pagination series, which means the index page will compete with component pages. In many cases, this duplication is due to the default CMS configuration.

Consider the following to avoid or improve upon this:

  • Create custom, unique titles and descriptions for your index pages (page 1 in each series).
  • Write boilerplate titles and descriptions for pages 2-N. For instance, you can use the title of page 1 with some boilerplate appended at the end, “{Title Page One} – Page X/Y”.
  • Do not repeat the SEO copy (if you have any) from the index page on component pages.

Adding this uniqueness to your titles, descriptions, and copy for the entire series may not greatly impact the rankings of pages 2 to N. However, doing so helps Google consolidate relevance to the index page. Additionally, the component pages will send internal quality signals and will not compete with the index page.

Another duplicate content issue particular to pagination can occur when you reference the first page in the series (AKA the index page) from component pages:

Figure 310 – URLs pointing to the index page (page 1 in the series) should not include pagination parameters. Instead, these links should point to the category index URL,

Ranking signals dispersion
Sometimes, component pages in the pagination series get linked internally (or from external sites) and may end up in the SERPs. In such cases, ranking signals are dispersed to multiple destination URLs instead of to a single, consolidated page.

Let’s look at how PageRank flows according to the first paper on this subject (published in 1998[21]), which notes that PageRank flows equally throughout each link and has a decay factor of 10 to 15%). Considering this, component pages are PageRank black holes—especially those not linked from the first page in the series.

Let me explain how PageRank flows on a view-all page and several paginated series. I will distribute PageRank between component pages only and assume that all the pages in the PageRank flow model contain the same internal links.

Figure 311 – In the pagination scenario above, the items listed on page 2 will receive about three times less PageRank than those listed on a view-all page.

Our scenario is for a listing with 100 items on a PageRank 4 page. Due to the decaying factor, this listing page will send only 3.4 PageRank points to other pages, 4 x (1-0.15). Each 100 items listed on the view-all page will receive 0.034 PageRank points.

We will split the same listing into ten pages in a paginated series, listing ten items per page. We will have links to component pages 1, 2, 3, 4, 5…, and 10.

For the first page in the series, we will have the following metrics:

  • Its PageRank is 4, the same as the view-all page, and the amount that can flow to all other links is 3.4 PageRank points.
  • The total number of links is 16 (10 links for items plus six links for pagination URLs).
  • Each item and pagination URL receives 0.213 PageRank points.

The ten items on the first page of pagination receive about six times more PageRank than items on the view-all page.

The second page has these metrics:

  • Its PageRank is 0.213, and the amount that can flow further to all other links is 0.181.
  • The total number of links is still 16 (10 links for items plus six links for pagination URLs).
  • Each item and pagination URL receives 0.011 PageRank points.

The ten items on the second page receive three times less PageRank than those listed on the view-all page.

Figure 312 – Page 6 in the series is not linked from the index page of the pagination exemplified above.

If a component URL is not present on the first page in the series (e.g., page 6 shows up only when users click on page 2 of the series, as in the screenshot above), then the amount of PageRank that flows to items linked from page 6 is incredibly low.

  • This page PageRank is 0.011 (from page 2), and the amount that can flow further is 0.0096.
  • The total number of links is 16 (10 links for items plus six links for pagination URLs).
  • Each item or pagination URL receives 0.0006 PageRank points.

This means the items on such pages will receive about 56 times less PageRank than those listed on the view-all page.

Figure 313 – This screenshot depicts how PageRank changes when you change the number of items on each component page (i.e., listing 10, 20, or 25 items per page). If you want to play with this model, download the sample Excel file from here.

This PageRank flow modeling suggests that:

  • The items listed on the first page in a pagination series receive significantly more PageRank than those listed on component pages.
  • The fewer links you have on the first page, the more important they are, and the more PageRank they receive, it’s no surprise.
  • If the link to a paginated page is not listed on the series’ index page, that page receives significantly less PageRank.
  • Items listed on pages 2 to N receive less PageRank than those listed on a view-all page. The exception is when they receive a lot of internal or external links.

However, in practice, Page Rank is a metric that flows more complexly. For example, Page Rank flows back and forth between pages, and more Page Rank is passed from contextual links than pagination links. The amount of Page Rank that gets into pagination pages is impossible to compute, except for Google.

However, this oversimplified model shows that you can either pass a lot of PageRank to a few items on the first page of pagination (and significantly less to items on component pages) or pass a medium amount of PageRank to all items via a view-all page.

In both cases, if you use pagination, putting your most important products at the top of the list on the first page is essential.

Thin content
Listing pages usually have little to no content, or at least not enough for search engines to consider them worthy of indexing. There is little text except for product names and basic item information. Because of this, Panda filtered many listing pages from Google’s index.

Questionable usefulness
Do your visitors make use of pagination? Look at your analytics data to find out. Do component pages serve as entry points from search engines or other referrals? If not, a view-all’s SEO and user benefits might be much greater than having pagination.

Pagination may still be necessary if the site architecture has already been implemented and it is too difficult to update or if many items cannot be divided and grouped into multiple subcategories.

If I want to minimize pagination issues, I will start with the website’s architecture. This allows me to avoid challenging user experience, IT, and SEO issues. I consider the following factors:

Replace product listings with subcategory listings
For example, on the Men’s Clothing category page in the next screenshot, instead of listing 2,037 products, you can list subcategories such as Athletic Wear, Belts & Suspenders, Casual Shirts, and so on. You will only have product listings deeper in the website hierarchy.

Figure 314—The Men’s Clothing category page lists products, but it should instead list subcategories, as in the next image:

Figure 315 is just a mock-up I created to illustrate replacing a product listing with a subcategory listing.

Listing categories instead of listings products will also assist users in making better scope selections.[22]

Break into smaller subcategories
If I have a category with hundreds or thousands of items, I look for ways to break it down into smaller categories. When I do this, I will decrease or even eliminate the number of pages in the series.

Figure 316 – The Jeans & Pants subcategory can be split into two subcategories.

Segmenting into multiple subcategories may remove the need for pagination if you list a reasonable number of items. However, do not become overly granular; you want to avoid ending up with too many subcategories.

Increase the number of items in the listing
The idea behind this approach is simple: the more products you display on a listing page, the fewer component pages you have in the series. For example, if you list 50 items using a 5×10 grid and have 295 items to list, you will need six pages in the pagination series. If you increase the number of items per page to 100, you will need only three pages to list them all.

How many items you list on each page depends on the number of other links on the page and the servers’ ability to load content quickly. I also considered the types of items on the list. Generally, I recommend 100 to 150 items as a good start.

Link to more pagination URLs
Instead of skipping pages in the pagination series, link to as many pages in the series as possible.

Figure 317 – This kind of pagination presents big usability issues.

The pagination in the previous screenshot requires search engines and people to click the right arrow seven times to reach the last page. That isn’t good for users and SEO.
Instead, you should link all the pagination URLs, and your pagination will look like this:

Figure 318—This approach makes it easier for users to access any of the component pages.

Adding links to a manageable number of pagination URLs will ensure that crawlers reach those pages in as few hops as possible.

If you can, interlink all the component pages. For example, if the listing results in fewer than 10 component links, you can list all the links instead of just 1, 2, 3…, or 10. If the listing generates an unmanageable number of component URLs, list as many as possible without creating a bad user experience.

The ideas above will help you minimize pagination’s impact on SEO. However, pagination is still necessary in many cases, and you must handle it.

My approach to pagination is situational, which means it depends on factors such as the current implementation, index saturation (the number of pages indexed by search engines), the average number of products in categories or subcategories, and other factors. There is no one-size-fits-all approach.

Apart from the “do nothing” approach, there are various SEO methods for addressing pagination:

  • The “noindex, follow” method.
  • The view-all method.
  • The pagination attributes, AKA the rel=“prev”, rel=“next” method.
  • The AJAX method.

An incorrect approach to pagination is to use rel=“canonical” to point all component pages in a series to the first page. Google states that:

“Setting the canonical to the first page of a parameter-less sequence is considered improper usage”.[23]

The “noindex, follow” method

This method requires adding the meta robots “noindex, follow” in the <head> of pages 2 to N of the series, while the first page will be indexable. Additionally, pages 2 to N can contain a self-referencing rel=“canonical”.

Of the three methods, I found this to be the least complicated to implement. However, it effectively removes pages 2 to N from search engines’ indices. This method does not transfer indexing signals from component pages to the primary, canonical page.

Figure 319 – Pages 2 to N are noindexed with a meta robots “noindex, follow” tag.

This is the best approach to keep pages out of the index – maybe because a thin content filter has hit the website. Also, a good application of the noindex method is on internal site search results pages since Google and all other top search engines do not like to return “search in search”.[24]

Blocking crawlers’ access to component pages can be done with robots.txt and within your webmaster accounts. These two options will not remove pages from indices; they will only prevent further crawling. Moreover, while you can use Google Search Console to prevent component pages from being crawled, it is easier to manage pagination if you block them just in one place, either with robots.txt or with GSC. When auditing crawling and indexation issues, do not forget where you blocked content. I recommend keeping a log file with all the systems that can block URL access.

The “view-all” page

This method seems to be Google’s preferred choice for handling pagination because

“users generally prefer the view-all option in search results”

and because

“[Google] will make more of an effort to properly detect and serve this version to searchers”.[25]

This approach seems to be backed up by testing performed by usability professionals such as Jakob Nielsen, who found that:

“the View-all option [was] helpful to some users. More important, the View-all option did not bother users who did not use it; when it wasn’t offered, however, some users complained”.[26]

The view-all method involves two steps:
1. Creating a view-all page that lists all the items in a category, as in this screencap:

Figure 320 – The view-all link lists all the items.

2. Make the view-all page the canonical URL of the paginated series by adding rel=“canonical” pointing to the view-all URL on each component page:

Figure 321 – Every component URL points to the view-all page.

The purpose of rel=“canonical” is to consolidate all link signals into the view-all page. With a view-all approach, all component pages lose their ability to rank in SERPs. However, while the view-all page can be different from the listing page index, making the view-all the index page is also possible.

The view-all method has advantages such as better usability, indexing signal consolidation, and relative ease of implementation.

However, there are several challenges to consider before creating a view-all page:

  • Consolidating hundreds or thousands of products on one page can dramatically increase page load times, especially on product listing pages. Fast loading time is considered under four seconds, but you should aim to load under 2 seconds. Use progressive loading to make this happen.
  • A view-all page means having hundreds or thousands of links on a single page because the view-all page must display all the items from the component pages. While the compensation may be the consolidation of indexing signals from component pages to the view-all page, we do not have any official words on how search engines will assess such many links on the view-all page.
  • Sometimes, you do not want to remove all other component pages and push the view-all to be listed in SERPs. If you want to surface individual pages from the pagination series, use the rel=“next” and rel=“prev” method.
  • Implementation is a bit more complex than for the “noindex” method. However, it is not as complex as the pagination attributes.

There are situations when you want to implement the view-all page solely for user experience purposes and do not want search engines to list it in SERPs. In such situations, make sure that the component pages in the series do not have a rel=“canonical” pointing to the view-all page but rather to the first page of the pagination. Also, mark the view-all page with “noindex”. Additionally, you may want to make the view-all link available only to humans. We can accomplish this using AJAX, cookie-triggered content, or other methods.

If you are concerned with page load times, there are ways to deliver a barebone version of the view-all page to search engines while presenting the fully rendered page to humans, on-demand and without increasing load times. These implementations must consider progressive enhancement[27] and mobile user experience.

However, nowadays, you should consider building progressive web apps that load super-fast rather than complicating things by delivering different resources to bots versus humans.

While de-pagination can work for websites with a reasonably low number of items in their listings, for websites with larger inventories, it may be easier to stay with user-friendly pagination that limits the number of items. From a usability standpoint,

“typically, this upper limit should be around 100 items, though it can be more or less depending on how easy it is for users to scan items and how much a long page impacts the response time”.[28]

Pagination attributes (aka rel=”prev” and rel=”next” method)

Another method for handling pagination is to use pagination attributes (also known as the rel=“prev” and rel=“next” method). Even if Google is no longer using this method as an indexing signal, this is probably still the best approach for URL discoverability, as it seems to generate good results without completely removing the ability of component pages to rank in search results.

In the <head> section of each component page in the series, you use either rel=“prev” or rel=“next” attributes (or both) to define a “chain” of paginated components.

The previous and next relationship attributes have been HTML standards for a long time[29], but they only got attention after Google pushed them. Rel=“prev” and rel=”next” are just hints to suggest pagination; they are not directives.

Let’s say you have product listings paginated into the following series:

On the first page (the category index page), you would include this line in the <head> section:
<link rel=“next” href=“” />

The first page contains the rel=“next” markup but no rel=“prev”. Typically, this page in the series becomes the hub and gets listed in the SERPs.

On the second page, you will include these two lines in the <head> section:

<link rel=“prev” href=“” />
<link rel=“next” href=“” />

Pages 2 to second-to-last should have rel=“next” and rel=“prev” markup.

Page 2 points back to the first page in the pagination as /duvet-covers/ instead of /duvet-covers?page=1. This is the correct way to reference the first page in the series, and it does not break the chain because:

/duvet-covers/ will point to/duvet-covers?page=2 as the next page, and
/duvet-covers?page=2 will point to /duvet-covers/ as the previous page.

On the third page ( – you will include the following markup in the <head> section:

<link rel=“prev” href=“” />
<link rel=“next” href=“” />

On the last page (, you will include the following link attribute:

<link rel=“prev” href=“” />

Notice that the last page in the series contains only the rel=“prev” markup.

The rel=”prev” rel=”next” method has a few advantages, such as:

  • Component pages retain and share equity with all other pages in the series.
  • It addresses pagination without the need to “noindex” component pages.
  • It consolidates indexing properties such as anchor text and PageRank, just as with a view-all implementation. This means that, in most cases, the series’ index page will show up in Google’s SERPs.
  • On-page SEO factors such as page titles, meta descriptions, and URLs may be retained for individual component pages rather than consolidated into one view-all page.
  • If the listing can be sorted in multiple ways using URL parameters, then these multiple “ordered by” views are eligible to be listed in SERPs. This is not possible with a view-all approach.

Mixing pagination attributes with a view-all page is not a good idea. If you have a view-all page, point the rel=“canonical” on all component pages to the view-all page, and do not use pagination attributes. You may also self-reference component pages to avoid duplicate content due to session IDs and tracking parameters.

Using rel=“canonical” at the same time with rel=“prev” and rel=“next”
Pagination attributes and rel=“canonical” are independent concepts, and both can be used on the same page to prevent duplicate content issues.

For example, page 2 of a series could contain a rel=”canonical”, a rel=“prev”, and a rel=“next”:

<link rel=“canonical” href=“” />
<link rel=“prev” href=“” />
<link rel=“next” href=“”/>

This setup tells Google that page 2 is part of a pagination series and that the canonical version of page 2 is the URL without the sessionID parameter. The canonical URL should point to the current component page with no sorts, filters, views, or other parameters, but rel=“prev” and rel=“next” should include the parameters.

Remember that rel=“canonical” should only deal with duplicate or near-duplicate content. Use it on:

  • URLs with session IDs.
  • URLs with internal or referral tracking parameters.
  • Sorting that changes the display but not the content (e.g., sorting that happens on a page-by-page basis).
  • Subsets of a canonical page (e.g., a view-all page).

You can also use rel=”canonical” on product variants (i.e., on near-duplicate PDPs that share the same product description with the only difference being an overhead product attribute (i.e., the same shoe but in different sizes). You need to understand your target market before applying the canonical, and you also need to be able to select the canonical product from the collection of SKUs.

Rel=“prev”, rel=“next” and URL parameters
Although rel=“prev” and rel=“next” seem more advantageous than the view-all method from an SEO standpoint, they have implementation challenges.

Regarding URL parameters, the rule on paginated pages is that pagination attributes can interlink only URLs with matching parameters. The only exception is when you remove the pagination parameter for the first page in the series.

To ensure that pagination attributes work properly, ensure that all pages within a paginated rel=“prev” and rel=“next” sequence use the same parameters.
Pagination and tracking parameters

The following URLs are not considered part of the same series since the URL for page 3 has different parameters, and that would break the chain:

In this case, you should dynamically insert the key-value pairs based on the fetched URL.

If the requested URL contains the parameter referrer=twitter (, then the pagination URLs should dynamically include the referrer parameter as well:

<link rel=“prev” href=“”>
<link rel=“next” href=“”>

Until March 2022, we could use Google Search Console to tell Google that this parameter does not change the page content and to crawl only the representative URLs (URLs without the referrer parameter). Now, the parameter handling has been discontinued.

Pagination and viewing or sorting parameters
Another frequent scenario with pagination is sorting and viewing listings that span multiple pages. Because each view option generates unique URL parameters, you must create a pagination set for each view.

Let’s say that the following are the URLs for “sort by newest”, displaying 20 items per page:

On page 1, you will have only the rel=“next” pagination attribute pointing to URLs with sort and view parameters:
<link rel=“next” href=“”>

On page 2, you will have rel=“prev” and rel=”next” also pointing to URLs with sort and view parameters:

<link rel=“prev” href=“”>
<link rel=“next” href=“”>

On page 3, you will have only a rel=”prev” attribute also pointing to URLs with sort and view parameters:

<link rel=“prev” href=“”>

The above markup defines one pagination series.

However, if users can also display 100 items per page, that is a new view option, and it will create a new pagination series. The new URLs will look like the ones below; the view parameter now equals 100.

On page 1, you will have only the rel=“next” pagination attribute pointing to URLs with sort and view parameters:
<link rel=“next” href=“”>

On page 2, you will have the rel=“prev” and rel=”next” also pointing to URLs with sort and view parameters:

<link rel=“prev” href=“”>
<link rel=“next” href=“”>

On page 3, you will have only a rel=”prev” attribute, also pointing to URLs with sort and view parameters:

<link rel=“prev” href=“”>

When dealing with sorting URLs, you may want to prevent search engines from indexing bi-directional sorting options because sorting by newest is the same as sorting by oldest, only in a different order. Keep one default way of sorting accessible—e.g., “newest”—and block the other, “oldest”.

Also, adding logic to the URL parameters not only can prevent duplicate content issues but it also can:

“help the searcher experience by keeping a consistent parameter order based on searcher-valuable parameters listed first (as the URL may be visible in search results) and searcher-irrelevant parameters last (e.g., session ID). Avoid”[30]

Ensure that parameters that do not change page content (such as internal click IDs) are implemented as standard key-value pairs, not as directories. This is necessary for search engines to understand which parameters are useless and which are useful.

Here are a couple of other best practices for pagination attributes:

  • While technically, you could use relative URLs to reference pagination attributes, you should use absolute URLs to avoid cases where URLs are accidentally duplicated across directories or subdomains.
  • Do not break the chain. This means that page N should point to N-1 as the previous page and to N+1 as the next page (except for the first page, which will not have a “prev” attribute, and the last page, which will not have the “next” attribute).
  • A page cannot contain multiple rel=“next” or rel=“prev” attributes.[31]
  • Multiple pages cannot have the same rel=“next” or rel=“prev” attributes.

The biggest downside of rel=“prev” and rel=“next” is that they can be tricky to implement, especially on URLs with multiple parameters. Also, remember that Bing does not treat the previous and next link relationships like Google does. While Bing uses the markup to understand your website structure, it will not consolidate indexing signals to a single page. If pagination is a problem in Bing, consider blocking excessive pages with a Bingbot-specific robots.txt directive or noindex meta tag.

The AJAX or JavaScript links method

With this method, you create pagination links that are not accessible to search engines but are available to users in the browser. The trade-off is that users without JavaScript cannot access component pages. However, users can still access a view-all page.

Figure 322 – Pagination links are not plain HTML links.

The screenshot above shows that the interface allows users to sort and choose between list and grid views (1). It also allows access to pagination. However, the source code (2) reveals a JavaScript implementation for pagination. If the JavaScript resources needed to generate those links are blocked with robots.txt, Google will not have access to those pagination URLs (3).

This approach can potentially avoid many duplicate content complications associated with pagination, sorting, and view options. However, it can introduce URL discoverability problems.

If you prefer this approach, make sure that search engines have at least one other way to access each product in each listing— using, for example:

  • A more granular categorization that does not require more than 100 to 150 items in each list.
  • A controlled internal linking that links all products in an SEO-friendly way from other pages.
  • Well-structured HTML sitemaps, along with XML Sitemaps.
  • Other sorts of internal links.

Infinite scrolling

A frequent user interface design alternative for pagination is infinite scrolling.[32] Also known as continuous scrolling, it lets users view content as they scroll down toward the bottom of the page without the need to click on pagination links. Visually, this alternative appears very similar to displaying all the items on the page. However, the difference between infinite scrolling and a view-all page is that with infinite scrolling, the content is loaded on demand (e.g., by clicking on a “load more items” button or when content becomes visible above the fold), while on the view-all page, the content is loaded at once.

Mobile websites use infinite scrolling more since, on small screens, it is easier to swipe than to click. However, infinite scrolling relies on progressive loading with AJAX[33], which means you will still need to provide links to component URLs to search engines or users without JavaScript active. You will achieve this using a progressive enhancement approach.

Regarding SEO, infinite scrolling does not solve pagination issues for large inventories, which is why Google suggests paginating infinite scrolls.[34]
Google’s advice is OK, but I do not believe that continuous scrolling needs pagination when there aren’t too many products in the listing; a view-all page with 200 items is preferable in many cases. However, pages that list more than 200 items and use infinite scrolling should degrade to plain HTML pagination links for non-JavaScript users, and this includes search engine bots.

Degrading to HTML links means that search engines may still get into pagination problems, so you will have to handle pagination with one of the methods described earlier.

Figure 323—This screenshot depicts the cached version of a subcategory page that uses infinite scrolling and degrades to HTML links for pagination when users do not have JavaScript. For users without JavaScript active (and for search engines), there are A HREF links to the Previous and Next pages.

Figure 324 – This screenshot is on the same page, but JavaScript is now turned on. The Previous and Next links do not show up anymore. This was achieved by hiding the pagination section with CSS and JavaScript. Users can continuously scroll to see all watches.[35]

Continuous scrolling has many advantages, such as a better user experience on touch devices, faster browsing due to eliminating page reloads, increased product discoverability, and consolidation of external links.

However, there are disadvantages too,[36] and infinite scrolling does not perform better on all websites.

For example, on Etsy – an ecommerce marketplace for handmade and vintage items – infinite scrolling did not have the desired business outcome, so they reverted to old-fashioned pagination.[37] Infinite scrolling led to fewer clicks from users, as they felt lost in a sea of items and had difficulty sorting between relevant and irrelevant.

However, infinite scrolling may work well on other websites, as reported in this study.[38] As with most ideas for your website, an A/B test will tell you whether removing pagination is helpful for users or not. If you plan to test infinite scrolling, here are a few ideas.

Display visual clues when more content is loading.
Not everyone’s connection is fast enough to load content in the blink of an eye. If your server cannot handle fast user scrolling, let the user know that more content is loadin.

Figure 325 – Notice the Loading More Results message at the bottom of the list. It conveys to users that more content is loading.

Consider a hybrid solution
A hybrid approach combines infinite scrolling and pagination. With this approach, you will display a “show more results” button at the end of a preloaded list. Make this button big on the mobile version to combat the fat-finger syndrome[39]. When the button is clicked, it loads another batch of items:

Figure 326 – In this example, more shoes are loaded with AJAX only when users click the “show more results” button.

Add landmarks when scrolling
Amazon uses horizontal pagination in this product widget to give users a sense of how many pages are in the carousel:

Figure 327—The horizontal pagination for this carousel is located at the top right side of the screenshot.

For vertical scrolling, adding landmarks such as virtual page numbers can give users a sense of how far they have scrolled and may help them create mental references (e.g., “I saw a product I liked somewhere around page 6”).

Figure 328 – This screenshot was modified to exemplify a navigational landmark (the horizontal rule and the text “Page 2”).

Update the URL while users scroll down
This is an interesting concept worth investigating. You can automatically append a pagination parameter to the URL when users scroll past a certain number of rows.

This concept is best explained with a video, so I made a brief screen capture to illustrate it in case the original page[40] becomes unavailable (you can download this file from here).

If you regularly have 200 or fewer items in your listings, it is better to load them simultaneously. Doing so will feed everything to search engines as one big view-all page. This is Google’s preferred implementation to avoid pagination.

Of course, users will see 10 or 20 items at a time, and you will defer loading the rest of them in the interface. However, the data is in the raw HTML code, and you only use the client (browser) to generate the UI.

This has the potential to save many pagination headaches. Depending on the website’s authority, I sometimes list more than 200 items on CLPs.

If the list is huge, you should probably paginate. But even then, you may want to consider a view-all page.

Complement with filtered navigation
Large sets of pagination should be complemented by filtered navigation, which allows users to narrow the items in the listing based on product attributes. Subcategory navigation also allows users to reach deeper into the website hierarchy.

Figure 329 – Filters can reduce items in a list from hundreds to a few tens or fewer.

The previous listing page has 116 items, but if you filter by Brand=Samsung, the list shortens to 52. If you filter by Color or Finish Family, the list shortens to 9 items.
If infinite scrolling produces better results for your users and revenue, it is probably a good idea to keep it in place. But, make it work for users without JavaScript, remove the pagination client-side, and implement infinite scrolling for users with JavaScript on.

Secondary navigation

On listing pages, primary navigation is always complemented by some ancillary navigation. We call that secondary navigation.

I will refer to secondary navigation as the navigation that provides access to categories, subcategories, and items located deeper in the taxonomy. On ecommerce websites, this type of navigation is usually displayed on the left sidebar.

Figure 330 – Secondary navigation can appear very close to the primary navigation (either at the top or on the left sidebar) and provides detailed information within a parent category.

The secondary navigation often lists subcategories, product attributes, and product filters.

The entire left section in the example above is considered the secondary navigation; it includes subcategories implemented as filters, filter names, and filter values.

Unlike primary navigation, the labels in secondary navigation can change from one page to another to help users navigate deeper into the website taxonomy. This change of links in the navigation menu is probably the most significant difference between primary and secondary navigation.

From an SEO point of view, it is important to create category-related navigation. Doing so offers users more relevant information, provides siloed crawl paths, and gives search engines better taxonomy clues.

Figure 331—Take Amazon, for example. When you are in the Books department of the website, the entire navigation is about books.

Faceted navigation (AKA filtered navigation)

E-commerce sites are often cluttered, displaying too much information to process and too many items to choose from. This leads to information overload and induces choice paralysis.[41] Therefore, it is essential to offer users an easier way to navigate through large catalogs; this is where faceted navigation (or what Google calls additive filtering) comes into play.

Filters can be highly useful whether your visitors are looking for something specific or just browsing the website. It will help users locate products without using the internal site search or the primary navigation, which often shows a limited number of options for users.

Faceted navigation makes it easier for searchers to find what they want by narrowing product listings based on predefined filters in clickable links.

Usability experts refer to faceted navigation as:

“arguably the most significant search innovation of the past decade”.[42]

Faceted navigation almost always positively impacts user experience and business metrics. One retailer saw:

“a 76.1% increase in revenue, a 26% increase in conversions and 19.76% increase in shopping cart visits in an A/B test after implementing filtering on its listing pages”.[43]

This screenshot illustrates a usual design for faceted navigation:

Figure 332 – A sample faceted navigation interface.

It is common to present faceted navigation in the left sidebar. Still, it can also be displayed at the top of product listings, depending on how many filters each category has. In many instances, subcategories are also included in the faceted navigation.
Filters, filter values, and facets have different meanings.

  • Filters represent a group of product attributes. In this screenshot, the filters are Styles, Women’s Size, and Women’s Width.
  • Filter values are the options under each filter. For the Styles filter, the filter values are Comfort, Pumps, Athletic, and so on.
  • Facets are views generated by selecting one or a combination of filter values. Selecting a filter value within the Women’s size filter and a filter value for the Women’s width filter creates the so-called “facet”.

Figure 333 – Selecting one or more filter values generates the so-called facet.

Faceted navigation is a boon for users and conversion rates but can generate a serious maze for crawlers. The major issues faceted navigation generates are:

  • Duplicate or near-duplicate content
  • Crawling traps
  • Non-essential, thin content

Figure 334 – If you received a Google Search Console message like the one in the screencap, faceted navigation is one of the possible causes.

There is no better example of how filtering can create problems than the one offered by Google. The faceted navigation on –alongside other navigation types such as sorting and viewing options– generated 380,000 URLs.[44] And keep in mind that this was a site that sold just 158 products.

If you are curious to find out how many URLs faceted navigation could generate for a product listing page, use the following formula for counting the possible permutations (without allowed repetition):


In this formula, n is the total number of filter values that can be applied, and r is the total number of filters. For instance, let’s say you have two filters:

  • The Styles filter, with five filtering options
  • The Materials filter, with nine filtering options

In this case, n will be 14, the total number of filtering options, and r will be 2 because we have two filters. This setup could theoretically generate 91 URLs.[45]
If you add another filter (e.g., Color), and this filter has 15 filtering options, n becomes 29, and r equals 3. This setup will generate 3,654 unique URLs.

As I mentioned, the formula above does not allow repetitive URLs. This means that if users select (style=comfort AND material=suede) on the same page, they get the same results as selecting (material=suede AND style=comfort) at the same URL. Suppose you do not enforce an order for URL parameters. In that case, the faceted navigation will generate 182 URLs for the example with two filters and 21,924 URLs for the example with three filters applied.

Figure 335 – The huge difference between the number of pages indexed and the number of pages ever crawled hints at a possible crawling issue or serious content quality issue.

Figure 336 – Notice how many URLs the price parameter generates?!

In the previous screenshot, the issue was identified and confirmed by checking the URL parameters report in the Google Search Console. The large number of URLs was due to the Price filter, which generated 5.2 million URLs.

You can partially solve duplicate content issues generated by faceted navigation by forcing a strict order for URL parameters, regardless of the order in which filters have been selected. For example, Category could be the first selected filter, and Price could be the second. If a visitor (or a crawler) chooses the Price filter first and then Category, you make it so that the Category shows up first in the URL, followed by Price.

Figure 337 – In the URL above, although the cherry filter value was selected after “double door”, its position in the URL is based on a predefined order.

The same order is reflected in the breadcrumbs as well:

Figure 338 – If you need a breadcrumb that reflects the order of user selection, you can store the order in a session cookie rather than in a URL.

Another near-duplicate content issue generated by facets arises when one of the filtering options presents almost the same items as the unfiltered view. For example, the unfiltered view for Ski & Snowboard Racks has 15 products, and you can narrow the results using two subcategories: Hitch Mount and Rooftops.

Figure 339 – The above is the product listing page for Ski & Snowboard Racks.

However, the subcategory Rooftop Ski Racks & Snowboard Racks includes 13 results from the unfiltered page. This means that except for two products, the filtered and unfiltered pages are near duplicates.

Figure 340 – The Rooftop Ski Racks & Snowboard Racks.

Faceted navigation has a significant advantage over hierarchical navigation: The filter combinations will generate pages that could not exist in a tree-like hierarchy. Tree-like hierarchies are rigid and cannot cover all possible combinations generated by faceted navigation. However, the hierarchy structure is still good for high-level decisions.

Let’s say that you sell jewelry and would like to rank for the query “square platinum pendants”. Your website hierarchy only segregates into jewelry-type categories such as pendants, bracelets, etc. Then, the site allows filtering based on a Material filter, with values such as platinum, gold, etc. If there is no Shape filter to list the square option, your website will have no faceted navigation page for “square platinum pendants”.

However, introducing the Shape filter on the Platinum Pendants listing page would allow you to generate the Square Platinum Pendants facet, which narrows down the inventory based on the square filter value. This page is relevant to users and search engines. You can optimize this page with custom content and visuals to make it more appealing to machines and humans.

Figure 341 – An additional filter – Shape – would allow targeting more torso and long-tail keywords.

Suppose there is no Shape filter to generate the Square Platinum Pendants facet and no hierarchical navigation that could lead to such a page. In that case, you must manually create a page targeting the “square platinum pendants” query. Then, you will have to link to it internally and externally so search engines can discover it. Depending on the size of your product catalog, it will be practically impossible to create thousands or even millions of such pages manually.

Essential, important, and overhead filters

Before discussing how to approach faceted navigation from an SEO perspective, it is important to break down the filters and facets into three types: essential, important, and overhead.

Essential filters/facets
Essential filters will generate landing pages that target competitive keywords with high search volumes, usually “head” or “torso terms”. If your faceted navigation lists subcategories, those facets are essential and are called faceted subcategories.

Figure 342—In this example, Bags, Wallets, and the remaining subcategories are essential filters. Search engines should always be allowed to crawl and index such filters.

Essential facets can also be generated by combining filter values under Brand + Category—for example, using the filter value “Nokia” for Brand and the filter value “Cameras” for Category.

Either Category or Brand can be considered a facet, as they function as filters for larger data sets.

You can handpick the top combinations of essential filters that are valuable for your users and your business. Turn them into standalone landing pages by adding content and optimizing them as you would with a regular important page. This is mostly a manual process that requires content creation, so it is doable for only a limited number of pages at once.

However, if you do this regularly and commit resources to content creation, you will have an advantage over your competitors. Start with the most important 1% of facets and gradually move on. If you do a couple per day (you need only about 100 to 150 carefully crafted words), you will have optimized hundreds of filtered pages in a year.

All essential facet pages should have unique titles and descriptions. Ideally, the titles should be custom, while the descriptions can be boilerplate.

Make sure that search engines can find the links pointing to essential filters and facets. Link to essential filter URLs from content-rich pages, such as blog posts and user guides. If possible, interlink from the main content area of such pages.

The URL structure for essential facets should be clean. It should ideally reflect, either partially or exactly, the hierarchy of the website in a directory structure or a file-naming convention:

Figure 343 – The URL for the Bathroom Accessories subcategory facet is parameter-free and reflects the website’s hierarchy.

Important filters/facets
These refinements will lead users and search engines to landing pages that can drive traffic for “torso” and “long-tail keywords”.

For example, if your analytics data proves that your target market searches for “red comfort shoes”, URLs generated by Color + Style selections are important facets. Search engines should be able to access important facet URLs.

You must decide what is and is not an important facet, preferably on a category or subcategory basis. For instance, the Color filter can be relevant and important for the Shoes subcategory, but it will be an overhead filter for the Fragrances subcategory.

The Sales or Clearance filter is a particular case you need to pay attention to. In the next example, the retailer lists all the facets for all the products on clearance.

Figure 344 – The left navigation filters in the image above do not help users much because Rugs do not have sleeves, Snowshoes do not have a shirt style, and Pullovers do not have a ski pole style.

Instead of listing products, this retailer should list only subcategories in the left navigation and the main content area. This will make it more likely that users will handle the ambiguous nature of the Clearance page by first choosing a category that interests them. Once the desired category has been selected, the retailer should display the filters that apply to that category.

Depending on how your target market searches online, it is advisable to prevent search engines from accessing URLs generated when more than two filter values have been applied. If one of the applied filters is essential, you will block when three filters have been applied.

This works best with multiple selections on the same filter (e.g., Brand=Acorn AND Brand=Aerosoles) because users are less likely to search for patterns like “{brand1}{brand2}{category}” (e.g., “Acorn Aerosoles shoes”).

Selecting multiple filter values is useful for users who might select Red AND Blue Shirts, but they are not so useful for search engines. Therefore, such selections can be blocked for bots.

Figure 345 – An example of multi-selections on the Brand filter.

Blocking access to faceted navigation URLs by default whenever multiple filters are applied will prevent bots from discovering pages created by single-value selections on different filters (e.g., Color=red AND Style=comfort). You will miss traffic for many filter combinations (unless you manually create and optimize landing pages for all the important filters and facets and allow bots to crawl and index those pages).

Let your data be the source of truth when deciding which facets to leave open for search engines. Gather data from various sources, then programmatically replace keywords with their filter values when appropriate. This is similar to the Labeling and Categorization technique described in the Website Architecture section. You must identify patterns and see which facets or filters your visitors use the most. In your ecommerce platform, mark the important filters and let them be indexed.

The URL structure for important facets must be as clean as possible. It is OK to keep the important filter values in a directory or file path structure. Keeping them in URL parameters is also OK as long as you use no more than two or three parameters.

Figure 346 – When an important filter is applied, its value is appended to the URL as a directory. In this URL, the filter value is Kohler under the Brand filter. Avoid using non-standard URL encoding—like commas or brackets—for URL parameters whenever possible.[46]

Often, search engines treat pages created by filters like subsets of the unfiltered page. To avoid being pushed into the supplemental index, you must create unique titles, descriptions, breadcrumbs, headings, and custom content on these filtered pages. Boilerplate titles and descriptions may be fine but do not just repeat the title of the unfiltered view on facet pages. If you implemented pagination with rel=“prev”/next, the unfiltered view would be the view-all or index page.

The breadcrumbs and headings must also be updated to reflect the user selection. This may sound obvious, but it is amazing how many e-commerce websites do not do it.

One technique that can increase the relevance of each filtered page and decrease near-duplicate content problems is to write product descriptions that include the filter values used to generate the page. For instance, let’s say you sell diamonds. When a user selects a value under the Material filter, the product description snippet will include the value of the filter.

Figure 347 – The PLP above filters the SKUs by Material=white gold. The quick-view product description for the second item cleverly includes the words “white” and “gold”.

This quick view snippet is different from the product description on the product detail page:

Figure 348 – Section (1) is the quick view snippet, and section (2) is the full product description.

Section (1) in the previous screenshot shows the quick view product description snippet on the product listing page. As you can see, the snippet was carefully created to include all the important filter values. Section (2) depicts the product description on the product detail page. These two product descriptions are different.

Writing custom product snippets for listing pages is an effective SEO tactic, even when you feature only 20 to 25 words for each product. However, it is difficult to write such snippets when you have thousands of products. A workaround is to write detailed product descriptions that include the most important product filter values either at the beginning or the end of the description and then automatically extract and display the first/last sentence on the product listing page.

Another method to increase the relevance of the listing pages generated by important facets is adding the selected filter values to the product listing on the fly. However, this can transform into spam if you are not careful. If you use this approach, make sure you have rules to avoid keyword stuffing.

Overhead filters
These filters generate pages with minimal or no search volume. All they do is waste the crawl budget on irrelevant URLs. A classic example of an overhead filter is Price; in many instances, so is Size. However, remember that a filter can be an overhead for one business but important or essential for another.

You should prevent search engines from crawling URLs generated based on overhead filters and mark filters as overheads on a category basis. Whenever a combination of filters includes an overhead value, add the “noindex, follow” meta tag to the generated page and append the crawler=no parameter to its URL. Then, block the crawler parameter with robots.txt.

The directive in robots.txt will prevent wasting the crawl budget, while the noindex meta tag will prevent empty snippets from showing up in the SERPs. If you have pages in the index that you need to remove, first implement the “noindex, follow” and wait for them to be removed. Once they are removed, append the crawl control parameter to the URLs.

Be careful about combining robots.txt with the noindex meta tag, as robots.txt will not allow robots access to a page-level directive, and noindex is a page-level directive. If your website has no index bloat or crawling issues, you may consider implementing rel=“canonical” instead of robots.txt.

You can also use AJAX to generate the content for overhead facets so that the URLs will not change and search engine crawlers will not request unnecessary content. In this case, you must block the scripts (and all other resources needed for the AJAX calls) with robots.txt. This will prevent search engines from rendering the AJAX links.

If you want to degrade the code for users with JavaScript off, you can use URL parameters, which can be placed either after a hash mark (#) or in a URL string blocked with robots.txt. However, the most stringent crawling restrictions come from not making the overhead URLs available to bots.

Figure 349 – Notice how the URL above contains the NCNI-5 string at the end.

The NCNI-5 string is used to control crawlers because all URLs containing the NCNI-5 string are blocked with robots.txt:

Disallow: /*NCNI-5*

To summarize, this is how Home Depot defines the filtered URLs for all three types of facets:

Figure 350 – Each filter/facet is treated differently, depending on how important each facet is.

The URL for the essential facet is made of a clean category name. The important-facet URL includes the category name and the filter value, Kohler. The overhead URL—while it includes the category name and the filter value—also includes the crawl control string, NCNI-5.

It is a bad idea to rewrite URLs to make overhead filters look like static URLs. The following sample URL includes the overhead filter Price, with values of 50 to 100.

The URL above does not exist on Home Depot’s website; I only added the /Price/50-100 part to illustrate. Generating search engine-friendly URLs does not change the fact that your website will have millions of irrelevant pages.

Regarding URL discoverability, search engines do not need to find links pointing to overhead filters or facets; you have to prevent search engines from discovering overhead facets.

If you have to allow search engines to crawl overhead facets, keep the filters in parameters using standard HTML encoding and key=value pairs instead of in directories. This helps search engines differentiate between useful and useless values.

A faceted navigation case study

A Google search for “Canon digital cameras” lists Overstock on the first page, OfficeMax on the fifth, and Target on the seventh.

Overstock’s approach to filtered navigation

Figure 351 – The above is the Digital Cameras sub-subcategory page filtered by Brand=Canon. It has a unique title, customized breadcrumbs, and a relevant H1 heading. Also, this page uses a good meta description. These elements send quality signals to search engines.

When users filter the SKUs by another brand (e.g., Sony), the page elements update. If they did not, the Canon Digital Cameras page would have the same H1, title, description, and breadcrumbs as the Digital Cameras page, which is undesirable.

Additionally, Overstock allows the crawling of essential and important filters and does not create links for gray-end filters (filters that generate zero results).

Figure 352 – The “10 Megapixels” filter value generates zero results. Therefore, it is not hyperlinked.

Overstock’s implementation of faceted navigation is SEO-friendly because it allows crawlers to access various filtered pages and updates page elements based on user or crawler selection.

A note on gray-end filter values: whatever you choose to do with these filter values in the interface (i.e., not showing them at all, showing them at the bottom of the filters list, or hiding them behind a “show more” link), gray ends filters should not be hyperlinked. If you have to hyperlink them, the header response code for zero results pages should be 404. If returning 404 is impossible, mark the pages with “noindex, follow”. Alternatively, you can use robots.txt to block URLs generating zero results.

OfficeMax’s approach to filtered navigation

Figure 353 – The image above depicts the Digital Cameras PLP on OfficeMax.

The title, the description, and the breadcrumbs do not update when the user selects a filter value under the Brand filter. This means thousands of filtered pages will have similar on-page SEO elements to the unfiltered page. Although the products on each filtered page will change, search engines will get a lot of near-duplicate meta-tag signals.

This “stallness” might cause Google not to index the faceted page resulting from filtering by Canon. The page that ranks for “Canon digital cameras” on OfficeMax is the Digital Cameras category page. This is not the ideal page to rank with because it does not match the user intent behind the search query. Filtering by Brand=Canon means that searchers must take an additional, unnecessary step.

On the cached version of the Digital Cameras page, we notice that the faceted navigation is nowhere to be found. That’s happening because the faceted navigation is not accessible to search engines.

Figure 354 – The faceted navigation is not accessible to search engines.

This is a quick reminder that if the links are missing in the cached version, Google might still be able to find them when it renders the page like a browser would.

Maybe OfficeMax tried to fix some over-indexation issues or a possible Panda filter on thin content pages. However, this faceted navigation implementation is not optimal, as it completely blocks access to all filtered pages. Unless OfficeMax creates manual landing pages for all essential and important filtered pages, they have closed the doors to search engines and the traffic those pages could bring in.

Target’s approach to faceted navigation

Figure 355 – Target does not create relevant signals for filtered pages like OfficeMax.

On Target’s website, the page title, breadcrumb, heading, and description for their Canon Digital Cameras page are the same as on the unfiltered page, Digital Cameras. The elements above will be the same on hundreds or thousands of other possible filtered pages.

Moreover, since the page has a canonical pointing to the unfiltered page, its ability to rank is (theoretically) zero. Because of their approach, I thought they would have a Canon Digital Cameras page that can be reached from the navigation or other pages. If they had one, Google would not have been able to identify it.

Figure 356 – Google could not find a category page (or even a facet URL) relevant to Canon Digital Cameras. All the most relevant results were PDPs.

Google’s cached version of the page shows that faceted navigation does not create links:

Figure 357 – Because Brand is an important filter, and we used only one filter value in our example, all the filter values under Brand should be plain HTML links.

Categories in faceted navigation

Hierarchical, category-based navigation is useful as long as it is easy for users to choose between categories. For instance, it could be more helpful for users if easy-to-decide-upon subcategories were listed in the main content area instead of being displayed as facet subcategories in the sidebar. Subcategory listing pages should be used:

whenever further navigation or scope definition is needed before it makes sense to display a list of products to users. Generally, sub-category pages make the most sense in the one or two top layers of the hierarchy where the scope is often too broad to produce a meaningful product list”.[47]

Figure 358 – This category displays the next level of hierarchy categories in the main content area.

In the previous screenshot, faceted navigation (usually present in the left navigation as filtering options) is not yet introduced at this hierarchy level (the category level) or even at the sub-subcategory level.

In the next example, you can see how category-based navigation ends at the third level of the hierarchy; the first level is the Décor category, the second is Blinds & Window Treatments, and the third is Blinds & Shades.

The faceted navigation is displayed only at the third level of the hierarchy.

Figure 359 – The faceted navigation displayed in the left sidebar to help make decisions. You can also notice that the subcategory listing has been replaced with the product listing.

It is important to keep hierarchies relatively shallow so users do not have to click through more than four layers to get to the list of products. Search engines will have the same challenges and may deem products buried deep in the hierarchy unimportant.

Because faceted navigation is a granular inventory segmentation feature, it generates excess content in most implementations. It will also generate duplicate content—for instance, if you do not enforce a strict order for URL parameters.

So, what options do we have for controlling faceted navigation?

Option rel=”canonical”

Although rel=“canonical” is supposed to be used for identical or near-identical content, it may be worth experimenting with canonicals to optimize content across faceted navigation URLs.

Vanessa Fox, who worked for Google Webmaster Central, has suggested the following approach for some cases:

“If the filtered view is a subset of a single non-filtered page (perhaps the view=100 option), you can use the canonical attribute to point the filtered page to the non-filtered one. However, if the filtered view results in a paginated content, this may not be viable (as each page may not be a subset of what you would like to point to as canonical)”.[48]

Rel=“canonical” will consolidate indexing signals to the canonical page and address some duplicate content issues, but search engine crawlers may still get trapped into crawling irrelevant URLs.

Rel=“canonical” is a good option for new websites or adding new filtering options to existing ones. However, it is not helpful if you are trying to remove existing filtered URLs from search engine indices. If you do not have indexing and crawling issues, you can use rel=“canonical”, as Vanessa suggests.

Option robots.txt

Robots.txt is the crawl control sledgehammer. Remember that if you use robots.txt to block URLs, you will tamper with the flow of PageRank to and from thousands of pages. That is because while URLs listed in robots.txt can get PageRank, they do not pass PageRank.[49] Also, remember that robots.txt does not prevent pages from being indexed.

However, this approach is necessary in some cases—e.g., when you have a new website with no authority and many items that need to be discovered or when you have thin content or indexing issues.
If you use parameters in URLs and would like to prevent the crawling of all the URLs generated by selecting values under the Price filter, you would add something like this in your robots.txt file:

User-agent: *
Disallow: *price=

This directive means that any URL containing the string price= will not be crawled.

Robots.txt blocked URL parameter/directory

This method requires you to selectively add a URL parameter to control which filtered pages are crawlable and which are not. I described this in the Crawl Optimization section but will repeat it here.

First, decide which URLs you want to block.

Let’s say that you want to control the crawling of the faceted navigation by not allowing search engines to crawl URLs generated when applying more than one filter value within the same filter (also known as multi-select). In this case, you will add the crawler=no parameter to all URLs generated when a second filter value is selected on the same filter.

Suppose you want to block bots when they try to crawl a URL generated by applying more than two filter values on different filters. In that case, you will add the crawler=no parameter to all URLs generated when a third filter value is selected, no matter which options were chosen or the order they were chosen. Here’s a scenario for this example:

The crawler is on the Battery Chargers subcategory page.

The hierarchy is: Home > Accessories > Battery Chargers
The page URL is:

Then, the crawler “checks” one of the Brands filter values, Noco. This is the first filter value; therefore, you will let the crawler fetch that page.

The URL for this selection does not contain the exclusion parameter:

Then, the crawler checks one of the Style filter values, cables. Since this is the second filter value applied, you will still let the crawler access the URL.

The URL still does not contain the exclusion parameter. It contains just the brand and style parameters:

Then, the crawler “selects” one of the Pricing filter values, the number 1. Since this is the third filter value, you will append the crawler=no to the URL.

The URL becomes:

If you want to block the URL above, the robots.txt file will contain:

User-agent: *
Disallow: /*crawler=no

The method described above prevents the crawling of facet URLs when more than two filter values have been applied, but it does not allow specific control over which filters will be crawled and which ones will not. For example, if the crawler “checks” the Pricing options first, the URL containing the pricing parameter will be crawled.

Blocking filtered pages based solely on how many filter values have been applied poses some risks. For instance, if a Price filter value is applied first, the generated pages will still be indexed since only one filter value has been selected. You should have more solid crawl control rules—e.g., if an overhead filter value has been applied, always block the generated pages.

Limiting the number of selections a search engine robot can discover is also a good idea. We will discuss this later in this section as the JavaScript/AJAX crawl control option.

Important filters or facets must be plain HTML links. You can present overhead filters as plain text to search engines (no hyperlinks) but as functional HTML to users (hyperlinks).

The blocked directory approach requires putting the unwanted URLs under a directory and then blocking that directory in robots.txt.

In our previous example, when the crawler checks one of the Pricing options, place the filtering URL under the /filtered/ directory. If your regular URL looks like this:

When you control crawlers, the URL will include the /filtered/ directory:

If you want to block the URL, the robots.txt will contain:

User-agent: *
Disallow: /filtered/

Option nofollow

Some websites prefer to add nofollow to unnecessary facet URLs (I refer to such URLs as refinements). Surprisingly, and contradicting the other official recommendation that tells us not to nofollow any internal links, nofollow is one of Google’s recommendations for handling faceted navigation[50]. However, nofollow does not guarantee that search engines will not crawl unnecessary URLs or that those pages will not be indexed. Additionally, nofollow-ing internal links might send search engines the wrong signals because nofollow translates into “do not trust these links”.

Hence, nofollow does not solve current indexing issues. This option works best with new websites.

It may be a good idea to either “back up” the nofollow option with another method that prevents URLs from being indexed (e.g., blocking URLs with robots.txt) or to canonicalize the link to a superset.

Option JavaScript/AJAX

We established that essential and important facets/filters should always be accessible to search engines as links. Preferably, those will be plain HTML links. On the other hand, URLs for overhead filters and facets can safely be blocked for search engine bots.

Theoretically, you can obfuscate the entire faceted navigation from search engines by loading it with search engine “unfriendly” JavaScript or AJAX. We have seen this deployed at OfficeMax. However, excluding the entire faceted navigation is usually a bad idea. It should only be done if there are alternative paths for search engines to reach pages created for all essential and important facets. In practice, this is neither feasible nor recommended.

One option is to allow search engines access only to essential and important facets links while not generating overhead links. For example, you load only the important facets and filters as plain HTML, while the overhead filters or facets are loaded with JavaScript or AJAX. Users can click on any links, as they will be generated in the browser (e.g., using “see more options” links).

Figure 360 – Some filter values in the faceted navigation are not hyperlinked.

In this example, users are shown just two filter values for Review Rating, with a link to Show All Review Rating (column 1). When they click on that link, they see all the filter values (column 2). However, the Show All Review Rating is not a link for search engines (column 3).

This will effectively limit the number of URLs search engines can discover, which may be good or bad, depending on your situation. If your target market searches for “laminate flooring 3-star reviews”, then you need to make the corresponding link available to bots.

Similarly, you can obfuscate entire filters or just some filter values. For example, eBay initially presents users with only a limited number of filters and filter values. Still, then at a click on “see all” or “More refinements”, it opens all the filters in a modal window:

Figure 361 – This modal window contains all the links needed for users to refine the list of products.

However, the content of the modal window is not accessible to search engines, as you can see in this screenshot, where the “More refinements” is not hyperlinked:

Figure 362 – The “More refinements” element looks and acts like a link but is not a plain a href link.

One advantage of selectively loading filters and facets with robotted AJAX or JavaScript is that it may help pass more PageRank to other, more important pages. This is very similar to the old PageRank sculpting concept. However, remember that this “sculpting” happens only if search engines cannot execute AJAX on such pages. And search engines are getting better by the day at executing JavaScript and AJAX. To ensure the links are not accessible to Googlebot, block the resources necessary for the JavaScript or AJAX calls with robots.txt, and then do a fetch and render in Google Search Console.

If you know that some pages are not valuable for search engines and do not want those useless pages in the index, then why allow bot access to them in the first place?

Another advantage of selectively loading URLs is that it will prevent unnecessary links from being crawled.

The hash mark option

You can append parameters after a hash mark (#) to avoid the indexing of faceted URLs. This means you can let faceted navigation create URLs for every possible combination of filters. As a note, remember that AJAX content is signaled with hashbang (#!). However, Google no longer recommends this scheme.

If you do an “info:” search for a page with a hash mark in the URL, you will see that Google defaults to the URL that excludes everything after the hash mark.

For search engines, this page:,70&sort=newest&page=1
defaults to the content on this page:

Figure 363 – Google caches the content of the page generated before the usage of hash marks.

The hash mark can potentially consolidate linking signals to, but all the pages generated using the hash mark will not be indexed; therefore, they cannot rank.

However, you can place only the overhead filters after the hash mark. Whenever an essential or important facet is selected, include it in a clean URL before the hash mark. Multiple selection filters can also be added after the hash mark.

You can also control crawlers using Bing and Google’s URL parameters handling tools (discontinued as of March 2022).

Figure 364—This setup hints to Google that the mid parameter narrows the content. I prefer to tell Google about the effect that each parameter has on the page content, but in the end, I will let them decide which URLs to crawl.

This setup presents only a clue to Google, so you must still address crawling and duplicate content using another method (e.g., blocking overhead facets with selective robots.txt) or combining methods.

Option noindex, follow

Adding the “noindex, follow” meta tag to pages generated by overhead filters can help address “index bloat” issues, but it will not prevent spiders from getting caught in filtering traps.

A quick note about using the noindex directive in conjunction with robots.txt: theoretically, “noindex, follow” can be used with robots.txt to prevent the crawling and indexing of new websites. However, if unwanted URLs have already been crawled and indexed, first, you have to add noindex, follow” to those pages and let search engine robots crawl them. This means you will not block the URLs with robots.txt yet. Block the unwanted URLs with robots.txt only after the URLs have been removed from the index.

Sorting items

Users must be allowed to sort listings based on various options. Some popular sort options are bestsellers, new arrivals, most rated, price (high to low or low to high), product names, and discount percentage.

Figure 365 – Some popular sorting options.

Sorting simply changes the order in which the content is presented, not the content itself. This will create duplicate or near-duplicate content problems, especially when the sorting can be bidirectional (e.g., sort by price—high to low and low to high) or when the entire listing is on a single page (view-all).

Google tells us that if the sort parameters never exist in the URLs by default, they do not even want to crawl those URLs.

Figure 366 – This screencap is from Google’s official presentation, “URL Parameters in Webmaster Tools”.[51]

The best way to approach sorting is situational, depending on how your listings are set up.

Use rel=“canonical”

Many times, sort parameters are kept in the URL. When users change the sort order, the sort parameters are appended to the URL, and the page reloads. In this case, you can use rel=“canonical” on sorted pages to point to a default page (e.g., sorted by bestsellers).

Figure 367 – In this screenshot, you see that while sorting generates unique URLs for ascending and descending sort options, both URLs point to the same canonical URL.

The use of rel=“canonical” is strongly advised when the sorting happens on a single page because sorting the content will change only how it is displayed but not the content itself. This means that the content on each page, although sortable, will not be different, and the generated page will be an exact duplicate. For instance, when sorting reorders the content on a view-all page, you generate exact duplicates (given that the view-all page lists all items in the inventory). However, even when the content is sorted on a page-by-page basis rather than using the entire paginated listing, you create near or exact duplicate content.

Removing or blocking sort-order URLs

This requires adding rel=“noindex, follow” to sort URLs or blocking access altogether using robots.txt or within Google Search Console.

A screenshot of a cell phone Description generated with very high confidence

Figure 368 – In this example, the Ski Boots listing can be sorted in two directions (price “high to low” and price “low to high”).

When items can be sorted in two directions, the first product on the first page sorted “high to low” becomes the last product on the last page sorted “low to high”. The second product on the first page becomes the second to last product on the last page, and so on. Depending on the number of items you list by default and how many products are listed, you may end up with exact or near duplicates. For example, let’s say you list 12 items per page, and there are 48 items in total. This means that the last page in the pagination series will display exactly 12 items. When you list by price “high to low”, the products on the first page of the pagination will be the same as those on the last page when sorting “low to high”.

One way to handle bidirectional sorting is to allow search engines to index only one sorting direction and remove or block access to the other. For example, you allow the crawling and indexing of “oldest” sort URLs and block the “newest”.

Figure 369 – Removing or blocking sort-order URLs is the easiest method to implement and may help address pagination issues quickly until you are ready to move ahead with a more complex solution.

Use AJAX to sort

With this approach, you sort the content using AJAX, and URLs do not change when users choose a new sort option. All external links are naturally consolidated to a single URL, as there will be only one URL to link to.

Figure 370 – Sorting with AJAX does not usually change the URL.

Notice how the URL in the previous screenshot does not change when the list is sorted again by Bestsellers in the image below:

Figure 371 – While the content updates when users select various sort options, the URL remains the same.

Because the URL does not update when sorting, this method makes it impossible to link, share, or bookmark URLs for sorted listings. But do people link or share sorted or paginated listings? Even if they do, how relevant will pagination or sorting be a week or a month from when it was linked or shared? Products are added to or removed from listings regularly, frequently changing the order of products. The chances are that the products listed on any sorted page will be partially or totally different from those listed on the same page in a week or month.

So, shareability and linkability should not be concerns when deciding whether to implement AJAX for sorting. If it is better for users, do it.

Use hash marks URLs

Using hash marks in the URL allows sharing, bookmarking, and linking to individual URLs. A rel=“canonical” pointing to the default URL (without the #) will consolidate eventual links to a single URL.

Figure 372 – In this screenshot, the default view lists items sorted by Most Relevant SKUs.

The URL above will be the canonical page. In the next screenshot, you will notice how the URL changes when the list is filtered by Price, low to high:

Figure 373 – The URL includes the hash mark and the filter value, ~priceLowToHigh.

Currently, search engines typically ignore everything after the hash mark unless you use a hashbang (#!) to signal AJAX content (which is deprecated). Search engines ignore everything after the hash mark because using it in URLs does not cause additional information to be pulled from the web server.

The hash mark implementation is an elegant solution that addresses user experience and possible duplicate content issues.

View options

Just as users prefer different sort options, some want to change the default way of displaying listings. The most popular view options are view N results per page or view as list/grid. While good for users, view options can cause problems for search engines.

Figure 374—In the example above, users can choose between a compact grid or a detailed list, and they can also choose the number of items per page.

Grid and list views

Figure 375 – The grid view (left) and the list view (right).

Usually, the grid and the list view present the same SKUs, but the list view can use far more white space. This space can be filled with additional product information and represents a big SEO opportunity, as it can be used to increase the amount of content on the listing page and create relevant contextual internal links to products or parent categories.

The optimal approach for viewing options is to load the list-view content in the source code in a way that is accessible to search engines and then use JavaScript to switch between views in the browser.

You do not need to generate separate URLs for each view. If you generate separate URLs, those pages will contain duplicate content, and the way to handle them is with rel=“canonical” to a default view. The default view has to be the page that loads the content for the list view.

For example, these two URLs point the rel=“canonical” to /French-Door-Refrigerators/products:



Many ecommerce websites have the View-N-items per page feature, allowing users to select the number of items in the listing:

Figure 376 is a typical drop-down for the view-N-items per page option.

If possible, your default product listing will be the view-all page. If view-all is not an option, display a default number in the list (say 20) and allow users to click a view-all link.

Figure 377 – Nike’s view-all option is displayed right in the menu.

If view-all generates an unmanageable list with thousands of items, let users choose between two numbers where the second number is substantially bigger than the default (e.g., 60 and 180). Remember to keep user preferences in a session or a persistent cookie[52], not in URL parameters.

Figure 378 – The second view option is substantially larger than the first one.

From an SEO perspective, view-N-items per page URLs are traditionally handled with rel=“canonical” pointing to default listing pages (usually index pages for department, category, or subcategory pages). For instance, on a listing page with 464 items, the view 180 items per page option can be kept in the key=value pair itemsPerPage=180, and the URL may look like this:

The URL above lists 180 items per page and will contain a rel=“canonical” in the <head> that points to the category default URL:

However, the canonical URL lists only 60 items by default, and that is what search engines will index. This means that a larger subset (which lists 180 SKUs) canonicalizes to a smaller subset (which lists 60 SKUs). This approach can create issues because Google will index the content on the canonical page (60 items) while ignoring the content from the rest of the view-N-items pages. In this case, you must ensure that search engines can somehow access each item in the entire set (464 items). For example, you can make this work with paginated content handled with rel=“prev” and rel=“next” so that Google consolidates all component pages into the canonical URL.

The use of rel=“canonical” on a view-N-items page is appropriate if the canonical points either to a view-all page or the largest subset of items. The former option is undesirable if you want another page to surface in search results (e.g., the first page in a paginated series with 20 items listed by default).

The approaches for controlling view-N-items pages are similar to those for handling sorting: a view-all page combined with AJAX/JavaScript to change the UI client-side, uncrawlable AJAX/JavaScript links, hash-marked URLs, or using the “noindex” meta tag. I mentioned these approaches in my preferred order, but remember that while one approach might suit the particular conditions of one website, it may not work for another.

  1. Prioritize: Good Content Bubbles to the Top,
  2. New snippets for list pages,
  3. More rich snippets on their way: G Testing Real Estate Rich Snippets,
  4. Product –,
  5. Below the fold,
  6. Implement the First 1-2 Levels of the E-Commerce Hierarchy as Custom Sub-Category Pages,
  7. Usability is not dead: how left navigation menu increased conversions by 34% for an eCommerce website,
  8. User Mental Models of Breadcrumbs,
  9. Breadcrumb Navigation Increasingly Useful,
  10. Breadcrumbs,
  11. New site hierarchies display in search results,
  12. Visualizing Site Structure And Enabling Site Navigation For A Search Result Or Linked Page, link
  13. Rich snippets – Breadcrumbs,
  14. Can I place multiple breadcrumbs on a page?
  15. Location, Path & Attribute Breadcrumbs,
  16. Taxonomies for E-Commerce, Best practices and design challenges -
  17. Breadcrumb Navigation Increasingly Useful,
  18. HTML Entity List,
  19. Pagination and SEO,
  20. Pagination and Googlebot Visit Efficiency,
  21. The Anatomy of a Large-Scale Hypertextual, Web Search Engine,
  22. Implement the First 1-2 Levels of the E-Commerce Hierarchy as Custom Sub-Category Pages,
  23. Five common SEO mistakes (and six good ideas!),
  24. Search results in search results,
  25. View-all in search results,
  26. Users’ Pagination Preferences and ‘View-all’,
  27. Progressive enhancement,
  28. Users’ Pagination Preferences and ‘View-all’,
  29. HTML <link> rel Attribute,
  30. Faceted navigation best (and 5 of the worst) practices,
  31. Implementing Markup For Paginated And Sequenced Content, link
  32. Infinite Scrolling: Let’s Get To The Bottom Of This,
  33. Web application/Progressive loading,
  34. Infinite scroll search-friendly recommendations,
  35. Infinite Scrolling: Let’s Get To The Bottom Of This,
  36. Infinite Scroll On Ecommerce Websites: The Pros And Cons,
  37. Why did infinite scroll fail at Etsy?
  38. Brazillian Virtual Mall MuccaShop Increases Revenue by 25% with Installment of Infinite Scroll Browsing Feature,
  39. Typographical error,
  40. Better infinite scrolling,
  41. The Paradox of Choice,
  42. Search Patterns: Design for Discovery, [page 95]
  43. Adding product filter on eCommerce website boosts revenues by 76%,
  44. Configuring URL Parameters in Webmaster Tools,
  45. Permutation, Combination – Calculator,
  46. Faceted navigation best (and 5 of the worst) practices,
  47. Implement the First 1-2 Levels of the E-Commerce Hierarchy as Custom Sub-Category Pages,
  48. Implementing Pagination Attributes Correctly For Google,
  49. Do URLs in robots.txt pass PageRank?!category-topic/webmasters/crawling-indexing–ranking/OTeGqIhJmjo
  50. Faceted navigation best (and 5 of the worst) practices,
  51. URL Parameters in Webmaster Tools,, page 18
  52. Persistent cookie, Eight: Product Detail Pages


Product Detail Pages ( PDPs)

Length: 14,287 words

Estimated reading time: 1 hour, 40 minutes


Product Detail Pages

Many marketers consider that the “bread and butter” of ecommerce websites are the product detail pages, aka PDPs. Since PDPs are where the add-to-cart micro-conversion happens, these pages are considered “money pages” and tend to get the most SEO attention. After all, if you do not rank when someone searches for your products, you will not have the chance to sell them. While the product detail pages focus on convincing and converting, conversion elements must be balanced with SEO.

In this part of the course, I will break down the most important sections on product detail pages and discuss ways to optimize them for a better search experience.

I will explain how to optimize the URLs for product detail pages, and then we will discuss optimizing images and videos in detail. We will also discuss optimizing product descriptions and how to handle product variants (AKA product variations) and thin content.

Then, we will see how you can optimize product names and discuss why it is important for SEO to collect and properly optimize product reviews.

Since products go in and out of stock often, I will also show you how to address this situation. And finally, we will learn how to optimize page titles for ecommerce.


Keeping products on category-free URLs whenever possible is a good idea because products can be re-categorized from one category to another and because category names can change over time. Neither of these alterations is advisable, as re-categorization or renaming means you must handle 301 redirects, even possibly 301s chains, which can quickly become a headache.

While a product can be accessed through multiple paths due to multi-categorization, the final PDP URL should not contain categories or subcategories.

Use: instead of or

If you need to feature categories in the URL, decide on a canonical URL for each product, then point the rel= “canonical” to the representative URL from all the possible URL taxonomies that lead to that product. Also, link externally and internally only to the canonical URL, especially from the global navigation and internal links.

If the product comes in multiple variants, then the URLs for those SKUs should contain some important SKU attributes (e.g., the manufacturer, the brand name, and the color attribute).

The URL might look like this:

Remember that including the brand in the URL is OK since an SKU belongs to only one brand. If you need to use categories or subcategories to generate PDP URLs:

  • Set the category or subcategory name in stone.
  • Use the product’s canonical category and keep the product under that category or subcategory.


Users can get product info straight from images, including details not covered in product descriptions (mainly skimmed, not read in detail). So, it is no surprise that high-quality images, taken from multiple angles and showing the product in action, increase user satisfaction. However, images also need to be optimized for search engines.

Regarding increasing conversion rates, savvy online retailers understand the importance of images, especially product images. A study[1] of online consumers found that:

  • 67% of consumers believe an image is “very important” when selecting a product.
  • More than 50% of consumers value the quality of a product image more than product information, description, or ratings and reviews.

From an SEO point of view, product images can drive organic traffic through Google Image Search and universal results that include images. Images can also improve the document’s relevance and optimize internal linking.

To understand images, search engines will first look for the alternative text of the HTML image element, img. Some search engines can extract text from images using optical character recognition (OCR).

Let’s optimize an image for SEO. We will start with a basic implementation of the image element, ending with a highly optimized, SEO-friendly image tag. (Note: I will use the terms tag and element interchangeably.)

This is the basic image element:
<img src=“0012adsds.gif” />

Before we proceed, how do search engines analyze images? Here are some signals that search engines use to understand, categorize, and rank images:

  • They take into consideration colors, sizes, and image resolution.
  • They look at the image type (e.g., is the image a photo, a drawing, or clip art?).
  • They also weigh text by distance from an image and extract context from the text around it.
  • They look at the overall theme of the website. For instance, adult websites will have all images tagged as “adult” and will be filtered out when the safe search filter is on.
  • Search engines will use the alt attribute of the image tag. The content of the alt text is directly used in document relevance analysis. The title attribute of the image tag is not cached but can provide additional context.
  • They also use image file names.
  • They look at the total number of thumbnail images on the same webpage as the ranked image.
  • OCR (optical character recognition).
  • Self-learning artificial intelligence trained by human input at a large scale. ReCaptcha is one form of human input.

As you can see, search engines consider plenty of clues when analyzing images. For those who want to know more about this subject, a Microsoft patent application since 2008 provides an interesting description of how images are ranked for image search.[2]

Did you know that when you solve an online captcha, Google (and possibly other search engines) uses that input to validate or refine artificial intelligence for image recognition? About 200 million captchas are typed in daily; that is a lot of human validation. If interested in this subject, watch the TED talk about ReCaptcha[3] and massive-scale online collaboration.

Here are some image optimization best practices:

Take your product images
This is not an SEO factor per se, but it will help you differentiate from competitors and open doors for image licensing partnerships (which may come with some valuable backlinks). However, having familiar imagery is important when searchers look for a product they already know. If you take your product images, differentiate but try not to alter the look of the products too much.

Add an alt attribute to every significant image.
Adding alt text to images is the best way to give search engines more information about the image and the page content. Without the alt text, the chances of an image being indexed in Google Images are lowered.
<img src=“0012adsds.gif” alt=“yellow t-shirt” />

The only attribute of the img element that gets cached by search engines is the content of the alt attribute.

Here’s a typical product listing grid:

Figure 379 – Products displayed in a grid view on a category listing page.

Below is the content that will be cached by search engines based on the alt attributes:

Figure 380 – The alt texts are highlighted with a red border.

The alt attributes should contain keywords, but those should not be simply a list of keywords. When writing the alternative text for your product images, think of how you would describe the image to a blind person succinctly and relevantly, in fewer than 150 characters. That sentence will be your alt attribute.

Most of the time, the alt text of a product thumbnail image is the exact product name. In the case of thumbnails for a category listing page, the alt text is the category name.

However, you can add more details by including significant product attributes. For example, instead of the alt text alt= “DG2 Stretch Denim Long Skirt,” you could use alt= “DG2 Stretch Denim Long Skirt in brown.”

Spacer images, 1px gifs, or other images used just for design purposes should still have an alt attribute, but it should be empty, alt= “. This is mostly for code validation and cross-browser compatibility. All other images visually depicting something important to visitors should have descriptive text.
Microsoft recommends to:

“Place relevant text near the beginning of the alt attribute to enable search engines to better correlate the keywords with the image. A copyright symbol or other copyright notice at the beginning of the alt attribute will indicate to the search engines that the most search-relevant aspect of the image is the copyright, rather than what the image depicts. If you require a copyright notice, consider moving it to the end of the alt attribute text”.[4]

More of Microsoft’s recommendations for alt text can be found in their “Image Guidelines for SEO” documentation.[5]

More and more websites have started using CSS sprites to reduce the number of HTTP requests made to the web server, thus improving page load times. While this is great, the implementation makes it impossible to add alt attributes, raising accessibility and SEO concerns. You can load icons, spacers, and other small images using CSS sprites, but product images should be loaded as single images with proper alt texts.

Use the title attribute.
Search engines do not cache/index the content of image title attributes. However, this does not mean that search engines do not use the title attribute to extract relevancy signals or that you should not implement it. In many browsers, the title attribute displays as a tooltip on mouseover and is used to give users additional information.

Figure 381—Title attributes appear as tooltips when the mouse is moved over images. The Outdoor Storage thumbnail contains the title attribute “Outdoor Storage.”

If an image is representative (i.e., a product image), it requires an alternative text and can have a title attribute too. The title attribute’s content should not be an exact copy of the alt text but rather should complement it. Keep the attribute short enough (i.e., under 255 characters), and do not just list keywords—create a meaningful sentence.

Our initial sample image tag can now be improved to read:
<img src=“0012adsds.gif” alt=“yellow t-shirt” title=“athletic women wearing a yellow tee shirt” />

Do not underestimate title attributes because search engines do not cache their content. They can play a big role in providing context to users, and we do not know how search engines use them to extract relevance.

Specify the width and height of the img tag.
Let’s improve the img tag further for faster browser rendering and better page load speed:
<img src= “0012adsds.gif” alt= “yellow t-shirt on a model” title= “athletic women wearing a yellow tee shirt while running” height= “250″ weight= “100″ />

Figure 382 – These image dimension tags help with faster browser rendering.

Tip: for infinite scrolling or other image-heavy use, defer image loading until images are visible in the browser.

Use keyword-rich file names.
You probably noticed the unfriendly file name used in the initial example: 0012adsds.gif. This filename does not help search engines understand what the image is about, and it should be avoided.

Figure 383 – This is an example of a good image file naming for a category thumbnail, including the category name “brake discs”. The file names for product images should be even more specific.

Your file names should include the product name, the category name, or whatever is depicted in the image. Having keywords in file names has long been recognized as an SEO factor.[6]

A common challenge for large e-commerce websites that use hosted image solutions to manage, enhance, and publish media content is that most of these solutions do not create SEO-friendly image names. For example, this URL is not SEO-friendly at all:$pdppreview_360$

Talk to your provider to find out whether there is a workaround to achieve a better file naming convention.

If we further optimize our example to include a relevant file name, we now have the following image tag:
<img src= “yellow-t-shirt.gif” alt= “yellow t-shirt on a model” title= “athletic women wearing a yellow tee shirt while running” height= “250″ weight= “100″/>

You should set and enforce image-naming rules. Otherwise, things can quickly get messy. For example, you could have the rule to append the image ID at the end of the file name after two plus signs, as in this example:

Provide context for your images by using captions and nearby text
Image captions or nearby text surrounding the image can provide context to search engines.

4 ipods followed by text description

Figure 384 – The descriptions in this screenshot provide context to search engines.

Apart from adding relevant image captions, you can provide a better context for your images by placing plain text content nearby. Add a relevant sentence visually close to the image and in the HTML code whenever possible.

Here’s how our img element can be improved even further by adding a caption to it:
<img src= “yellow-t-shirt.gif” alt= “yellow t-shirt on a model” title= “athletic women wearing a yellow tee shirt while running” height= “250″ weight= “100″ />Adidas 2011 Summer Collection </br> Yellow T-Shirt

The caption will often be the product or the category name.

Create standalone landing pages for each image.
If it makes sense (e.g., if you sell stock images), create dedicated landing pages for each image.

landing page dedicated to a single image on

Figure 385 is an example of a clean and useful landing page dedicated to a single image. In our example, the image is the product that is sold online.

You can also encourage users to generate content on your website by allowing them to comment, share, or rate images.

Make use of image XML Sitemaps.
Create image XML Sitemaps and include information about your product and category images. Here are the official guidelines on how to accomplish this.[7]

xml source code highlighting image details section

Figure 386 – The basic information in the image Sitemap should include the path for your image files.

You can also specify information such as image caption, geolocation, title attribute, and license. Once you have generated this file, submit it to Google using the Search Console.

Add EXIF data to your images.
At least one search engine (Google) has confirmed using EXIF[8] data when analyzing images. More and more photo and mobile devices automatically add EXIF information such as geolocation, the picture owner, or the camera orientation. If this data has the potential to provide search engines with additional info about images (and it has), edit the EXIF data for your product and category images. However, do not make this a top priority.

Adding image metadata, such as User Comments, can reinforce your image’s title or alt text. Other metadata that may be useful are Artist, Copyright, or Image Description.

screenshot of exif editing software

Figure 387 – You can use EXIF editors to change images’ metadata.

It may be worth testing how adding EXIF metadata affects traffic from Image Search. Just remember that Google re-crawls images at a much lower rate than the regular web. Also, it pays to mention that when you use image optimization tools to reduce the image size, you can accidentally remove existing EXIF data.

Group similar images into folders
All images logically grouped around a similar theme should be grouped into folders if possible and appropriate. You can replicate the directory taxonomy of your website to use it for images as well.

If the URL path for your T-shirts category is, then your images can be placed under

browser address bar that contains a proper URL structure

Figure 388 – You can see how this t-shirt image is located under the /t-shirt/ directory. This directory contains only t-shirt images.

The advantages of grouping into folders include adding keywords to the image URL and providing relevance clues to Google Image Search users. While grouping has limited influence on rankings, keywords in the directory structure are some of the signals search engines seek.

In our example, if we put the image under the /t-shirts/ directory, the img tag becomes:
<img src= “/t-shirts/yellow-t-shirt.gif” alt= “yellow t-shirt on a model” title= “athletic women wearing a yellow tee shirt while running” height= “250″ weight= “100″ />Adidas 2011 Summer Collection </br> Yellow T-Shirt

You should place your adult (or other sensitive) images into separate directories.

Use absolute image source paths.
The way you reference the image source (src) does not directly influence rankings, but using absolute instead of relative paths can help avoid problems with crawling, broken links, content scrapers, and 404 errors.

If we update the source to reference an absolute path, our example becomes this:
<img src= “” alt= “yellow t-shirt on a model” title= “athletic women wearing a yellow tee shirt while running” height= “250″ weight= “100″ />Adidas 2011 Summer Collection </br> Yellow T-Shirt

Make the images accessible through plain HTML links
Try not to use Flash or JavaScript to create slideshows, swatches, zooming, or similar features; that will make it impossible for search engines to find image URLs for important images. If you have to use JavaScript, provide alternative image URLs. Otherwise, search engines may not be able to crawl image URLs easily.

Search engines know that users like high-quality images, so always keep the high-resolution product image URLs accessible when JavaScript is disabled. Search engines can execute JavaScript to some extent, but if the only way to reach a product image is with JavaScript enabled, crawlers may never discover that image.
CRO tip: Place your product images above the fold.

Implement plain-text web buttons.
One technique for improving page load speed, and sometimes even the internal link relevance, is creating web buttons with CSS and HTML. Instead of using the classic web button made of an image, you mimic the button’s appearance by overlaying plain text on a CSS-styled background.

Here’s how a sample implementation looks like:

Figure 389 – The text on the blue background buttons (e.g., “Buy Celebrex”) is plain HTML text that can be selected with the mouse.

Since search engines seem to assign a bit more weight to text links than to image alt text, this technique can potentially increase the relevance of the linked-to page.
The opposite of this technique is to take “unwanted” text (e.g., site-wide boilerplate text) and embed it in images. For instance, if you have a global footer with a regulatory warning at the bottom of each website page, you could embed it into an image so it does not dilute the page’s relevance. This is a bit of a gray hat. Remember that this technique is used mostly by spammers to pass email filters and can be flagged as spam on web pages, too. Use it at your own risk.

Figure 390—The text in the image above is not plain HTML text but embedded in an image. When using such tactics, remember that Google is fully capable of reading text from images.

Make it easy to share images.
Make it easy for users to share and embed your images whenever appropriate. This is great because, with a mandatory image attribution requirement, you can generate backlinks.

In this example, take a look at how Flickr integrates social sharing and embed codes:

Figure 391—You can encourage users to share images, especially product images. User-generated photos, such as products in real life or inspirational pictures, can be re-shared.

To recap, we started with a very basic image element:
<img src=“0012adsds.gif” />

And we have ended up with this optimized version:
<img src= “” alt= “yellow t-shirt on a model” title= “athletic women wearing a yellow tee shirt while running” height= “250″ weight= “100″ />Adidas 2011 Summer Collection</br> Yellow T-Shirt


A 2011 study[9] found that videos in search results have a 41% higher CTR than plain-text results. One online retailer found that visitors who view product videos are 85% more likely to buy than visitors who do not.[10] According to econsultancy, Zappos sales increased between 6% and 30% on products with product videos.[11] There are multiple benefits to having videos on PDPs, so there is no doubt that you should do it if the budget permits.

Videos can be self-hosted on your servers or hosted by a third-party provider. Some providers, like YouTube, are free, while others, like Wistia, are paid. Remember that if you host videos on third-party websites rather than on your website, you might miss the opportunity to gather links to your videos. If the video goes viral, you will miss a lot of backlinks and social signals.​For example, Dollar Shave Club launched its viral video on YouTube, and its website gathered almost 20,000 backlinks.

Many websites mentioned the brand name without linking to Dollar Shave Club; however, most websites linked to the YouTube video. Should they have self-hosted the video, people would have linked directly to the video URL on Dollar Shave Cub, thus increasing their backlinks.

Figure 392—The number of backlinks for spiked after the video went viral, a positive side effect of the video’s success.

However, take a look at the number of backlinks the YouTube URL gathered:

Figure 393 – If these links pointed to Dollar Shave’s website, it would’ve helped increase their search authority.

Here are several tactics to get the most out of your product videos:

  • Transcribe the video and make the text available to search engines whenever doing so makes sense (e.g., when you have an expert video reviewing products).
  • Add social sharing buttons and easy-to-use embed codes.
  • Create video XML Sitemaps[12] and submit them to Google and Bing.
  • Repurpose your videos to produce related content—e.g., presentations, user manuals, instructographics,[13] podcasts, etc. You can go the other way, too: use other existing media types to create the videos.
  • Mark up the product video with vocabulary.[14]
  • If possible, embed the video with HTML5 rather than iframes.

It is best to self-host the videos or use a paid hosting solution to embed them on your URLs. This will increase the chances of getting video-rich SERP snippets for your domain name. YouTube provides rich snippets for the domains the videos are embedded on, but only sporadically.

The next image depicts how Google ranks a YouTube video in the first spot, while Zappos does not get a video-rich snippet, although a video is included on their page.

Figure 394—Why does Google rank its property (YouTube) while the content creator (Zappos) ranks below YouTube?

Product descriptions

Product descriptions should be written to improve conversions by creating an emotional connection with users and enticing them to act. It is known that evoking any emotion is better than not evoking emotions at all. While most people only skim descriptions, carefully crafting the first sentence to be engaging enough will increase the chances of making a sale.

The best product descriptions are written by copywriters familiar with the product who have received some basic SEO training, not by SEOs with copywriting skills.

Figure 395—Read this description. It does not read like the classic SEO style you are used to.

Writing converting product descriptions requires a lot of work, so you may be tempted to skip it. However, the good news is that many competitors will not invest in great product descriptions for the same reason. You can capitalize on their mindset and differentiate your brand while gaining an SEO advantage.
Prominently display a brief product copy crafted to sell the product’s benefits (also known as the “benefits copy”) in an easy-to-spot place on the product page. You can complement it with a more detailed product copy that describes the product features (also known as the “features copy”) on a less important page area.

Try to incorporate the following into the copy:

  • Product-related keywords (e.g., SKU numbers, UPCs numbers, catalog numbers, IBANs, part numbers, etc.).
  • The root form of the words used in the product name, as well as variations and synonyms (e.g., “seat”, “seating”, “chair”).
  • The product name (make sure you repeat it in the product description at least once).
  • Other names the product might be known by.

Doing this will be important for search engines and your internal site search (given that the site search uses multiple data sources to score and rank items). So, review your analytics data, incorporate frequently searched queries into your product descriptions, and use the same queries to feed your internal site search database.

Figure 396 – Section (1) of the copy focuses on benefits, while section (2) lists the features.

In the previous image, section one focuses on benefits, while section 2 lists the product features. The layout of the PDP allows the features copy to follow immediately below the benefits copy, which is ideal but not always possible due to design constraints. In many cases, the page layout allows room for only one or two sentences and a hyperlink to a section down the page, where you can list more detailed product info (i.e., details presented in tabbed navigation.)

Figure 397 – Tabbed navigation allows space for longer product descriptions.

Tabbed navigation is a very common design element on PDPs, but there are some concerns about how search engines treat content that is not visible to users (e.g., in non-active tabs, accordions, “read more” drop-downs, or collapsed and expanded sections).

Search engines will index this content but do not assign the same weight to content that is not visible to users. In the past, Google suggested that text hidden for design purposes is fine as long as you do not hide too much content with too many links.[15] However, this technique is not considered spam. However, after the mobile-first indexing update, the content behind tabs will be assigned the same weight as visible text.

You can create separate URLs for each tab, but that would decrease the overall content on the product detail page, which is not a good idea. Moreover, the user experience is better if the tabs can be switched quickly without reloading the page.

Until mobile-first indexing is rolled out for your website, an interim solution is to display the entire product description without any text expand/collapse functionalities or tabs. Search engines seem to prefer this, but design limitations introduce constraints.

Figure 398—The above is a well-written product description that is fully visible without any parts of the content behind tabs or read more. It is also a good example of how a “boring” product can have a great description.

If you want to use tabs to make the user experience more pleasant, consider the following:

  • Display the product description (or other important content) in the default active tab. If search engines can understand which content is hidden and which is not, then putting the most important content in the default tab increases the chance of getting more out of it.

Figure 399 – The product description tab is the default active tab and is accessible when bots request the raw HTML. Additionally, the implementation uses hash marks to switch between tabs, which means that the content of the Specs and Reviews tabs is already loaded in the HTML code.

  • Do not generate separate URLs for each tab unless the information provided is substantial enough to justify creating a new page.
  • If you want the content inside the tabs to be indexed, make sure it is available with JavaScript disabled.
  • Suppose the tabs contain the same boilerplate text on all product detail pages (e.g., shipping information, legal, etc.). In that case, you can put the repetitive text in an iframe to avoid duplicate content issues.
  • Consider placing user reviews outside the tabbed navigation.

Product descriptions are one of the best spots to feature internal contextual links. Ideally, you will link to parent categories in the same silo (maybe using the same URLs as in breadcrumbs), but you can also link to other related products that make sense for users. You should balance internal linking and conversion because internal links may take users away from the product page.

If you are not careful, product descriptions can generate duplicate content within your website (if the same product description is used across multiple product variants) or on external websites (if you use generic manufacturer-supplied descriptions).

Manufacturer-supplied descriptions
The general SEO wisdom is that you should write unique product descriptions. It is one of the best approaches to optimizing PDPs if you can put that into practice. However, keep in mind that this does not work with every product or within every industry. For example, it makes sense to write unique product descriptions for expensive wristwatches but not for ordinary pencils. Also, this is often not economically feasible for websites with large inventories.

Moreover, sometimes, Google ignores the product description and ranks what’s best for users based on their intent and location.[16] So, while unique product descriptions may not always rank at the top, you still have to test the impact of writing 100- to 200-word product descriptions at scale before deciding whether it will work to your advantage. Start with your top 10% most important items, write the descriptions for conversions and branding, and then measure the impact on rankings and traffic. If 10% is too much based on your inventory size, start with the top 100 to 150 products.

I know at least two websites that were able to rank for very competitive keywords by creating unique and compelling product descriptions. Because 90%+ of the pages on those websites were PDPs, they drove up the relevance of the entire website. Those two websites rank now with almost no backlinks pointing to them.

According to Google, if the generic description provided by the manufacturer/supplier is just a small part of the main content on a page, you are fine.[17] If the manufacturer requires you to keep their descriptions unaltered, which causes SEO problems, place each description in iframes with a noindex in the frame source. In this case, add unique content to differentiate your website from competitors.

Product variants

Even with unique product descriptions, you will encounter crawling and duplicate content issues if products come in multiple variants. For example, the Nike Dual Fusion shoes come in red, green, and black; this generates unique URLs for each possible product variant. Usually, product variants generate exact or near-duplicate content, and such cases are best handled with rel= “canonical”. However, rel= “canonical” is not the only solution.

Decide how to handle variants once you understand how your target market searches online; base the decision on your business goals.
For instance, if your target market uses search queries that include variant keywords, you need unique URLs for each product variant. Make those URLs available to users and search engines, and do not use rel= “canonical” to a representative URL (i.e., the master SKU). The challenge is to make these pages compelling by adding unique product descriptions for each variant product.

Let’s discuss a few approaches to handling product variants.

URL consolidation

With this approach, you handle all different product options in the interface, using a design that helps users make faster and better product selections. All product variants are displayed on a single product detail URL, which does not change when a variant is selected in the user interface.

For instance, you can provide product options with drop-downs or swatches, as depicted in the next screenshot:

Figure 400 – Changing the color and size options with a drop-down selector does not change the URL.

Figure 401 – These swatches change the product image and description, but the URL remains unchanged.

To increase the chances of the canonical page surfacing in SERPs for queries that include attributes such as colors – e.g., “Nike Dual Fusion 2 Run Gray” – include the product attribute(s) as plain text copy in a search engine-friendly way (accessible to bots, either server-side or in the rendered HTML). For instance, the product description, the specs copy, or other parts of the PDP copy will include something like, “This item is also available in gray, red, and blue”.

If you already have unique URLs for each variant and want to consolidate them into one representative URL, use 301 redirects or rel=”canonical” to point to the master SKU/PDP.

Unique URLs for each product variant

Having unique URLs for each product variant allows those URLs to rank in SERPs for queries containing product attributes. However, since the content on these pages is similar, the URLs might compete against each other—or worse, they might be completely filtered out from the SERPs. If you use unique URLs for each variant, the URLs will include a self-referencing canonical tag.

Additionally, having individual URLs for each variant will dilute indexing properties and backlinking power if other sites link to variant PDP URLs instead of a master PDP.

This approach should be implemented if your data shows that your target market searches for various product attributes such as model numbers, colors, sizes, etc. (e.g., different tire sizes, like P195 / 65 R15 89H).

The challenge and key to success with this approach is to create content that is different enough for each variant page so that it does not get filtered by Panda and does not create duplicate content pages.

Unique URLs for each product variant, along with canonical URLs

A hybrid approach uses the interface to allow users to select product options without changing the URL while using URLs for each product variant. Each variant URL will point to the authoritative product using rel= “canonical”. This is how Zappos handles product variants.

The canonical product page is

The following product variant URLs (different color models) point to the canonical URL above:

Because all color variant PDPs point to a canonical product, Zappos ranks in Google SERPs with the canonical URL, even when someone searches for a color-specific product.

Figure 402 – Zappos ranks with the canonical PDP, the page for the pink shoes. However, the users landing on the canonical PDP must take additional steps to find the red color option, which is probably not the best UX (as few clicks as possible to complete a task).

Some of the advantages of having separate URLs for product variants are:

  • Users can share each variant URL.
  • You can link internally to variant PDPs.
  • People can backlink to any variant  URL.
  • You can list product variants on internal site search results or category listing pages.

For example, suppose an item is available in various colors, and someone visits the category page to select “red” from the faceted navigation filters. In that case, you will want to show only red items. If you do not have separate URLs for “red”, it is impossible for that user to share the “red shoes” page with someone else.

Additionally, if you run product listing ads for variant-specific keywords, it is better to land users on product variant URLs. Pages targeting product attributes tend to convert better than just one-size-fits-all pages that require users to find the product filtering options.

I usually recommend creating separate URLs for the most important product variant, not just for SEO reasons but also to provide better landing pages for PPC and PLA campaigns. If you encounter SEO issues (e.g., crawling, duplicate content, or ranking cannibalization), you can always noindex variant URLs or implement rel= “canonical” to point to the representative URL.

The following is a quote from Google:

“Google allows rel=“canonical” from individual product variants to a general/default version (e.g., “Taccetti 53155 Pump in Beige” and “Taccetti 53166 Pump in Black” with rel=“canonical” to “Taccetti 53155 Pump”) as long as the general version mentions the product variants. By doing so, the general product page acts as a view-all page, and only the general version may surface in search results (suppressing the individual variant pages)”.[18]

Thin content

Even after creating unique descriptions for every product in the database, you might find that pages have been filtered out of SERPs because they have been classified as “thin”. This means that the content on the PDPs is not “relevant enough” for Google to include the URLs in its results.

Figure 403 – Highlighted in red is the entire description of this product and pretty much all the text content on this page. Unless this page has good authority, its chances of being included in the SERPs based solely on content are slim.

Ask your programmers to provide a .csv file containing the word count of each product URL. If the website is small, run a crawl with Screaming Frog and sort the URLs by WordCount. Your data must include only non-boilerplate text, such as the word count for product descriptions, user reviews, or other forms of UGC. Get this list in Excel and sort by lowest count. Find pages with low content (e.g., under 50 words) and add them to the copywriting queue based on their importance. If too many pages have thin content (I usually set that threshold at around 50 words of content unique to the site), you may even consider noindexing them until you can add more content.

Product names

Product names are one of the elements that attract the user’s eye within moments of landing on a page. On most e-commerce websites, the design of the PDPs usually follows the same pattern: the product image is to the left, and the product name is either above the image or to its right side. The add to cart button is to the right of the product image, and the product info is either on the right or below the product image.

This is probably why users scan PDPs using the well-known “F” pattern.

Figure 404 – The “F” pattern applies to ecommerce websites, too. Image source: NNgroup

Although there seems to be little correlation between rankings and H1 headings [19], Google suggests[20] that they assign more weight to H1s. Therefore, wrapping the product name in an HTML heading element, preferably the H1, is still a good idea.

The following is an excerpt from Google’s SEO Report Card, which aimed to identify potential areas for improvement on Google’s product pages:

“Most product main pages have an opportunity to use one <h1> tag … but they’re currently only using other heading tags (<h3> in this case) or larger font styling. While styling your text so it appears larger might achieve the same visual presentation, it does not provide the same semantic meaning to the search engine that an <h1> tag does. The product’s name and/or a few words about its features are great to have in an <h1> tag for the product main page”.[21]

However, if the document structure requires it, an H2 for the product name will work too. Note that the heading hierarchy on PDP templates will differ from those on category pages or other templates. Keep in mind that visually, the product name should be the largest font size on the product page.

Do not be afraid to create long product names. Two-column PDP layouts can easily accommodate this. Include the brand or the manufacturer associated with the product, especially if you sell products from multiple brands. Also include model numbers, collection names, SKU numbers, or other important product attributes.

Figure 405—On this dress PDP, the product name includes the brand, the fabric, and the color. This is great information for users and search engines.

Figure 406 – The product name in the example above does not even include the category the product belongs to: slippers. It may be obvious to users that they are looking at slippers, but not including “slippers” in the product name is not good for search engines.

The person (or the team that adds new products to the catalog) should be trained to understand how your target market searches for those products and should propose product name templates based on that data.

This is not a complex process, and if you want to ensure you do not mess up the product names, add just the shortest product name in the database and then programmatically add other relevant product attributes.

Product naming gets complicated when you do not have control over product names. That can happen when you run a marketplace where suppliers upload product sheets. In such cases, naming conventions are hard to create and enforce, and it may be better to let suppliers use open-text fields for product names. If sellers upload products, you should enforce a maximum number of characters to be used in the title. Amazon, for example, has a maximum limit of 250 characters. I also recommend checking if the titles are not truncating words at the 250-character limit.

If you allow product name changes, only give the update rights to one person. Optimally, this person should be aware of the impact of changing product names (e.g., URLs might change, potential backlinks loss, internal linking updates, 301 redirects from old to new URLs, etc.).

Also, in most cases, it is a good idea to set product names in stone or not to update the URL when the product name changes. However, the latter option may pose issues with new URLs containing old product names. So, it would be best to balance updating versus not updating URLs when product names change. Although not preferred, a solution is to keep the PDP URLs free of product names and use only product IDs in the URL. Consider this approach only if you cannot easily implement 301 redirects.

Use Product[22] type to mark your code with product names, brands, manufacturers, images, and other product properties. Search engines do not yet use many Product properties, but as long as you already keep product attributes in your database, it will not be much of a hassle to mark up your HTML code at a later date. Google supports some of these properties[23] and will gradually support even more. The preferred way to markup the content is JSON-LD.


There is no doubt that product reviews are good for users and conversions. According to one study,[24] adding just the first review can increase conversion by 20%. Reviews enriched with additional info about the reviewer or reviews that can rate a particular product criterion (e.g., quality versus price) are even more useful for users.[25]

You can generate reviews by collecting them from people who purchased on your website or integrate them from vendors who sell reviews. Keep in mind that it can take many purchases to get a single review. Anecdotally, it took Amazon 1,300 book sales to generate the first review[26] for Harry Potter and the Deathly Hallows.

Implementing in-house and third-party reviews is a good idea, especially if you are starting.

The reviews can end up on multiple URLs, depending on your chosen solution and how you customize their out-of-the-box implementation. This means reviews can occur on URLs on your website, the vendor’s website, or other competing websites. This will likely create duplicate content issues, so you must pay attention. Just keep in mind that there is no such thing as a “duplicate content penalty”. However, if your reviews are duplicated elsewhere, they may not work as well as expected.

When implementing reviews, first, you need to decide which type of content you want to surface in SERPs for “reviews” related queries: do you want to rank PDP URLs or product review URLs specially constructed to target queries like “review + product names”.

You want PDPs to appear in SERPs for “reviews” related keywords.
In this scenario, the reviews should be placed on the PDPs and openly available to search engine robots. This means the reviews will not be inserted in the code with JavaScript, AJAX, or other technology that loads them client-side. The reviews should be available in the HTML source code when a bot fetches the PDPs. All other pages, sections, or subdomains that list the same reviews should be blocked with robots.txt.

For instance, Amazon allows the reviews for this bicycle SKU (Kent Super 20 Boys Bike (20-Inch Wheels), Red/Black/White) to be indexed.

Figure 407 – The customer reviews are included on the PDP, and Google caches them.

To check whether the implementation of the reviews is SEO-friendly, look for the content of the reviews in the text-only cached version of the PDP. Additionally, do a “Fetch & Render” using Google Search Console to see if the reviews appear on the rendered page. Make sure you are rendering the mobile pages as well.

What if you want a dedicated product reviews page in SERPs instead of a PDP?
Some vendors require a subdomain or a directory to publish their reviews, e.g., or This is not necessarily bad, and it is a valid approach for merchants who plan to attract searchers in the research stage of the buying cycle. The “reviews” keyword modifier (e.g., “Under Armour Stormfront Jacket reviews”) suggests that users are closer to a buying decision. To increase the chances of product review pages ranking for “review”-related keywords, consider linking internally and externally with “reviews” in the anchor text.

Figure 408 – The reviews subdomain shows up in SERPs. This was a deliberate decision on Clinique’s side.

If you want the dedicated product reviews page to appear in SERPs, be careful with duplicate content on your website. You will often list the same product reviews on the PDP and the product reviews page. In these instances, you will have to prevent crawlers from finding the reviews on the PDP. When you have the same reviews on multiple URLs, search engines will have difficulty identifying the right page to surface in SERPs.

To check whether your reviews generate duplicate content, copy a few sentences one at a time from various reviews and do a “site:” search on Google:

Figure 409 – In the example above, the same review is shown on 15 URLs. This needs to be investigated.

However, if you list only a small fraction of the total number of reviews on the PDP (e.g., five out of 50), then you can let search engines access the reviews on the PDP and the product reviews page. In this case, do not block the reviews subdomains/directory with robots.txt.

You can see this implemented on Amazon:

Figure 410 – Amazon has a dedicated directory for product review pages.

The reviews for Kent Super 20 Boys Bike (20-Inch Wheels), Red/Black/White (Sports) are accessible to Googlebot and have been cached by Google. Amazon can afford this approach because the product reviews page lists three times more reviews than the PDP.

Figure 411 – Amazon could improve the product reviews page by consolidating the three paginated pages into one superset.

Amazon opens its review URLs for bots to rank multiple review pages related to this bike:

Figure 412—The PDP takes the first position, while the product reviews page takes the second and third positions.

Here are some SEO considerations you may want to consider when implementing reviews.

Pay attention to duplicate reviews on other websites
If your provider syndicates reviews in an SEO-friendly manner to other websites (meaning they are accessible and available for indexing by search engines), that will cause duplicate content issues. Again, it is not like you will get penalized for doing this, but the SEO effectiveness of the reviews will be diminished.

If the provider syndicates the reviews, you should allow crawlers to access duplicate reviews only if you add substantial unique content to the pages the reviews are listed on, in addition to the reviews offered by your provider. For example, you can feature your Expert Reviews or reviews you collected independently.

If 90% of the reviews are syndicated elsewhere on the Internet, wrap them within an iframe, put them inside a blocked robots.txt subdomain, or AJAX the review implementation.

You must also be careful if you syndicate your reviews on comparison-shopping engines (CSEs).

Figure 413 – ABT is the source of the review, but Bizrate and Shopzilla have the same content indexed and might rank above ABT.

If you plan to syndicate reviews on CSEs, select which reviews to keep for your website and which ones to syndicate.

Mark up reviews and ratings with structured data such as microdata, microformats, or RDFa.
Use vocabulary to mark the reviews and ratings to get SERP-rich snippets (stars, ratings, videos, etc.). The reviews and ratings must be displayed on the same page as the relevant product. Google explains its implementation in detail in this article.[27]

Also, this tool might be useful when working with Schema markup:

Separate URLs for each review
If your current implementation generates separate URLs for each review, then using rel= “canonical” to point to a view-all reviews URL is acceptable.[28]

Display reviews of related products
If a product has no reviews, but there are other closely related items with reviews (e.g., the same pair of Nike shoes, but in a different color), you can display reviews for the related item. However, you have to ensure that the reviews make sense to users. Do not markup the reviews with semantic markup and place those reviews in a JavaScript or robots.txt blocked iframe – the purpose is to offer something useful for users, not spam search engines.

Tip: Reviews are one of the best ways to keep product detail pages “fresh”. If you keep regularly adding reviews to a product page, the page will be crawled more often, its authority will increase, and it will show up in SERPs for more queries.

Other ways to freshen up PDPs include adding excerpts from relevant blog posts and adding one or two sentences from research papers related to the product (if applicable).

Expired, out-of-stock, and seasonal products

Product lifecycles and seasonality can rarely be avoided. Some products can expire for good, and others can go out of stock. Some of those that go out of stock may be restocked, others will not. Other products are available only during a certain season, while some are evergreen and never change or run out of stock. How you handle product lifecycles from an SEO perspective depends on future inventory availability.

There is no definitively correct way to handle product lifecycles, but generally, try to:

  • Avoid removing OOS item URLs until you know if the product is back in stock. If you remove URLs, they will return a 404 header response. 404 pages are taken out of the index after a while, and you might lose some possible backlinks. If you need to return a 404 page, then at least create a custom page that will help reduce bounce rates.
  • Also, avoid serving soft 404s, which are light-content pages responding with the 200OK response code, but their content displays “Sorry, the item is no longer available” or something similar.[29]
  • Do not 301 redirect every out-of-stock PDP to the home page or their parent category page. Since the home page is unrelated to the out-of-stock product, 301 redirecting a PDP will be treated as a soft 404, which will not preserve indexing signals.
  • Use meta “expiry” if your items are unavailable after a certain date. Classified ads can be marked up with this tag.

Figure 414 – The header response code for inexistent URLs should not be 200 OK.

The screenshot above shows how the server responds with 200 OK for a dummy URL request. There are instances when this kind of setup makes sense, for example, when you want a PDP to load properly with only a product ID in the URL). You must include a rel=” canonical” to the representative URL in such cases.

Discontinued products

These are products that have reached the end of their lifecycles.[30] For example, Canon stopped manufacturing the Canon EOS-1Ds Mark III model in 2012. Sometimes, end-of-lifecycle products are replaced with a newer model, but other times, they are discontinued for good.
If a product is replaced with a newer version, you can 301 the URL for the old model to the latest product URL. If possible, alert users with a message that the product they are looking for has been discontinued and has been replaced with a newer one. The old product name should not be close to the “not available” text in the source code. Otherwise, the “not available” text may appear in the SERP snippet. You can even place the non-availability message in a robotted iframe or JavaScript to avoid that. Do this only to improve the CTR on SERPs and not to attempt to game search engines.

Because the target market does not stop searching for a product immediately after the manufacturer discontinues it, you should redirect searchers only after you notice a significant decline in the search demand for that product or when all your stocked items for that SKU are sold. Until then, display a notice on the old PDP announcing that the product has been discontinued and link to the newer version.

Depending on stock availability, some prefer leaving both pages alive indefinitely, with or without a notification message.

Figure 415 – This PDP URL is still available and responds with a 200 OK code, although the product has been discontinued. This is an acceptable solution because if you still have the discontinued item in stock, you want to sell it.

Upcoming products

Create pages for high-demand products that have not yet been released but will be on the market shortly (i.e., a few months later). Such pages must be content-rich and helpful for users. This tactic is useful because these pages will be mostly non-commercial, and they can gather links organically from trusted sources more easily than commercial pages.

A month before the new product launch, increase the number of internal links to those pages, for example, by linking from the home page. Take pre-orders or capture contact info before the launch date. When the product becomes available, users can add it to the cart.

If you plan well, you will be positioned ahead of your competitors at the product release date.

Out-of-stock (OOS) products

There are two main use cases for OOS products:

  • The product will never be restocked.
  • The product goes only temporarily out of stock.

If the product is never restocked, you have a couple of options. The first is 301 redirects to one of the following pages:

  • Another product variant (i.e., the same product but in a different color).
  • A replacement product (i.e., an updated version of the product).
  • A parent category or subcategory (this one is not advisable, so I recommend not doing it).

The second option is to return a 404 status code for the PDP. If you cannot implement the first option, you will do this.

The third option is to leave the PDP alive and return a 200 OK response code. In this case, displaying a visible notice communicating the reason for unavailability is very important. It is also important to guide users to a replacement or a similar product. Optionally, the “add to cart” button can be changed to “out of stock” and deactivated so users cannot add the item to the cart. Another option is to allow users to “save for later” or “backorder”.

To minimize the effects on the conversion rate for permanent out-of-stock PDPs, offer related items in a very accessible spot on the page.

If the product temporarily goes out of stock, the page should return a 200 OK response and let customers know that the product is currently out of stock. It should also provide users with an estimated availability date if that is possible.

Eventually, it would be best if you offered an incentive (e.g., a 10% discount) to compensate for the inconvenience and collected their email addresses to announce the product’s relaunch. Additionally, make sure users can backorder the product.

Figure 416 – The “Temporarily Out of Stock” messaging is easy to spot, and it is clear. However, separating it from the product name would be better.

If all the products under a subcategory are out of stock and the PDPs received qualified traffic in the past, 301-redirect them to the parent category. The subcategory page will redirect to the parent category since it has no stocked products. This may not be the best approach from a user experience perspective, but you may want to preserve eventual backlinks pointing to PDPs and subcategory pages. If the subcategory generates minimal revenue, traffic, and backlinks, let it return 404.

Remember that shoppers may become frustrated if too much of your inventory is out of stock. In this case, markup the affected pages with noindex and remove them from navigation until your inventory improves. This helps to address content quality and Panda penalties.

Here are some additional recommendations for handling out-of-stock products:

  • Google treats expired products as “soft 404” errors. This means that OOS pages are considered low-quality content; in many cases, such pages should be noindexed.
  • Google recommends removing OOS pages from its index by returning a hard 404 Page Not Found header response. However, this approach does not work well for UX and conversion.
  • Out-of-stock SKUs should not be presented anywhere in the site navigation. However, they can appear on internal site search result pages when someone queries the SKU number or the exact SKU name.
  • Out-of-stock URLs should be accessible for type-in traffic or email to assist those who have questions about a product they purchased in the past and is now OOS.
  • OOS products should be accessible to the sales team on your intranet.
  • Neither 404s on discontinued or long-term out-of-stock product URLs (which is what Google recommends many times) nor 301 or 302 header redirects provide the most optimal user experience.
  • Google says not to use 301 or 302 to a parent category or home page; this is the correct approach.

Despite all the options above, I believe there is a better approach from a UX and SEO perspective: instead of a “useful” 404 page, show the OOS page only to users landing from outside your website (i.e., organic or referral).

This page displays a modal window with a very short and clear message about the product status. Offer one link to the OOS product and another to the related or the replacement product. Be careful about the size of the mobile modal window. It should cover 20% of the screen size at maximum and be placed at the bottom.

The modal window’s messaging can be similar: “Sorry, this product is out of stock. Visit the out-of-stock product or navigate to the replacement product”. The modal window displays a 10-second countdown timer. Ten seconds should be enough for most people to read the message. At zero, the user is redirected to the most appropriate page.

The redirect is done with JavaScript, which passes SEO signals or the meta-refresh tag. If you want to pass authority from the old to the new page, the JavaScript timer and the meta refresh must be under 5 seconds.

Seasonal products

If the product is seasonal, handle it similarly to out-of-stock items. If the product will return in stock the next season, then leave the page in place, notify the users, and remove the ability to place orders. If it does not return, 301-redirect to another product variant (i.e., the same product but in a different color) or a replacement product.

Seasonal products, just like event and holiday URLs, require some attention regarding URL naming and maintenance. For example, if you use years in the URL, updating the URL the next year is like starting over again. Of course, you could do a 301 redirect from the previous year’s URL to the current one, but it is better to avoid using URLs that designate years or other time indicators. Instead, use a generic URL that can accommodate new dates or models in existing URLs.

For example, Ford uses for their newest Focus model, 2018. Toyota uses for all Corolla models, no matter the release year, as you can see in this screenshot:

Figure 417 – This URL naming convention consolidates links to a single page, year after year.

The same recommendation applies to URLs for regular special events. Instead of, use This page can be promoted harder when the time comes, but it should not be allowed to return a 404 status code after the event ends.

Title tags

While <title> is technically not a tag but an HTML element, it is often referred to as a tag in SEO contexts.

An internal analysis[31] that Google performed on its own Google product pages found that over 90% of those pages could improve their SEO simply by optimizing the title tag.

Since Google emphasizes titles in blue text, they are the first element searchers scan on SERPs. Titles are important in determining whether searchers will click on a particular listing. They are also one of the most important on-page SEO factors, and when others link to your pages organically, they tend to use the page titles as anchor text.

It is important to mention that, just as with Google Ads, the title of a SERP snippet has the biggest influence on CTR. Moreover, because SERP CTR and dwell time are now part of RankBrain, aiming for better click-thru rates on your organic results is important. Higher-than-average CTRs and longer dwell times are quality signals search engines use.

The SERP title myth
Many otherwise knowledgeable web admins (and even a few SEOs) believe that the content of the title tag is the only source Google uses to generate and display the SERP titles.

Figure 418 – SERP titles are emphasized in blue.

Yes, most of the time, the content of the title tag is displayed in SERPs, as you can see in the screencap above. However, the SERP title is not based solely on what is wrapped within the HTML title tag. Google’s goal is to be relevant, so it is expected that they will not unthinkingly use just the content of the title tag to generate the most relevant snippets for users.

For example, you forgot to add the product name in the title tag, and a user searches for that product. Due to the great content and backlinks, Google might classify the page as highly relevant to that search query. However, displaying an empty title in Google SERPs would be a poor experience since the page title is missing. In such cases, Google will use other sections on the page to extract and display a more useful title to the searcher.

A common question is, “Why is Google changing/rewriting/not indexing my title tag properly?” As mentioned, Google aims to provide the most relevant titles for searchers. To accomplish this, Google will use various data sources and signals. They will also analyze the page content and look for external relevance signals from other sources (e.g., from the now-extinct DMOZ, Yahoo! directory, or the anchor text in backlinks) to match a user query with relevant content extracted from a page.

Here are some scenarios that may trigger search engines to alter the SERP titles:

  • It’s a malformed title tag.
  • Titles that are too short or too long.
  • A page blocked by robots.txt but with many backlinks related to the search query.

Getting a different title in the SERPs than the one in the HTML code does not mean that Google indexed your pages or titles incorrectly; it just means that the search query determines whether your HTML title tag is displayed.

Since we are discussing titles and CTRs, I want to touch on SERP CTR, SERP bounce rate, dwell time, and pogo-sticking concepts.
SERP CTR is the click-through rate on organic search results.

SERP bounce rate[32] is a bounce that happens when searchers click on a SERP result and then go back to the initial SERP without interacting with the content on the page they clicked on. That is not necessarily bad, depending on how much time the searchers spend on the website.

Dwell time is the time a searcher spends on a page before returning to the SERPs.

Pogo-sticking is defined as going back and forth between a SERP and the web pages listed in the results.

All the above are “crowd-sourced” metrics used by search engines to self-evaluate the quality of their results. For example, suppose a spam page ranks first for a competitive keyword but does not get enough clicks because users easily identify its SERP snippet as spam as soon as they see it in the listing. In that case, that page may be deemed irrelevant regarding the keyword. Similarly, suppose a page ranked number one for a particular keyword gets many clicks, but almost everyone bounces back within a second. In that case, that signal gets picked by search engines (most likely by RankBrain, in Google’s case). Search engines may reduce the rankings of that page because it is not useful for users and because of very low dwell time.

In an older crowd-sourced test I ran in 2013, the rankings of the targeted URL went up from #16 to #12 for a long-tail keyword after test participants clicked the targeted URL in the SERPs, visited a couple of pages, and spent some time on the test website. However, since this was not a large-scale experiment, it is possible that the fluctuation was just due to personalization or natural SERP variations.

However, if you think about it, it makes sense for search engines to test and analyze how users react to different results and to adjust results and algorithms based on SERP CTR and dwell time. Although it has not been officially confirmed, Matt Cutts suggests in a video that Google takes clicks into account when they test new algorithms on live results[33].

Remember, there is a metric that Google uses internally to measure the quality of their results: the long click.

“This occurred when someone went to a search result, ideally the top one, and did not return. This means Google has successfully fulfilled the query”.

The ideal scenario is to “finish the search” on your website. That is the ultimate quality signal you can send to search engines.

Consider the following suggestions for improving the effectiveness of your titles:

Title tag and H1s matching
One way to reinforce a product page’s relevance is to partially match the title tag with the H1. When doing this, both elements should contain the product name. This partial match is a good idea because H1 and <title> should be conceptually related but not the same.

Optimize your <title> tags for better SERP CTR and the H1 for conversion and reassurance.

On PDPs, the product name is usually wrapped in an H1, and it can be a pattern along the lines of the following examples:

  • {Product_Name}
  • {Brand}{Product_Name}
  • {Brand}{Product_Name}{Variant or Attribute}

You can use the H1 product-naming convention in the title tag as well, but you need to change it a bit—for instance, by adding modifiers such as “Buy”, “Online”, “Free Shipping”, or {Business Name}”. Your title would look similar to:

  • {Product_Name}–{Business_Name}
  • {Brand}{Product_Name}–{Business_Name}
  • {Brand}{Product_Name}{Variant}–{Business Name}

Figure 419 – This screenshot shows that the title tag differs from the H1. Since the product name in H1 is very short, the title tag can easily be complemented with other useful product attributes.

Remember that the keywords in the title tag should accurately reflect the page content and be present in the main content area.

A side note about the title tag for category pages: when a category page lists subcategories in the faceted navigation or the main content area, the <title> tag can include some of the most important subcategories, especially when category names are very short.

Figure 420—This is the SERP for “women’s dresses.” Nordstrom’s title tag includes “Cocktail Dresses” and “Maxi Dresses”. This approach works best for top-level categories with short names.

Keyword significance consolidation
This tactic works only for ecommerce websites that focus on a particular product line (e.g., bar stools) or in a specific niche (e.g., you only sell furniture). The tactic will help increase the significance[34] of your main keyword, creating more relevance around your website for that product line or niche. Here are the steps you need to take:

  1. Place the main keyword on the home page at the beginning of the title tag.
  2. Use the main keyword towards the end of the title tag on every website page, even if the title becomes longer than 65 characters or 500 pixels.
  3. Mention the keyword in the main content area on each website page. If your pages are content-rich, repeat the keyword every 250 words.
  4. Consolidate the contextual anchor text from internal pages to point to the homepage. For example, if “speedboat parts” is your most important keyword, then each page on your website should contain the keyword “speedboat parts” in the main content area, and the first instance of “speedboat parts” should link to the homepage.

Figure 421 – The above is a screenshot from Google Search Console before Google removed the Keywords report.

The report above displays the keywords and their significance for a website that sells only Shoprider scooters. Notice how “shoprider” and “scooters” are the most significant keywords on this website. The website ranks in the top five for “Shoprider scooters” in Canada, close to Shoprider’s official website.

Previously, you could’ve downloaded this list to find keyword variations. That was a useful report to understand how Google groups keywords, but unfortunately, it has been discontinued.

Figure 422 – The report showed the keyword variations as well.

Just a quick note: keyword significance is different from keyword density. Keyword significance is measured at the domain level, while keyword density is measured at the document/page level.

Usually, ecommerce websites ship nationwide or even internationally. However, there are cases when you cannot ship outside a geographical region due to regulatory restrictions. For example, you cannot ship inter-provincially if you sell wine in Canada.

If you sell only to a specific region, province, or city, you can mention that in the title tag to increase the chances of showing up for a geo-personalized search query.

Figure 423 – The URL ranked second contains the city name in the title tag, while the URL at the bottom of page one does not.

If you are a retailer with multiple locations, build separate landing pages for each store location. The store address should be placed in the title tag, and the landing pages should reinforce the store locations, mentioning surrounding landmarks or geo-tagged images. At a minimum, the address in the title should include the city, state, or province.

Holiday-specific titles
Searchers’ behavior and queries change around major holidays and events such as Boxing Day, Mother’s Day, or Halloween. It is useful for searchers and SEO to update the title tag to accommodate these changes. For instance, around Valentine’s Day, the title “Valentine Gifts for Her. All Items on Sale & FREE Shipping” is more enticing and relevant to users than “Gifts for Her. All Items on Sale & FREE Shipping”.

Figure 424 – Tiffany updated the page title to match user intent before Valentine’s Day.

Character count and pixel length
Google does not index just 65 characters from title tags. It only displays about 65 characters in the SERP title (or the corresponding length in pixels). Google indexes as many as 1,000 characters.

Knowing this opens the door to experiments like thinking of your titles in blocks rather than a single 65-character unit. For example, it may be worth testing titles made of two blocks:

  • The first block of about 65 characters is where you craft the perfect title. This block will include category and subcategory names, product names, branding, calls to action, etc. Think of this as the title you would write if you were to follow the 65-character limit. Ideally, this will be the title seen by searchers on SERPs.
  • The second block will contain second-tier keywords such as product attributes, model numbers, stock availability, plurals, synonyms, etc. You could eventually repeat the most important keyword for your website at the end of the title on all pages except the homepage.

If the title is a full sentence and you want the entire sentence to appear in the SERP, it is best to keep it under 65 characters.

Branded titles
When referring to branded titles, I mean using your brand name, not the other brands you might be selling. The decision of whether to add your brand name to the title tag depends on various factors, such as:

  • The goal of your organic search campaigns is branding versus rankings.
  • How strong your offline brand is.
  • The authority of your website (i.e., external links pointing to your site).

I usually recommend not placing the brand at the beginning of the title tag. However, the final placement should consider the following:

  • If you have a well-established brand and a more than decent website authority, you can place the brand at the beginning of the title tag.
  • If you try to build a brand, then place the brand at the beginning of the title.
  • If your brand has only some recognition and your goal is to drive unbranded traffic, you should add the name at the end of the title tag.
  • If your brand is unknown or you don’t care about branding, do not include your name in the title.

Figure 425 – Big names like Amazon include their brand name right at the beginning of the tag. This tactic works for recognizable brands as they rely less on page titles for SEO reasons.

Sometimes,es Google will change the title tag to append the brand name at the end or beginning whenever it makes sensor users. The example below shows how a search for “engagement rings” returns Costco’s website, with the brand name at the end of the title.

Figure 426 – The SERP title includes the brand name Costco.

However, if you check their HTML code, you will notice that the title tag does not include the brand name:

Figure 427 – However, the HTML page title of that same page does not include the brand name. Because the SERP title would be too short (just the category name), Google added the brand name automatically to make the results more appealing to searchers.

Keyword prominence
The term prominence refers to the closeness of the keywords to the beginning of the title tag. On category pages, start the title with the category name; on product detail pages, start with the product name.

But why is prominence important? First, search engines assign more weight to words at the beginning of the title. Second, Western readers skim text from left to right, and it is important to reassure them that the page is relevant by placing the category or product name at the beginning. An exception to this is if you have an established brand or if you are trying to build one; in these cases, start the title with your brand.

Keyword proximity
Proximity refers to how close words are to each other. If your targeted keyword is “women’s dresses”, you should not place other words between “women’s” and “dresses”. For example, the title “Women’s Casual & Formal Dresses” is not ideal; instead, it should be “Women’s Dresses: Casual, Formal, Going Out and more dress styles at{BrandName}”.

A quick note about category names: Do basic search volume research when deciding on category names. The screenshot below is from Google Keyword Planner, and it shows that “women’s dresses” has significantly more search volume than “womens dresses”.

Figure 428 – The search volume for “women’s dresses” is almost double that for “women’s dresses”.

The importance of keyword prominence seems to have decreased after the Hummingbird update, as Google is not focusing on exact match keywords as much as it used to. However, it is still advisable not to break apart important words such as category or subcategory names.

User intent modifiers
User intent keyword modifiers can be placed before or after the targeted keywords to attract searchers at a specific buying stage. Based on user intent, search queries can be categorized into three main categories: informational, transactional, and navigational.

In the Keyword Research section, we discussed user intent in detail. While most search queries are not transactional, informational and navigational queries are still valuable because they can assist conversions. Hence, e-commerce websites should try to capture consumers with relevant content at each buying stage.

One way of clarifying the purpose of a page to users is to include user intent keyword modifiers in the title tag:

  • You can include transactional modifiers such as “buy”, “sale”, “discount”, and “cheap” on category and product detail pages.
  • Navigational modifiers (e.g., Sears Store Vancouver, BC) can be included on store location pages. The brand name can be added to the “About Us” and “Contact Us” pages.
  • Educational modifiers such as “learn”, “discover”, “read”, “find”, or “guide” can be included on shopping guide pages.

Keywords order
In some cases, you will find that words have different search volumes and even different meanings if arranged in a different order. For example, “dog toys” has a different meaning than “toys dog” and a different search volume. When the order of words creates different meanings, you will have to create separate landing pages.

Singular versus plural
It is known that using the same keyword more than twice in the title tag may raise spam flags. However, is the plural form of the keyword considered repetition?

When search engines analyze the content of a document, they use a process called stemming[35]. That means that they strip words to their root form (e.g., “dresses”, “dressed”, and “dressing” are all variations of the root word “dress). If you view the matter from this angle, the plural variation can be considered a repetition. Although Google will treat singular and plural words as different keywords, I would not recommend using singulars and plurals in the same title.

That is because there is more than just stemming when it comes to plural or singular—there is user intent. Generally speaking, search queries containing plurals suggest that users seek a list of items rather than just one item. Moreover, in some instances, the same word can have different meanings in singular versus plural—e.g., “car cover”, which may refer to insurance cover versus “car covers” as in weather-proofing.

I recommend using the plural on listing pages or shopping guides and the singular on PDPs. For example, the title on a category page can read “Canon Digital SLR Cameras”.

The title of a product detail page under this category will read “Canon EOS 60D 18 MP CMOS Digital SLR Camera with 3.0-Inch LCD (Body Only)”.

Stop words
In computing, words such as “and”, “or”, “the”, and “in”, are called stop words.[36] Since these were usually deemed non-essential for relevance scoring until the Hummingbird update, search engines used to filter them when analyzing and classifying documents. Use natural language to create your titles, and if that means including stop words, do not sweat it; you are good.

You should consider whether your CMS automatically removes stop words from titles and URLs because some are important for users and can completely change meanings. For instance, if you sell music online and the CMS automatically removes the word “that” from the band name “Take That”, you will end up with a very suboptimal page title, e.g., “Best Take Albums” instead of “Best Take That Albums.”

Word separators
The word separator most used by SEOs is the pipe sign “|”, but symbols such as hyphens and even commas are good choices too. Google suggests not to use underscores,[37] and I would also recommend staying away from the following special characters: ‘”< >{}[] ( ).

Some websites use catchy titles with a lot of non-alphabetic symbols to grab searchers’ attention (e.g., ~~~!FREE iPods!~~~) and possibly higher CTRs. Remember that using special symbols may get you a better CTR but also result in spam flags.

Character savers
If you need to squeeze in more text, you can replace certain words with their corresponding symbols—for example, the word “and” with the “&” symbol, or the word “with” with the “/” symbol, or the word “copyright” with the “©” symbol. Remember to implement special characters using HTML entities (“&” as &amp; “©” as &copy).[38]

Other great space-saving options are abbreviations (e.g., instead of “extra-large” you could use XL), and shorter synonyms (e.g., T-shirts instead of tee-shirts). The decision on which keyword version to use in the title has to be based on the search volume for those keywords and the content targeted on the page.

Calls to action (CTAs)
A page that ranks second but has a great, compelling CTA in the title could theoretically grab more clicks than the page ranked first if that first page has a poor title. Remember that the headline is one of the most important elements tested in advertising and conversion rate optimization. On SERPs, your headline is the title.

CTAs include action verbs, unique selling points, or promotional words. Sometimes, promotions can also affect CTR. An example of a promotional title is: “All Digital Cameras 60% OFF”.

Competitive differentiators and free shipping
If you know that your target market is sensitive to a particular feature or benefit that is part of your unique selling proposition (e.g., you offer a “lowest price guarantee”), use that to attract more clicks on your listing and to differentiate from competitors. You can do the same if you have a competitive edge (e.g., you are the exclusive retailer of a product/line of products).

Figure 429 – “Free shipping” is extremely exciting for shoppers, and Zappos features that prominently in their page titles. This tactic works for conversion and better SERP CTRs.

Test your titles
SEO testing is theoretically possible [39] but very hard to statistically conclude since search engines involve a lot of uncontrolled variables. However, title tag variations are one of the easiest SEO tests you can run. Here are some ideas for your tests:

  • Place your brand at the beginning or end of the title.
  • Add one or more important product attributes to the product name.
  • Add the most important subcategory names before or after the parent category name.
  • Test various unique selling points at the beginning/end of the title.
  • Test various title patterns


  1. It is All About the Images [Infographic],
  2. Ranking Images For Web Image Retrieval,
  3. Luis von Ahn: Massive-scale online collaboration,
  4. WEB1000 – The ‘alt’ attribute of the <img> or <area> tag begins with words or characters that provide no SEO value,
  5. Image guidelines for SEO,
  6. Is it better to have keywords in the URL path or filename?
  7. Image Sitemaps,
  8. Does Google use EXIF data from pictures as a ranking factor?
  9. Video SEO White Paper,
  10. Pop-ups, video buttons, and color swatches can turn site search results into selling tools.,
  11. Six retailers that used product videos to improve conversion rates,
  12. Creating a Video Sitemap,
  13. Using Instructographics For Online Marketing,
  14. markup for videos,
  15. Will I be penalized for hidden content if I have text in a “read more” drop-down?
  16. Webmaster Central 2013-09-27, [min 20:49]
  17. Will having the same ingredients list for a product as another site cause a duplicate content issue?,
  18. SEO tips for e-commerce sites,
  19. Whiteboard Friday – The Biggest SEO Mistakes SEOmoz Has Ever Mad,
  20. How many H1 tags should be on each HTML page? [min 00:42]
  21. Google’s SEO report card,
  22. Thing > Product,
  23. Non-visible text,
  24. PowerReviews Spreads Consumer Reviews Between E-Commerce Sites,
  25. Ecommerce UX: 3 Design Trends to Follow and 3 to Avoid,
  26. The Magic Behind Amazon’s 2.7 Billion Dollar Question,
  27. Rich snippets – Reviews,
  28. Can I specify the canonical of all of a product’s review pages as a single URL?
  29. Farewell to soft 404s,
  30. End-of-life (product),
  31. Google’s SEO report card,
  32. Opinion: Is SERP Bounce a Ranking Signal or a Quality Factor for SEO?
  33. What’s it like to fight webspam at Google? [min 02:50]
  34. Content Keywords,
  35. Stemming,
  36. Stop words,
  37. Is a comma a separator in a title tag?
  38. HTML Character Sets,
  39. SEO Tip: Titles matter, probably more than you think,