Programmatic SEO Pt 6: How to Run SEO Tests

We thought we'd done a good job covering all the various facets of large-scale programmatic SEO - keyword research, competitive analysis, creating landing pages, building links and technical issues, but something that kept coming up in comments and feedback was something not very often discussed: testing.

Most marketers are familiar with testing as a concept, but usually you think about landing page and feature tests.  While the topic of SEO testing has been broached by few folks, it still remaining a not-so-talked about subject.  So, we're going to break it down.

Two types of testing in SEO

Testing things in SEO is far more difficult than with paid acquisition, landing page optimization, or product testing.  That's simply because Google really doesn't want anybody to run tests - they'd rather us all live in a fairyland where we 'just focus on the user'.

There are really two types of testing in SEO, which we'll explore in this piece.  The first is a the scientific approach, a true A/B test where you split templated pages and measure a statistical difference.  The second is a more artful approach, where you measure before and after.  And finally, we wrap up a piece explaining why often testing should be the last thing on your mind, and why some things aren't worth testing.  Here we go.

The Scientific Approach: A/B Testing Across URLs

Sites that already generate a lot of traffic and have templated page types can actually run fairly pure A/B tests on things like title tags and landing page layouts.  Engineers from Pinterest and Airbnb have laid out how they have done this.

Here's how it works.  First, you decide what you want to test across pages - say you want to test putting 'free estimates' in your title tags to see if that will entice more users to click.  You first split your pages in half in a manner in which approximately divides the traffic in half.  Then, you make the title tag changes to half of the pages - the test group.  Then, over the following weeks you watch to see if the traffic increases in the test group relative to the other group.

This type of testing is typically done on title tags and meta descriptions, as these are user-driven metrics and you will start to see a difference right away.  Additionally, Google seems to adjust the search results for user related metrics much more quickly than it does relative to linking or changes in content.

One key challenge to this is ensuring that you have chosen your sample size properly.  You'll want to make sure one group doesn't have say a single page that accounts for a disproportionate amount of traffic, because that could skew the results.  The folks at Airbnb detailed the statistical approach to account for this if you wanna get nerdy, but we'll spare you the math talk.

Metrics to Track

Why measure traffic?  Technically, in this sort of test you should be measuring for conversion rates, as often more traffic = less intent, or at least keep an eye on conversion rate to ensure it doesn't drop.  You can also export Google Search Console data and  measure differences in CTR.  However, we've seen time and time again that Google Search Console data isn't always accurate.  Finally, you could track rank.  However, sometimes you won't actually see a difference in rank, especially if you're already in the top few results - click through rate is driving the win.  Additionally, you may not be measuring rank for all the possible keywords that you could be ranking for, so rank tracking can only get you so far (and remember, we're a rank tracking company saying this).

Our recommendation?  Use traffic or conversions as the primary metric, but also pay attention to the others, as you can certainly learn from them.  We also recommend tracking bounce rate and / or dwell time to learn what factor in the test may be causing the result.  For example, we once ran a test which replaced a form on a page with a large button.  Traffic went up, but conversions went down.  The bounce rate and time on site for the button version were far better, leading us to believe that the bounce rate was driving slightly higher rankings.  However, the difference in conversion wasn't enough to offset the increase in traffic.

When we setup a testing system for a client, we pull in raw traffic & bounce rate data from Google Analytics and rank data from SerpDB, and put it into a monitoring dashboard like this:

Limitations of Pure A/B Testing

For starters, you need a site that has enough traffic to run a test.  Additionally, you need a site that has templated pages (as is usually the case in programatic SEO).

Additionally type of testing should generally only be employed for sites that have page 1 rankings across the board.  For one, you probably have better things to do if you aren't there already.  But more importantly, things best tested with this sort of testing are typically user-behavior related - links and other on-page factors can take weeks or months to bake in.

Additionally, this type of testing can't replace best practices alone. In the Pinterest article, the author writes the following:

"we once noticed that Google Webmaster Tool detected too many duplicate title tags on our board pages. The title tags on board pages were set to be “{board_name} on Pinterest,” and there are many boards created by different users with the same names. The page title tag is known to be an important factor for SEO, so we wondered if keeping the title tags unique would increase traffic. We ran an experiment to reduce duplicate title tags by including Pin counts in the title tag, for instance “{board_name} on Pinterest | ({number} Pins).” But we found that there was no statistically significant change in traffic between the groups."

However, this explanation reflects a basic misunderstanding of title tags.  Users may not have a preference on which title tag they like better, hence the test results.  However, when Google sees duplicate content across your site, the algorithm is more likely to ding your whole site entirely - not just a subsegment of pages.  Hence why their test showed no delta.  I'm not saying this is definitively the case for Pinterest, but I'd keep the title tag with the number in it because it's a best practice.

Just because a test doesn't show a results doesn't mean you should throw best practices out the window.  More on this in the sections to come.

The Artful Approach: Before and After Testing

Not all SEO testing needs to be or should be purely scientific.  Consider the following questions you might want to ask:

  • Does investing in page speed reduction across my whole site make a difference?
  • Should I prune thin content to keep link equity from spreading too thin?
  • Will several links from a PBN to my money page increase the rankings?
  • Will internal linking between city pages help my rankings overall?
  • How many links to I need to build to get my 2,500 word blog post to rank?

For all of the above hypotheses, running a simple split test won't help you get to an answer.  That doesn't mean we can't test the hypotheses - we just need to be a bit more artful about it.  In many instances, the best you can do is make a change and observe the differences.

Let's take an example.  In a Moz post, the folks at Pipedrive detailed how they spent a whole lot of time ranking #1 for the high volume term 'sales management'.  Here is an excerpt:

Hypothesis: We hypothesized that dropping the number of "sales management" occurrences from 48 to 20 and replacing it with terms that have high lexical relevance would improve rankings.  Were we right?  Our organic pageviews increased from nearly 0 to over 5,000 in just over 8 months.

Of course, many growth snobs who tout nothing but testing might scoff at this as N of 1, saying that this could just as easily be coincidence.  Especially when the post described how they spent months building guest post links to their guide.

However, the result followed the cause and it's what intuition would suggest might happen.  Sometimes in SEO, that's the best we're going to get.

One type of the artful approach is what we call the kitchen sink approach.  This is useful when you care more about results in a timeframe than you do the exact factors for getting the results.  In the kitchen sink approach, you look at your goal and you lay out all of the activities you could see doing to get a page to rank.  These might include:

  • Building a ton of links
  • Internally linking from the homepage
  • Putting structure data on the page
  • Putting jumplinks on the page
  • Optimizing the keyword density with Clearscope
  • Adding unique images
  • Improving the UX

Once you've identified everything, you do it all.  Then, a few months later, measure the results or not.  You may not know if every thing had an effect, or if some were pointless, but who cares?  Now you have a playbook for getting your stuff to rank.

Do you even need to be running tests?

Running tests is probably one of the last things you should be doing in SEO.  Testing, by nature, is only going to represent incremental gains and will only bring you to local maxima.  Often in SEO there is more low hanging fruit in SEO than getting a 5% lift by changing title tags.

This is especially true for sites that are newer, where link building and content should be the priority.  Or on extremely old authoritative sites that are jacked up from a technical perspective.

The other thing to consider is many of the impactful things in SEO are not testable, yet they are considered best practices and for good reason.  Sometimes you should do things because Google themselves has said to do so, or because studies of numerous websites have proven so, or just because a good healthy dose of common sense and intuition would indicate so.

These include:

  • Beefing up landing pages with UGC
  • Adding all proper schema markup
  • Producing quality content relevant to your niche consistently
  • Ensuring all pages are crawlable
  • Ensuring your highest intent pages are heavily internally linked
  • Making sure keywords and synonyms are used in the right places
  • Ensuring your website loads quickly
  • Generating pages to rank for all possible search intent
  • Building links

All of these are things we know to be impactful as a community.  Many of these we can't run a true test, but I'd argue you shouldn't be running tests on any of these.  We know they work with little to no downside risk.

I spoke to an SEO manager of an up and coming aggregator-style marketplace.  He told me "I can't really justify a budget for link building because I can't measure the effectiveness like I can with testing."  That mentality, my friends, is why that website is in the shitter and going to lose to their competitors.  Google intentionally makes it hard to test many factors; that doesn't mean those factors are not impactful.  Links are the primary example of this, as they take a long time to work, and you never really know if that one link worked or not.  But if you build hundreds of links to a site for years, you'll know that worked.

This isn't to say don't run tests.  It's just to remind you that there are often things that aren't testable, yet still impactful and worth considering.

Want more posts like this?  Join our mailing list below.

[mc4wp_form id="2386"]

Don't worry, we won't flood your inbox with fluff marketing crap or anything self promotional 

Read more


Programmatic SEO (Pt 5): Common Technical Challenges and How to Overcome Them

This is the fifth post in a multipart series on programatic SEO.  Thus far we've covered keyword research, competitive analysis, creating landing pages and building links - all at scale.

Companies doing programmatic SEO typically generate a TON of pages on their website.  With all these pages comes a number of technical issues, which we'll go over here.  As with most posts on our blog, we try and avoid rehashing what's obvious and already covered elsewhere.  So we'll focus exclusively on the challenges unique to large scale, programmatic SEO.

Getting Pages Crawled

If Google can't find a page, it can't rank it.  And even if you submit an orphaned page manually or via a sitemap, you're losing out on internal link juice you could be sending to that page.

For starters, make sure that every page on your site is categorized.  For a travel site, you probably have Countries, States, Cities and Destination types.  Choose whatever the broadest is and create an HTML sitemap.  Be sure to create some sort of category in your backend for random pages like About Us, Media Relations, ect.  Use one of these broad categories to create a user sitemap that's linked in the footer.

Retailmenot does this by breaking stores up alphabetically.

This ensures that no matter what, Google can find all of your pages.  Categorization also makes it easy to establish bread crumbs.

Of course, you don't just want Google to find your pages - you want to rank those pages.  The sitemap is not designed to replace internal linking - just to make sure no pages get orphaned.  Internal linking makes sure that you are sending your hard-earned link juice to the right places.

Internal Linking

Though Google has come a long way, page rank is still at the core of the algorithm, and websites pass page rank via internal links.

This is one of those areas where what's best for the user and what's best for SEO aren't quite the same.  Here are some best practices, at least SEO-wise.

Use Breadcrumbs

Using breadcrumbs with your clearly defined hierarchy ensures that every subcategory is linked to its parent categories.  Also, using schema markup allows you to tell Google verbatim what the hierarchy of your site is, and makes your result a bit prettier in the SERPs.

Related Categories

Typically you don't just want to internally link up and down your category hierarchy, but also to pages at the same level of your hierarchy.

Sometimes this is also good for the user,

In other cases, well internal linking is really just for SEO.

There are two ways companies typically go about internal linking in this situation.

The more complex solution is using the product's use to give the user the best possible recommendation.  This is used by sites that are designed to be their own search engines of sort - Yelp, Tripadvisor, and Indeed for example.  These sites have data on what the 'best' page for the user is, and thus it's in their best interest to show it to the user - while also funneling link equity to the page.  The downside is that some results pages may get orphaned.  So, make sure you design the logic so that every page gets linked to some extent.

The second solution is to just use categorization.  If you're like Thumbtack (the example above) and just want to spread out link equity, you can use some sort of categorization to decide which links to show.  That could be metro areas or cities within a certain radius for a site targeting locations.  This is much simpler and easy to maintain.

Use commercial anchor text

Current best practices are to use commercial anchor text for internal linking.  Tell Google what that page is about.

Make your most linked pages link to your money pages

Ideally, your most linked pages should directly link to your most valuable pages.  This is why so many home pages link directly to big categories.  Relevance also matters here.  Make it a practice every quarter to go over the pages with your most external links, and ensure they are linking to money pages.

Minimize clicks from home page

Try and ensure your valuable pages - usually category pages - are no more than two clicks from the homepage.  For less valuable pages that still get traffic - such as individual listing pages - try and aim for three or four clicks from the homepage.

Linking out to relevant resources matters

Internal linking isn't just about passing link equity.  You also want to demonstrate authority on topics you are trying to rank for.  A page about 'Travel Spots in Thailand' that links to individual listings and guides to different cities in Thailand demonstrates more authority than just one page that doesn't link to any others.  It's sort of the concept of a content hub.

Ensuring Google Likes Your Content

Just because you've ensured Google can find your pages with your sitemaps and internal linking doesn't mean G likes what's on the page.

The first indication is your indexation rate.  Remember the categories you created for each page type?  You should also create an XML sitemap for each of these categories (that ideally automatically updates when new pages are created).  Then submit those to Google Search Console, and GSC will tell you which of them are indexed.  Note that the new search console often lies about individual pages not being indexed, but the indexation rate is still something to monitor.

This indexation rate should be above 90%.  If it isn't, you probably have a problem with some of your pages.  More on fixing this later.

Another measure of your page content is the cache date.  

When it comes to sites that do well programmatic SEO, most results will have been cached within the last week.  However, you should check the cache dates of your competition to get a benchmark.  If your time from last cache is significantly above average for your space, that's a sign your landing page content isn't good.

Finally, keep tabs on your crawl budget.  Generally if it's going up, that's a good sign.  If it is staying constant, that's not necessarily a bad thing.  If it's shrinking and the number of pages isn't, that's a bad sign.  Additionally, you want to make sure the number of pages on your site is no more than 3x your daily crawl budget.

Dealing with Thin Content

If you see signs that Google doesn't like your content, or you just look at your pages and can tell they are thin, you've got a couple options.

First, you can beef up your pages with more content.  Read more on that here.

If that's not an option, you can delete the pages.  However, it's likely the case that these pages are useful, in which case you'll want to noindex them and likely nofollow any links to those pages.

Indeed does this with all of their individual job pages, as often they are thin, scraped, and have little SEO value.

 

You may not want to deindex a whole category of pages, however, as they may get traffic.  For example, Yelp gets a ton of traffic from people searching for 'restaraunt + reviews'.  In cases like this, you may want to programmatically set logic to noindex the page if it falls below a certain threshold.  Our gut feel on this is that if there isn't at least 3x the content as exists in the template (nav, header, footer, related articles, ect), that page is garbage in Google's eyes.

Page Speed

For sites that act as search or booking engines, you want the landing pages to have the most up-to-date information for users.  For example, you probably don't want to show a hotel that doesn't have availability.  However, you don't want to have to request that information every time somebody loads the page.  So, ideally you have a caching solution that purges and primes the cache whenever the information changes materially.  Talk to your engineering team about that because we're not going to tackle that topic in this post.

Spreading Link Equity Too Thin

As previously mentioned, page rank is still at the heart of Google's algorithm.  Yes relevance, rank brain, EAT, ect are all factors, but link equity is still a very real thing.  The more pages your site has, the more you are spreading that link equity out.

How can you tell if you're spreading yourself too thin?

Well, for one, if your indexation rate is low, that can be a sign.  Especially if it looks like your content is good.

Another thing to make a practice of is measuring your rank on new pages as they are generated.  Typically when your pages first get indexed, they'll settle on a rank within a week or two.  If that rank that pages are settling at goes down (for similarly competitive terms), it's a sign your link equity is getting spread thin.  Unfortunately there's no hard and fast science here, but it's worth being aware of.

Whether you see the signs of this or not, you can usually prune some pages to help put your link equity to use.  Do this by using a crawl tool like screaming frog to get all of your pages.  Then download traffic data by landing page, match up the two datasets, and create a paretto chart.  If a large portion of your pages are getting zero or very little traffic, it might be worth pruning them via deletion and redirect or via no-indexing.

Duplicate Content

We've all heard of duplicate content before, but it can really become an issue in programmatic SEO.  As your site grows larger, often you will have a worse page ranking in the search results than your money page.  For example, you may have a transactional page about the best lawn care companies in Portland, OR getting outranked by a blog post about lawn care tips in Portland.  Occasionally you might get two slots from this, but often you may be showing the wrong page and getting a lower rank as a result.

First, you need a mechanism to let you know this is happening.  For us, that's pretty easy.  When we work with a client, we use SerpDB to write all rankings to a database.  Then, we programmatically create a table that maps each keyword to the desired landing page, and join them together.  Clean url structure makes this easy.

For example, let's say we have a url structure of https://www.[domain]/[state]/[city]/[category].  It's quite easy to map the keyword 'philadelphia pa home cleaners' to the url https://www.domain.com/pa/philadelphia/home-cleaners.  Make sense?

Then, we setup our data visualization, usually using Looker or Tableau, to have a field that checks to see if the url ranking is the desired url.

Now, when it comes to fixing these, you have a few options.  One is to do nothing if you're ranking and converting well.  A likely decision if it's a one-off situation that doesn't affect the bottom line much.

If it does affect the bottom line, you can try rewording one of the page to have fewer keywords.  You can try sending more link equity to the main page.  If both pages have links, you might consider consolidating and using a 301 redirect to send all that link juice to one page.

If it's systematic, it's likely that you need to rethink your categories.  For example, if you have pages for [city} + home cleaners and [city] + cleaning services, maybe you consolidate those programmatically, 301 redirecting one category to the other in order to preserve link juice.

Finally, it is best to show the most unique content on the pages to the search engines, even if it seems normal to duplicate content for users.  This is a big issue for location-based searches.  For example, if you create a page for rank for 'Washington DC hotels', it might be the case that a hotel located across the river in Arlington, VA might be one of the top hotels.  However, if that hotel is appearing on both the Washington DC page and the Arlington, VA page, and maybe even other bordering towns, the page is less unique.  One instance of this is hardly problematic, but if your listings or other on-page content is similar across several pages, you are hurting yourself.  Idealists might think 'Google is smart enough to know the difference' and that you shouldn't worry about it but as in so many cases, these clowns are wrong.  This is a case where old best practices still very much apply and we have seen enough data to know this.

We're on a mission to de-bullshit-ify the world of SEO. If you want more real content and less fluff, we'd be honored if you joined our mailing list below.

 

 

Read more


Programmatic SEO (Pt 4): Building Links at Scale

scaling-link-building

This is part 4 of our multipart series on programmatic SEO.

When you're trying to dominate an industry programmatically, you can't do it by doing dumb shit like broken link building, skyscraper method, resource link building and other forms of what we like to call 'begging for links'.  While those may augment a strategy, there simply aren't enough broken links out there to achieve the quantity of links you need.  You need to identify a strategy that scales really far.  Then scale it really far.

This piece is a bit different than the others in that it isn't a step by step guide.  While almost anybody could successfully follow our guide to keyword research or competitive analysis to a tee (more or less), link building is much more nuanced.  There is no exact playbook, and each company has to figure out what works in its own vertical.  If there were a playbook, everyone would do it and it would then cease to work.  So, we've included five high level strategies with examples to serve as inspiration rather than prescription.

Link building tactics that scale

1) Scalable Ego Bait

Nearly every two-sided marketplace uses some form of ego-bait to get their supply side to get a link back.  Often this is in the form of badges.

Yelp, Houzz, Thumbtack, Tripadvisor, ect all offer the businesses on their platform some sort of embeddable badge to embed on their site.  Even Airbnb is getting into the badging game.

Typically the way it works is you offer a business on your platform a badge or reward, complete with an HTML embed code as soon as you join the platform.  Sometimes it just shows that you're a '[giant aggregator] Certified Business' and other times it will live update with your reviews or aggregate rating, a la Yelp.

Then, to get even more badges out of the supply base, an aggregators often offer a 'Best of [year] award' or other special achievements, making the business feel special enough to show off the badge on the website.

Have a competitor that's succeeding at badges?  One easy way to see which businesses are linking back is to right click the badge and click 'Search Google for This Image'.

Because badges can get into shady territory, we put together an entire post that dives into badges as a link building tactic.

Software review sites like Capterra and Software Advice also get links back in a similar manner, but usually without the badge.  If you look at these companies' link profiles, you'll notice that the software vendors themselves typically link back in a more thought out, editorial fashion.

The great part about this strategy is that your link building scales with your growth rate.  Even faster if you earn multiple badges per supplier.  Just be careful not to get a penalty with your embeddables.

2) Viral Content

Movoto, as detailed in this piece, earned their place in the search engine rankings by taking the approach of being 'Buzzfeed for real estate'.

The company had a fraction of the budget of competitors like Zillow, yet managed to carve out a valuable spot in a highly competitive niche through goofy, Buzzfeedy, listicle posts designed to go viral.  The results are quite remarkable.

Remember, this was during the heydey of social media when list posts like this were new, ads were cheap, and you could get a post to go viral with relative ease.  In the piece by Movoto, company members describe their formula for figuring out the viral equation.  Additionally, Movoto spent a ton on paid social to get the posts to 'lift off'.  As these posts went viral and gained thousands of page views, they naturally accumulated backlinks.

Movoto did a lot of content that was locally relevant, and repeated for every city as seen below.  A great example of finding what works, and scaling to oblivion.

The results are quite impressive and the post is worth a read.  However, it is worth noting that the Buzzfeedy type listicles and organic social virility are not what they used to be.

3) Proprietary Data

Thumbtack was founded in 2008, coming to the home services space a decade after rivals Homeadvisor and Angieslist. Yet the company managed to appear at the top of nearly every search result for local services, making them a giant in the space worth $1.3 billion at last valuation.  It's worth noting that Angieslist and Homeadvisor combined (the two merged last year) are only worth $1.3 billion.  Shows the power of SEO, but we digress...

At the center of Thumbtack's dominance is one thing: The Annual Small Business Report.

Thumbtack spent their first 3 years getting a massive supply base of local businesses on their platform.  They then survey those suppliers about local business in their region.  What do they do with that data?  If you guessed that they turn it into a piece of content, well you're only sort of right.  They turn it into a piece of content for every state and metro area in the United States.

These annual reports earn links from news outlets (both local and national), chambers of commerce, and government websites.  And local links are exactly what a company trying to rank for local terms should go after.

Another example of a company leveraging proprietary data is Redfin, which uses its database of home sale prices to produce a wide variety of insights that journalists are hungry to eat up.

Redfin regularly publishes newsworthy insights on the housing market.  And the housing market is always changing, so there's always some sort of scoop that journalists and members of the real estate industry are interested in learning.

If your company can produce proprietary data as a bi-product of your business operations, and you can package it in a way that's newsworthy, that's link gold right there.

4) Commissioned Surveys

Not every company produces a ton of proprietary data, but that doesn't mean you can collect it.  Surveys are a great way to collect data and produce insights that journalists, bloggers, and even governments are interested in.

While surveys require an extra step and a higher cost than proprietary data, they have a key advantage.  You can tailor the questions in a manner to hopefully predict a certain response.  Asking somewhat leading questions around topics that are controversial or otherwise hot can lead to a juicier story, which leads to more links.  But we know you'd never bend the truth for a link...

Anyways, when we think surveys, we think of Bankrate.

Bankrate has been commissioning and publishing surveys for a long time, and they almost always get picked up by news outlets.  If you read through a few of these, you'll notice they almost seem to follow a formula.  Nearly every survey contains the following elements, often in the same order.

A personal anecdote highlighting the problem

The main finding from the survey indicating that person is not alone, and there is a larger trend

 

One or two 'drill ins' that either break the data up by a dimension, or show a related insight in order to expand upon the initial finding.

 

And a conclusion recommending what someone can do to fix the problem.

That story arch is more or less present in all of Bankrate's surveys, and it works.

Bankrate typically uses professional research firms to conduct the surveys via phone.  These are expensive, and cheaper alternatives include Surveymonkey and Google Surveys.  However, many high-end news outlets won't run a story unless the data comes from a professional research firm.

5) Content with high link intent

Sometimes link building really is a function of creating great content.

I think I puked in my mouth saying that.  But really, there are a ton of sites that generate so much content that they seem to naturally pick up links.  Some of these are likely more or less by accident, while others may have put a little more thought into it.  We'll go over how to put thought into it.

In many verticals, like marketing, for example, people are writing content every single day and looking for sources to use in their stories.  You just need to get discovered as that source.

For example, if we were adding an item to this piece about infographics, I might search 'how to build links with infographics'.  And because I'm lazy, I'm probably clicking on the first or second result and using that as my source.  That's why the top couple results here have so many links.

If you're the top result, you get a lot of links.  Which further secures you as the top result, begetting you more links.  It's a positive feedback loop.

A lot of times this virtuous cycle is how companies that pioneered a vertical solidified top placements.  Hubspot comes to mind as the player in inbound marketing.  Student Loan Hero pioneered the student loan debt vertical - which is on the rise.  Just look at what ranking #1 for 'student loan debt statistics' got them.

For this strategy to work, you should make it a point to produce content that has intent to link.  How to determine this?  Typically you want high search volume for informational intent.  Additionally, if you see the top couple posts earning a lot of editorial links, you're probably in the right place.  From here, now you need to crack the top few by pitching your piece, guest posting, ect...whatever you have to do to get it to the top.  It's a lot of work without a short term payback, but pays dividends for years to come.

6) Guest Posting

Despite Matt Cutts scaring everyone back in 2014 with his 'stick a fork in it post', guest posting is alive and well, for the moment at least.  Consensus among the SEO community seems to say 'if it's for relevant high quality sites and the anchor text is contextual, it's ok'.  Others disagree.  Whether that will remain true forever or not is uncertain, but there are still some companies that are scaling link building through massive guest posting campaigns.

For guest posting to work, you generally need an industry where there are a lot of adjacent blogs and publications that need content.  Marketing and software are prime examples of this, as nearly every marketing consultant and software company has a somewhat authoritative blog.  Less so in something like home services - there just aren't that many blogs out there that want a piece on how to unclog toilets.

Software Advice is one player that has appeared to scale guest posting massively.  The company contributes pretty useful content for all sorts of verticals.

The company almost always links back to a category page, using semantically related, yet not quite exact commercial anchor text.  A little dicey if you ask us, but it seems to be working.

Note that Software Advice's category pages have a TON of non-commercial information on them, making their backlinks a little less commercial.  We would not recommend linking to commercial pages otherwise as that's clearly manipulative and any manual review would reveal that.

Eventually, the brand takes over.

In most of the examples we analyzed, the older the companies got, the more links they seemed to get out of thin air.  Executive hires, IPO filing, earnings reports, company milestones, scandals, ect.  So when slogging away at building links brick by brick, it's important to realize that SEO compounds and eventually gets you to a point where your company is newsworthy enough to start earning mentions and links without even trying.  But until then, happy link building.

If you enjoyed this post, we'd love if you shared it or gave us your email below so we can let you know when the next post in the series is out.

Read more


Programmatic SEO (Pt 3): Creating Landing Pages at Scale

Note: This is part 3 of a multipart series on programmatic SEO.  

Once you've identified which keywords you're going after and have a good sense of the competitive landscape, it's time to start generating landing pages.  This is one of the key challenges with programmatic SEO, as each one of your thousands of landing pages needs to be highly unique.

In this guide we will:

  1. Talk about the importance of creating one page for searcher intent
  2. Touch on what 'doorway pages' are and how to avoid them
  3. Go over strategies to ensure your product creates enough content
  4. Give you a dead-simple, surefire way to choose the optimal landing page design

One page per search intent

Google is pretty good at synonyms and related topics these days.  So no need to create a landing page for every possible keyword.  Rather, you'll want to create landing pages for each searcher intent.

It's important you focus on what Google defines as intent - not what you define as intent.  Google uses user data and an algorithm to determine this, and algorithms aren't perfect.  Anybody who has seen Google's exact match substitutions knows this.

How to figure this out?  In the previous section on competitive landscape analysis, we looked at which competitor pages rank for multiple head terms.  Additionally, you can spend an hour just browsing the serps, observing which keywords rank for which terms.  Pay specific attention to which words Google highlights in bold, as well as what 'People also searched for'.

Do competitors have multiple pages per intent?  What about competitors that aren't the 800 lb gorillas - often giant sites like Yelp will have multiple pages for similar intents.  That's because they have plenty of content and link equity to spare, and also don't specialize in a vertical.

At the end of the day, this is a judgement call, and you have to start somewhere.  If it turns out you guessed wrong, you can correct it later (we'll get into this in the section on technical problems).

Avoiding doorway pages

Google has taken a stance on doorway pages, defined in their own words as follows:

Doorways are "pages created to rank highly for specific search queries".  Well shit, that's exactly what we're trying to do, isn't it?

Before we freak out, let's remember that Google's guidelines forbid any form of link building other than 'creating useful content'.

Here's some questions the folks at Google have given us to assess whether our pages are doorway pages or not that are a little more inline with reality.

  • Is the purpose to optimize for search engines and funnel visitors into the actual usable or relevant portion of your site, or are they an integral part of your site’s user experience?
  • Are the pages intended to rank on generic terms yet the content presented on the page is very specific?
  • Do the pages duplicate useful aggregations of items (locations, products, etc.) that already exist on the site for the purpose of capturing more search traffic?
  • Are these pages made solely for drawing affiliate traffic and sending users along without creating unique value in content or functionality?
  • Do these pages exist as an “island?” Are they difficult or impossible to navigate to from other parts of your site? Are links to such pages from other pages within the site or network of sites created just for search engines?

So yes, all SEO-oriented pages are doorway pages to some extent, but Google seems fine with them assuming that they have a real reason to exist and that each page has some sort of unique value.

Ideally, you can demonstrate to the algorithm that your landing pages have a reason to exist other than searchers. In travel, for instance, it would be highly likely for you to present a curated list of destinations in a city, even for a visitor who came to your site from somewhere other than Google.  Plenty of sites get away with a lot less - just a ton of cheap handwritten content below the fold.

As you're creating these pages, ask yourself the following:

  • Is there a plausible reason for this page to exist were it not for Google?  If not, how can I arrange the page as such?
  • Is there some sort of 'value' that this page could add, other than me making money?  If not, would an algorithm interpret as such?
  • How will a machine be able to know that this page is different from the thousands of others on my site?
  • What can I do to make this page linkable?  If you figure this out, you win.

To sum up, at a bare minimum, you need each landing page to have enough unique content.  To be truly defensible (and good for users), you need the page to have some sort of unique value, or at least be perceived that way.  And to really win, you want to give people a reason to link to your pages.

Strategies for filling out landing pages

The key challenge in programmatic SEO is getting enough unique, useful content to fill out thousands of landing pages.

Chances are, you'll use a few of the following elements to fill out your landing pages:

  • Photos
  • Business listings
  • Reviews
  • Price sheets
  • Questions and Answers
  • Product listings
  • Statistics
  • Relevant data
  • Bios
  • Anonymized user data
  • Curation
  • Maps
  • Calculators
  • ....the list goes on

All of your landing pages need to be filled out with unique content, at a bare minimum.

Ideally, all the content is super useful to users.  But those of us who live in reality and not some whitehat fairy land understand that what's good for humans isn't always what the search engines favor.

Knowing what fills out a landing page isn't the hard part, nor is coding the landing pages.  The real challenge is generating that content.  Companies that succeed at SEO build their entire business model and product in a manner that generates content that can be put into landing pages.  The business model and landing page content strategy is two sides of the same coin.

Here are six high level, non-mutually exclusive strategies that companies use to create enough content to fill out thousands of landing pages.

Two-sided marketplace

Examples: Houzz, Yelp, Ebay, Care.com, Rover, Expedia, Thumbtack

Two sided, online marketplaces and SEO are practically synonymous.  On one hand, you have the vendors or businesses, who are incentivized to contribute content as that is what will help them get more business through the platform.  Vendors can provide photos, listings, prices, business info, Q&A, and more.

On the other hand, you have consumers to leave reviews - probably the most prevalent form of user generated content.  Additionally, the data collected by the marketplace's operations, whether it's prices, quote requests, or preferences, can be used to generate more content.

Many of these marketplaces are designed around a compare and contrast experience.  That same experience can create great landing pages.  Others, like Homeadvisor and Thumbtack, are just designed to capture a lead, but they can create the facade of a compare and contrast experience, while baiting visitors away into another flow.  Sneaky?  Sure, but it works.

Community

Examples: Stack Overflow, Quora, Chegg, Pinterest, Reddit, Wallstreet Oasis, Tripadvisor, every forum out there

Online communities naturally generate a ton of unique content that can be organized into user-friendly landing pages.

Quora and Stackoverflow have a Q&A experience, creating questions and several answers that themselves often rank on Google.  Additionally, the sites categorize their questions into topic pages to rank for more broadly terms.

Forums are designed around a conversation.  Reddit and Pinterest are designed around submission of 3rd party content in a structured manner, and comments on said content provides more written content.

Nextdoor, while a closed network, surfaces snippets of neighbor conversations to rank for neighborhoods and local business related searches.

Proprietary Data

Examples: Zillow, Walkscore, 42 Floors, Thumbtack's Small Business Survey

If your company inherently generates data - or you have access to data nobody else has, it can create great landing page content.  Zillow is the canonical example of this, creating a page about home values in a market for nearly every city.

One clear benefit of using proprietary data to fill out your landing pages is that it is easily linkable.

Curation

Examples: Indeed, Yellowpages, Coupons.com

Indeed appears at the top for nearly every job related search, but they actually collect very little of their own content.  Rather, the company runs its own web crawlers designed to get every job and put it into curated listings.

Note that Google generally frowns upon scraper sites, but Indeed has enough links and has done a decent enough job about putting the content together in a usable way that they're fine.  It's ill-advised to rely solely on content that exists elsewhere on the web - this is why Yellowpages doesn't win in any individual space.

e-Commerce

Examples: Amazon, Ebay, Wayfair, Creditcards.com, every e-commerce store out there

When users are searching for specific products, they typically would like some sort of top list that presents the specifications, benefits, drawbacks, and prices of each.  If you sell products online, well, you'll want to make sure you leverage all the data that comes with.

Reviews are pretty much table stakes for e-commerce, so make sure you're collecting these.  Also, avoid Bazzaarvoice for SEO purposes as those reviews likely show on other sites.  Proper product categorization is also key.

Where to go from there?  Many e-commerce SEOs go to great lengths to write custom descriptions and take custom photos of their products, especially when the product is not unique to their store.

Editorial

Examples: Eater, Medium, REI, Airbnb's neighborhood guides

If all else fails, you can always hand write content.  Eater is a great example that dominates in all sorts of food niches by hand writing lists of best [food type] in [city].  Though somewhat expensive, editorial content has a ton of advantages.  For one, you don't have to wait on users to create it, making it a great way to jumpstart a new site or product category.  Additionally, you can use a tool like Clearscope to determine exactly which keywords to include in it, giving you the best possible chance to rank.

Additionally, handwritten content can be a great way to beef up pages even if you already have plenty of UGC.  That's why Houzz buries these useless walls of text at the very bottom of their pages

The practical shortcut to choosing a landing page design

Let's drop the 'focus on user intent' mantra for a second and get really practical.

By now you know all the big players in your pace.  You have a sense of who is ranking well, and who isn't.  You also probably have a sense of which players seem to be punching above their weight class.  Chances are, some of those big players have even done a good amount of testing to figure out the optimal landing page.

Why not use their learnings to your advantage?

So, take a guess at which competitors have the best landing page experience.  Pay special attention to those that have similar business models as you do.  If you want, you can rank them and choose the top few.

Then copy them all.

Have your engineering team create identical clones to all of them, as best that you can.  Since you may have different content to work with than some competitors, you may have to get creative.  Or, some may have different business models than you - one site may be designed to capture leads while another may be designed to peruse.  Do your best to imitate them, while still using your own design palette.

Then, split your landing pages across these templates and get the pages out in the wild and test them against one another.  If you have an established, authoritative website, you'll probably be able to get results within a couple weeks.  If it's a new site, it may be 6-12 months before you have an answer.

If your site is established, and is already playing around on page 1, you'll be able to measure traffic differences directly.  However, if you're less established, you'll want to measure rank as your primary indicator.  Additionally, you'll want to measure user engagement metrics like bounce rate and dwell time, as these are indicative of a good user experience and are highly correlated with rank / traffic. Finally, pay attention to conversion rate.  A traffic increase may mean nothing if conversion rate drops.

Ideally, one comes out as a clear winner across the board.  From there, you can run your own experiments on the pages.  Originality is highly overrated, at least when it comes to SEO.

And that's it for landing page creation.  Next week we'll cover all the technical challenges that come along with programmatic SEO.  Enter your email below and we'll let you know when that post is live.

Read more


Programatic SEO (Pt 2): Competitive Analysis of the Search Results at Scale

keyword-research-competitive-metrics

Whether your site is been around for awhile or you're diving into a new vertical, competitive analysis of the SERPs reveals who the big players are and where the opportunity is.  In this post, we won't go into too much depth on basic things like link analysis, as that has been covered ad nauseam.  Rather, we're going to focus on using rank data - at scale - to gain valuable insights into the competitive landscape.

This is Part 2 of a five-part series on programatic SEO.

Part 1 – Large scale keyword research

Part 2 – Competitive analysis (this post)

Part 3 – Creating landing pages at scale

Part 4 – Building links

Part 5 – Dealing with inventory: the technical issues you need to watch out for

Dataset

The dataset we're going to use is a single day's worth of rank data that contains a few thousand keywords in the campgrounds & RV park space.  The reason we'll use this space is that it's one where there is actually some amount of opportunity.  We paired locations with the following head terms to create this dataset: RV parks, camping, campgrounds.  So the keywords are all along the lines of 'Ocean city camping', 'Seattle RV parks', ect.

Some might proclaim that rank tracking is dead, but they're wrong.  Rankings are still the most straightforward and valuable data we have when it comes to sizing up competitors.

We're not going to use any competitive keyword metrics, simply because they are usually not useful when it comes to programatic SEO.  These metrics rely on the number of links pointing specifically to that page, and with large scale sites, it's usually the case that 0 link are pointing to a landing page.

Analysis

Let's start by looking at what the top players are, as defined by having the most keywords in the top 10.

keywords-in-top-10

As you can see, the top 10 results are clearly dominated by two players: Yelp and RVPark Reviews.  Yelp is no stranger to the local businesses space, and RVPark reviews appears to be the default player in the space.

Note how there is a pretty large drop-off between #2 and #3 - that indicates that these serps are quite fragmented.

Just how fragmented?  Let's find out with a Paretto chart.  The blue bars indicated the number of top 10 keywords a domain has.  The domains are sorted from largest to smallest.  The orange line represents the cumulative % total.

paretto-chart-top-10-keywords

As you can see, there is a very long tail when it comes to the top 10 results in this space.  The top 20 players only take up 58% of the serps, and #20 - Facebook - only takes up about 1% of top 10 results.  This indicates that the space is wide open.

The top 10 alone doesn't tell the whole story.  As we all know, there's a big difference between being top 10 and top 3.  The chart below is filtered down to the top 10 players, showing the distribution of keywords by position.

As you can see here, only Yelp and RV Park Reviews are consistently taking up the top 2 shelf spots.  Surprisingly, Tripadvisor - the canonical example of SEO dominance - isn't faring so well.  It's important to notice these sorts of oddities when doing analysis, then drill in.

Looking at the actual results, it's quite clear that Tripadvisor is ranking for these terms by accident.  All of the results tend to be individual listings or forums.

Back to the analysis.  Though Yelp and RV Park reviews have a ton of #1 and #2 spots, there are plenty of instances where they do not.  Who takes the top spot in these instances?  Let's switch up the visualization to find out.

Below, we have the top 3 results on the x-axis, with the grand total of the top 3 in the fourth column.  The bars are 100% distributions of those positions by domain.

As you can see, Yelp and RV park reviews take up almost half of the top 3.  But a good 40% of the top 3 is taken up by a very longtail of sites, many being individual campground locations and state park websites.

Another way to look at this is the following chart, where we look at the distribution of sites based on number of keywords that domain is ranking for in the entire dataset - a proxy for the size of the site.

Clearly most of the serp is taken up by giant sites as indicated by the red bar.  These sites show up for 1000+ keywords in the dataset.  There is also a sizable chunk of the results taken up by smaller sites.  The shape of the red bars is quite interesting - Google seems to 'prefer' the aggregators, which based on poking around, have a ton more link juice and a better UX than many of the websites.  No shocker there.

No competitive analysis would be complete without a 2x2 matrix.  Below, we've created a 2x2 matrix based on two metrics.  Each data-point represents a domain.  On the X-axis, we have the % of keywords in our data-set that a domain is showing up for, period.  So if our dataset has 5000 keywords and a domain shows up for 1000, that domain would be 20%.  This shows how many queries that domain is attempting to rank for.  On the Y-axis, we show the % of those keywords that rank in the top 10, or the first result.  This is an indicator of how effective that domain is at ranking on the first page.

This gives us four major categories:

Nailing a Niche: (Top-Left): These sites aren't trying to rank for many keywords, but for those that they are, they are ranking on page one.  Lots of campgrounds that only exist in one metro area.

Dominating the Space (Top-right): Here's where Yelp and RV Park reviews sit.   They have a page created to target nearly every keyword and a majority of them are ranking on the first page.

Not even trying (Bottom-left):  These are sites that have few pages created to target the keywords, and a minority of those pages are ranking on page 1.  You see a lot of .gov sites and tangentially related sites.

Trying but struggling / indifferent (Bottom-right):  These domains have pages ranking for a majority of the keywords, but few of those pages make it to page 1.  Here you have both up and coming sites that are trying but haven't earned a lot of page 1 rankings, and those sites that are just really large but probably aren't making a concerted effort to go after the terms (Facebook being the perfect example).

Here's the chart with a few labels added.

Finally, let's see which players are going after which search terms.  Remember, In this data-set, we use the following head terms (paired with city / state): campgrounds, camping, rv parks.

different-keywords-different-sites

There are quite a few insights we can glean from this, such as:

  • Hipcamp is ignoring 'RV Parks'
  • Yellowpages is ignoring 'Camping'
  • GoodSam does really well at 'RV Parks'

As one could imagine, there's probably overlap between these three terms, but each term probably has a different intent.  You'll have to consider your business model, as well as the competitiveness of the SERPs when deciding which to focus on.

This leads us to ask the very important question - which keywords does google consider the same intent?  It is ill-advised to create multiple pages for the same intent, as you'll be needlessly wasting link equity.  We'll cover that in the next post on creating a landing page for each searcher intent.

Next steps

Using nothing more than rank data for 5,000 or so keywords, we were able to get a pretty clear picture of what the competitive landscape is for a whole industry.  However, there's no need to stop here.  For starters, it's important to drill into nuances or oddities you may uncover.

Additionally, you might want to incorporate additional datapoints such as domain rating, site speed, search volume, US Census data, ect to enrich your analysis.

Finally, you should pay a lot of attention to the sites that seem to be dominating or up-and-coming.  What are common patterns in their UX?  Where are they getting their links?  What are they using for title tags?  This is the more traditional competitive analysis in SEO.

Next week we'll talk about how to go about creating your landing pages at scale, programatically.  Subscribe below to get notified when that post comes out.

Read more


Programatic SEO (Pt 1): How to Do Keyword Research at Scale

keyword-research-at-scale

When I think SEO I think of two varieties.  One is content first - basically producing medium to long form content around SEO that is informational and topically related to what your target audience wants to learn.  Think Hubspot or Intercom.  This is what most of the content is written about in the SEO world.  The other is large-scale programatic SEO, typically focused around scalably creating landing pages that rank for transactional intent.

In this series of posts, we're going to drill down and give a no-BS, step by step process about how to do the more programatic sort of SEO.  Most of this variety you'll see in consumer facing aggregators like Tripadvisor, Yelp, and Bankrate, but it's also pretty common in E-Commerce and there are a couple B2B examples such as Software Advice.

This variety of SEO is much less about producing long-form authoritative content that educates and builds brand, but rather about creating high volumes unique, user-friendly landing pages targeted at transactional terms.

Here's what we'll cover in this series.

Part 1 - Large scale keyword research

How do to keyword research at scale, and use that to guide your landing page creation.

Part 2 - Competitive analysis

No, this isn't the same rehashed crap showing you how to use Ahrefs or SEMRush.  In fact it has nothing to do with link profile at all.

This is about using data to determine who the big players are and to discover the whitespace in your vertical.

Part 3 - Creating landing pages at scale

Here we'll dive into how to create a landing page for each searcher intent.

Part 4 - Building links

It's not so easy to build links to your money pages in programatic SEO.  This is how you get around that.

Part 5 - Dealing with inventory: the technical issues you need to watch out for

In B2B SaaS you can get a long way with a simple wordpress theme.  With programatic SEO, there are a number of things to watch out for, technically speaking.

Large Scale Keyword Research

Keyword research is nothing new, but here we're going to apply the same concepts at scale.  This section assumes a basic understanding and intuition around keyword research.

1) Find your head terms

In nearly every sort of programatic SEO, there are what we call head terms.  These are the broad level categories you'll be trying to rank for.  Here are some examples:

  • Yelp: Restaraunts, Gyms, Yoga Studios
  • Bankrate: Credit cards, bank accounts
  • Zillow: Real estate, homes for sale
  • Tripadvisor: Hotels, Things to Do
  • JC Penny: T-Shirts, Jeans, Polo Shirts

Typically head terms contain a great deal of search volume, but are also often searched with modifiers (more on this later).  Additionally, some head terms may be a parent or child of other terms.

To get a sense of the relative volume, you can use Google Trends, or a keyword research tool.  I like Google trends at a glance, which also shows seasonality.

search-volume-google-trends-relative

 

 

 

 

 

 

 

 

 

It's important that you flesh out as many head terms that you can think of.  Start by thinking of as many as possible.  In the vacation rental space, you can try the following:

  • vacation rentals
  • beach house rentals
  • beach houses
  • vacation homes

Then, try and flesh out the list as much as possible.  Include singular and plurals separately.   It doesn't matter whether something has volume or not - we'll deal with that later.  Here are some common, well known tools / methods flesh out those head terms:

  • Ubersuggest
  • Keywords Everywhere
  • Google Adwords Planner
  • Ahrefs, SEMRush, Moz, ect
  • People also search for

Additionally, if you already have competitors in the space, look at their category pages and the keywords they use in their title tags.  Add any terms that stand out to the list.

Finally, search for some of the keywords (feel free to add a modifier), and see what title tags are being shown.  Pay very close attention to the bolded text as Google clearly sees that as a closely related term.

2) Figure out your modifiers

While most of the head terms will have a ton of search volume, it's quite likely that the real volume comes with the head term in conjunction with one or more modifiers.

Now, we'll use the same tools and methods that are listed above to figure out all of our modifiers.

Primary vs secondary modifiers

It's important to at least have a best guess at what your primary and secondary modifiers are.  To explain this, here are some examples:

Primary Modifiers:

  • Shirts: v neck shirts, button down shirts, dress shirts
  • Credit cards: rewards credit cards, travel credit cards, cash back credit cards
  • Restaurants: Thai restaurants, Mexican restaurants, fine dining restaurants
  • House cleaner: Philadelphia house cleaner, New York house cleaner, house cleaner in Lancaster PA

Secondary Modifiers:

  • Shirts: best shirts, comfortable shirts, affordable shirts,
  • Credit cards: best credit cards, credit cards for low credit
  • Restaurants: best restaurants, closest restaurants, cheapest restaurants

Primary modifiers tend to indicate a whole new category.  Thai restaurants is a category in and of itself.  Additionally, they are usually mutually exclusive.  You probably won't find many Thai Mexican restaurants.

Secondary modifiers can either modify the head term, or the head term + a primary modifier.  For example, you could easily imagine people searching best Thai restaraunts or affordable buttondown shirts.  Generally speaking, you don't need to worry about getting volume for or tracking keywords with secondary modifiers.

Local: the easy case

If your business is targeting local intent, it's pretty easy to figure out your modifiers as they will likely be [head term] + [location].  The nature of your business will probably dictate whether that location is state (or state equivalent if you live outside America), city, or even neighborhood.  Nobody searches for the best Thai Food in Pennsylvania, but they may search for the cheapest car insurance in PA.

If you're city level, be sure you include city and city + state in your list of modifiers (we'll show how to put this all together).

There are plenty of places you can get this data.  The US Census is a good start, but won't have every little town and won't have neighborhoods.  Wikipedia often has neighborhoods, and typically every incorporated and unincorporated city / town within a county.  Nextdoor has neighborhoods, but often they aren't the names people search.  Use a virtual assistant to find as many as possible.

Also, you will often find a great deal of volume for [head term] + "near me".  Usually Google picks up on this.

3) Putting it all together with Python

Now, we'll want to go ahead and put our modifiers together to get a large list of keywords.  If you're at a large scale, it's not unreasonable to be tracking 10-200k keywords, but you can still get some good insights with as little as 2,000.

Our goal here is to get every possible permutation of keywords & modifiers.  In most cases, you can probably skip the secondary modifiers as Google tends to show the same pages for modified terms as not, though there are exceptions.

For illustrative purposes, here is a script focused on the vacation home rental space.  In reality, there would be far more cities and states, and probably more head terms.  Additionally, you'd probably want to import and export CSVs, but you get the point.

As you can see, when you run this, you get an exhaustive list of keyword options.

Then, just paste that into Excel using the text import functionality, and you've got a comprehensive list of keywords.  Additionally, you have effectively created keyword tags.  One tag is your head term, the other is your location.

Things to be careful of

Are modifiers universal or specific to certain head terms?

Sometimes you may have modifiers that don't make sense for every one of your head terms.

In the finance space, for example, the modifier "travel" goes well with "credit cards", but doesn't go well with "checking accounts".  We'd recommend simply creating seperate sets of modifiers for each base term.

Include secondary modifiers?

Including secondary modifiers can be helpful, but is not necessary.  If you do, be sure to make sure you have a column in your output file for the modifier.

Don't know python?

It's pretty easy to do things like this!  And there are plenty of places to learn.  It comes pre-installed on Macs, and you can use this exact code as a starting point.

Pre-fixes vs suffixes

In many cases, you might have to classify modifiers as pre-fixes or suffixes, and go from there.  For example "rewards credit cards" has a pre-fix modifier, but "credit cards for travel" is a suffix.

4) Getting search volume

Yes Rand Fishkin, we know the Adwords tool doesn't give accurate data, but we don't need accurate.  What we need is directional data that we can use for comparison.

Most tools - including the Adwords planner - have a pretty low keyword limit.  So, you'll want to have a virtual assistant block these into chunks and run it, OR just use Keyword Keg which allows for large scale batches at a pretty reasonable cost.

Once you do that, now you have a list of keywords with search volume, but you need to reconnect your tagged list of keywords.  Perfect job for a VLOOKUP.  If you're dataset is getting too large for Excel (~50k rows) or you're data geeks like us, you can also upload to SQL tables and join.

Now you have a pretty comprehensive set of keywords, tagged with head terms and modifiers, with search volume appended so that you can pivot to gain all sorts of valuable insights.

5) Visualizing search volume

Now that we have a giant data-set of keywords and volume, let's play around and see what we can learn.  We'll be using a dummy data-set focused on dog related services.  See our head terms and locations (kept it to Philadelphia, Pittsburgh, and Columbus for simplicity).

And the script to put it together:

We used Keyword Keg to get all the volume, joined the keywords back to their head terms, and here we go.  First, let's check out which head term tends to have the most search volume.  The chart below groups all of the search volume by head term.  So volume for keywords 'philadelphia dog boarding', 'dog boarding pittsburgh pa', and 'columbus oh dog boarding' will all roll up into the 'dog boarding' category.

What can we see here?  A lot, actually.  For one, dog boarding dwarfs the rest of the terms.  Additionally, there are plenty of terms almost not worth paying attention to, or at least rolling up into other categories - everything from dog walking services onward.  One super interesting thing is that 'dog walkers' has slightly more volume than dog walker. , and 'dog sitters' has nearly as much volume as 'dog sitter'.  This implies that people are searching for lists, or are in the mode to comparison shop.  That's a hint that we might want to design our landing pages around a comparison shopping experience.  It also means I'd bet money that Rover is kicking Wag's ass in the SERPs, which we'll find out in the next post.  Finally, the head term 'doggy daycare' has a sizable chunk of volume, so let's be sure to sprinkle that keyword in.

Since Google generally interprets plurals pretty well, let's group the singular and plural together.  We'll also be risky and assume Google interprets 'doggy daycare' and 'dog daycare' the same.  We haven't done anything to prove this yet, but hey, why not live life on the edge once in awhile?

Now that's a nice manageable set of head terms.  Now you may be asking, shouldn't we do more grouping?  Such as 'dog boarding' and 'pet boarding'.  Sure, we could shoot from the hip, do a handful of searches and make that judgement call.  But wouldn't we rather use data?  We'll do that in the next post.  But for now, let's nerd out on data and see what else we can learn.

Here's a view showing the record count by search volume.  Shocker, Google is telling us that most terms have zero search volume.

Of course, we know big G is full of shit, so we'll just exclude and move on.

Here's a great chart for visualizing large scale keyword research.  It's known as the Paretto chart.  The blue line shows the volume for each city.  The red line shows the cumulative contribution for the total.  The way to interpret this, is that the top 4 cities in the Columbus metro area account for 77.7% of the total volume.  This might inform where you focus your efforts on.

It's quite common that some head terms are super concentrated in the head, while others are a total long tail.  Do keep in mind that because Google shows lots of long tail as zero, the effects may be skewed towards the head.

Finally, what is the most search pattern people use?  Is it city + state + keyword, keyword + city, or something else?

 The data is showing that 'keyword + city' and 'city + keyword' are the most common types.  Google is pretty good about interpreting intent these days so this isn't something to sweat, but even in 2018, sprinkling the right phrasing in can get you itty-bitty wins.

Conclusion

Now we've created a pretty exhaustive list of potential keywords.  They all have search volume attached, and we've learned a thing or two about how people searched.

In the next post, we'll really get into the meat of it, as we analyze the actual SERPs to do an assessment of the competitive landscape.  Want us to let you know when that comes out?  Be a doll and join our mailing list below.

 

Read more