Whether your site is been around for awhile or you’re diving into a new vertical, competitive analysis of the SERPs reveals who the big players are and where the opportunity is. In this post, we won’t go into too much depth on basic things like link analysis, as that has been covered ad nauseam. Rather, we’re going to focus on using rank data – at scale – to gain valuable insights into the competitive landscape.
This is Part 2 of a five-part series on programatic SEO.
Part 2 – Competitive analysis (this post)
Part 3 – Creating landing pages at scale
Part 4 – Building links
Part 5 – Dealing with inventory: the technical issues you need to watch out for
The dataset we’re going to use is a single day’s worth of rank data that contains a few thousand keywords in the campgrounds & RV park space. The reason we’ll use this space is that it’s one where there is actually some amount of opportunity. We paired locations with the following head terms to create this dataset: RV parks, camping, campgrounds. So the keywords are all along the lines of ‘Ocean city camping’, ‘Seattle RV parks’, ect.
Some might proclaim that rank tracking is dead, but they’re wrong. Rankings are still the most straightforward and valuable data we have when it comes to sizing up competitors.
We’re not going to use any competitive keyword metrics, simply because they are usually not useful when it comes to programatic SEO. These metrics rely on the number of links pointing specifically to that page, and with large scale sites, it’s usually the case that 0 link are pointing to a landing page.
Let’s start by looking at what the top players are, as defined by having the most keywords in the top 10.
As you can see, the top 10 results are clearly dominated by two players: Yelp and RVPark Reviews. Yelp is no stranger to the local businesses space, and RVPark reviews appears to be the default player in the space.
Note how there is a pretty large drop-off between #2 and #3 – that indicates that these serps are quite fragmented.
Just how fragmented? Let’s find out with a Paretto chart. The blue bars indicated the number of top 10 keywords a domain has. The domains are sorted from largest to smallest. The orange line represents the cumulative % total.
As you can see, there is a very long tail when it comes to the top 10 results in this space. The top 20 players only take up 58% of the serps, and #20 – Facebook – only takes up about 1% of top 10 results. This indicates that the space is wide open.
The top 10 alone doesn’t tell the whole story. As we all know, there’s a big difference between being top 10 and top 3. The chart below is filtered down to the top 10 players, showing the distribution of keywords by position.
As you can see here, only Yelp and RV Park Reviews are consistently taking up the top 2 shelf spots. Surprisingly, Tripadvisor – the canonical example of SEO dominance – isn’t faring so well. It’s important to notice these sorts of oddities when doing analysis, then drill in.
Looking at the actual results, it’s quite clear that Tripadvisor is ranking for these terms by accident. All of the results tend to be individual listings or forums.
Back to the analysis. Though Yelp and RV Park reviews have a ton of #1 and #2 spots, there are plenty of instances where they do not. Who takes the top spot in these instances? Let’s switch up the visualization to find out.
Below, we have the top 3 results on the x-axis, with the grand total of the top 3 in the fourth column. The bars are 100% distributions of those positions by domain.
As you can see, Yelp and RV park reviews take up almost half of the top 3. But a good 40% of the top 3 is taken up by a very longtail of sites, many being individual campground locations and state park websites.
Another way to look at this is the following chart, where we look at the distribution of sites based on number of keywords that domain is ranking for in the entire dataset – a proxy for the size of the site.
Clearly most of the serp is taken up by giant sites as indicated by the red bar. These sites show up for 1000+ keywords in the dataset. There is also a sizable chunk of the results taken up by smaller sites. The shape of the red bars is quite interesting – Google seems to ‘prefer’ the aggregators, which based on poking around, have a ton more link juice and a better UX than many of the websites. No shocker there.
No competitive analysis would be complete without a 2×2 matrix. Below, we’ve created a 2×2 matrix based on two metrics. Each data-point represents a domain. On the X-axis, we have the % of keywords in our data-set that a domain is showing up for, period. So if our dataset has 5000 keywords and a domain shows up for 1000, that domain would be 20%. This shows how many queries that domain is attempting to rank for. On the Y-axis, we show the % of those keywords that rank in the top 10, or the first result. This is an indicator of how effective that domain is at ranking on the first page.
This gives us four major categories:
Nailing a Niche: (Top-Left): These sites aren’t trying to rank for many keywords, but for those that they are, they are ranking on page one. Lots of campgrounds that only exist in one metro area.
Dominating the Space (Top-right): Here’s where Yelp and RV Park reviews sit. They have a page created to target nearly every keyword and a majority of them are ranking on the first page.
Not even trying (Bottom-left): These are sites that have few pages created to target the keywords, and a minority of those pages are ranking on page 1. You see a lot of .gov sites and tangentially related sites.
Trying but struggling / indifferent (Bottom-right): These domains have pages ranking for a majority of the keywords, but few of those pages make it to page 1. Here you have both up and coming sites that are trying but haven’t earned a lot of page 1 rankings, and those sites that are just really large but probably aren’t making a concerted effort to go after the terms (Facebook being the perfect example).
Here’s the chart with a few labels added.
Finally, let’s see which players are going after which search terms. Remember, In this data-set, we use the following head terms (paired with city / state): campgrounds, camping, rv parks.
There are quite a few insights we can glean from this, such as:
- Hipcamp is ignoring ‘RV Parks’
- Yellowpages is ignoring ‘Camping’
- GoodSam does really well at ‘RV Parks’
As one could imagine, there’s probably overlap between these three terms, but each term probably has a different intent. You’ll have to consider your business model, as well as the competitiveness of the SERPs when deciding which to focus on.
This leads us to ask the very important question – which keywords does google consider the same intent? It is ill-advised to create multiple pages for the same intent, as you’ll be needlessly wasting link equity. We’ll cover that in the next post on creating a landing page for each searcher intent.
Using nothing more than rank data for 5,000 or so keywords, we were able to get a pretty clear picture of what the competitive landscape is for a whole industry. However, there’s no need to stop here. For starters, it’s important to drill into nuances or oddities you may uncover.
Additionally, you might want to incorporate additional datapoints such as domain rating, site speed, search volume, US Census data, ect to enrich your analysis.
Finally, you should pay a lot of attention to the sites that seem to be dominating or up-and-coming. What are common patterns in their UX? Where are they getting their links? What are they using for title tags? This is the more traditional competitive analysis in SEO.
Next week we’ll talk about how to go about creating your landing pages at scale, programatically. Subscribe below to get notified when that post comes out.