Marie Haynes excellent analysis was the inspiration for me to share some thoughts I’ve alluded to on twitter! Although retired core updates are still a part of SEO that I am curious about and still form hypothesis for what is happening.
Causes of SERP Volatility Unrelated to Algorithmic Updates
For the past couple of years the tools measuring SERP volatility have been terrible at measuring volatility to “warn” of core update activity. IMO, one of the reasons for the ineffective measurement is the unprecedented number of innovations to SERP construction being made by Google on almost a daily basis.
The tools “supposedly” measuring algorithmic activity are still measuring the volatility as if the SERPs were still the “ten blue links” of years past. Nothing could be further from the truth!
So now the tools are measuring the volatility caused by changes to SERP construction as much as volatility from algorithmic updates of which Google is constantly making smaller algo changes.
If my memory is correct these experiments and algo adjustments and experiments have increased 5 fold or more from the days of Universal Search. This too would cause significant flux in results. The activity could even be enough to spike volatility if the number of affected SERPs was even 1% of queries. Likely not enough to warrant announcing a core update but does create a lot of volatility or flux in SERPs.
SERP Volatility Due to Re-Ranking
Since the first time Bill Slawski and I discussed a paper written by some Googlers titled “Out-of-Domain Semantics to the Rescue! Zero-Shot Hybrid Retrieval Models” we have been discussing whether this is in play and if so how we could determine it.
I have believed that what the paper discussed is what Google is doing and therefore my belief is all pages are re-ranked after indexing and NLP processing. I strongly advise reading and understanding the paper above as the IR (Information Retrieval) is pure gold!
On top of what the paper outlines there have been a few other bits of info that I feel back my hypothesis up:
- John Mueller has said recently that it can take days or longer to index a page after discovery. One must wonder how Google determines page relevance and ranking before indexing and NLP is applied?
- Google uses Buffer Protocol technology in crawling which means they basically stream bytes to a server which are processed and a “bag of words” is created which is a phrase they use in the Trillions of Questions video released a few months ago.
- The “bag of words” is scored using ngrams or other ranking method enabling Google to apply a ranking and add the page directly to the results after crawling. Consider Google wants to be first with fresh information since it was embarrassed by being beat to the punch by TMZ when Michael Jackson passed.
Testing the Re-Ranking Hypothesis
To test my re-ranking hypothesis I published “Semantic SEO Audits: What to Include & Organize the Audit” a couple of weeks ago. Next day after publishing and for a few days after it ranked #1. Once it was indexed it dropped to #3 and 15 days later has moved to #5.
I wrote it like it was 1999 replacing what we used to call primary keywords with the main entities Semantics, SEO and Audits. That’s the main reason it ranked #1 because the query isn’t very competitive, yet, and the pre-index relevance score is high because of the exact phrase repeats!
This small test tells me that there is some sort of phrase based ranking going on pre-indexing. Reviewing the pages in the “semantic SEO audit” query results you’ll find a variety of other topics like SEO audits or Semantic SEO topics.
So I believe it’s safe to say some are included because of topic relevance for the entity Semantic SEO and only a few are relevant for the full query. IMO, other than the #1 ranking page included because of links and topic relevance the others will disappear as competition for exact phrase stiffens.
The post hasn’t been promoted because I want to see how getting links to it affects its ranking and to see how adjusting entity frequency and location will affect the page. I have felt for some time that links are even less effective than they were when I retired and will be even less effective in the future as NLP improves.
Looking at the pages that are ranking they have far more authority for the Semantic SEO topic. I will watch the SERP and as competition grows I will review the use of entities on new entries for the query and adjust current usage or add new entities I find in the new entries.
BTW, not really worried about competition affecting my page as I am the “originating author” ie that which all others must create more value to rank higher. Bill Slawskis’ post “Original Content and Original Authors Identified by Google” on author expertise and originating content!
One of Bill’s strongest qualities for his reviews of patents was when he analyzed them he tried not to make assumptions about their use by Google. I was lucky in that this particular paper we discussed at length as well as the re-ranking theory. That post was a test of these hypothesis I only wish I could’ve shared the results with him!
What % of affected SERPs Warrants Confirmation of a Core Update Event?
The core updates I remember affected roughly 7% or more of SERPs. So my thoughts have been that Google doesn’t deem an update to be core unless it affects 7% or more of SERPs.
I’m wondering why the % of SERPs affected in the recent update wasn’t included in this announcement. The part of me that wears a tinfoil cap (pictured above) wonders why they aren’t disclosing this important data point. I won’t be surprised if it’s considerably higher than 7%. Enough that they didn’t want to cause more panic than usual!
Below is complete conjecture on my part in the hope it promotes re-thinking about core update roll outs and what happens in between them!
IMO, since machine learning started powering all of the non-technology driven core updates (hummingbird, rankbrain and BERT) the way core updates are rolled out has changed. One of the reasons that it takes two weeks to roll out a core update is they are also a refresh of the index where some sites get full dampening and fixed sites get full removal of dampening.
IMO, what Marie saw from her analysis was the affects of dampening adjustments pre-roll out. My belief is these adjustments are needed or the dampening that occurs between core updates is amplified by the new core update.
This happens because around 2013 (from a bad memory) Google started doing incremental collection of data rather than “Google dance” like events. The events are the refreshes of the index which would mean any activity between the events would either get extra benefit for a fix and greater dampening than intended for discovery between core update events.
One of the things I’ve noticed about core updates since they added AI/ML to the mix is that when looking at a GA or Panguin timeline of organic Google traffic affected sites often have:
- slow trend down possibly during training (intial dampening of pages found during crawls)
- spike up pre-rollout (removal of current dampening)
- a top that lasts a week or so (during roll out)
- spike up/adjusted/down (full core update dampening applied, adjusted or removed)
Do you see the same thing let me know on twitter!