LSI keywords is a concept that has become immensely popular with SEOs over recent years. So much so that the term has almost become synonymous with semantics.
A lot of this is due to the misconception that LSI keywords should be a focus in your SEO strategy.
In this article, I'll explain why you shouldn’t waste time considering LSI and what to do instead.
Latent Semantic Indexing keywords are words that are semantically linked to your primary keyword or topic.
If the main keyword is "sport," then anything relating to the topic such as "baseball, coach, team, or football" is considered an "LSI keyword."
LSI was introduced in the 1980s and referred to a new natural language processing (NLP) technology that’s both hard to explain and not easy to understand without knowing how NLP works.
The technology itself is a variety of mathematical calculations and not the basic math you learned in school.
Why was it developed?
Often, words have multiple meanings while sharing the same spelling,
For example, rock is a genre of music, but it’s also a stone. This is called a homonym (polysemy) .
So how does a search engine determine which one you mean and return the relevant information?
While you and I can work out which variation of the word "rock" is being used, a computer needs some additional help.
In terms of how this applies to a search engine, if someone searched for "best publications," how does a search engine know that it should return results mentioning the "best books?"
LSI helps in both these situations.
Using the mathematical calculations mentioned previously, it can figure out across a set of documents that certain words are related.
Or that some words have different meanings.
It’s also useful for:
At the beginning of this article, I mentioned that using LSI is a waste of time, and here’s one reason why — it’s not useful for modern SEO.
John Mueller has confirmed it himself.
There's no such thing as LSI keywords -- anyone who's telling you otherwise is mistaken, sorry.— 🍌 John 🍌 (@JohnMu) July 30, 2019
Still not convinced?
Here are some more reasons.
LSI only works on static datasets.
You would only find the LSI formula useful on small static datasets.
The issue is pretty obvious here. The web is far from static or small.
LSI tech would need rerunning every time a new page gets added to the web, which would be costly and high on the processing power required.
When LSI was created, the internet wasn’t like it is today, and the creators didn’t build LSI with this application in mind.
It’s old technology.
Google has multiple algorithms and ways to decipher semantics and the meaning behind words. We know they use a word vector approach with RankBrain, which is much more efficient and scalable than LSI.
More recently, we’ve seen BERT’s release, which solves common issues with natural language processing in a far superior way to LSI.
On top of that, Google has the knowledge graph, helping them understand entities (meaning ‘things’ like a person, brand, locations) and their relationships in a way that scales across the web.
With so much hype around LSI keywords, it would be good to know if there are any benefits, especially if you have already spent a lot of time implementing them.
It’s important to say that adding related terms to your content isn’t a direct ranking factor.
Fortunately, there’s potential for a silver lining.
Adding related entities to your content could potentially help it to rank better.
If you’re writing an article about clothing and including semantically related entities to the topic, such as "jeans, boots, or sneakers," it gives Google additional information to determine relevance.
Adding related entities helps Google better understand relevance, potentially improving where you rank, making it a non-direct ranking factor.
This is similar to how structured data isn’t a direct ranking factor, but adding it can help Google better understand what you’re talking about, potentially improving where you rank.
However, artificially inserting keywords into your content isn’t the best SEO practice. Doing this is more reminiscent of keyword stuffing, which doesn’t provide additional value to your site’s users.
So what should I be doing?
The answer is simple: create well-researched content written by experts on the topic area.
A well-researched topic ensures questions and multiple intents are covered.
Using an expert will ensure the content:
Having a specialist write your content will always be more valuable than spending hours researching LSI keywords.
Now you know how beneficial it is to create well-researched content, here’s a process to help you do it.
Check the search result.
You can gain a lot of information from just checking the search result for your targeted keyword, so make sure to use this to your advantage by looking at all the SERP features returned.
At the bottom of every search result is a list of related searches.
These can help you understand or structure your content to ensure similar searches and user intents are fulfilled.
Autocomplete is a handy feature for searchers, but it’s also a goldmine for content research.
Note keywords and queries users have and make sure you answer them in your content.
Rather than doing this manually, you can speed up the process with tools such as:
People also ask
The People also ask (PAA) box is another way to gain insights into questions users ask related to the topic you enter.
All you need to do is search the topic you’re covering; in this example, it’s "Rhodiola." Next, scroll down to the PAA and make a note of the results.
Select relevant PAA results and more questions are added to the PAA box.
You can use AlsoAsked to speed up this process, automating the discovery of questions around the topic you’re covering.
Finally, use the knowledge graph to spot attributes and other related entities.
The knowledge graph understands the relationship between people, things, and concepts. So it’s a great way to find related topics to include in your content.
The “People also search for” area will include entities with a strong relevance to your topic.
Check competing content.
Comparing your content against competitors and improving upon it is a great way to add relevant entities to a page.
Here are a few ways you can do that.
See what their page has covered.
If your competitor is ranking well organically, they've likely done a good job of covering the topic.
We’re not looking for specific entities and words they’ve covered; instead, we're looking at things such as:
In the example of Rhodiola, if we check the Healthline’s article, we can see they’ve included:
At a minimum, you’ll need to do the same.
See what their page is ranking for.
A content gap analysis reveals keywords that you’re not ranking for, but your competitors are. You can do one easily within Ahrefs Site Explorer by adding competing articles.
Once done, a list of the pages with ranking keywords will display. Scan through the list and highlight long tail keywords and related topics to explore.
Use research tools.
There are plenty of great research tools out there to help you on your way to creating the best possible content.
Let’s have a look at some of the best ones out there!
Referring to itself as "AI for Content," Frase is a brilliant tool that ensures your content covers everything it can.
This tool compares your content to other pages ranking and helps by:
My favorite is their topic score, which allows you to see related entities and phrases included in competing articles quickly.
Selecting an entity shows you the piece of content it was extracted from, providing valuable context to expand your research.
As well as an analysis based on entities, you can also get an overview of common header tags, helping you quickly understand common page structures.
Keyword explorers make it easy to find long tail queries related to what you input. I tend to use Ahrefs Keyword Explorer, which provides a few different useful reports.
My favorite tool is the phrase match report, which shows keywords that include the phrase you entered. I commonly use it with filters to find precisely what I want.
You can also use the ‘also rank for’ report to see other keywords that the top-ranking pages for the term you entered rank for. This is effectively reverse engineering what Google sees to be semantically related terms so is great for finding related entities.
In the below example, you can see both "golden root" and "arctic root" may be things that we should be talking about.
Another favorite is the questions report, which shows keywords included terms like "what, where, when, or how."
Knowledge graph API
Alongside checking the knowledge panel that shows up, some tools will query Google’s knowledge graph API for you and return relevant terms.
This knowledge graph API tool by Carl Hendy is my favorite. For my example term ‘Rhodiola’ its done a great job of returning a semantically related entity.
It might not be the first one that comes to mind, but Wikipedia is a great way to research a topic and its related entities.
Wikipedia tends to cover each topic extensively, so you’ll easily spot opportunities from a quick read of the page.
I’d recommend doing things like checking the references.
Using the what links here tool to see other wiki pages relating to the one you’re researching.
Make sure to also check links on the page to other wiki pages. These will all be related entities.
So there you have it! Hopefully, you now have a more robust understanding of LSI Keywords’ history and how to implement a content optimization strategy that’ll achieve what you’re trying to do.