Create or optimize content to improve the visibility of the site in the search engine, study competitors and customers, think out a marketing strategy or an advertising campaign – all this can be done with the help of a properly assembled semantic core . Using the example of collecting a semantic core for an online store, I will show a step-by-step selection of key queries, as well as the necessary tools.
- What is a semantic core?
- The structure of the semantic core.
- Collection of the site’s semantic core step by step.
- Semantic kernel filtering: stop words and frequency.
- Semantic core clustering.
- How to use the semantic core.
What is a semantic core?
The semantic core of the site is a set of keywords and phrases on the most relevant topic of the site, combined into separate groups. It is important that the search queries of the semantic core describe the content of the site and its pages as accurately as possible, helping search engines to determine to which niche the site belongs, what services and products it offers.
The semantic core is an important component of the site, which affects further ranking in search results. The more precisely the content of a page is described to search engines using search queries, the more likely it is to be shown for one of these key queries.
The structure of the semantic core
Three main rules for the formation of the semantic core:
- All key queries are combined by clusters (groups).
- A group of queries contains all the keywords and phrases corresponding to one specific page of the site.
- All queries within a cluster must match the same key query type.
Types of key phrases and words can be divided into three main groups:
- Navigational The request contains the name of a specific store or toponym.
- Informational. They contain questions (when, why, how much, etc.), as well as the words harm, benefit, forum, photo, video .
- Commercial. Key queries with the words “buy”, “price”, “online store”, precise queries with additional information about the product or service (model, brand, color, taste and other characteristics).
| Type | Example of key queries |
| Navigational | sports nutrition outletsports nutrition Ukrainesports nutrition Dnipro |
| Informational | when to drink a protein shake?why can’t you eat protein bars often?damage from sports nutrition |
| Commercial (transactional) | buy sports nutritionhow much does sports nutrition cost?sports nutrition biotech usasports supplements price |
Different types of queries serve different user goals. There is no point in promoting a blog page that talks about how bad sports nutrition is with the commercial keywords “buy sports nutrition” because the content of the page does not match the keyword query. A user who just wanted to buy sports nutrition will quickly leave the page. This behavior will be considered by the search engine as a signal that the page will be downgraded, because it does not satisfy the user’s request.
When forming a cluster, the frequency of the key request is taken into account. The frequency of a keyword or phrase indicates how many times a query was entered by users into the search engine during a certain period of time.
An example of frequency data from the Serpstat service:
There are three types of frequency of key requests:
- HF (high frequency).
- MF (medium frequency).
- LF (low frequency).
When forming a cluster of the semantic core, all three types of frequency are used, except for requests with a frequency of “0”.
Visualization of the structure of the semantic core looks like this:
Collection of the site’s semantic core step by step
For clarity, I will use the example of creating a semantic core for an online store: from the formation of basic queries to clustering.
Important: Do not delete the semantic collection intermediate results. Create each step in a new tab so you can always go back to the original data and be able to quickly make changes to any step.
Formation of basic (marker) requests
Marker queries are the main keywords that broadly describe the subject of the site. Compiling a list of token queries will allow you to build the skeleton of a semantic core for further semantic collection.
To create token queries:
1. Use your own experience and knowledge about the product. Sports nutrition, which will be presented in the online store, belongs to two popular, well-known categories: protein and amino acids. After a quick study of the assortment, it is worth highlighting one more category – gainer. At the first stage, the list of token requests looks like this:
- sports nutrition;
- white;
- amino acids;
- gainer
2. Research the niche. Researching the topic using a Google search will help you get more information about the types and classification of the product. Pay attention to synonyms and alternative names used in search results for your queries.
Another quick way to delve deeper into the topic is to use the capabilities of artificial intelligence: ( ChatGPT , NotionAI and others). The free version of the tools is enough to get additional information or immediately request a list of keys on a topic.
An example of keyword generation in Notion AI:
Please note: Despite the development and popularity of AI-tools, ChatGPT or Notion AI are still not full-fledged tools for collecting semantic core, as they are at the stage of refinement and do not contain the latest relevant information in the database.
3. Study the competitors. Name categories, filtering – all these can become marker queries, and also help create your own site structure. It is worth considering that even if you are targeting competitors that are on the first lines of search results, you should not simply copy the structure and keywords from another site without first analyzing the frequency of key queries.
After studying the product and competitors, the list of marker queries looks like this:
- sports nutrition;
- white;
- amino acids;
- gainer;
- sports supplements;
- creatine;
- BCAAs;
- arginine
Collection of semantic core and selection of tools
We select keywords and phrases for the semantic core by combining the following methods and services:
- analysis of competitors’ sites;
- collection of Google search tips for each of the marker queries;
- use of tools based on artificial intelligence;
- using SEO tools, for example, Serpstat , Semrush , Ahrefs , Keywordtool.io ;
- working with the Google Ads keyword planner.
You can collect a semantic core for free using trial versions of SEO tools and the Google Ads scheduler. The main disadvantage of this method is the inevitable lost keywords.
To collect semantics, I recommend using several tools at once and combining them depending on your tasks, for example, Serpstat and Google Ads are the best solution for collecting semantics in Russian and Ukrainian languages, as well as searching for low-frequency and rare queries, which must also be taken into account when collecting semantic core
Pay attention. The exact frequency of keyword queries will only be displayed when using SEO tools or in the Keyword Planner from Google Ads (provided you have an active advertising company in your account).
Collecting key queries using Serpstat
- Set the search region and insert one of the keywords.
- Press “Enter”, Serpstat will automatically redirect you to summary report mode for the requested keyword.
- 7 main SEO reports will be available in Serpstat for the entered key query.
To collect key queries for existing sites, use the Competitors report to view competitors in your niche in a given region and the Domain Comparison tool.
Enter your website address and two niche competitors.
Serpstat will report the intersection of key queries between all analyzed domains, as well as the unique key queries for each domain.
For new sites, it is important to use the “Phrase Selection” and “Similar Phrases” reports, to obtain more complete semantics, it is also recommended to study the “Search Tips” report.
Use Serpstat filtering to refine your search and filter out queries you don’t need.
Export reports to a Google spreadsheet or another convenient format for further work with semantics.
Collecting key queries using Google Ads
Create a Google Ads account. Detailed instructions on creating an account can be found in Google Help .
- After creating an account, you will be redirected to your advertising account. Select Tools & Settings from the top menu, then click Keyword Planner.
- Select Find New Keywords, enter up to 10 token queries and click the Get Results button.
Please note: The new keyword search regional settings set must match the promotion region and semantic language.
In the report, Google will display available synonyms and similar queries, to expand the search, there are additional filters to refine the search results and keywords directly.
If you don’t have an active ad campaign in your ad account, you’ll see frequency data in the “from…to…” range format. If your GoogleAds account is running ads, you will see the exact number of requests per month and a graph showing the dynamics of the request’s popularity:
Use hints and filters to clarify keywords, save the obtained data for further work with semantics.
Forming a list of stop words
In order to correctly distribute the collected semantic core on the pages of the site, as well as to remove unnecessary garbage at the collection stage, it is necessary to compile a list of stop words.
Stop words are words or phrases that characterize a query or a group of queries that are not relevant to your semantic core.
Since in the example I am considering a variant of the commercial semantic core, I need to exclude all information requests from the semantics:
- what;
- Why;
- why;
- photo;
- video;
- review;
- forum etc.
Please note that queries can be of mixed type and conversion. For example, the queries “how much does it cost”, “where to buy” can result in search results with links to online stores, as well as links to articles and reviews.
To check, enter a keyword in the search and view the first page of Google output:
If all the links on it lead to articles and blogs, this is an information request, if the search results contain online stores or mixed output, the request can be classified as commercial.
Universal stop words in commercial semantics include the names of competitors’ stores, as well as regions in which it is impossible to place an order in your online store.
Forming a list of stop words using lemmatization
If your set of key queries and phrases is too large, use the lemmatization method to generate a broader list of stop words . Its essence is to unify the key requests: convert to the nominative case in the singular. This will help to find unique words that do not belong to the semantic core that you are forming.
Use the cst.dk service to bring key queries to a single view :
- Select the language of your semantics and set the following service settings:
- Copy or upload the list of key requests to the cst.dk service and click the button “Submit my text” (“Behandl min tekst”).
- Wait for the task to be completed, a list of keywords and a lemma for them will appear below:
- Copy the resulting list of lemmas into a document format that is convenient for you (for example, a notebook or Google Sheets) and study it for irrelevant words that can be added to the list of stop words.
In this case, I add the words containing “dropshipping” to the list of stop words, and the word “domestic” must be checked in the context of the key query. It can be used in such relevant queries as “sports nutrition for home training” and in irrelevant “how to prepare sports nutrition at home”.
That is why it is important to check the original key queries when forming stop words, as well as check the search results for queries that cause you doubt.
The stage of forming stop words is a painstaking manual process that will require your time and patience, but it will greatly facilitate the last stage of creating a semantic core – clustering.
Semantic kernel filtering: stop words and frequency
To reduce the number of garbage queries in the collected semantics, before applying the list of stop words, it is worth using the frequency data and removing all queries whose frequency is equal to zero.
To do this, update the received data on the frequency of key queries through Serpstat or Google Ads. It is important that, regardless of how many sources you used to collect semantics, the frequency indicator of all requests was checked through one service.
Update in Google Ads
Go to the query count and forecast viewer. You can immediately copy your keywords and phrases or download them as a CSV file.
At one time, you can check the frequency of slightly more than 20,000 requests.
To download the received data, select one of the formats in the “Historical plan performance” section.
Update in Serpstat
Go to the “Batch Analysis – Batch Keyword Analysis” menu category.
Create a new project, be sure to check the correctness of the displayed region.
Add keywords and click Create.
Use one of the formats convenient for you to export the report.
In the resulting Google Ads or Serpstat report, filter and remove all keyword queries with a frequency of “0”.
Now the obtained semantics must be cleaned using the list of stop words that we compiled earlier.
To do this, use a special Google sheet template that works through regular formulas.
You can download the template from the link .
- In column A, add a list of key queries.
- In column C, add a list of stop words.
- Column E automatically generates a list of keywords that contain stop words.
- In column G, a list of keywords that contain stop words will appear.
Be sure to check the received error lists. Perhaps one of the stop words turned out to be incorrect and refers to relevant queries.
Semantic core clustering
Services can use two types of algorithms for clustering:
- According to the consequences of the issue. Keywords form groups based on how similar the search engine results are. This method of clustering is used in Serpstat.
- By similarity of phrases. All similar requests will fall into the same group, even if the results for them are very different. In this way, clusters are created by the free Streamlit.app service .
The best way is to cluster semantics based on the results of search results, since the grouping of queries by similarity may not always correspond to the actual distribution of user queries.
To cluster keywords in Serpstat:
- Go to the “Clustering” section in the left side menu and create a new project:
- Fill in the project data and insert no more than 1000 keywords for clustering:
- Click the “Save” button and wait for the clustering process:
Important: Automatic clustering does not guarantee a 100% result.
Distribute ready requests grouped into clusters on the site pages, use them to create content and write meta tags.
An example of Serpstat key query clustering:
How to use the semantic core
After all the stages of cleaning and clustering, I received a list of keywords divided into clusters. An example of collected semantics for the topic “Sports Nutrition”:
Key requests with additional “tails” are combined into smaller groups of requests (subclusters) linked by a clarifying characteristic. The main clusters should be used for categories and subcategories by product types, and subclusters should be used for the formation of new landing pages and filters.
For example, in the query cluster “Amino acids” subclusters can be distinguished by the type of amino acids (leucine, isoleucine, valine, arginine), by purpose (for women, for men), form of release (powder, tablets, liquid). All these sub-clusters can be used to create filters that not only help to improve the usability of the site, but also bring traffic when optimizing the filter pages .
What should be remembered?
- A correctly composed semantic core of the site and the distribution of key queries on the site pages is one of the important ranking factors.
- Do not remove intermediate steps during the assembly of the semantic core. Form all stages in one document, but on different tabs. In this case, you can easily make corrections as needed.
- To collect the semantic core in Ukrainian and Russian, use Serpstat and Google Ads, for semantics in English you can additionally use Ahrefs.
- The exact frequency of the key is very important and necessary for the correct distribution of key queries on the site pages and content. To determine the frequency, you need to use only one service.
- For clustering, it is best to use tools that form groups of requests not by the similarity of phrases, but by the results of the issue.
- There is no magic button. SEO tools, clusterizers, lemmatizers, ready-made lists of stop words are just tools that help save time and shorten some processes. At each of the stages of collecting the semantic core, the main role is played by manual verification and proofreading of the obtained results.