Keyword in URL
Keywords and phrases that appear in the page URL, outside of the domain name, aid in establishing relevance of a piece of content for a particular search query. Diminishing returns are apparently achieved as URLs become lengthier or as keywords are used more than once.
Keywords Earlier in URL
The order in which keywords appear in a URL matters. It’s been theorized that keywords appearing earlier in a URL have more weight. At minimum, it’s been confirmed by Matt Cutts that “after about five” words, the weight of a keyword dwindles.
Keyword in Title Tag
Title tags define the title of a document or page on your site, and often appear in both the SERP and as snippets for social sharing. Should be no longer than 60-70 characters, depending on the characters (Moz Tool
). As with URL, keywords closer to the beginning are widely theorized to have more weight.
Source(s): US 20070022110 A1
Keyword Density of Page
The percentage of times a keyword appears in text. Practicing SEOs once sculpted all content so that a single keyword/phrase appeared 5.5%-6% of the time. In the early-to-mid-2000s, this was very effective. Google has since improved with other types of content analysis that those tactics are scarcely relevant in 2015. And Keyword Density, although referenced in Google Patents, is almost certainly just a simplified concept within TF-IDF, which we’ll cover next.
Source(s): Patent US 20040083127 A1
TF-IDF of Page
Think of TF-IDF, or Term Frequency-Inverse Document Frequency, like Keyword Density with context. TF-IDF weighs the density of keywords on a page against what is “normal” rather than just seeking out a flat, raw percentage. This serves to ignore words like “the” in computation and establishes how many times a literate human should probably mention a phrase like “Google Ranking Factors” in a single document that covers such a topic.
Key Phrase in Heading Tag (H1, H2, etc.)
Keywords in Heading tags have strong weight in determining the relevant subject matter of a page. An H1 tag carries the most weight, H2 has less, and so forth. This tag also improves accessibility for screen readers and clear, descriptive headings reduce bounce rates according to various studies.
Words with Noticeable Formatting
Keywords in bold, italic, underline, or larger fonts have more weight in determining the relevant subject matter of a page, but less weight than words appearing in a heading. This is confirmed by Matt Cutts, SEOs, and a patent that states: “matches in text that is of larger font or bolded or italicized may be weighted more than matches in normal text.”
Keywords in Close Proximity
The closeness of words to one another implies association. To anyone that’s ever wielded the English language, this won’t come as a surprise. One paragraph about your SEO work in Chicago will thus do more to rank for “Chicago SEO” than two paragraphs, with one about SEO and one about Chicago.
Source(s): Patents: US 20020143758 A1, US 20080313202 A1
Keyword in ALT Text
The ALT attribute of an image is an used to describe that image to search engines and who are unable to display the image. This establishes relevance, especially for Image Search, while also improving accessibility.
Exact Search Phrase Match
Although Google may return search results that contain only part of a search phrase as it appears on your page (or in some cases, none at all), a patent states that a higher Information Retrieval (IR) score is given for an exact match. Specifically, stating that “a document matching all of the terms of the search query may receive a higher score than a document matching one of the terms.”
Source(s): Patent US8818982 B1
Partial Search Phrase Match
It’s established by a Google patent, that when a page contains an exact match of a search phrase on the page, it significantly perceived to that query relevance, dubbed the Information Retrieval (IR) score. In the process, they confirm that you may still rank for certain search queries when a page contains a search phrase not exactly as it was entered into Google. This is further verified by just doing a lot of Googling.
Source(s): Patent US8818982 B1
Keywords Higher on Page
There’s a natural trend in how we write English: earlier is usually more important. This applies to sentences, paragraphs, pages, HTML tags. Google seems to apply this everywhere as well, with content that appears earlier and more visibly being given more weight. This is, at very least, a function of the Page Layout algorithm, which gives a lot of preference to what appears above-the-fold on your site.
Keyword stemming is the practice of taking the root or ‘stem’ of a word and finding other words that share that stem (ie. ‘stem-ming’, ‘stem-med’, etc.). Avoiding this, such as for the sake of a keyword density score, results in poor readability and has a negative impact. This was introduced in 2003 with the Florida update.
Internal Link Anchor Text
The anchor text of a link tells the user where that link leads. It’s an important component of navigation within your website, and when not abused, helps to establish the relevance of a particular piece of content over vague alternatives such as “click here”.
Keyword is Domain Name
Also referred to as an Exact Match Domain or EMD. A powerful ranking bonus is attributed when a keyword exactly matches a domain and a search query meets Google’s definition of a “commercial query”. This was designed so that brands would rank for their own names, but was frequently exploited and as a result, made less-powerful
in various circumstances.
Source(s): Patent EP 1661018 A2, US 8046350 B1
Keyword in Domain Name
A ranking bonus is attributed when a keyword or phrase exists within a domain name. The weight given seems to be less significant than when the domain name exactly matches that of a particular SEO query, but more significant than when a keyword appears later in the URL.
Source(s): Patent EP 1661018 A2
Keyword Density across Domain
Krishna Bharat identified a problem with PageRank when he introduced Hilltop: “a web-site that is authoritative in general may contain a page that matches a certain query but is not an authority on the topic of the query”. Hilltop improved search by looking at the relevance of entire sites, labeled “experts”. Since TF-IDF determines page-level relevance, we make a small assumption that Hilltop defines an “expert” domain using the same tools.
TF-IDF across Domain
Saying “Keyword Density” instead of “Term Frequency” in 2015 throws a lot of SEO specialists into a rage, despite being perfect synonyms. What’s important when talking about “Keyword Density” factors is again the latter half of TF-IDF: Inverse Document Frequency. Google throws out words like adverbs with TF-IDF and dynamically evaluates the natural density for topic. Metrics on “how much is natural” have apparently decreased over time.
Distribution of Page Authority
Typically, pages that are linked sitewide are given a large boost, pages linked from them get a lesser boost, and so forth. A similar effect is often seen from pages linked from the homepage, because this is commonly the most-linked page on most websites. Creating a site architecture to maximize this factor is commonly known as PageRank Sculpting.
Source(s): Patent US 6285999 B1
This is somewhat confusing since a brand new domain name may also receives a temporary boost. Older domains are given a little more trust, which Matt Cutts emphasizes is pretty minor (while in the process, acknowledging exists). Speculatively, this may be rewarding sites that have had a chance to prove themselves not a part of short-term black hat projects.
New domains may receive a temporary boost in rankings. In a patent discussing methods of determining fresh content, it’s stated “the date that a domain with which a document is registered may be used as an indication of the inception date of the document.” That said, the impact this actually has on one’s rankings is, according to Matt Cutts, relatively small. Speculatively, this may be intended to give a brand new site, or timely niche site, just enough chance to get off the ground.
Hyphen-Separated URL Words
The ideal method of separating keywords in a URL is to use a hyphen. Underscores can work, but are not as reliable, as they can be confused with programming variables. Mashing words together in a URL is likely to cause words to not be seen as separate keywords, thus preventing any Keyword in URL bonus. Aside from these scenarios, just using a hyphen will not make a site rank higher.
Keywords Earlier in Tag
An SEO theory manifested itself in the early 2000s called the first third rule. It noted that our language – sentences, titles, paragraphs, even entire web pages, are generally used in order of importance. Although not confirmed by Google, Northcutt’s experience with word order experiments have more frequently indicated that this is a factor.
Long Domain Registration Term
Google directly states
in this patent that longer domain registration terms predict the legitimacy of a domain. Speculatively, those that engage in webspam understand that it’s a short-term, high volume game of burn/rinse/repeat and don’t purchase domains for longer than they need.
Source(s): Patent US 7346839 B2
Despite Google downplaying their ability to investigate Domain Registrant information, we know of a patent that discusses using Domain Registration Terms to single out webspam schemes. We’ve also seen Matt Cutts speak about private whois contributing to penalties, and encouraging visitors on his blog to report fake whois data. We believe that this is wise “play it safe card”, despite only a lack thereof being confirmed as a (negative) factor.
Use of HTTPS (SSL)
SSL was officially announced as a new positive ranking factor in 2014, regardless of whether the site processed user input. Gary Illyes downplayed the significance of SSL in 2015, calling it a tiebreaker. Although, for an algorithm based on the numeric scoring of billions of web pages, we’ve found that tiebreakers very often make all of the difference on competitive search queries.
With the advent of Schema.org, a joint project between Google, Yahoo!, Bing, and Yandex to understand logical data entities over keywords, we move further away from the traditional “10 blue links” style of search. Currently, use of Structured Data can improve rankings in a massive variety
of scenarios. There are also theories that schema.org can improve traditional search rankings by catering to a ranking method known as entity salience
The full name of this one is technically “fresh content when query deserves freshness”. This term, Query Deserves Freshness (often shortened to QDF), refers to search queries that would benefit from more current content. This does not apply to every query, but it applies to quite a lot, especially those that are informational in nature. These SEO benefits are just one more reason that brand publishers tend to be very successful.
Domain-wide Fresh Content
There is unconfirmed speculation that domain-wide performance is improved by maintaining fresh content. Speculatively, this means that overall the resource that Google is recommending is less “stale” and more accurate/relevant, especially if at least some significant portion of the information has been worth a little upkeep or supplementation by the owner.
Source(s): Patent US 8549014 B2, Speculation
A Google patent states: “For some queries, older documents may be more favorable than newer ones.” It goes on to describe a scenario where a search result set may be re-ranked by the average age of documents in the retrieved results before being displayed.
Source(s): Patent US 8549014 B2
Domain-wide Old Content
Theoretically, for all we have heard about Query Deserves Freshness (QDF), which serves news-like content in a number of circumstances, some sort of “Query Deserves Oldness”. Considering that we’ve never been told about “QDO” by Google, it may be reasonable to conclude that older content is always preferred when QDF is not at play. Just like domain-wide freshness, however, we don’t have too much evidence to confirm a domain-wide seniority score.
Quality Outbound Links
Although it’s possible for outbound links to “leak PageRank”, web sites are not supposed to be dead ends. Google rewards authoritative outbound links to “good sites”. To quote the source: “parts of our system encourage links to good sites.”
Relevant Outbound Links
Given that Google analyzes your inbound links for authority, relevance, and context, it seems reasonable to suggest that outbound links should be relevant as well as authoritative. This would likely relate to the Hilltop algorithm, simply in reverse to the manner that’s widely accepted for inbound links.
Good Spelling and Grammar
This is a Bing ranking factor
. Amit Singhal stated “these are the kinds of questions we ask” regarding spelling/grammar in Google’s definition of quality content. Matt Cutts said no in 2011 as of “a long time ago”, but also that rankings correlate anyway. Our agency’s findings have been that the first Panda update made this matter a lot. If nothing else, most content-related factors are clearly affected by spelling/grammar.
We know that Google analyzes the reading level of content, since they created such a search filter for the results page (now removed). We also know that content mills, which Google is not fond of, are considered to be very basic, whereas academic writing was very advanced. What we don’t have, as of yet, is a concrete source or study that directly relates reading level to rankings.
Rich media, on top of drawing more traffic from in-line image and video search, has long been considered a component of “high quality, unique content”. Video appeared to be the deciding factor with Panda 2.5. Northcutt’s work has also shown a positive correlation. Currently though, there’s no official, public source signing off on this factor.
Categorical Information Architecture has been an SEO discussion point for a long time, as it seems that Google analyzes topic coverage across entire sites. The exact ranking implications of this are unclear, but Google now refers to this as Structured Data, and at very least, will use to display breadcrumbs on the results page, therefore ranking more pages.
Some SEOs claim that the meta keywords tag never mattered for SEO. That’s a myth. The notion that Google ranks meta keywords in 2015 is also a myth. Both of these facts were confirmed the same way – by placing a zero-competition, made-up word in a meta keywords tag, getting that page into the index, then searching that word. Remember though, that Google is not the only search engine, and could theoretically index countless other dynamic sites that benefit from this tag.
Mobile-friendly websites are given a significant ranking advantage. For now, the ranking implications of this appears to pertain only to users searching on mobile devices. This made its way into the mainstream SEO conversation and became more severe during the Mobilegeddon update in 2015, although experts were speculating on this topic for nearly a decade previous.
A good meta description functions as a search ad. Considering how many AdWords agencies exist almost entirely on A/B testing AdWords ads, the marketing value here can’t be understated. Although keywords used in meta descriptions were once widely considered a direct ranking factor, Matt Cutts stated in 2009 that they’re not now.
Many have suggested that Google Analytics is or may become a Google ranking factor. All evidence at present, as well as very clear statements from Matt Cutts, indicate that any ranking benefits coming from Google Analytics, now or ever in the future, are an absolute myth. That said, it’s an amazingly powerful tool in the right marketer’s hands.
Google Webmaster Tools
Just like Google Analytics, there are no confirmed ranking benefits to using Google Webmaster Tools in any way. Webmaster Tools is still useful in unearthing problems related other ranking factors on this page; especially those related to manual penalties and certain crawler errors.
ccTLD in National Ranking
Country code TLDs such as .uk and .br are believed to carry with them a ranking bonus to searches from the same country, which is especially useful for internationalization. They should also perform far better in contrast to a ccTLD from another country.
Sitemaps can be useful, though not required, for the purpose of getting more pages of your site into the Google index. The notion that an XML sitemap will improve rankings within Google is a myth. This comes straight from Google and is confirmed by various studies.
Salience of Entities
As time goes on, Google seems to do more to analyze ideas and logical entities in preference to words and phrases. It analyzes how we say things in preference to exact search queries that appear on a page. This process, in simple terms, is what’s making it possible to search for “how to cook meat”, and be returned results for steak recipes that might not mention the word “meat” directly anywhere.
Phrasing and Context
As keyword density is now virtually a non-factor, a basic understanding of Phrase-Based Indexing tells us that if you write about content thoroughly and elaborately, you stand a far better chance of ranking compared to writing generic content that just happens to drop a lot of keywords. A clear component of one Google patent describes this as the “identification of related phrases and clusters of related phrases”.
Source(s): Patent US 7536408 B2
Web Server Near Users
Google functions differently on many local queries, supplementing traditional results with Google Maps results, and potentially altered organic listings as well. The same is true for national and international searches. By hosting your site at least loosely near to your users, such as within the same country, you are likely to enjoy better rankings.
Authorship was an experiment that Google ran from 2011 to 2014, which thrived upon bloggers using the rel=”author” tag to establish the reputation of particular authors. Google directly confirmed by the creation and demise of authorship. Eric Enge did a nice eulogy
on the rise and fall of authorship on Search Engine Land.
The rel=”canonical” tag suggests the ideal URL for a page. This can avert duplicate content devaluations and penalties when multiple URLs might result in the same content. Our experience is that this is only a suggestion to Google and one that is often ignored. According to Google it does not directly improve rankings. Despite all of this, it’s a very good idea.
Using rel=”author” was once widespread SEO advice and hypothesized as a positive ranking factor, but Google’s use of this factor at all went away along with an entire practice known as Authorship. The notion that rel=”author” is beneficial for any reason whatsoever is now regarded as a myth.
Just like rel=”author”, using rel=”publisher” was once widespread SEO advice and also hypothesized as a positive ranking factor. And, just like rel=”author” Google’s use of rel=”publisher” at all went away along with an entire practice known as Authorship.
URL uses “www” Subdomain
A common misconception propagated by by SEO bloggers suggests that a site may rank better if your URLs start with “www”. This originates from the idea that we often force all pages on a site to resolve at “www”. The reason that we actually do this is simply to avoid two URLs serving the same content at the same address, which would bring about a negative ranking factor.
Dedicated IP Address
Web server IP addresses can be useful for geo-targeting certain demographics. They can be negative ranking factors when they sit amidst a significant private webspam operation, or are used by the Hilltop algorithm to identify two sites as being from differing owners. But, the notion that just having a dedicated IP address provides a direct ranking advantage has been repeatedly debunked.
Subdomains (thing.yoursite.com) are often viewed as separate websites by Google, as compared to subfolders (yoursite.com/thing/), which are not. This has obvious implications with many other factors on this page. Matt Cutts called subfolders/subdomains “roughly equivalent” in 2012, confirming this now happens less often, but still happens. Panda recovery stories post-2012 such as HubPages migration from subfolders/subdomains, prove that it still can be a major factor.
Number of Subdomains
The number of subdomains on a site appears to be the most significant factor in determining whether subdomains are each treated as their own sites (as occurs in nature with free web hosting services and hybrid hosting/social sites like HubPages), or just portions of a common site. Presumably, thousands of subdomains means that they don’t all belong to a single thematic site and are likely each websites in their own right.
Although SEO paranoia seems to make this frequent advice, it’s directly denied by Google. We’ve also found no real evidence to support, and have seen no noticeable effects when assisting with optimizations for media monetization, which is something that our agency frequently does. We’re therefore prepared to firmly declare this factor a myth.
Keywords in HTML Comments
This is an early SEO theory that’s very easily debunked by a ten second experiment and a little patience. In the cited example, we place an extremely non-competitive made-up word in our source code, then link to it prominently so that it gets indexed. If that word appears in search, we have evidence that Google ranks by that word. In this case, it doesn’t.
Another twist on an early SEO theory that’s very easily debunked by a ten second experiment and a little patience. In the cited example, we place an extremely non-competitive made-up word in our source code, then link to it prominently so that it gets indexed. If that word appears in search, we have evidence that Google ranks by that word. In this case, it doesn’t.
Keywords in CLASSes, NAMEs, and IDs
Once again, we can debunk theories as to whether or not words in an odd place have any impact on search engines by putting a non-competitive phrase there and waiting. It’s not worth even speculating at what Google tells us or what’s in a patent. And again here, we can confirm that this factor is a myth, at least at the time of writing this.
A physical address is theorized as a mark of legitimacy in standard search rankings. Loosely supported by the notion that Google looks at citations for local SEO (also known as Google Maps SEO) as mentions of Name, Address, Phone (sometimes shorted to “NAP”) together. “Highly satisfying contact information” is also something that Google quality control auditors are instructed to seek out.
Verifiable Phone Number
A phone number is theorized as a mark of legitimacy in standard search rankings. Loosely supported by the notion that Google looks at citations for local SEO (also known as Google Maps SEO) as mentions of Name, Address, Phone (sometimes shorted to “NAP”) together. “Highly satisfying contact information” is also something that Google quality control auditors are instructed to seek out.
Accessible Contact Page
Theorized as a mark of legitimacy. It appears that this may have originated, or is at least best-supported, from a document called Google’s Quality Rater Guidelines. In this document, Google asks quality control auditors to search for “highly satisfying contact information.”
Low Code-to-Content Ratio
This SEO theory seemed to become widespread in 2011, suggesting that more content and less code is good. Here’s what we know: 1.) Speed is a confirmed factor, 2.) Google’s own PageSpeed Insights tool really presses even a 5Kb reduction in payload size, 3.) Minor code mistakes can cause devaluations and penalties. So at minimum (and more likely) this is an issue of indirect correlation. But to add my own account to several others, I’ve more than once seen this seem to really matter.
Meta Source Tag
The Meta Source Tag was created for Google News in 2010 to better-attribute sources. It comes in two forms: syndication-source (if syndicating a 3rd party) and original-source (you’re the source). In situations where content is syndicated, this may theoretically help avoid duplicate content penalties. If you’re the original-source, this tag is overridden by rel=”canonical” anyway.
More Content Per Page
SerpIQ conducted an interesting correlation study comparing the length of content to top rankings, which decidedly favors content with 2,000-2,500 words. It’s not clear if this is an indirect function of other factors, such as these pages being better-liked and therefore drawing more links/shares, or growing popular by ranking for more, longer search query variations.
Meta Geo Tag
Unlike IP address and ccTLDs, Matt Cutts states that they “barely look at this tag, if at all”, although he did suggest that this tag might be considered if you were to use it on a gTLD site (such as “.com”), and attempt to restrict it to a country. So, while this is confirmed to be almost useless, it was suggested that Google does at least look at it and may consider it a factor very, very rarely for internationalization.
Keywords Earlier in Display Title
More than a decade of studies and correlation research suggests that titles that begin with a keyword usually
(but not always) rank better than titles ending in a keyword. It’s easy to test and usually confirms: earlier keywords are better. But our chosen source for this suggests more. Thumback.com conducted a study where title word order changed traffic by 20%-30%. Their best-performing titles didn’t begin with a keyword, but were
altered (as Google sometimes does
) to do so in Google’s results.
Keywords Earlier in Headings
Heading tags are another place where word order appears to really matter. Again, something known as the “first third rule” has been often thrown around on this topic – suggesting that words appearing earlier have more weight. Usually our findings have confirmed this, but regardless, it’s well-worth testing, especially in the H1 position.
Novel Content against Web
A Google Patent and this SEO’s working experience seem to indicate that Google devalues a lot more than just directly similar content. Google has literally patented methods for calling your content uninteresting. Once determining that a set of articles are related, this patent suggests various methods for determining which content is descriptive, unique, and/or weird (in a good way) when compared to others on the same topic.
Novel Content against Self
Google patents suggest that the genuine uniqueness/weirdness of content, as well as how elaborately that content speaks, determines something known as a “novelty score”. This is done by quantifying/qualifying “information nuggets” within text. We pretty much know only that Google’s methods for novelty scoring requires comparing many individual documents. Considering that duplicate content is weighed both internally and externally, however, novelty scores likely are as well.
Source(s): Patent US 8140449 B1
Sitewide Average Novelty Score
Kumar and Bharat’s patent titled “Detecting novel document content” describes how single documents may be scored on how “novel” (that’s an adjective) they are. Assigning an average novelty scores sitewide also appears to fit the narrative of other known sitewide factors such as sitewide thin content (Panda algorithm behavior) and sitewide expert relevance (Hilltop algorithm behavior).
Source(s): Patents US 8140449 B1, US 8825645 B1, Speculation
Quantity of Comments
We know from countless sources and even certain Webmaster Tools messages that Google can separate user-generated content and analyzes it differently. One theory suggests that Google might look at quantities of comments on content to help rate content quality. At present, however, there is no clear evidence for this factor beyond maybe fitting an “if I were Google” narrative. Speculatively, it would also be one of the easiest factors to game.
Positive Sentiment in Comments
It’s theorized that Google looks at blog comment opinions to determine the quality of content. There is a patent and confirmation from Google that they score the sentiment expressed towards an entire site in product reviews. But according to Amit Singhal, they’re not able to apply this to content, because “if we demoted web pages that have negative comments against them, you might not be able to find information about many elected officials”.