Wednesday, May 31, 2017
Sad to say our next three shows on...
Sad to say our next three shows on the All-American Road Show have been canceled. Details below: Unfortunately, the next three weeks of “The All-American Road Show” have been postponed. Below is the full list of dates we are working to reschedule. We apologize for any inconvenience this may cause. Hold on to your tickets for now, and stay tuned for updates. June 1 – Fresno, CA – Save Mart Center June 2 – Mt. View, CA – Shoreline Amphitheatre June 3 – Wheatland, CA – Toyota Amphitheatre June 9 – Southaven, MS – Bankplus Amphitheater at Snowden Grove June 10 – Birmingham, AL – Oak Mountain Amphitheatre June 15 – Charleston, WV – Charleston Civic Center June 16 – Cincinnati, OH – Riverbend Music Center June 17 – Noblesville, IN – Klipsch Music Center Unfortunately, Chris’ appearance at the CMA Music Festival on Sunday, June 11 is cancelled. Thank you for your patience and support. -Team CS
Sad to say our next three shows on...
Sad to say our next three shows on the All-American Road Show have been canceled. Details below: Unfortunately, the next three weeks of “The All-American Road Show” have been postponed. Below is the full list of dates we are working to reschedule. We apologize for any inconvenience this may cause. Hold on to your tickets for now, and stay tuned for updates. June 1 – Fresno, CA – Save Mart Center June 2 – Mt. View, CA – Shoreline Amphitheatre June 3 – Wheatland, CA – Toyota Amphitheatre June 9 – Southaven, MS – Bankplus Amphitheater at Snowden Grove June 10 – Birmingham, AL – Oak Mountain Amphitheatre June 15 – Charleston, WV – Charleston Civic Center June 16 – Cincinnati, OH – Riverbend Music Center June 17 – Noblesville, IN – Klipsch Music Center Unfortunately, Chris’ appearance at the CMA Music Festival on Sunday, June 11 is cancelled. Thank you for your patience and support. -Team CS
Gearing up backstage for the 1st show on 15 in a 30...
3 months ago, my brother was losing a battle...
Gearing up backstage for the 1st show on 15 in a 30...
3 months ago, my brother was losing a battle...
Optimizing AngularJS Single-Page Applications for Googlebot Crawlers
Posted by jrridley
It’s almost certain that you’ve encountered AngularJS on the web somewhere, even if you weren’t aware of it at the time. Here’s a list of just a few sites using Angular:
- Upwork.com
- Freelancer.com
- Udemy.com
- Youtube.com
Any of those look familiar? If so, it’s because AngularJS is taking over the Internet. There’s a good reason for that: Angular- and other React-style frameworks make for a better user and developer experience on a site. For background, AngularJS and ReactJS are part of a web design movement called single-page applications, or SPAs. While a traditional website loads each individual page as the user navigates the site, including calls to the server and cache, loading resources, and rendering the page, SPAs cut out much of the back-end activity by loading the entire site when a user first lands on a page. Instead of loading a new page each time you click on a link, the site dynamically updates a single HTML page as the user interacts with the site.
Why is this movement taking over the Internet? With SPAs, users are treated to a screaming fast site through which they can navigate almost instantaneously, while developers have a template that allows them to customize, test, and optimize pages seamlessly and efficiently. AngularJS and ReactJS use advanced Javascript templates to render the site, which means the HTML/CSS page speed overhead is almost nothing. All site activity runs behind the scenes, out of view of the user.
Unfortunately, anyone who’s tried performing SEO on an Angular or React site knows that the site activity is hidden from more than just site visitors: it’s also hidden from web crawlers. Crawlers like Googlebot rely heavily on HTML/CSS data to render and interpret the content on a site. When that HTML content is hidden behind website scripts, crawlers have no website content to index and serve in search results.
Of course, Google claims they can crawl Javascript (and SEOs have tested and supported this claim), but even if that is true, Googlebot still struggles to crawl sites built on a SPA framework. One of the first issues we encountered when a client first approached us with an Angular site was that nothing beyond the homepage was appearing in the SERPs. ScreamingFrog crawls uncovered the homepage and a handful of other Javascript resources, and that was it.
Another common issue is recording Google Analytics data. Think about it: Analytics data is tracked by recording pageviews every time a user navigates to a page. How can you track site analytics when there’s no HTML response to trigger a pageview?
After working with several clients on their SPA websites, we’ve developed a process for performing SEO on those sites. By using this process, we’ve not only enabled SPA sites to be indexed by search engines, but even to rank on the first page for keywords.
5-step solution to SEO for AngularJS
- Make a list of all pages on the site
- Install Prerender
- “Fetch as Google”
- Configure Analytics
- Recrawl the site
1) Make a list of all pages on your site
If this sounds like a long and tedious process, that’s because it definitely can be. For some sites, this will be as easy as exporting the XML sitemap for the site. For other sites, especially those with hundreds or thousands of pages, creating a comprehensive list of all the pages on the site can take hours or days. However, I cannot emphasize enough how helpful this step has been for us. Having an index of all pages on the site gives you a guide to reference and consult as you work on getting your site indexed. It’s almost impossible to predict every issue that you’re going to encounter with an SPA, and if you don’t have an all-inclusive list of content to reference throughout your SEO optimization, it’s highly likely you’ll leave some part of the site un-indexed by search engines inadvertently.
One solution that might enable you to streamline this process is to divide content into directories instead of individual pages. For example, if you know that you have a list of storeroom pages, include your /storeroom/ directory and make a note of how many pages that includes. Or if you have an e-commerce site, make a note of how many products you have in each shopping category and compile your list that way (though if you have an e-commerce site, I hope for your own sake you have a master list of products somewhere). Regardless of what you do to make this step less time-consuming, make sure you have a full list before continuing to step 2.
2) Install Prerender
Prerender is going to be your best friend when performing SEO for SPAs. Prerender is a service that will render your website in a virtual browser, then serve the static HTML content to web crawlers. From an SEO standpoint, this is as good of a solution as you can hope for: users still get the fast, dynamic SPA experience while search engine crawlers can identify indexable content for search results.
Prerender’s pricing varies based on the size of your site and the freshness of the cache served to Google. Smaller sites (up to 250 pages) can use Prerender for free, while larger sites (or sites that update constantly) may need to pay as much as $200+/month. However, having an indexable version of your site that enables you to attract customers through organic search is invaluable. This is where that list you compiled in step 1 comes into play: if you can prioritize what sections of your site need to be served to search engines, or with what frequency, you may be able to save a little bit of money each month while still achieving SEO progress.
3) "Fetch as Google"
Within Google Search Console is an incredibly useful feature called “Fetch as Google.” “Fetch as Google” allows you to enter a URL from your site and fetch it as Googlebot would during a crawl. “Fetch” returns the HTTP response from the page, which includes a full download of the page source code as Googlebot sees it. “Fetch and Render” will return the HTTP response and will also provide a screenshot of the page as Googlebot saw it and as a site visitor would see it.
This has powerful applications for AngularJS sites. Even with Prerender installed, you may find that Google is still only partially displaying your website, or it may be omitting key features of your site that are helpful to users. Plugging the URL into “Fetch as Google” will let you review how your site appears to search engines and what further steps you may need to take to optimize your keyword rankings. Additionally, after requesting a “Fetch” or “Fetch and Render,” you have the option to “Request Indexing” for that page, which can be handy catalyst for getting your site to appear in search results.
4) Configure Google Analytics (or Google Tag Manager)
As I mentioned above, SPAs can have serious trouble with recording Google Analytics data since they don’t track pageviews the way a standard website does. Instead of the traditional Google Analytics tracking code, you’ll need to install Analytics through some kind of alternative method.
One method that works well is to use the Angulartics plugin. Angulartics replaces standard pageview events with virtual pageview tracking, which tracks the entire user navigation across your application. Since SPAs dynamically load HTML content, these virtual pageviews are recorded based on user interactions with the site, which ultimately tracks the same user behavior as you would through traditional Analytics. Other people have found success using Google Tag Manager “History Change” triggers or other innovative methods, which are perfectly acceptable implementations. As long as your Google Analytics tracking records user interactions instead of conventional pageviews, your Analytics configuration should suffice.
5) Recrawl the site
After working through steps 1–4, you’re going to want to crawl the site yourself to find those errors that not even Googlebot was anticipating. One issue we discovered early with a client was that after installing Prerender, our crawlers were still running into a spider trap:
As you can probably tell, there were not actually 150,000 pages on that particular site. Our crawlers just found a recursive loop that kept generating longer and longer URL strings for the site content. This is something we would not have found in Google Search Console or Analytics. SPAs are notorious for causing tedious, inexplicable issues that you’ll only uncover by crawling the site yourself. Even if you follow the steps above and take as many precautions as possible, I can still almost guarantee you will come across a unique issue that can only be diagnosed through a crawl.
If you’ve come across any of these unique issues, let me know in the comments! I’d love to hear what other issues people have encountered with SPAs.
Results
As I mentioned earlier in the article, the process outlined above has enabled us to not only get client sites indexed, but even to get those sites ranking on first page for various keywords. Here’s an example of the keyword progress we made for one client with an AngularJS site:
Also, the organic traffic growth for that client over the course of seven months:
All of this goes to show that although SEO for SPAs can be tedious, laborious, and troublesome, it is not impossible. Follow the steps above, and you can have SEO success with your single-page app website.
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
Tuesday, May 30, 2017
Hey y’all, I’m playing at #HGTVLodge during...
Hey y’all, I’m playing at #HGTVLodge during...
No, Paid Search Audiences Won’t Replace Keywords
Posted by PPCKirk
I have been chewing on a keyword vs. audience targeting post for roughly two years now. In that time we have seen audience targeting grow in popularity (as expected) and depth.
“Popularity” is somewhat of an understatement here. I would go so far as to say that I've heard it lauded in messianic-like “thy kingdom come, thy will be done” reverential awe by some paid search marketers. as if paid search were lacking a heartbeat before the life-giving audience targeting had arrived and 1-2-3-clear’ed it into relevance.
However, I would argue that despite audience targeting’s popularity (and understandable success), we have also seen the revelation of some weaknesses as well. It turns out it’s not quite the heroic, rescue-the-captives targeting method paid searchers had hoped it would be.
The purpose of this post is to argue against the notion that audience targeting can replace the keyword in paid search.
Now, before we get into the throes of keyword philosophy, I’d like to reduce the number of angry comments this post receives by acknowledging a crucial point.
It is not my intention in any way to set up a false dichotomy. Yes, I believe the keyword is still the most valuable form of targeting for a paid search marketer, but I also believe that audience targeting can play a valuable complementary role in search bidding.
In fact, as I think about it, I would argue that I am writing this post in response to what I have heard become a false dichotomy. That is, that audience targeting is better than keyword targeting and will eventually replace it.
I disagree with this idea vehemently, as I will demonstrate in the rest of this article.
One seasoned (age, not steak) traditional marketer’s point of view
The best illustration I've heard on the core weakness of audience targeting was from an older traditional marketer who has probably never accessed the Keyword Planner in his life.
“I have two teenage daughters.” He revealed, with no small amount of pride.
“They are within 18 months of each other, so in age demographic targeting they are the same person.”
“They are both young women, so in gender demographic targeting they are the same person.”
“They are both my daughters in my care, so in income demographic targeting they are the same person.”
“They are both living in my house, so in geographical targeting they are the same person.”
“They share the same friends, so in social targeting they are the same person.”
“However, in terms of personality, they couldn’t be more different. One is artistic and enjoys heels and dresses and makeup. The other loves the outdoors and sports, and spends her time in blue jeans and sneakers.”
If an audience-targeting marketer selling spring dresses saw them in his marketing list, he would (1) see two older high school girls with the same income in the same geographical area, (2) assume they are both interested in what he has to sell, and (3) only make one sale.
The problem isn’t with his targeting, the problem is that not all those forced into an audience persona box will fit.
In September of 2015, Aaron Levy (a brilliant marketing mind; go follow him) wrote a fabulously under-shared post revealing these weaknesses in another way: What You Think You Know About Your Customers’ Persona is Wrong
In this article, Aaron first bravely broaches the subject of audience targeting by describing how it is far from the exact science we all have hoped it to be. He noted a few ways that audience targeting can be erroneous, and even *gasp* used data to formulate his conclusions.
It’s OK to question audience targeting — really!
Let me be clear: I believe audience targeting is popular because there genuinely is value in it (it's amazing data to have… when it's accurate!). The insights we can get about personas, which we can then use to power our ads, are quite amazing and powerful.
So, why the heck am I droning on about audience targeting weaknesses? Well, I’m trying to set you up for something. I’m trying to get us to admit that audience targeting itself has some weaknesses, and isn’t the savior of all digital marketing that some make it out to be, and that there is a tried-and-true solution that fits well with demographic targeting, but is not replaced by it. It is a targeting that we paid searchers have used joyfully and successfully for years now.
It is the keyword.
Whereas audience targeting chafes under the law of averages (i.e., “at some point, someone in my demographic targeted list has to actually be interested in what I am selling”), keyword targeting shines in individual-revealing user intent.
Keyword targeting does something an audience can never, ever, ever do...
Keywords: Personal intent powerhouses
A keyword is still my favorite form of targeting in paid search because it reveals individual, personal, and temporal intent. Those aren’t just three buzzwords I pulled out of the air because I needed to stretch this already obesely-long post out further. They are intentional, and worth exploring.
Individual
A keyword is such a powerful targeting method because it is written (or spoken!) by a single person. I mean, let’s be honest, it’s rare to have more than one person huddled around the computer shouting at it. Keywords are generally from the mind of one individual, and because of that they have frightening potential.
Remember, audience targeting is based off of assumptions. That is, you're taking a group of people who “probably” think the same way in a certain area, but does that mean they cannot have unique tastes? For instance, one person preferring to buy sneakers with another preferring to buy heels?
Keyword targeting is demographic-blind.
It doesn’t care who you are, where you’re from, what you did, as long as you love me… err, I mean, it doesn’t care about your demographic, just about what you're individually interested in.
Personal
The next aspect of keywords powering their targeting awesomeness is that they reveal personal intent. Whereas the “individual” aspect of keyword targeting narrows our targeting from a group of people to a single person, the “personal” aspect of keyword targeting goes into the very mind of that individual.
Don’t you wish there was a way to market to people in which you could truly discern the intentions of their hearts? Wouldn’t that be a powerful method of targeting? Well, yes — and that is keyword targeting!
Think about it: a keyword is a form of communication. It is a person typing or telling you what is on their mind. For a split second, in their search, you and they are as connected through communication as Alexander Graham Bell and Thomas Watson on the first phone call. That person is revealing to you what's on her mind, and that's a power which cannot be underestimated.
When a person tells Google they want to know “how does someone earn a black belt,” that is telling your client — the Jumping Judo Janes of Jordan — this person genuinely wants to learn more about their services and they can display an ad that matches that intent (Ready for that Black Belt? It’s Not Hard, Let Us Help!). Paid search keywords officiate the wedding of personal intent with advertising in a way that previous marketers could only dream of. We aren’t finding random people we think might be interested based upon where they live. We are responding to a person telling us they are interested.
Temporal
The final note of keyword targeting that cannot be underestimated, is the temporal aspect. Anyone worth their salt in marketing can tell you “timing is everything”. With keyword targeting, the timing is inseparable from the intent. When is this person interested in learning about your Judo classes? At the time they are searching, NOW!
You are not blasting your ads into your users lives, interrupting them as they go about their business or family time hoping to jumpstart their interest by distracting them from their activities. You are responding to their query, at the very time they are interested in learning more.
Timing. Is. Everything.
The situation settles into stickiness
Thus, to summarize: a “search” is done when an individual reveals his/her personal intent with communication (keywords/queries) at a specific time. Because of that, I maintain that keyword targeting trumps audience targeting in paid search.
Paid search is an evolving industry, but it is still “search,” which requires communication, which requires words (until that time when the emoji takes over the English language, but that’s okay because the rioting in the streets will have gotten us first).
Of course, we would be remiss in ignoring some legitimate questions which inevitably arise. As ideal as the outline I've laid out before you sounds, you're probably beginning to formulate something like the following four questions.
- What about low search volume keywords?
- What if the search engines kill keyword targeting?
- What if IoT monsters kill search engines?
- What about social ads?
We’ll close by discussing each of these four questions.
Low search volume terms (LSVs)
Low search volume keywords stink like poo (excuse the rather strong language there). I’m not sure if there is any data on this out there (if so, please share it below), but I have run into low search volume terms far more in the past year than when I first started managing PPC campaigns in 2010.
I don’t know all the reasons for this; perhaps it’s worth another blog post, but the reality is it’s getting harder to be creative and target high-value long-tail keywords when so many are getting shut off due to low search volume.
This seems like a fairly smooth way being paved for Google/Bing to eventually “take over” (i.e., “automate for our good”) keyword targeting, at the very least for SMBs (small-medium businesses) where LSVs can be a significant problem. In this instance, the keyword would still be around, it just wouldn’t be managed by us PPCers directly. Boo.
Search engine decrees
I’ve already addressed the power search engines have here, but I will be the first to admit that, as much as I like keyword targeting and as much as I have hopefully proven how valuable it is, it still would be a fairly easy thing for Google or Bing to kill off completely. Major boo.
Since paid search relies on keywords and queries and language to work, I imagine this would look more like an automated solution (think DSAs and shopping), in which they make keyword targeting into a dynamic system that works in conjunction with audience targeting.
While this was about a year and a half ago, it is worth noting that at Hero Conference in London, Bing Ads’ ebullient Tor Crockett did make the public statement that Bing at the time had no plans to sunset the keyword as a bidding option. We can only hope this sentiment remains, and transfers over to Google as well.
But Internet of Things (IoT) Frankenstein devices!
Finally, it could be that search engines won’t be around forever. Perhaps this will look like IoT devices such as Alexa that incorporate some level of search into them, but pull traffic away from using Google/Bing search bars. As an example of this in real life, you don’t need to ask Google where to find (queries, keywords, communication, search) the best price on laundry detergent if you can just push the Dash button, or your smart washing machine can just order you more without a search effort.
On the other hand, I still believe we're a long way off from this in the same way that the freak-out over mobile devices killing personal computers has slowed down. That is, we still utilize our computers for education & work (even if personal usage revolves around tablets and mobile devices and IoT freaks-of-nature… smart toasters anyone?) and our mobile devices for queries on the go. Computers are still a primary source of search in terms of work and education as well as more intensive personal activities (vacation planning, for instance), and thus computers still rely heavily on search. Mobile devices are still heavily query-centered for various tasks, especially as voice search (still query-centered!) kicks in harder.
The social effect
Social is its own animal in a way, and why I believe it is already and will continue to have an effect on search and keywords (though not in a terribly worrisome way). Social definitely pulls a level of traffic from search, specifically in product queries. “Who has used this dishwasher before, any other recommendations?” Social ads are exploding in popularity as well, and in large part because they are working. People are purchasing more than they ever have from social ads and marketers are rushing to be there for them.
The flip side of this: a social and paid search comparison is apples-to-oranges. There are different motivations and purposes for using search engines and querying your friends.
Audience targeting works great in a social setting since that social network has phenomenally accurate and specific targeting for individuals, but it is the rare individual curious about the ideal condom to purchase who queries his family and friends on Facebook. There will always be elements of social and search that are unique and valuable in their own way, and audience targeting for social and keyword targeting for search complement those unique elements of each.
Idealism incarnate
Thus, it is my belief that as long as we have search, we will still have keywords and keyword targeting will be the best way to target — as long as costs remain low enough to be realistic for budgets and the search engines don’t kill keyword bidding for an automated solution.
Don’t give up, the keyword is not dead. Stay focused, and carry on with your match types!
I want to close by re-acknowledging the crucial point I opened with.
It has not been my intention in any way to set up a false dichotomy. In fact, as I think about it, I would argue that I am writing this in response to what I have heard become a false dichotomy. That is, that audience targeting is better than keyword targeting and will eventually replace it…
I believe the keyword is still the most valuable form of targeting for a paid search marketer, but I also believe that audience demographics can play a valuable complementary role in bidding.
A prime example that we already use is remarketing lists for search ads, in which we can layer on remarketing audiences in both Google and Bing into our search queries. Wouldn’t it be amazing if we could someday do this with massive amounts of audience data? I've said this before, but were Bing Ads to use its LinkedIn acquisition to allow us to layer on LinkedIn audiences into our current keyword framework, the B2B angels would surely rejoice over us (Bing has responded, by the way, that something is in the works!).
Either way, I hope I've demonstrated that far from being on its deathbed, the keyword is still the most essential tool in the paid search marketer’s toolbox.
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
Monday, May 29, 2017
I'm thankful for those who fight...
I'm thankful for those who fight for the freedom of total strangers like you and I! I'm thankful for those like Colton Rusk who put their life on the line for our freedom! Unfortunately some of them won't ever get to come back home. Shake the hands, and thank those men and women in uniform every chance you get. Keep them in your prayers, and ask the man upstairs to guide them, and keep them safe! I'm a proud supporter of the men and women who fight for this country! God Bless the United States Military!
I'm thankful for those who fight...
I'm thankful for those who fight for the freedom of total strangers like you and I! I'm thankful for those like Colton Rusk who put their life on the line for our freedom! Unfortunately some of them won't ever get to come back home. Shake the hands, and thank those men and women in uniform every chance you get. Keep them in your prayers, and ask the man upstairs to guide them, and keep them safe! I'm a proud supporter of the men and women who fight for this country! God Bless the United States Military!
Evidence of the Surprising State of JavaScript Indexing
Posted by willcritchlow
Back when I started in this industry, it was standard advice to tell our clients that the search engines couldn’t execute JavaScript (JS), and anything that relied on JS would be effectively invisible and never appear in the index. Over the years, that has changed gradually, from early work-arounds (such as the horrible escaped fragment approach my colleague Rob wrote about back in 2010) to the actual execution of JS in the indexing pipeline that we see today, at least at Google.
In this article, I want to explore some things we've seen about JS indexing behavior in the wild and in controlled tests and share some tentative conclusions I've drawn about how it must be working.
A brief introduction to JS indexing
At its most basic, the idea behind JavaScript-enabled indexing is to get closer to the search engine seeing the page as the user sees it. Most users browse with JavaScript enabled, and many sites either fail without it or are severely limited. While traditional indexing considers just the raw HTML source received from the server, users typically see a page rendered based on the DOM (Document Object Model) which can be modified by JavaScript running in their web browser. JS-enabled indexing considers all content in the rendered DOM, not just that which appears in the raw HTML.
There are some complexities even in this basic definition (answers in brackets as I understand them):
- What about JavaScript that requests additional content from the server? (This will generally be included, subject to timeout limits)
- What about JavaScript that executes some time after the page loads? (This will generally only be indexed up to some time limit, possibly in the region of 5 seconds)
- What about JavaScript that executes on some user interaction such as scrolling or clicking? (This will generally not be included)
- What about JavaScript in external files rather than in-line? (This will generally be included, as long as those external files are not blocked from the robot — though see the caveat in experiments below)
For more on the technical details, I recommend my ex-colleague Justin’s writing on the subject.
A high-level overview of my view of JavaScript best practices
Despite the incredible work-arounds of the past (which always seemed like more effort than graceful degradation to me) the “right” answer has existed since at least 2012, with the introduction of PushState. Rob wrote about this one, too. Back then, however, it was pretty clunky and manual and it required a concerted effort to ensure both that the URL was updated in the user’s browser for each view that should be considered a “page,” that the server could return full HTML for those pages in response to new requests for each URL, and that the back button was handled correctly by your JavaScript.
Along the way, in my opinion, too many sites got distracted by a separate prerendering step. This is an approach that does the equivalent of running a headless browser to generate static HTML pages that include any changes made by JavaScript on page load, then serving those snapshots instead of the JS-reliant page in response to requests from bots. It typically treats bots differently, in a way that Google tolerates, as long as the snapshots do represent the user experience. In my opinion, this approach is a poor compromise that's too susceptible to silent failures and falling out of date. We've seen a bunch of sites suffer traffic drops due to serving Googlebot broken experiences that were not immediately detected because no regular users saw the prerendered pages.
These days, if you need or want JS-enhanced functionality, more of the top frameworks have the ability to work the way Rob described in 2012, which is now called isomorphic (roughly meaning “the same”).
Isomorphic JavaScript serves HTML that corresponds to the rendered DOM for each URL, and updates the URL for each “view” that should exist as a separate page as the content is updated via JS. With this implementation, there is actually no need to render the page to index basic content, as it's served in response to any fresh request.
I was fascinated by this piece of research published recently — you should go and read the whole study. In particular, you should watch this video (recommended in the post) in which the speaker — who is an Angular developer and evangelist — emphasizes the need for an isomorphic approach:
Resources for auditing JavaScript
If you work in SEO, you will increasingly find yourself called upon to figure out whether a particular implementation is correct (hopefully on a staging/development server before it’s deployed live, but who are we kidding? You’ll be doing this live, too).
To do that, here are some resources I’ve found useful:
- Justin again, describing the difference between working with the DOM and viewing source
- The developer tools built into Chrome are excellent, and some of the documentation is actually really good:
- The console is where you can see errors and interact with the state of the page
- As soon as you get past debugging the most basic JavaScript, you will want to start setting breakpoints, which allow you to step through the code from specified points
- This post from Google’s John Mueller has a decent checklist of best practices
- Although it’s about a broader set of technical skills, anyone who hasn’t already read it should definitely check out Mike’s post on the technical SEO renaissance.
Some surprising/interesting results
There are likely to be timeouts on JavaScript execution
I already linked above to the ScreamingFrog post that mentions experiments they have done to measure the timeout Google uses to determine when to stop executing JavaScript (they found a limit of around 5 seconds).
It may be more complicated than that, however. This segment of a thread is interesting. It's from a Hacker News user who goes by the username KMag and who claims to have worked at Google on the JS execution part of the indexing pipeline from 2006–2010. It’s in relation to another user speculating that Google would not care about content loaded “async” (i.e. asynchronously — in other words, loaded as part of new HTTP requests that are triggered in the background while assets continue to download):
“Actually, we did care about this content. I'm not at liberty to explain the details, but we did execute setTimeouts up to some time limit.
If they're smart, they actually make the exact timeout a function of a HMAC of the loaded source, to make it very difficult to experiment around, find the exact limits, and fool the indexing system. Back in 2010, it was still a fixed time limit.”
What that means is that although it was initially a fixed timeout, he’s speculating (or possibly sharing without directly doing so) that timeouts are programmatically determined (presumably based on page importance and JavaScript reliance) and that they may be tied to the exact source code (the reference to “HMAC” is to do with a technical mechanism for spotting if the page has changed).
It matters how your JS is executed
I referenced this recent study earlier. In it, the author found:
Inline vs. External vs. Bundled JavaScript makes a huge difference for Googlebot
The charts at the end show the extent to which popular JavaScript frameworks perform differently depending on how they're called, with a range of performance from passing every test to failing almost every test. For example here’s the chart for Angular:
It’s definitely worth reading the whole thing and reviewing the performance of the different frameworks. There's more evidence of Google saving computing resources in some areas, as well as surprising results between different frameworks.
CRO tests are getting indexed
When we first started seeing JavaScript-based split-testing platforms designed for testing changes aimed at improving conversion rate (CRO = conversion rate optimization), their inline changes to individual pages were invisible to the search engines. As Google in particular has moved up the JavaScript competency ladder through executing simple inline JS to more complex JS in external files, we are now seeing some CRO-platform-created changes being indexed. A simplified version of what’s happening is:
- For users:
- CRO platforms typically take a visitor to a page, check for the existence of a cookie, and if there isn’t one, randomly assign the visitor to group A or group B
- Based on either the cookie value or the new assignment, the user is either served the page unchanged, or sees a version that is modified in their browser by JavaScript loaded from the CRO platform’s CDN (content delivery network)
- A cookie is then set to make sure that the user sees the same version if they revisit that page later
- For Googlebot:
- The reliance on external JavaScript used to prevent both the bucketing and the inline changes from being indexed
- With external JavaScript now being loaded, and with many of these inline changes being made using standard libraries (such as JQuery), Google is able to index the variant and hence we see CRO experiments sometimes being indexed
I might have expected the platforms to block their JS with robots.txt, but at least the main platforms I’ve looked at don't do that. With Google being sympathetic towards testing, however, this shouldn’t be a major problem — just something to be aware of as you build out your user-facing CRO tests. All the more reason for your UX and SEO teams to work closely together and communicate well.
Split tests show SEO improvements from removing a reliance on JS
Although we would like to do a lot more to test the actual real-world impact of relying on JavaScript, we do have some early results. At the end of last week I published a post outlining the uplift we saw from removing a site’s reliance on JS to display content and links on category pages.
A simple test that removed the need for JavaScript on 50% of pages showed a >6% uplift in organic traffic — worth thousands of extra sessions a month. While we haven’t proven that JavaScript is always bad, nor understood the exact mechanism at work here, we have opened up a new avenue for exploration, and at least shown that it’s not a settled matter. To my mind, it highlights the importance of testing. It’s obviously our belief in the importance of SEO split-testing that led to us investing so much in the development of the ODN platform over the last 18 months or so.
Conclusion: How JavaScript indexing might work from a systems perspective
Based on all of the information we can piece together from the external behavior of the search results, public comments from Googlers, tests and experiments, and first principles, here’s how I think JavaScript indexing is working at Google at the moment: I think there is a separate queue for JS-enabled rendering, because the computational cost of trying to run JavaScript over the entire web is unnecessary given the lack of a need for it on many, many pages. In detail, I think:
- Googlebot crawls and caches HTML and core resources regularly
- Heuristics (and probably machine learning) are used to prioritize JavaScript rendering for each page:
- Some pages are indexed with no JS execution. There are many pages that can probably be easily identified as not needing rendering, and others which are such a low priority that it isn’t worth the computing resources.
- Some pages get immediate rendering – or possibly immediate basic/regular indexing, along with high-priority rendering. This would enable the immediate indexation of pages in news results or other QDF results, but also allow pages that rely heavily on JS to get updated indexation when the rendering completes.
- Many pages are rendered async in a separate process/queue from both crawling and regular indexing, thereby adding the page to the index for new words and phrases found only in the JS-rendered version when rendering completes, in addition to the words and phrases found in the unrendered version indexed initially.
- The JS rendering also, in addition to adding pages to the index:
- May make modifications to the link graph
- May add new URLs to the discovery/crawling queue for Googlebot
The idea of JavaScript rendering as a distinct and separate part of the indexing pipeline is backed up by this quote from KMag, who I mentioned previously for his contributions to this HN thread (direct link) [emphasis mine]:
“I was working on the lightweight high-performance JavaScript interpretation system that sandboxed pretty much just a JS engine and a DOM implementation that we could run on every web page on the index. Most of my work was trying to improve the fidelity of the system. My code analyzed every web page in the index.
Towards the end of my time there, there was someone in Mountain View working on a heavier, higher-fidelity system that sandboxed much more of a browser, and they were trying to improve performance so they could use it on a higher percentage of the index.”
This was the situation in 2010. It seems likely that they have moved a long way towards the headless browser in all cases, but I’m skeptical about whether it would be worth their while to render every page they crawl with JavaScript given the expense of doing so and the fact that a large percentage of pages do not change substantially when you do.
My best guess is that they're using a combination of trying to figure out the need for JavaScript execution on a given page, coupled with trust/authority metrics to decide whether (and with what priority) to render a page with JS.
Run a test, get publicity
I have a hypothesis that I would love to see someone test: That it’s possible to get a page indexed and ranking for a nonsense word contained in the served HTML, but not initially ranking for a different nonsense word added via JavaScript; then, to see the JS get indexed some period of time later and rank for both nonsense words. If you want to run that test, let me know the results — I’d be happy to publicize them.
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!