Getting into the index
Websites that load a great deal of their website dynamically through AJAX can get in trouble getting their content indexed. Since the content is not initially loaded into the source code and does not have its own hyperlinked location, the search engine crawler cannot find, let alone index it. This is a problem especially with websites that load their navigations and article deeplinks using AJAX, since it can prevent the entire website from appearing in the index.
This difficulty is the result of Google’s crawler not being able to fully cope with the complexities that come with adding an AJAX engine in between the client and server. Since an entire website can be presented under the same URL using AJAX, this can leave entire websites vulnerable to being excluded from the index.
Feed the spider
Google would not be Google had they not devised a protocol to deal with these issues. The little known AJAX crawling specification provides two important mechanisms to help Google understand and properly index an AJAX-enabled web page or site:
- The exclamation mark URL: AJAX URLs are usually made up in the following fashion: http://www.example.com/#key=value. By changing this to http://www.example.com/#!key=value, you are telling the crawler that this is an AJAX URL. Upon encountering this URL, the crawler will re-fetch the URL as http://www.example.com/?_escaped_fragment_=key=value (more on that later)
- The fragment meta tag: some pages are AJAX-enabled and are not hyperlinked to in a traditional way. Think of the homepage. To tell the crawler that this is an AJAX-enabled page, include the following meta tag in the head-section:
<meta name="fragment" content="!">. Upon encountering this meta tag, the crawler will re-fetch the URL as http://www.example.com/?_escaped_fragment_
What to do behind the scenes
Both of these mechanisms require the web server to process the URL parameter _escaped_fragment_ that the crawler appends to the URL when it encounters it. The idea here, is that the server is to return a so-called ‘HTML snapshot’ of the requested page when the parameter is present. This HTML snapshot is basically a non-AJAX version of the requested page. For example, if one wants to index index.php#page=2 as a separate page, the idea is to change this URL to index.php#!page=2, catch the ?_escaped_fragment_=page=2 parameter and use the details in this parameter to serve up index.php with the correct content loaded.
Easy? No. Necessary? Yes.
Note that this is just the tip of the iceberg. For more in-depth information and detailed guidance, make sure to check out the AJAX section on Google Developers for more!