Optimize your URLs - Best practices for crawling & indexing
Optimize your URLs - Best practices for crawling & indexing
Search engine marketing best practises
1. Context: Why do you care?
2. Reduce inefficient crawling of your site
3. Avoid maverick coding practices
4. Remove user-specific details from URLs
5. Optimize dynamic URLs
6. Rein in infinite spaces
7. Disallow actions Googlebot can’t perform
8. Get your preferred URLs indexed
9. Resources
The internet world is big. Really big. How does an search engine spider deal with it? How do search engines deal with this?
· Discover unique content
· Prioritize crawling
o Crawl new content
o Refresh old content
o Crawl fewer duplicates
· Keep all the good stuff in the index
· Return relevant search results
Avoid maverick coding practices
· Discourage alternative encodings
§ shop.example.com/items/Periods-Styles__end-table_W0QQ_catrefZ1QQ_dmptZAntiquesQ5fFurnitureQQ_flnZ1QQ_npmvZ3QQ_sacatZ100927QQ_trksidZp3286Q2ec0Q2em282
§ Where [W0 = ?] and [QQ= &]
· Eliminate expand/collapse "parameters"
o www.example.com/ABN/GPC.nsf/MCList?OpenAgent&expand=1,3,15
Remove user-specific details from URLs
· Remove from the URL path
o www.example.com/cancun+hotel+zone-hotels-1-23-a7a14a13a4a23.html
o www.example.com/ikhgqzf20amswbqg1srbrh55/index.aspx?tpr=4&act=ela
o Creates infinite URLs to crawl
o Difficult to understand algorithmically
· Keywords in name/value pairs are just as good as in the path
o www.example.com/skates/riedell/carrera/
o www.example.com/skates.php?brand=riedell&model=carrera
Optimize dynamic URLs
· Dynamic URLs contain name/value pairs
o skates.php?size=6&brand=riedell
· Create patterns for crawlers to understand
o www.example.com/article.php?category=1&article=3&sid=123
o www.example.com/article.php?category=1&article=3&sid=456
o www.example.com/article.php?category=2&article=3&sid=789
· Use cookies to hide user-specific details
o www.example.com/skates.php?query=riedell+she+devil&id=9823576
o www.example.com/skates.php?ref=www.fastgirlskates.com&color=red
Rein in infinite spaces
· Uncover issues in CMS
o www.example.com/wiki/index.php?title=Special:Ipblocklist&limit=250&offset=423780&ip=
Disallow actions Googlebot can’t perform
· Googlebot is too cheap to ‘Add to cart’
o Disallow shopping carts
o http://www.example.com/index.php?page=EComm.AddToCart&Pid=3301674647606&returnTo=L2luZGV4LnBocD9wYWdlPUVDb21tLlByb2QmUGlkPTMzMDE2NzQ2NDc2OTI=
· Googlebot is too shy to ‘Contact us’
o Disallow contact forms, especially if they have unique URLs
o http://www.example.com/bb/posting.zsp?mode=newtopic&f=2&subject=Seeking%20information%20about%20roller%20derby%20training
· Googlebot forgets his password a lot
o Disallow login pages
o https://www.example.com/login.asp?er=43d9257de47d8b08a91069cccb584ab83ff21140bd46e81656dab3507f45d1ab079cd77244231e557d724dc1df1a641
Get your preferred URLs indexed
· Set your preferred domain in Webmaster Tools
o www.example.com vs. example.com
· Put canonical URLs in your Sitemap
· Use the new rel=“canonical” on any duplicate URLs
Webmaster Central
www.google.com/webmasters
· Help Center: Documentation, FAQs, webmaster guidelines
· Blog: Hot topics & best practices
· Help Forum: Ask questions, engage with others
Search engine marketing best practises
1. Context: Why do you care?
2. Reduce inefficient crawling of your site
3. Avoid maverick coding practices
4. Remove user-specific details from URLs
5. Optimize dynamic URLs
6. Rein in infinite spaces
7. Disallow actions Googlebot can’t perform
8. Get your preferred URLs indexed
9. Resources
The internet world is big. Really big. How does an search engine spider deal with it? How do search engines deal with this?
· Discover unique content
· Prioritize crawling
o Crawl new content
o Refresh old content
o Crawl fewer duplicates
· Keep all the good stuff in the index
· Return relevant search results
Focus on efficiency in these steps
Funnel your crawling “budget” toward your most important content
Avoid maverick coding practices
· Discourage alternative encodings
§ shop.example.com/items/Periods-Styles__end-table_W0QQ_catrefZ1QQ_dmptZAntiquesQ5fFurnitureQQ_flnZ1QQ_npmvZ3QQ_sacatZ100927QQ_trksidZp3286Q2ec0Q2em282
§ Where [W0 = ?] and [QQ= &]
· Eliminate expand/collapse "parameters"
o www.example.com/ABN/GPC.nsf/MCList?OpenAgent&expand=1,3,15
Remove user-specific details from URLs
· Remove from the URL path
o www.example.com/cancun+hotel+zone-hotels-1-23-a7a14a13a4a23.html
o www.example.com/ikhgqzf20amswbqg1srbrh55/index.aspx?tpr=4&act=ela
o Creates infinite URLs to crawl
o Difficult to understand algorithmically
· Keywords in name/value pairs are just as good as in the path
o www.example.com/skates/riedell/carrera/
o www.example.com/skates.php?brand=riedell&model=carrera
Optimize dynamic URLs
· Dynamic URLs contain name/value pairs
o skates.php?size=6&brand=riedell
· Create patterns for crawlers to understand
o www.example.com/article.php?category=1&article=3&sid=123
o www.example.com/article.php?category=1&article=3&sid=456
o www.example.com/article.php?category=2&article=3&sid=789
· Use cookies to hide user-specific details
o www.example.com/skates.php?query=riedell+she+devil&id=9823576
o www.example.com/skates.php?ref=www.fastgirlskates.com&color=red
Rein in infinite spaces
· Uncover issues in CMS
o www.example.com/wiki/index.php?title=Special:Ipblocklist&limit=250&offset=423780&ip=
Disallow actions Googlebot can’t perform
· Googlebot is too cheap to ‘Add to cart’
o Disallow shopping carts
o http://www.example.com/index.php?page=EComm.AddToCart&Pid=3301674647606&returnTo=L2luZGV4LnBocD9wYWdlPUVDb21tLlByb2QmUGlkPTMzMDE2NzQ2NDc2OTI=
· Googlebot is too shy to ‘Contact us’
o Disallow contact forms, especially if they have unique URLs
o http://www.example.com/bb/posting.zsp?mode=newtopic&f=2&subject=Seeking%20information%20about%20roller%20derby%20training
· Googlebot forgets his password a lot
o Disallow login pages
o https://www.example.com/login.asp?er=43d9257de47d8b08a91069cccb584ab83ff21140bd46e81656dab3507f45d1ab079cd77244231e557d724dc1df1a641
Get your preferred URLs indexed
· Set your preferred domain in Webmaster Tools
o www.example.com vs. example.com
· Put canonical URLs in your Sitemap
· Use the new rel=“canonical” on any duplicate URLs
Webmaster Central
www.google.com/webmasters
· Help Center: Documentation, FAQs, webmaster guidelines
· Blog: Hot topics & best practices
· Help Forum: Ask questions, engage with others
Labels: Search engine marketing best practises, SEO, seo tricks


0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home