Tuesday, August 11, 2009

Optimize your URLs - Best practices for crawling & indexing

Optimize your URLs - Best practices for crawling & indexing

Search engine marketing best practises
1. Context: Why do you care?
2. Reduce inefficient crawling of your site
3. Avoid maverick coding practices
4. Remove user-specific details from URLs
5. Optimize dynamic URLs
6. Rein in infinite spaces
7. Disallow actions Googlebot can’t perform
8. Get your preferred URLs indexed
9. Resources

The internet world is big. Really big. How does an search engine spider deal with it? How do search engines deal with this?

· Discover unique content
· Prioritize crawling
o Crawl new content
o Refresh old content
o Crawl fewer duplicates
· Keep all the good stuff in the index
· Return relevant search results

Focus on efficiency in these steps
Funnel your crawling “budget” toward your most important content


Avoid maverick coding practices
· Discourage alternative encodings
§ shop.example.com/items/Periods-Styles__end-table_W0QQ_catrefZ1QQ_dmptZAntiquesQ5fFurnitureQQ_flnZ1QQ_npmvZ3QQ_sacatZ100927QQ_trksidZp3286Q2ec0Q2em282
§ Where [W0 = ?] and [QQ= &]
· Eliminate expand/collapse "parameters"
o www.example.com/ABN/GPC.nsf/MCList?OpenAgent&expand=1,3,15

Remove user-specific details from URLs

· Remove from the URL path
o www.example.com/cancun+hotel+zone-hotels-1-23-a7a14a13a4a23.html
o www.example.com/ikhgqzf20amswbqg1srbrh55/index.aspx?tpr=4&act=ela
o Creates infinite URLs to crawl
o Difficult to understand algorithmically
· Keywords in name/value pairs are just as good as in the path
o www.example.com/skates/riedell/carrera/
o www.example.com/skates.php?brand=riedell&model=carrera

Optimize dynamic URLs
· Dynamic URLs contain name/value pairs
o skates.php?size=6&brand=riedell
· Create patterns for crawlers to understand
o www.example.com/article.php?category=1&article=3&sid=123
o www.example.com/article.php?category=1&article=3&sid=456
o www.example.com/article.php?category=2&article=3&sid=789
· Use cookies to hide user-specific details
o www.example.com/skates.php?query=riedell+she+devil&id=9823576
o www.example.com/skates.php?ref=www.fastgirlskates.com&color=red

Rein in infinite spaces
· Uncover issues in CMS
o www.example.com/wiki/index.php?title=Special:Ipblocklist&limit=250&offset=423780&ip=

Disallow actions Googlebot can’t perform

· Googlebot is too cheap to ‘Add to cart’
o Disallow shopping carts
o http://www.example.com/index.php?page=EComm.AddToCart&Pid=3301674647606&returnTo=L2luZGV4LnBocD9wYWdlPUVDb21tLlByb2QmUGlkPTMzMDE2NzQ2NDc2OTI=
· Googlebot is too shy to ‘Contact us’
o Disallow contact forms, especially if they have unique URLs
o http://www.example.com/bb/posting.zsp?mode=newtopic&f=2&subject=Seeking%20information%20about%20roller%20derby%20training
· Googlebot forgets his password a lot
o Disallow login pages
o https://www.example.com/login.asp?er=43d9257de47d8b08a91069cccb584ab83ff21140bd46e81656dab3507f45d1ab079cd77244231e557d724dc1df1a641

Get your preferred URLs indexed

· Set your preferred domain in Webmaster Tools
o www.example.com vs. example.com
· Put canonical URLs in your Sitemap
· Use the new rel=“canonical” on any duplicate URLs

Webmaster Central
www.google.com/webmasters

· Help Center: Documentation, FAQs, webmaster guidelines
· Blog: Hot topics & best practices
· Help Forum: Ask questions, engage with others

Labels: , ,

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home