|
|
#1 (permalink) |
|
Top Level Poster
Join Date: Feb 1975
Location: Irvine, CA
Posts: 445
Thanks: 2
Thanked 6 Times in 6 Posts
|
I have heard this text will stop a crawler from following links. What are the benefits of not having a robot scan a certain page? Is anyone currently using it?
__________________
www.affiliateprograms.com |
|
|
|
|
#2 (permalink) |
|
Senior Member
Join Date: Apr 2008
Posts: 22
Thanks: 0
Thanked 0 Times in 0 Posts
|
Hi
You can enter whatever you want there. I use it for example to stop agents crawling the templates directory, and linksmanager to crawl pages not needed for it. See example below: User-Agent: * Disallow: /yourdomain.com/Templates Allow: / User-agent: linksmanager Disallow: /cgi-bin/ Disallow: /cp/ Disallow: /css/ Disallow: /EN/ Disallow: /images/ Disallow: /modlogan/ Disallow: /PT/ Disallow: /webalizer/ Disallow: /widgets/ Disallow: /secure/ Disallow: /secure/ Disallow: /secure/ |
|
|
|
|
#3 (permalink) |
|
Junior Member
|
Robots.txt is useful to allow/disallow unwanted crawlers and bots to go through your specificed page or directory.
__________________
Coderea Technologies - Build your outsourcing strategy now! |
|
|
|
|
#4 (permalink) |
|
Top Level Poster
Join Date: Feb 1975
Location: Irvine, CA
Posts: 445
Thanks: 2
Thanked 6 Times in 6 Posts
|
Why would we want the crawler to not see a page though. They might find something that helps a site rank on that page.
__________________
www.affiliateprograms.com |
|
|
|
|
#5 (permalink) |
|
Senior Member
Join Date: Feb 2009
Posts: 44
Thanks: 0
Thanked 2 Times in 2 Posts
|
Not always, for example, duplicate version of page aka print version, or some pages that are taken from other places and are know as duplicate, so you want to block that content out straight away. Also, you can "break" the link, by having redirect going trough blocked folder.
Saves you some PageRank as well google wont count that a link from your site (let me remind - when your site links to bad/banned sites, it can harm you as well). |
|
|
|
|
#6 (permalink) |
|
Top Level Poster
Join Date: Feb 1975
Location: Irvine, CA
Posts: 445
Thanks: 2
Thanked 6 Times in 6 Posts
|
i didn't think about duplicate content. I guess that makes sense.
__________________
www.affiliateprograms.com |
|
|
|
|
#7 (permalink) |
|
Member
Join Date: Feb 2009
Posts: 82
Thanks: 0
Thanked 1 Time in 1 Post
|
For not wanting robots to follow a link to a "bad" site, does a no-follow link do the same thing?
Also, can you robots.txt a section of a page? <body> I like cats <robots.txt> No I don't </robots.txt> </boddy> |
|
|
|
|
#8 (permalink) |
|
Senior Member
Join Date: Feb 2009
Posts: 44
Thanks: 0
Thanked 2 Times in 2 Posts
|
Yes, nofollow does the same thing with some exceptions of how PageRank is passed.
I don't have any evidence that there is a <robots.txt> tag esp. for Google Also syntax like that doesn't make any sense as there are no rules included.
Last edited by Sandis; 04-03-2009 at 11:08 AM. |
|
|
| Thread Tools | Search this Thread |
| Display Modes | |
|
|