| Other Added |
Hubs | Hubbers | Topics | Request |
| #1 in Business | Subscribe Email Print |
|
You are here: Home > Internet and Businesses Online > SEO > What is a Robots.txt File? |
|
Other Added - What is a Robots.txt File?
Are You Killing Your Employees? Read This Before You Answer That Question u want them to see, not partial, test or script pages you don't want them to see.Stress kills—it’s so bad in Japan that they even have a word for sudden death from overwork: karoushi. Stress is the underlying cause of much of the heart disease in our country, which is the number one cause of death. Stressed workers suffer from 30% more heart disease than their less-stressed co-workers. More Let's look at some examples to get started: This allows all spiders to spider all pages on your site. The * is a wildcard that means “all spiders.” User-agent: * Disallow: This is the opposite of the a In the beginning, these robots spidered every page, every file, attached to the Web. This caused problems for both the search engines and the people using them. Pages that really aren't worth looking at, such as, say, header files to be included in all pages on a site, were being spidered and showed up in search results. Have you ever searched on Google and gotten a partial page as a result? The solution was for Google and other search engines to begin looking for a robots.txt file in the root folder of each site (http://www.mydomain.com/robots.txt) to determine what should and shouldn't be searched. This is named, "The Robots Exclusion Standard." This simple text file, created with Notepad or other simple text editor gives you complete control by telling the robots not to spider certain folders in your site. The result is happier visitors who come to your site from search engines and get only full pages that you want them to see, not partial, test or script pages you don't want them to see. Let's look at some examples to get started: This allows all spiders to spider all pages on your site. The * is a wildcard that means “all spiders.” User-agent: * Disallow: This is the opposite of the ab In the beginning, these robots spidered every page, every file, attached to the Web. This caused problems for both the search engines and the people using them. Pages that really aren't worth looking at, such as, say, header files to be included in all pages on a site, were being spidered and showed up in search results. Have you ever searched on Google and gotten a partial page as a result? The solution was for Google and other search engines to begin looking for a robots.txt file in the root folder of each site (http://www.mydomain.com/robots.txt) to determine what should and shouldn't be searched. This is named, "The Robots Exclusion Standard." This simple text file, created with Notepad or other simple text editor gives you complete control by telling the robots not to spider certain folders in your site. The result is happier visitors who come to your site from search engines and get only full pages that you want them to see, not partial, test or script pages you don't want them to see. Let's look at some examples to get started: This allows all spiders to spider all pages on your site. The * is a wildcard that means “all spiders.” User-agent: * Disallow: This is the opposite of the a The solution was for Google and other search engines to begin looking for a robots.txt file in the root folder of each site (http://www.mydomain.com/robots.txt) to determine what should and shouldn't be searched. This is named, "The Robots Exclusion Standard." This simple text file, created with Notepad or other simple text editor gives you complete control by telling the robots not to spider certain folders in your site. The result is happier visitors who come to your site from search engines and get only full pages that you want them to see, not partial, test or script pages you don't want them to see. Let's look at some examples to get started: This allows all spiders to spider all pages on your site. The * is a wildcard that means “all spiders.” User-agent: * Disallow: This is the opposite of the a Let's look at some examples to get started: This allows all spiders to spider all pages on your site. The * is a wildcard that means “all spiders.” User-agent: * Disallow: This is the opposite of the a Let's look at some examples to get started: This allows all spiders to spider all pages on your site. The * is a wildcard that means “all spiders.” User-agent: * Disallow: This is the opposite of the above example. This one tells all spiders to NOT spider your whole site. You might want this if you have a test site, for example, that is not live yet. User-agent: * Disallow: / This example tells all robots to stay out of the cgi-bin and images folders. User-agent: * Disallow: /cgi-bin/ Disallow: /images/ This example tells only the WebFerret robot to not spider the page ferret.htm. It’s only an example. I have nothing against WebFerret. The user agent code for Google is googlebot. User-agent: WebFerret Disallow: ferret.htm It is important that the file is a simple text file – do not use Microsoft Word to create it. And be careful of how you type – it must look exactly like the above examples, with caps only for the first letter, just the right spacing, etc. A poorly done robots.txt file could harm your site more than help it.
HTTP = HTML link (for blogs, profiles,phorums):
Related Articles:The Fuss about Non-Disclosure-Agreements(NDA) Innovation Management - Time to Market or Time to Success? A Powerful Way to Put Your Home Business on Auto-Pilot
|