Problems with Googlebot
  • Hi

    First - I'm still using Campsite (3.4.3) and I'm running it on/for 4 domains (websites) and 4 different languages. So fare so good, but lately I discovered that something is blocking Google to index my sites. I tried various solutions but no luck so fare. My firewall or ISP are not blocking google. Can someone please point me to right direction, my sites are there but invisible for google!?

    br/grega
  • 5 Comments sorted by
  • Vote Up0Vote Down Andrey PodshivalovAndrey Podshivalov
    Posts: 1,526Member, Administrator, Sourcefabric Team
    You should use Google Webmaster Tools: https://www.google.com/webmasters/tools/

    Add your sites into account and after then you can analyze google bot errors/reports and set up search options
  • Thanks for your reply Andrey. I'm already using Google Webmaster Tools, and if I run Fetch as Googlebot I got timeout or unreachable status. As I said my site is totally invisible for google. I'm assuming that my google problems are caused by .htaccess file. Should I add something to .htaccess coz I can't just remove it?
  • Vote Up0Vote Down Andrey PodshivalovAndrey Podshivalov
    Posts: 1,526Member, Administrator, Sourcefabric Team
    it could be network issue. If site is public then no any restriction for crawlers by default
  • Our system admin checked that and assured me that nothing is blocking web crawlers. I also just find out that robots.txt is unreachable too and if I remove .htaccess I can see it.

    My .htaccess file looks like this:

    # There might be no access to apache config, set options here
    DirectoryIndex index.php index.html
    Options -Indexes FollowSymLinks -MultiViews

    # Options might not allowed here, use Rewrite rules instead
    <IfModule mod_rewrite.c>
        RewriteEngine On

        RewriteCond %{REQUEST_URI} /+get_img$
        RewriteRule . get_img.php [L]

        RewriteCond %{REQUEST_URI} /+attachment/+
        RewriteRule . attachment.php [L]

        RewriteCond %{REQUEST_URI} !\.swf$
        RewriteCond %{REQUEST_URI} !\.php$
        RewriteCond %{REQUEST_URI} !\.html$
        RewriteCond %{REQUEST_URI} !\.css$
        RewriteCond %{REQUEST_URI} !\.js$
        RewriteCond %{REQUEST_URI} !/+admin
        RewriteCond %{REQUEST_URI} !/+install
        RewriteCond %{REQUEST_URI} !(/+plugins/[^/]*)?/+javascript
        RewriteCond %{REQUEST_URI} !(/+plugins/[^/]*)?/+css
        RewriteCond %{REQUEST_URI} !(/+plugins/[^/]*)?/+images
        RewriteCond %{REQUEST_URI} !/+templates
        RewriteRule . index.php [L]

        RewriteCond %{REQUEST_URI} .tpl$
        RewriteRule . index.php [L]

        RewriteCond %{REQUEST_URI} /+admin$|/+admin/+.*|/+admin-files
        RewriteRule . admin.php [L]
    </IfModule>

    # Uncomment it for gui backup/restore process
    # NOTE: It can be incompatible on some shared hosting
    # php_value output_buffering Off
  • Vote Up0Vote Down Andrey PodshivalovAndrey Podshivalov
    Posts: 1,526Member, Administrator, Sourcefabric Team
    As I said it's not a problem of .htaccess

    for crawlers the robot.txt is not mandatory.  Btw, if you want to get accessible robot.txt add in .htaccess


    RewriteCond %{REQUEST_URI} !\.txt$ 
    Post edited by Andrey Podshivalov at 2011-11-17 04:04:08