[campsite-dev] Fwd: content management systems and google wisdoms.
  • >
    >Subject: The Spider of Doom (Alex Papadimoulis)
    >
    >The Daily WTF: Curious Perversions in Information Technology,
    >Alex Papadimoulis, 28 Mar 2006
    >http://www.thedailywtf.com/
    >
    >Josh Breckman worked for a company that landed a contract to develop a
    >content management system for a fairly large government website. Much of the
    >project involved developing a content management system so that employees
    >would be able to build and maintain the ever-changing content for their
    >site.
    >
    >Because they already had an existing website with a lot of content, the
    >customer wanted to take the opportunity to reorganize and upload all the
    >content into the new site before it went live. As you might imagine, this
    >was a fairly time consuming process. But after a few months, they had
    >finally put all the content into the system and opened it up to the
    >Internet.
    >
    >Things went pretty well for a few days after going live. But, on day six,
    >things went not-so-well: all of the content on the website had completely
    >vanished and all pages led to the default "please enter content" page.
    >Whoops.
    >
    >Josh was called in to investigate and noticed that one particularly
    >troublesome external IP had gone in and deleted *all* of the content on the
    >system. The IP didn't belong to some overseas hacker bent on destroying
    >helpful government information. It resolved to googlebot.com, Google's very
    >own web crawling spider. Whoops.
    >
    >After quite a bit of research (and scrambling around to find a non-corrupt
    >backup), Josh found the problem. A user copied and pasted some content from
    >one page to another, including an "edit" hyperlink to edit the content on
    >the page. Normally, this wouldn't be an issue, since an outside user would
    >need to enter a name and password. But, the CMS authentication subsystem
    >didn't take into account the sophisticated hacking techniques of Google's
    >spider. Whoops.
    >
    >As it turns out, Google's spider doesn't use cookies, which means that it
    >can easily bypass a check for the "isLoggedOn" cookie to be "false". It also
    >doesn't pay attention to Javascript, which would normally prompt and
    >redirect users who are not logged on. It does, however, follow every
    >hyperlink on every page it finds, including those with "Delete Page" in the
    >title. Whoops.
    >
    >After all was said and done, Josh was able to restore a fairly older version
    >of the site from backups. He brought up the root cause -- that security
    >could be beaten by disabling cookies and javascript -- but management didn't
    >quite see what was wrong with that. Instead, they told the client to NEVER
    >copy paste content from other pages.


    Micz Flor - micz@mi.cz

    content and media development http://mi.cz
    --------------------------------------------------------
    http://www.campware.org -- http://www.suemi.de
    http://www.redaktionundalltag.de
    --------------------------------------------------------
  • 1 Comment sorted by
  • That's a good one

    We had our share of security bugs but not like this. What they did is really stupid and they call themselves a software development company!!!

    "isLoggedOn" cookie: we should use cookies like this more often

    Mugur

    Micz Flor wrote: >Subject: The Spider of Doom (Alex Papadimoulis)
    >
    >The Daily WTF: Curious Perversions in Information Technology,
    >Alex Papadimoulis, 28 Mar 2006
    >http://www.thedailywtf.com/
    >
    >Josh Breckman worked for a company that landed a contract to develop a
    >content management system for a fairly large government website. Much of the
    >project involved developing a content management system so that employees
    >would be able to build and maintain the ever-changing content for their
    >site.
    >
    >Because they already had an existing website with a lot of content, the
    >customer wanted to take the opportunity to reorganize and upload all the
    >content into the new site before it went live. As you might imagine, this
    >was a fairly time consuming process. But after a few months, they had
    >finally put all the content into the system and opened it up to the
    >Internet.
    >
    >Things went pretty well for a few days after going live. But, on day six,
    >things went not-so-well: all of the content on the website had completely
    >vanished and all pages led to the default "please enter content" page.
    >Whoops.
    >
    >Josh was called in to investigate and noticed that one particularly
    >troublesome external IP had gone in and deleted *all* of the content on the
    >system. The IP didn't belong to some overseas hacker bent on destroying
    >helpful government information. It resolved to googlebot.com, Google's very
    >own web crawling spider. Whoops.
    >
    >After quite a bit of research (and scrambling around to find a non-corrupt
    >backup), Josh found the problem. A user copied and pasted some content from
    >one page to another, including an "edit" hyperlink to edit the content on
    >the page. Normally, this wouldn't be an issue, since an outside user would
    >need to enter a name and password. But, the CMS authentication subsystem
    >didn't take into account the sophisticated hacking techniques of Google's
    >spider. Whoops.
    >
    >As it turns out, Google's spider doesn't use cookies, which means that it
    >can easily bypass a check for the "isLoggedOn" cookie to be "false". It also
    >doesn't pay attention to Javascript, which would normally prompt and
    >redirect users who are not logged on. It does, however, follow every
    >hyperlink on every page it finds, including those with "Delete Page" in the
    >title. Whoops.
    >
    >After all was said and done, Josh was able to restore a fairly older version
    >of the site from backups. He brought up the root cause -- that security
    >could be beaten by disabling cookies and javascript -- but management didn't
    >quite see what was wrong with that. Instead, they told the client to NEVER
    >copy paste content from other pages.


    Micz Flor - micz@mi.cz

    content and media development http://mi.cz
    --------------------------------------------------------
    http://www.campware.org -- http://www.suemi.de
    http://www.redaktionundalltag.de
    --------------------------------------------------------



    ---------------------------------
    Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls. Great rates starting at 1¢/min.