[campsite-dev] Re: [campsite-core] the tracker and the search engine
  • Hi,

    I took a look at the technical spec and at phpOpenTracker's (which I
    confess I've never used) website. There are a couple of points for
    discussion on implementing Tracker that I'd like to run by you guys.

    (1) phpOpenTracker v. awStats. I've used awStats in the past, and
    judging by the spec, it looks like this might've been the better way
    to go.

    Problems with phpOpenTracker:
    (a) Stores stuff in the database, which, when dealing with millions of
    hits, is just rather silly.
    (b) Doesn't seem to have a caching mechanism like awStats.
    (c) No log rotation.

    Awstats addresses these three problems, which is why I think going
    with awstats is a good idea. However, I haven't been "keeping up" on
    awstats progress in the last, say, 3 years. Any comments?

    Problems with awStats:
    According to John's email, it looks like most of this stuff is
    implemented. (I haven't yet gone through his code, but a quick glance
    at the cvs reveals that a lot has already been implemented). So,
    doing it over again with awstats would be a waste of labor, esp. since
    we could implement IMPLEMENTATION OPTION #2, see below.

    IMPLEMENTATION OPTION #1: AWSTATS
    Implementing with awstats, rough draft.

    (a) Awstats only has available what Apache gives it, and this usually
    is limitted to the URL. However, IIRC, we can modify Apache to store
    other useful information, and if Apache can't get at some information,
    we write up a modified Log() function, or something along these lines,
    to append to the Apache log file the pertinent information, e.g., User
    who is logged in, hidden variables that aren't in the URL, etc.

    (b) We then utilize awstats's templating tools to render the output
    the way we want it etc.

    Note Benes:
    (a) awStats is in perl. That's another language to be thrown in, but,
    according to their website, awStats doesn't require any "strange" perl
    libraries, just standard ones.
    (b) We'll also need to modify Apache's logfile rules, which throws in
    another pseudo language as well, and will make installation that much
    more complicated.


    IMPLEMENTATION OPTION #2
    We use the current phpOpenTracker code, but write a bunch of scripts
    to split up the database by day, by month, by year, etc. like a
    logrotate. An RDBM indexed properly can take a lot of data, and
    should actually be better than text files (which is what apache log
    would be), but even it can get overloaded with too much data.


    Anyway, these are preliminary comments before I dig even deeper. What
    are some of your thoughts?

    Peter

    On 6/7/06, Aleksandar Brajanoski wrote:
    > here is the last written document on the tracker specs. it is obvious
    > that some parts relate to the long url 2.1.x CS version. so don't mind
    > that.
    >
    >


    --
    Hartman's Brain Consulting | Openflows Networks Ltd. | Campware.org

    gpg 1024D/ED6EF59B (7D1A 522F D08E 30F6 FA42 B269 B860 352B ED6E F59B)
    gpg --keyserver pgp.mit.edu --recv-keys ED6EF59B
  • 13 Comments sorted by
  • IMHO adding another language (PERL) to the list of campware language would make things more complex and frankly, we don't have a PERL developer. Also, it would be harder to integrate with Campsite, we couldn't reuse Campsite code.

    Storing the log in the database may not be such a stupid idea, after all MySQL was built for this kind of applications in the first place. The only issue is to design the database structure correctly. Also, storing the log in a text file makes the processing more complicated, reports would take longer and more system resources to process.

    "Awstats only has available what Apache gives it, and this usually is limitted to the URL. However, IIRC, we can modify Apache to store other useful information, and if Apache can't get at some information, we write up a modified Log() function, ... to append to the Apache log file the pertinent information, e.g., User ..."

    Come on, we're trying to get rid of C++ code, should we start programming in C now? Not to mention the complexity it adds to the installation. With so many configuration issues dependent on apache our clients would surely run into trouble installing such an application. And we would have to spend countless hours configuring and debugging our client's system instead of implementing new features in Campsite. Yes, we have a complicated install system now but we're trying to get rid of it in 3.0, it makes no sense to add another one.

    These are my humble opinions Smile

    Mugur

    Peter Hartman wrote: Hi,

    I took a look at the technical spec and at phpOpenTracker's (which I
    confess I've never used) website. There are a couple of points for
    discussion on implementing Tracker that I'd like to run by you guys.

    (1) phpOpenTracker v. awStats. I've used awStats in the past, and
    judging by the spec, it looks like this might've been the better way
    to go.

    Problems with phpOpenTracker:
    (a) Stores stuff in the database, which, when dealing with millions of
    hits, is just rather silly.
    (b) Doesn't seem to have a caching mechanism like awStats.
    (c) No log rotation.

    Awstats addresses these three problems, which is why I think going
    with awstats is a good idea. However, I haven't been "keeping up" on
    awstats progress in the last, say, 3 years. Any comments?

    Problems with awStats:
    According to John's email, it looks like most of this stuff is
    implemented. (I haven't yet gone through his code, but a quick glance
    at the cvs reveals that a lot has already been implemented). So,
    doing it over again with awstats would be a waste of labor, esp. since
    we could implement IMPLEMENTATION OPTION #2, see below.

    IMPLEMENTATION OPTION #1: AWSTATS
    Implementing with awstats, rough draft.

    (a) Awstats only has available what Apache gives it, and this usually
    is limitted to the URL. However, IIRC, we can modify Apache to store
    other useful information, and if Apache can't get at some information,
    we write up a modified Log() function, or something along these lines,
    to append to the Apache log file the pertinent information, e.g., User
    who is logged in, hidden variables that aren't in the URL, etc.

    (b) We then utilize awstats's templating tools to render the output
    the way we want it etc.

    Note Benes:
    (a) awStats is in perl. That's another language to be thrown in, but,
    according to their website, awStats doesn't require any "strange" perl
    libraries, just standard ones.
    (b) We'll also need to modify Apache's logfile rules, which throws in
    another pseudo language as well, and will make installation that much
    more complicated.

    IMPLEMENTATION OPTION #2
    We use the current phpOpenTracker code, but write a bunch of scripts
    to split up the database by day, by month, by year, etc. like a
    logrotate. An RDBM indexed properly can take a lot of data, and
    should actually be better than text files (which is what apache log
    would be), but even it can get overloaded with too much data.


    Anyway, these are preliminary comments before I dig even deeper. What
    are some of your thoughts?

    Peter

    On 6/7/06, Aleksandar Brajanoski
    wrote:
    > here is the last written document on the tracker specs. it is obvious
    > that some parts relate to the long url 2.1.x CS version. so don't mind
    > that.

    --
    Hartman's Brain Consulting | Openflows Networks Ltd. | Campware.org

    gpg 1024D/ED6EF59B (7D1A 522F D08E 30F6 FA42 B269 B860 352B ED6E F59B)
    gpg --keyserver pgp.mit.edu --recv-keys ED6EF59B



    ---------------------------------
    Want to be your own boss? Learn how on Yahoo! Small Business.
  • Hi,

    to show/test the poll module, I replaces the campsite-dev installation on
    edge-server. If somebody miss it, let me know, there are backups of course.

    Sebastian
  • I need to report that our product Docmint have a security hole, which was
    used to intrude to 2 of our servers.

    The vulnerability is located in engine/require.php script, and quite easy to
    exploit.
    They explicitly used a vulnerability in this script, not an general
    approach! So any Docmint installation is in danger.

    A detailled report will follow.


    Emergency plan:

    - Turn register_globals to off in php.ini.
    - Again, make sure register_globals is off! Take care, even if it is
    disabled in php.ini, it could be turned on in an .htaccess file.
    - Don't forget to restart apache. Use phpinfo() to make sure
    register_globals is really turned off.
    - If you cannot switch it off globally, put a .htaccess file containing
    "php_flag register_globals off" into you Docmint base folder. Or move the
    whole Docmint installation away from beeing accessable by apache! Please
    wait until we can offer a safe solution for register_globals turned on.
    Aware: register_globals = on is a common risk.

    Best,
    Sebastian
  • Just so everyone knows, we were not vulnerable to this since we dont
    have register_globals on in our php.ini files.


    Sebastian Goebel wrote:
    > I need to report that our product Docmint have a security hole, which was
    > used to intrude to 2 of our servers.
    >
    > The vulnerability is located in engine/require.php script, and quite easy to
    > exploit.
    > They explicitly used a vulnerability in this script, not an general
    > approach! So any Docmint installation is in danger.
    >
    > A detailled report will follow.
    >
    >
    > Emergency plan:
    >
    > - Turn register_globals to off in php.ini.
    > - Again, make sure register_globals is off! Take care, even if it is
    > disabled in php.ini, it could be turned on in an .htaccess file.
    > - Don't forget to restart apache. Use phpinfo() to make sure
    > register_globals is really turned off.
    > - If you cannot switch it off globally, put a .htaccess file containing
    > "php_flag register_globals off" into you Docmint base folder. Or move the
    > whole Docmint installation away from beeing accessable by apache! Please
    > wait until we can offer a safe solution for register_globals turned on.
    > Aware: register_globals = on is a common risk.
    >
    > Best,
    > Sebastian
    >
    >
  • i wanted to add that: docmint does not require register_globals to be
    on Smile sorry.

    At 22:54 10.10.2006, you wrote:
    >Just so everyone knows, we were not vulnerable to this since we dont
    >have register_globals on in our php.ini files.
    >
    >
    >Sebastian Goebel wrote:
    > > I need to report that our product Docmint have a security hole, which was
    > > used to intrude to 2 of our servers.
    > >
    > > The vulnerability is located in engine/require.php script, and
    > quite easy to
    > > exploit.
    > > They explicitly used a vulnerability in this script, not an general
    > > approach! So any Docmint installation is in danger.
    > >
    > > A detailled report will follow.
    > >
    > >
    > > Emergency plan:
    > >
    > > - Turn register_globals to off in php.ini.
    > > - Again, make sure register_globals is off! Take care, even if it is
    > > disabled in php.ini, it could be turned on in an .htaccess file.
    > > - Don't forget to restart apache. Use phpinfo() to make sure
    > > register_globals is really turned off.
    > > - If you cannot switch it off globally, put a .htaccess file containing
    > > "php_flag register_globals off" into you Docmint base folder. Or move the
    > > whole Docmint installation away from beeing accessable by apache! Please
    > > wait until we can offer a safe solution for register_globals turned on.
    > > Aware: register_globals = on is a common risk.
    > >
    > > Best,
    > > Sebastian
    > >
    > >


    Micz Flor - micz@mi.cz

    content and media development http://mi.cz
    ------------------------------------------------------------------
    http://www.campware.org -- http://www.redall.de -- http://suemi.de
    ------------------------------------------------------------------
  • This report is much better verbalized then mine, so read here:
    http://advisories.echo.or.id/adv/adv51-K-159-2006.txt


    Sava, don't you have some strong yugoslavian friends in Sydney?
  • Serbs (and other ex-Yugoslavs) are a bunch of tree-hugging woosies. We
    could send them one of their homegrown ninjas, John Pye, to sort them out.
    Smile



    "Sebastian
    Goebel" To:
    hine.de> Subject: RE: [campsite-dev] !! Docmint vulnerability report !!

    11/10/2006 05:47
    PM
    Please respond to
    campsite-dev






    This report is much better verbalized then mine, so read here:
    http://advisories.echo.or.id/adv/adv51-K-159-2006.txt


    Sava, don't you have some strong yugoslavian friends in Sydney?




    Invest in Press Freedom: Visit http://www.mdlf.org/support-free-press
  • hi developers,

    i am trying to update docmint to
    - Apache 2.0.58
    - PHP 5.1.6
    - MySQL 4.1.15

    and one of the MySQL issues is: what *type* of UTF8 do i need to
    use... the current docmint install comes up with latin_swedish when i
    build the database... that does not sound too good.

    which one do i chose form all of these? check the mysql site:
    http://dev.mysql.com/doc/refman/4.1/en/charset-unicode-sets.html

    Micz Flor - micz@mi.cz

    content and media development http://mi.cz
    --------------------------------------------------------
    http://www.campware.org -- http://www.suemi.de
    http://www.redaktionundalltag.de
    --------------------------------------------------------
  • From what I read on that page it seems to me that you should use utf8_unicode_ci.

    Mugur

    --- Micz Flor wrote:
    > hi developers,
    >
    > i am trying to update docmint to
    > - Apache 2.0.58
    > - PHP 5.1.6
    > - MySQL 4.1.15
    >
    > and one of the MySQL issues is: what *type* of UTF8 do i need to
    > use... the current docmint install comes up with latin_swedish when i
    > build the database... that does not sound too good.
    >
    > which one do i chose form all of these? check the mysql site:
    > http://dev.mysql.com/doc/refman/4.1/en/charset-unicode-sets.html
    >
    > Micz Flor - micz@mi.cz
    >
    > content and media development http://mi.cz
    > --------------------------------------------------------
    > http://www.campware.org -- http://www.suemi.de
    > http://www.redaktionundalltag.de
    > --------------------------------------------------------
    >
    >


    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around
    http://mail.yahoo.com
  • At 18:18 16.10.2006, you wrote:
    > From what I read on that page it seems to me that you should use
    > utf8_unicode_ci.

    what is campsite using?


    >Mugur
    >
    >--- Micz Flor wrote:
    > > hi developers,
    > >
    > > i am trying to update docmint to
    > > - Apache 2.0.58
    > > - PHP 5.1.6
    > > - MySQL 4.1.15
    > >
    > > and one of the MySQL issues is: what *type* of UTF8 do i need to
    > > use... the current docmint install comes up with latin_swedish when i
    > > build the database... that does not sound too good.
    > >
    > > which one do i chose form all of these? check the mysql site:
    > > http://dev.mysql.com/doc/refman/4.1/en/charset-unicode-sets.html
    > >
    > > Micz Flor - micz@mi.cz
    > >
    > > content and media development http://mi.cz
    > > --------------------------------------------------------
    > > http://www.campware.org -- http://www.suemi.de
    > > http://www.redaktionundalltag.de
    > > --------------------------------------------------------
    > >
    > >
    >
    >
    >__________________________________________________
    >Do You Yahoo!?
    >Tired of spam? Yahoo! Mail has the best spam protection around
    >http://mail.yahoo.com


    Micz Flor - micz@mi.cz

    content and media development http://mi.cz
    ------------------------------------------------------------------
    http://www.campware.org -- http://www.redall.de -- http://suemi.de
    ------------------------------------------------------------------
  • hi Micz !

    Campsite uses the MySQL default charset, that is latin1, and the
    corresponding default collation for it, latin1_swedish_ci.

    all of this regarding charset to be used depends on what your needs
    are. in my case i use the default latin1 charset because it fits well
    to English.

    the default collation for the utf8 charset is utf8_general_ci.
    reading on the link you provided there is no a big difference between
    utf8_general_ci and utf8_unicode_ci collations, and the first one is
    faster.

    i have used utf8 and utf8_general_ci collation to support, for
    example, only German characters, and it works quite well.

    whatever you choose (utf8_general_ci or utf8_unicode_ci), you have got
    to pay attention not only to your server, but your clients connecting
    to it. usually people set up the server in a right way, but charset
    and collation affect both storage and communication. so, the better
    you can do is to make sure your client has support for the charset set
    up on the server and say to your server to keep communication using
    the same charset that it uses for storage (because the default one
    here is latin1 too). you can do by either SET NAMES 'utf8' or by
    setting the 'character_set_connection' and 'collation_connection'
    system variables directly.

    and the last one thing (and very important too): if you are going to
    show your stored data on Web, label your pages explicitly:



    if you do not do this, you will can see how your pages still show bad
    characters because the default HTTP charset is ISO-8859-1 (a.k.a.
    latin1).

    btw, a meta tag is not the only way to set up this, take a look at:
    http://www.w3.org/International/O-HTTP-charset



    On 10/16/06, Micz Flor wrote:
    > At 18:18 16.10.2006, you wrote:
    > > From what I read on that page it seems to me that you should use
    > > utf8_unicode_ci.
    >
    > what is campsite using?
    >
    >
    > >Mugur
    > >
    > >--- Micz Flor wrote:
    > > > hi developers,
    > > >
    > > > i am trying to update docmint to
    > > > - Apache 2.0.58
    > > > - PHP 5.1.6
    > > > - MySQL 4.1.15
    > > >
    > > > and one of the MySQL issues is: what *type* of UTF8 do i need to
    > > > use... the current docmint install comes up with latin_swedish when i
    > > > build the database... that does not sound too good.
    > > >
    > > > which one do i chose form all of these? check the mysql site:
    > > > http://dev.mysql.com/doc/refman/4.1/en/charset-unicode-sets.html
    > > >
    > > > Micz Flor - micz@mi.cz
    > > >
    > > > content and media development http://mi.cz
    > > > --------------------------------------------------------
    > > > http://www.campware.org -- http://www.suemi.de
    > > > http://www.redaktionundalltag.de
    > > > --------------------------------------------------------
    > > >
    > > >
    > >
    > >
    > >__________________________________________________
    > >Do You Yahoo!?
    > >Tired of spam? Yahoo! Mail has the best spam protection around
    > >http://mail.yahoo.com
    >
    >
    > Micz Flor - micz@mi.cz
    >
    > content and media development http://mi.cz
    > ------------------------------------------------------------------
    > http://www.campware.org -- http://www.redall.de -- http://suemi.de
    > ------------------------------------------------------------------
    >
    >


    --
    /holman
  • hi holman,

    thanks for the detailed reply. if campsite happily uses
    latin1_swedish_ci, docmint should probably do the same since it might
    be a campware product at some point - and that might imply also that
    there might be an API for other campware products (e.g. floating DIVs
    with docmint help in an admin interface or something like that).

    which would then spill over to the general mailinglist - what UTF are
    the campware products using.

    At 02:19 17.10.2006, you wrote:
    >hi Micz !
    >
    >Campsite uses the MySQL default charset, that is latin1, and the
    >corresponding default collation for it, latin1_swedish_ci.
    >
    >all of this regarding charset to be used depends on what your needs
    >are. in my case i use the default latin1 charset because it fits well
    >to English.
    >
    >the default collation for the utf8 charset is utf8_general_ci.
    >reading on the link you provided there is no a big difference between
    >utf8_general_ci and utf8_unicode_ci collations, and the first one is
    >faster.
    >
    >i have used utf8 and utf8_general_ci collation to support, for
    >example, only German characters, and it works quite well.
    >
    >whatever you choose (utf8_general_ci or utf8_unicode_ci), you have got
    >to pay attention not only to your server, but your clients connecting
    >to it. usually people set up the server in a right way, but charset
    >and collation affect both storage and communication. so, the better
    >you can do is to make sure your client has support for the charset set
    >up on the server and say to your server to keep communication using
    >the same charset that it uses for storage (because the default one
    >here is latin1 too). you can do by either SET NAMES 'utf8' or by
    >setting the 'character_set_connection' and 'collation_connection'
    >system variables directly.
    >
    >and the last one thing (and very important too): if you are going to
    >show your stored data on Web, label your pages explicitly:
    >
    >
    >
    >if you do not do this, you will can see how your pages still show bad
    >characters because the default HTTP charset is ISO-8859-1 (a.k.a.
    >latin1).
    >
    >btw, a meta tag is not the only way to set up this, take a look at:
    >http://www.w3.org/International/O-HTTP-charset
    >
    >
    >
    >On 10/16/06, Micz Flor wrote:
    >>At 18:18 16.10.2006, you wrote:
    >> > From what I read on that page it seems to me that you should use
    >> > utf8_unicode_ci.
    >>
    >>what is campsite using?
    >>
    >>
    >> >Mugur
    >> >
    >> >--- Micz Flor wrote:
    >> > > hi developers,
    >> > >
    >> > > i am trying to update docmint to
    >> > > - Apache 2.0.58
    >> > > - PHP 5.1.6
    >> > > - MySQL 4.1.15
    >> > >
    >> > > and one of the MySQL issues is: what *type* of UTF8 do i need to
    >> > > use... the current docmint install comes up with latin_swedish when i
    >> > > build the database... that does not sound too good.
    >> > >
    >> > > which one do i chose form all of these? check the mysql site:
    >> > > http://dev.mysql.com/doc/refman/4.1/en/charset-unicode-sets.html
    >> > >
    >> > > Micz Flor - micz@mi.cz
    >> > >
    >> > > content and media development http://mi.cz
    >> > > --------------------------------------------------------
    >> > > http://www.campware.org -- http://www.suemi.de
    >> > > http://www.redaktionundalltag.de
    >> > > --------------------------------------------------------
    >> > >
    >> > >
    >> >
    >> >
    >> >__________________________________________________
    >> >Do You Yahoo!?
    >> >Tired of spam? Yahoo! Mail has the best spam protection around
    >> >http://mail.yahoo.com
    >>
    >>
    >>Micz Flor - micz@mi.cz
    >>
    >>content and media development http://mi.cz
    >>------------------------------------------------------------------
    >>http://www.campware.org -- http://www.redall.de -- http://suemi.de
    >>------------------------------------------------------------------
    >>
    >
    >
    >--
    >/holman


    Micz Flor - micz@mi.cz

    content and media development http://mi.cz
    --------------------------------------------------------
    http://www.campware.org -- http://www.suemi.de
    http://www.redaktionundalltag.de
    --------------------------------------------------------
  • hi folks,

    i need to write a date in the RFC-822 format to validate for an RSS
    feed (link see below) but can not figure out in the manual how i
    would assemble the matching string to do so. any ideas?

    http://feedvalidator.org/docs/error/InvalidRFC2822Date.html

    Micz Flor - micz@mi.cz

    content and media development http://mi.cz
    --------------------------------------------------------
    http://www.campware.org -- http://www.suemi.de
    http://www.redaktionundalltag.de
    --------------------------------------------------------