[campsite-dev] Campsite Volunteer: Article Ranking (most read) ; TAGS View, e.g., http://www.flickr.
  • Hi all,

    This just came in, so I'm forwarding it to the list. -- doug


    ----- Forwarded by Douglas Arellanes/Mdlf on 07/19/2007 11:44 AM -----


    "Vladimir Ernesto Diaz Aviles"
    07/19/2007 07:02 AM


    To: contact@campware.org
    cc:
    Subject: Campsite Volunteer: Article Ranking (most read) ; TAGS View, e.g.,
    http://www.flickr.com/photos/tags/


    Dear Campware Team,

    Your products are great! and I would like to contribute developing the
    necessary functionality for Campsite (CS) to keep statistics that help
    determine the most read articles/topics , that would allow to get
    lists ordered by this criteria, and to build a look like the one in :
    http://www.flickr.com/photos/tags/ , where the topics associated to
    the article would be presented with a size proportional to their
    frequency.

    Where should I start?, perhaps modifying the Article components in
    order to increase a counter every time the article is read and persist
    it in a database table that would take care of the statistics, from
    there it would be possible to create/modify a CS special tag to
    be able to get an ordered list based on a ranking of top most read
    articles.

    Any suggestion on how to start, and recommendations are welcome.

    Regards,

    Ernesto
  • 8 Comments sorted by
  • Dear Vladimir,
    to get proper statistics is an issue, that needs to be divided into
    several parts.
    1) defining stats you would like to have
    2) collecting data you can collect
    3) Making admin interface to setup reports and others

    We have tried something on this field in TOL, but the result was, if
    you store all info to database and than try to generate reports in
    realtime, most of them will be too slow (20mins) afret a while. The
    amount of data is huge. therefore my suggestion on this field was to
    design standard static stats, that will be generated every day on
    incremental base.
    Than for other stats you can hold data in database for n days, where
    n is number, that everyone can modify in admin interface preferences.
    Data collected earlier are dropped from database (not to make it
    huge) and all you have is the static reports.
    for latest n days you can use report designer to create any report
    you wish.
    And of course if the process of generating daily static reports and
    the dynamic on demand report is the same, it should be possible to
    let user in admin interface to say, that this "on demand" report
    shall be put to the static daily set.

    I think the collecting part is semi done and for remaining things you
    might want to collect, it is easy to add it.
    The rest (administration part) is not completed and as the major
    developer who started irt does not continue anymore, it might be
    interesting for you to have a look at it, but may be starting from
    scratch would be better idea. The reason is, that it was written as
    complete independent application and I believe it is better from
    marketing reasons to have this included into campsite admin interface.

    Ondra




    > "Vladimir Ernesto Diaz Aviles"
    > 07/19/2007 07:02 AM
    >
    >
    > To: contact@campware.org
    > cc:
    > Subject: Campsite Volunteer: Article Ranking (most
    > read) ; TAGS View, e.g., http://www.flickr.com/photos/tags/
    >
    >
    >
    > Dear Campware Team,
    >
    > Your products are great! and I would like to contribute developing the
    > necessary functionality for Campsite (CS) to keep statistics that help
    > determine the most read articles/topics , that would allow to get
    > lists ordered by this criteria, and to build a look like the one in :
    > http://www.flickr.com/photos/tags/ , where the topics associated to
    > the article would be presented with a size proportional to their
    > frequency.
    >
    > Where should I start?, perhaps modifying the Article components in
    > order to increase a counter every time the article is read and persist
    > it in a database table that would take care of the statistics, from
    > there it would be possible to create/modify a CS special tag to
    > be able to get an ordered list based on a ranking of top most read
    > articles.
    >
    > Any suggestion on how to start, and recommendations are welcome.
    >
    > Regards,
    >
    > Ernesto
    >
  • Holman, you were mentioning some stats being built into 3.0? Can you say
    more?

    Sava



    Ondra Koutek
    le.com> cc: campsite-dev@campware.org
    Subject: Re: [campsite-dev] Campsite Volunteer: Article Ranking (most read) ; TAGS
    19/07/2007 03:59 View, e.g., http://www.flickr.com/photos/tags/
    PM
    Please respond to
    campsite-dev






    Dear Vladimir,
    to get proper statistics is an issue, that needs to be divided into several
    parts.
    1) defining stats you would like to have
    2) collecting data you can collect
    3) Making admin interface to setup reports and others

    We have tried something on this field in TOL, but the result was, if you
    store all info to database and than try to generate reports in realtime,
    most of them will be too slow (20mins) afret a while. The amount of data is
    huge. therefore my suggestion on this field was to design standard static
    stats, that will be generated every day on incremental base.
    Than for other stats you can hold data in database for n days, where n is
    number, that everyone can modify in admin interface preferences.
    Data collected earlier are dropped from database (not to make it huge) and
    all you have is the static reports.
    for latest n days you can use report designer to create any report you
    wish.
    And of course if the process of generating daily static reports and the
    dynamic on demand report is the same, it should be possible to let user in
    admin interface to say, that this "on demand" report shall be put to the
    static daily set.

    I think the collecting part is semi done and for remaining things you might
    want to collect, it is easy to add it.
    The rest (administration part) is not completed and as the major developer
    who started irt does not continue anymore, it might be interesting for you
    to have a look at it, but may be starting from scratch would be better
    idea. The reason is, that it was written as complete independent
    application and I believe it is better from marketing reasons to have this
    included into campsite admin interface.

    Ondra




    "Vladimir Ernesto Diaz
    Aviles" <
  • Dear Ondra,

    Right now I am concern on providing the following basic functionality:

    (1) A top-n list of most read articles, that would imply to extend the list > tag, having something like:



    Where freq is the times or frequency that an article has been read.

    (2) Similar to (1) but for Topics

    (3) Obtain how many times an article has been read, with something like:


    (4) Similar to (3) but for Topics

    These 4 points requires basic statistics to be collected:

    We have to keep track of counters for articles and its topics.

    Every time an article is read, we have to increase by one a counter in a
    database table, that includes the article Id as key (possibly publication
    id, etc., anything necessary to identified the article uniquely), the
    counter itself, and a date of last read.

    Once the article's counter is increased, we get the Topics associated to the
    article, and increase their individual counters, which are stored in another
    database table, that includes a unique identifier for the Topic (possible a
    composed primary key), the counter of the topic, and a date of last update.

    At the beginning there would be no reports, no modifications to the Admin
    interface, I just want to provide these points that would help to create
    top-n lists of articles, lieke the top-5 most popular articles (of the
    issue, or of all times), and something like Tags View, e.g.,
    http://www.flickr.com/photos/tags/ , based on information on the frequency
    on topics.

    In a second phase would be great to:

    (a) Keep track of an article rating, which would be provided by the readers,
    something like the ones in Amazon.com, and then obtain the top-n best
    articles according to the rating.

    (b) One could also be interested on how many comments an article has gotten.
    This might be easy to implement with the database as it is, but perhaps is
    not s interesting as the points mentioned above.

    (c) Include tools in the Admin interface for reporting, etc.

    For all above, could you please suggest:

    (1) What files should I start studying to include the insert/update of the
    tables in the DB, I would like to keep track of this statistics behind the
    scenes and not in the Article's template. Same for topics.

    (2) What components should I study in order to modify or extend the
    functionality of CS tags and

    (3) Is there any consideration that I have to follow to include the
    aforementioned tables in the database?

    Regards,

    VEDAX


    On 7/19/07, Ondra Koutek wrote:
    >
    > Dear Vladimir,
    > to get proper statistics is an issue, that needs to be divided into
    > several parts.
    > 1) defining stats you would like to have
    > 2) collecting data you can collect
    > 3) Making admin interface to setup reports and others
    >
    > We have tried something on this field in TOL, but the result was, if you
    > store all info to database and than try to generate reports in realtime,
    > most of them will be too slow (20mins) afret a while. The amount of data is
    > huge. therefore my suggestion on this field was to design standard static
    > stats, that will be generated every day on incremental base.
    > Than for other stats you can hold data in database for n days, where n is
    > number, that everyone can modify in admin interface preferences.
    > Data collected earlier are dropped from database (not to make it huge) and
    > all you have is the static reports.
    > for latest n days you can use report designer to create any report you
    > wish.
    > And of course if the process of generating daily static reports and the
    > dynamic on demand report is the same, it should be possible to let user in
    > admin interface to say, that this "on demand" report shall be put to the
    > static daily set.
    >
    > I think the collecting part is semi done and for remaining things you
    > might want to collect, it is easy to add it.
    > The rest (administration part) is not completed and as the major developer
    > who started irt does not continue anymore, it might be interesting for you
    > to have a look at it, but may be starting from scratch would be better idea.
    > The reason is, that it was written as complete independent application and I
    > believe it is better from marketing reasons to have this included into
    > campsite admin interface.
    >
    > Ondra
    >
    >
    >
    >
    >
    > *"Vladimir Ernesto Diaz Aviles" *
    >
    > 07/19/2007 07:02 AM
    >
    > To: contact@campware.org
    > cc:
    > Subject: Campsite Volunteer: Article Ranking (most read) ;
    > TAGS View, e.g., http://www.flickr.com/photos/tags/
    >
    >
    > Dear Campware Team,
    >
    > Your products are great! and I would like to contribute developing the
    > necessary functionality for Campsite (CS) to keep statistics that help
    > determine the most read articles/topics , that would allow to get
    > lists ordered by this criteria, and to build a look like the one in :
    > http://www.flickr.com/photos/tags/ , where the topics associated to
    > the article would be presented with a size proportional to their
    > frequency.
    >
    > Where should I start?, perhaps modifying the Article components in
    > order to increase a counter every time the article is read and persist
    > it in a database table that would take care of the statistics, from
    > there it would be possible to create/modify a CS special tag to
    > be able to get an ordered list based on a ranking of top most read
    > articles.
    >
    > Any suggestion on how to start, and recommendations are welcome.
    >
    > Regards,
    >
    > Ernesto
    >
    >
    >


    --
    Ing. Vladimir Ernesto Diaz-Aviles, M.Sc.
    Servicios de Consultoría / Consulting Services
    bluecacao.com
    Tel. (503) 794-03281

    -----------------------------------------
    INFORMACIÓN CONFIDENCIAL La información transmitida en este
    mensaje es para el uso exclusivo de la persona o entidad a quien va
    dirigida, y contiene información de carácter confidencial y
    privilegiado.

    CONFIDENTIAL INFORMATION The information transmitted in this message
    is for the exclusive use of the person or entity which it is addressed,
    and contains confidential and legally privileged information.
  • Hi Ernesto !

    i am very glad to hear about you and your interest of cooperating to
    Campsite development. Statistics is a very important area which we
    definitely need to work on. There is a specification draft about this
    topic (i think it is the same Douglas mentioned) at
    http://code.campware.org/projects/campsite/wiki/CampsiteStatsDesign

    First of all, i hope you have not started any coding work on this,
    have you?, please say no =)
    Currently we are working on developing Campsite 3.0 (a major version)
    and the biggest change will be the replacement of the entire template
    engine. We are implementing a full PHP template engine instead of the
    C++ one you can find in the current release.

    We are doing this because of multiple reasons, one of the most
    important is the fact that more developers (like you) will be
    interested on cooperate and help us to improve Campsite, for sure.

    So, you could start speaking about something like this (as we decided
    to use the Smarty template system and extend it to our specific
    needs):

    {{ list_article name="mytoparticles" length="5" order="byfreq" }}

    instead of:



    as you just mentioned.

    well, i have not asked you something important, are you familiar with
    PHP? i hope you are =)

    Statistics is a point we have got drafted for Campsite 3.0 but we have
    not had the time to even think about it, right now our major and
    unique priority is the template engine and all what it means.

    Having said that, i think this is definitely a very good opportunity
    to get you involved in the project in terms of developing for 3.x



    On 7/19/07, Vladimir Ernesto Diaz Aviles wrote:
    > Dear Ondra,
    >
    > Right now I am concern on providing the following basic functionality:
    >
    > (1) A top-n list of most read articles, that would imply to extend the tag, having something like:
    >
    >
    >
    > Where freq is the times or frequency that an article has been read.
    >
    > (2) Similar to (1) but for Topics
    >
    > (3) Obtain how many times an article has been read, with something like:
    >
    > (4) Similar to (3) but for Topics
    >
    > These 4 points requires basic statistics to be collected:
    >
    > We have to keep track of counters for articles and its topics.
    >
    > Every time an article is read, we have to increase by one a counter in a database table, that includes the article Id as key (possibly publication id, etc., anything necessary to identified the article uniquely), the counter itself, and a date of last read.
    >
    > Once the article's counter is increased, we get the Topics associated to the article, and increase their individual counters, which are stored in another database table, that includes a unique identifier for the Topic (possible a composed primary key), the counter of the topic, and a date of last update.
    >
    > At the beginning there would be no reports, no modifications to the Admin interface, I just want to provide these points that would help to create top-n lists of articles, lieke the top-5 most popular articles (of the issue, or of all times), and something like Tags View, e.g., http://www.flickr.com/photos/tags/ , based on information on the frequency on topics.
    >
    > In a second phase would be great to:
    >
    > (a) Keep track of an article rating, which would be provided by the readers, something like the ones in Amazon.com, and then obtain the top-n best articles according to the rating.
    >
    > (b) One could also be interested on how many comments an article has gotten. This might be easy to implement with the database as it is, but perhaps is not s interesting as the points mentioned above.
    >
    > (c) Include tools in the Admin interface for reporting, etc.
    >
    > For all above, could you please suggest:
    >
    > (1) What files should I start studying to include the insert/update of the tables in the DB, I would like to keep track of this statistics behind the scenes and not in the Article's template. Same for topics.
    >
    > (2) What components should I study in order to modify or extend the functionality of CS tags and
    >
    > (3) Is there any consideration that I have to follow to include the aforementioned tables in the database?
    >
    > Regards,
    >
    > VEDAX
    >
    >
    >
    >
    > On 7/19/07, Ondra Koutek wrote:
    > >
    > >
    > >
    > >
    > > Dear Vladimir,
    > > to get proper statistics is an issue, that needs to be divided into several parts.
    > > 1) defining stats you would like to have
    > > 2) collecting data you can collect
    > > 3) Making admin interface to setup reports and others
    > >
    > >
    > > We have tried something on this field in TOL, but the result was, if you store all info to database and than try to generate reports in realtime, most of them will be too slow (20mins) afret a while. The amount of data is huge. therefore my suggestion on this field was to design standard static stats, that will be generated every day on incremental base.
    > > Than for other stats you can hold data in database for n days, where n is number, that everyone can modify in admin interface preferences.
    > > Data collected earlier are dropped from database (not to make it huge) and all you have is the static reports.
    > > for latest n days you can use report designer to create any report you wish.
    > > And of course if the process of generating daily static reports and the dynamic on demand report is the same, it should be possible to let user in admin interface to say, that this "on demand" report shall be put to the static daily set.
    > >
    > >
    > > I think the collecting part is semi done and for remaining things you might want to collect, it is easy to add it.
    > > The rest (administration part) is not completed and as the major developer who started irt does not continue anymore, it might be interesting for you to have a look at it, but may be starting from scratch would be better idea. The reason is, that it was written as complete independent application and I believe it is better from marketing reasons to have this included into campsite admin interface.
    > >
    > >
    > > Ondra
    > >
    > >
    > >
    > >
    > >
    > >
    > >
    > >
    > >
    > > "Vladimir Ernesto Diaz Aviles" < vedax@bluecacao.com>
    > >
    > > 07/19/2007 07:02 AM
    > >
    > > To: contact@campware.org
    > > cc:
    > > Subject: Campsite Volunteer: Article Ranking (most read) ; TAGS View, e.g., http://www.flickr.com/photos/tags/
    > >
    > >
    > > Dear Campware Team,
    > >
    > > Your products are great! and I would like to contribute developing the
    > > necessary functionality for Campsite (CS) to keep statistics that help
    > > determine the most read articles/topics , that would allow to get
    > > lists ordered by this criteria, and to build a look like the one in :
    > > http://www.flickr.com/photos/tags/ , where the topics associated to
    > > the article would be presented with a size proportional to their
    > > frequency.
    > >
    > > Where should I start?, perhaps modifying the Article components in
    > > order to increase a counter every time the article is read and persist
    > > it in a database table that would take care of the statistics, from
    > > there it would be possible to create/modify a CS special tag to
    > > be able to get an ordered list based on a ranking of top most read
    > > articles.
    > >
    > > Any suggestion on how to start, and recommendations are welcome.
    > >
    > > Regards,
    > >
    > > Ernesto
    > >
    > >
    > >
    >
    >
    >
    > --
    > Ing. Vladimir Ernesto Diaz-Aviles, M.Sc.
    > Servicios de Consultor
  • Dear Holman,

    > There is a specification draft about this
    > topic (i think it is the same Douglas mentioned) at
    > http://code.campware.org/projects/campsite/wiki/CampsiteStatsDesign
    >

    I read the spec and it's very interesting, but to get there I would
    suggest to start with a narrow scope. If we could start with the
    issues I proposed, it would let us incorporate useful statistics more
    easily, coping with performance issues gradually.

    > First of all, i hope you have not started any coding work on this,
    > have you?, please say no =)

    More or less, but more less than more...I have incorporated an
    independent counter to my article template (very rudimentary), that
    keep track of the number of times an article has been read...this has
    been done with PHP, and without touching CS code.

    But I would like to incorporate the functionality directly to CS.

    > well, i have not asked you something important, are you familiar with
    > PHP? i hope you are =)
    >

    I am familiar, but not like you guys, for sure...I have more
    experience developing middleware components with Java JEE, and Web
    components such as Portlets, Servlets and JSPs, but I'm comfortably
    following PHP code, making modifications, and extending it...

    > Statistics is a point we have got drafted for Campsite 3.0 but we have
    > not had the time to even think about it, right now our major and
    > unique priority is the template engine and all what it means.
    >
    > Having said that, i think this is definitely a very good opportunity
    > to get you involved in the project in terms of developing for 3.x
    >

    I understand the priority, and as I said before, I would like to start
    with some basic statistics and evolve from there...could you please
    suggest where should I start or shall I wait for the template engine
    to be finished?, could we work in parallel?...How things work around
    here, shall I start planning, coding, or a task is open and assigned
    to me?

    Cheers,

    VEDAX

    >
    >
    > On 7/19/07, Vladimir Ernesto Diaz Aviles wrote:
    > > Dear Ondra,
    > >
    > > Right now I am concern on providing the following basic functionality:
    > >
    > > (1) A top-n list of most read articles, that would imply to extend the tag, having something like:
    > >
    > >
    > >
    > > Where freq is the times or frequency that an article has been read.
    > >
    > > (2) Similar to (1) but for Topics
    > >
    > > (3) Obtain how many times an article has been read, with something like:
    > >
    > > (4) Similar to (3) but for Topics
    > >
    > > These 4 points requires basic statistics to be collected:
    > >
    > > We have to keep track of counters for articles and its topics.
    > >
    > > Every time an article is read, we have to increase by one a counter in a database table, that includes the article Id as key (possibly publication id, etc., anything necessary to identified the article uniquely), the counter itself, and a date of last read.
    > >
    > > Once the article's counter is increased, we get the Topics associated to the article, and increase their individual counters, which are stored in another database table, that includes a unique identifier for the Topic (possible a composed primary key), the counter of the topic, and a date of last update.
    > >
    > > At the beginning there would be no reports, no modifications to the Admin interface, I just want to provide these points that would help to create top-n lists of articles, lieke the top-5 most popular articles (of the issue, or of all times), and something like Tags View, e.g., http://www.flickr.com/photos/tags/ , based on information on the frequency on topics.
    > >
    > > In a second phase would be great to:
    > >
    > > (a) Keep track of an article rating, which would be provided by the readers, something like the ones in Amazon.com, and then obtain the top-n best articles according to the rating.
    > >
    > > (b) One could also be interested on how many comments an article has gotten. This might be easy to implement with the database as it is, but perhaps is not s interesting as the points mentioned above.
    > >
    > > (c) Include tools in the Admin interface for reporting, etc.
    > >
    > > For all above, could you please suggest:
    > >
    > > (1) What files should I start studying to include the insert/update of the tables in the DB, I would like to keep track of this statistics behind the scenes and not in the Article's template. Same for topics.
    > >
    > > (2) What components should I study in order to modify or extend the functionality of CS tags and
    > >
    > > (3) Is there any consideration that I have to follow to include the aforementioned tables in the database?
    > >
    > > Regards,
    > >
    > > VEDAX
    > >
    > >
    > >
    > >
    > > On 7/19/07, Ondra Koutek wrote:
    > > >
    > > >
    > > >
    > > >
    > > > Dear Vladimir,
    > > > to get proper statistics is an issue, that needs to be divided into several parts.
    > > > 1) defining stats you would like to have
    > > > 2) collecting data you can collect
    > > > 3) Making admin interface to setup reports and others
    > > >
    > > >
    > > > We have tried something on this field in TOL, but the result was, if you store all info to database and than try to generate reports in realtime, most of them will be too slow (20mins) afret a while. The amount of data is huge. therefore my suggestion on this field was to design standard static stats, that will be generated every day on incremental base.
    > > > Than for other stats you can hold data in database for n days, where n is number, that everyone can modify in admin interface preferences.
    > > > Data collected earlier are dropped from database (not to make it huge) and all you have is the static reports.
    > > > for latest n days you can use report designer to create any report you wish.
    > > > And of course if the process of generating daily static reports and the dynamic on demand report is the same, it should be possible to let user in admin interface to say, that this "on demand" report shall be put to the static daily set.
    > > >
    > > >
    > > > I think the collecting part is semi done and for remaining things you might want to collect, it is easy to add it.
    > > > The rest (administration part) is not completed and as the major developer who started irt does not continue anymore, it might be interesting for you to have a look at it, but may be starting from scratch would be better idea. The reason is, that it was written as complete independent application and I believe it is better from marketing reasons to have this included into campsite admin interface.
    > > >
    > > >
    > > > Ondra
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > > "Vladimir Ernesto Diaz Aviles" < vedax@bluecacao.com>
    > > >
    > > > 07/19/2007 07:02 AM
    > > >
    > > > To: contact@campware.org
    > > > cc:
    > > > Subject: Campsite Volunteer: Article Ranking (most read) ; TAGS View, e.g., http://www.flickr.com/photos/tags/
    > > >
    > > >
    > > > Dear Campware Team,
    > > >
    > > > Your products are great! and I would like to contribute developing the
    > > > necessary functionality for Campsite (CS) to keep statistics that help
    > > > determine the most read articles/topics , that would allow to get
    > > > lists ordered by this criteria, and to build a look like the one in :
    > > > http://www.flickr.com/photos/tags/ , where the topics associated to
    > > > the article would be presented with a size proportional to their
    > > > frequency.
    > > >
    > > > Where should I start?, perhaps modifying the Article components in
    > > > order to increase a counter every time the article is read and persist
    > > > it in a database table that would take care of the statistics, from
    > > > there it would be possible to create/modify a CS special tag to
    > > > be able to get an ordered list based on a ranking of top most read
    > > > articles.
    > > >
    > > > Any suggestion on how to start, and recommendations are welcome.
    > > >
    > > > Regards,
    > > >
    > > > Ernesto
    > > >
    > > >
    > > >
    > >
    > >
    > >
    >
    > --
    > /holman
    >
  • The major problem of all "start simple first" is, that during short
    time the development goes out of your hands and you need to modify
    huge number of code. Also you will need to modify database structure
    to get more stats and in general if you start something simple first,
    you will do more work than starting think about it in complex.

    Therefore I suggested to start with definition of "collecting data"
    where you will put all data you can collect and potentially want to
    measure, to a some simple database structure.
    The data collection will require to put extra code to the templates
    for sure, so you will create template that will be included to proper
    places.

    This was the simple part.

    However from the real production experiences, the collected data in
    database will grow very quickly and database will be than very huge.
    that it is vital to find out the proper method, how to reduce the size.

    This was the second part, simple, but more complicated.

    Than you can start playing with the data. that means for the
    beginning to create some simple way to define various reports. You
    need to differ one time reports and continual reports generated on
    incremental ways

    This was the most difficult part from my point of view.

    And after all this is done, you need to create the admin interface,
    which should be campsite module.

    all first 3 parts are totally independent on campsite version, so you
    can start now.
    The last one is connected to campsite admin interface, so my advice
    is to wait for campsite 3.0, which I believe will be released prior
    you finish the first three parts.

    Ondra


    On Jul 20, 2007, at 7:51 AM, Vladimir Ernesto Diaz Aviles wrote:

    > Dear Holman,
    >
    >> There is a specification draft about this
    >> topic (i think it is the same Douglas mentioned) at
    >> http://code.campware.org/projects/campsite/wiki/CampsiteStatsDesign
    >>
    >
    > I read the spec and it's very interesting, but to get there I would
    > suggest to start with a narrow scope. If we could start with the
    > issues I proposed, it would let us incorporate useful statistics more
    > easily, coping with performance issues gradually.
    >
    >> First of all, i hope you have not started any coding work on this,
    >> have you?, please say no =)
    >
    > More or less, but more less than more...I have incorporated an
    > independent counter to my article template (very rudimentary), that
    > keep track of the number of times an article has been read...this has
    > been done with PHP, and without touching CS code.
    >
    > But I would like to incorporate the functionality directly to CS.
    >
    >> well, i have not asked you something important, are you familiar with
    >> PHP? i hope you are =)
    >>
    >
    > I am familiar, but not like you guys, for sure...I have more
    > experience developing middleware components with Java JEE, and Web
    > components such as Portlets, Servlets and JSPs, but I'm comfortably
    > following PHP code, making modifications, and extending it...
    >
    >> Statistics is a point we have got drafted for Campsite 3.0 but we
    >> have
    >> not had the time to even think about it, right now our major and
    >> unique priority is the template engine and all what it means.
    >>
    >> Having said that, i think this is definitely a very good opportunity
    >> to get you involved in the project in terms of developing for 3.x
    >>
    >
    > I understand the priority, and as I said before, I would like to start
    > with some basic statistics and evolve from there...could you please
    > suggest where should I start or shall I wait for the template engine
    > to be finished?, could we work in parallel?...How things work around
    > here, shall I start planning, coding, or a task is open and assigned
    > to me?
    >
    > Cheers,
    >
    > VEDAX
    >
    >>
    >>
    >> On 7/19/07, Vladimir Ernesto Diaz Aviles wrote:
    >> > Dear Ondra,
    >> >
    >> > Right now I am concern on providing the following basic
    >> functionality:
    >> >
    >> > (1) A top-n list of most read articles, that would imply to
    >> extend the tag, having something like:
    >> >
    >> >
    >> >
    >> > Where freq is the times or frequency that an article has been read.
    >> >
    >> > (2) Similar to (1) but for Topics
    >> >
    >> > (3) Obtain how many times an article has been read, with
    >> something like:
    >> >
    >> > (4) Similar to (3) but for Topics
    >> >
    >> > These 4 points requires basic statistics to be collected:
    >> >
    >> > We have to keep track of counters for articles and its topics.
    >> >
    >> > Every time an article is read, we have to increase by one a
    >> counter in a database table, that includes the article Id as key
    >> (possibly publication id, etc., anything necessary to identified
    >> the article uniquely), the counter itself, and a date of last read.
    >> >
    >> > Once the article's counter is increased, we get the Topics
    >> associated to the article, and increase their individual counters,
    >> which are stored in another database table, that includes a unique
    >> identifier for the Topic (possible a composed primary key), the
    >> counter of the topic, and a date of last update.
    >> >
    >> > At the beginning there would be no reports, no modifications to
    >> the Admin interface, I just want to provide these points that
    >> would help to create top-n lists of articles, lieke the top-5 most
    >> popular articles (of the issue, or of all times), and something
    >> like Tags View, e.g., http://www.flickr.com/photos/tags/ , based
    >> on information on the frequency on topics.
    >> >
    >> > In a second phase would be great to:
    >> >
    >> > (a) Keep track of an article rating, which would be provided by
    >> the readers, something like the ones in Amazon.com, and then
    >> obtain the top-n best articles according to the rating.
    >> >
    >> > (b) One could also be interested on how many comments an article
    >> has gotten. This might be easy to implement with the database as
    >> it is, but perhaps is not s interesting as the points mentioned
    >> above.
    >> >
    >> > (c) Include tools in the Admin interface for reporting, etc.
    >> >
    >> > For all above, could you please suggest:
    >> >
    >> > (1) What files should I start studying to include the insert/
    >> update of the tables in the DB, I would like to keep track of this
    >> statistics behind the scenes and not in the Article's template.
    >> Same for topics.
    >> >
    >> > (2) What components should I study in order to modify or extend
    >> the functionality of CS tags and
    >> >
    >> > (3) Is there any consideration that I have to follow to include
    >> the aforementioned tables in the database?
    >> >
    >> > Regards,
    >> >
    >> > VEDAX
    >> >
    >> >
    >> >
    >> >
    >> > On 7/19/07, Ondra Koutek wrote:
    >> > >
    >> > >
    >> > >
    >> > >
    >> > > Dear Vladimir,
    >> > > to get proper statistics is an issue, that needs to be divided
    >> into several parts.
    >> > > 1) defining stats you would like to have
    >> > > 2) collecting data you can collect
    >> > > 3) Making admin interface to setup reports and others
    >> > >
    >> > >
    >> > > We have tried something on this field in TOL, but the result
    >> was, if you store all info to database and than try to generate
    >> reports in realtime, most of them will be too slow (20mins) afret
    >> a while. The amount of data is huge. therefore my suggestion on
    >> this field was to design standard static stats, that will be
    >> generated every day on incremental base.
    >> > > Than for other stats you can hold data in database for n days,
    >> where n is number, that everyone can modify in admin interface
    >> preferences.
    >> > > Data collected earlier are dropped from database (not to make
    >> it huge) and all you have is the static reports.
    >> > > for latest n days you can use report designer to create any
    >> report you wish.
    >> > > And of course if the process of generating daily static
    >> reports and the dynamic on demand report is the same, it should be
    >> possible to let user in admin interface to say, that this "on
    >> demand" report shall be put to the static daily set.
    >> > >
    >> > >
    >> > > I think the collecting part is semi done and for remaining
    >> things you might want to collect, it is easy to add it.
    >> > > The rest (administration part) is not completed and as the
    >> major developer who started irt does not continue anymore, it
    >> might be interesting for you to have a look at it, but may be
    >> starting from scratch would be better idea. The reason is, that it
    >> was written as complete independent application and I believe it
    >> is better from marketing reasons to have this included into
    >> campsite admin interface.
    >> > >
    >> > >
    >> > > Ondra
    >> > >
    >> > >
    >> > >
    >> > >
    >> > >
    >> > >
    >> > >
    >> > >
    >> > >
    >> > > "Vladimir Ernesto Diaz Aviles" < vedax@bluecacao.com>
    >> > >
    >> > > 07/19/2007 07:02 AM
    >> > >
    >> > > To: contact@campware.org
    >> > > cc:
    >> > > Subject: Campsite Volunteer: Article Ranking
    >> (most read) ; TAGS View, e.g., http://www.flickr.com/photos/tags/
    >> > >
    >> > >
    >> > > Dear Campware Team,
    >> > >
    >> > > Your products are great! and I would like to contribute
    >> developing the
    >> > > necessary functionality for Campsite (CS) to keep statistics
    >> that help
    >> > > determine the most read articles/topics , that would allow to
    >> get
    >> > > lists ordered by this criteria, and to build a look like the
    >> one in :
    >> > > http://www.flickr.com/photos/tags/ , where the topics
    >> associated to
    >> > > the article would be presented with a size proportional to their
    >> > > frequency.
    >> > >
    >> > > Where should I start?, perhaps modifying the Article
    >> components in
    >> > > order to increase a counter every time the article is read
    >> and persist
    >> > > it in a database table that would take care of the
    >> statistics, from
    >> > > there it would be possible to create/modify a CS special tag
    >> to
    >> > > be able to get an ordered list based on a ranking of top most
    >> read
    >> > > articles.
    >> > >
    >> > > Any suggestion on how to start, and recommendations are welcome.
    >> > >
    >> > > Regards,
    >> > >
    >> > > Ernesto
    >> > >
    >> > >
    >> > >
    >> >
    >> >
    >> >
    >>
    >> --
    >> /holman
    >>
  • Dear Vladimir,

    We are actively working on the PHP template engine so it's not something
    stable you can use to implement the article ranking. In order to make things
    easier you should implement this feature completely separately from us but
    in a way that will allow us to integrate it in Campsite once we're done with
    the PHP template engine.

    First, I strongly suggest to read the following documentation:

    http://code.campware.org/projects/campsite/wiki/CampsiteNewbieGuide

    especially coding standards and database naming conventions. Be aware that
    we didn't convert the old code to the new coding standards but from now on
    all the new code should respect these rules.

    You should create separate tables that hold data about article rating and
    create an API that will allow data handling for article rating. If you write
    this feature in this way then we can even integrate it in older Campsite
    versions. Please let me know if you need more help.

    Regards,
    Mugur

    On 7/20/07, Vladimir Ernesto Diaz Aviles wrote:
    >
    > Dear Holman,
    >
    > > There is a specification draft about this
    > > topic (i think it is the same Douglas mentioned) at
    > > http://code.campware.org/projects/campsite/wiki/CampsiteStatsDesign
    > >
    >
    > I read the spec and it's very interesting, but to get there I would
    > suggest to start with a narrow scope. If we could start with the
    > issues I proposed, it would let us incorporate useful statistics more
    > easily, coping with performance issues gradually.
    >
    > > First of all, i hope you have not started any coding work on this,
    > > have you?, please say no =)
    >
    > More or less, but more less than more...I have incorporated an
    > independent counter to my article template (very rudimentary), that
    > keep track of the number of times an article has been read...this has
    > been done with PHP, and without touching CS code.
    >
    > But I would like to incorporate the functionality directly to CS.
    >
    > > well, i have not asked you something important, are you familiar with
    > > PHP? i hope you are =)
    > >
    >
    > I am familiar, but not like you guys, for sure...I have more
    > experience developing middleware components with Java JEE, and Web
    > components such as Portlets, Servlets and JSPs, but I'm comfortably
    > following PHP code, making modifications, and extending it...
    >
    > > Statistics is a point we have got drafted for Campsite 3.0 but we have
    > > not had the time to even think about it, right now our major and
    > > unique priority is the template engine and all what it means.
    > >
    > > Having said that, i think this is definitely a very good opportunity
    > > to get you involved in the project in terms of developing for 3.x
    > >
    >
    > I understand the priority, and as I said before, I would like to start
    > with some basic statistics and evolve from there...could you please
    > suggest where should I start or shall I wait for the template engine
    > to be finished?, could we work in parallel?...How things work around
    > here, shall I start planning, coding, or a task is open and assigned
    > to me?
    >
    > Cheers,
    >
    > VEDAX
    >
    > >
    > >
    > > On 7/19/07, Vladimir Ernesto Diaz Aviles wrote:
    > > > Dear Ondra,
    > > >
    > > > Right now I am concern on providing the following basic functionality:
    > > >
    > > > (1) A top-n list of most read articles, that would imply to extend the
    > tag, having something like:
    > > >
    > > >
    > > >
    > > > Where freq is the times or frequency that an article has been read.
    > > >
    > > > (2) Similar to (1) but for Topics
    > > >
    > > > (3) Obtain how many times an article has been read, with something
    > like:
    > > >
    > > > (4) Similar to (3) but for Topics
    > > >
    > > > These 4 points requires basic statistics to be collected:
    > > >
    > > > We have to keep track of counters for articles and its topics.
    > > >
    > > > Every time an article is read, we have to increase by one a counter in
    > a database table, that includes the article Id as key (possibly publication
    > id, etc., anything necessary to identified the article uniquely), the
    > counter itself, and a date of last read.
    > > >
    > > > Once the article's counter is increased, we get the Topics associated
    > to the article, and increase their individual counters, which are stored in
    > another database table, that includes a unique identifier for the Topic
    > (possible a composed primary key), the counter of the topic, and a date of
    > last update.
    > > >
    > > > At the beginning there would be no reports, no modifications to the
    > Admin interface, I just want to provide these points that would help to
    > create top-n lists of articles, lieke the top-5 most popular articles (of
    > the issue, or of all times), and something like Tags View, e.g.,
    > http://www.flickr.com/photos/tags/ , based on information on the
    > frequency on topics.
    > > >
    > > > In a second phase would be great to:
    > > >
    > > > (a) Keep track of an article rating, which would be provided by the
    > readers, something like the ones in Amazon.com, and then obtain the
    > top-n best articles according to the rating.
    > > >
    > > > (b) One could also be interested on how many comments an article has
    > gotten. This might be easy to implement with the database as it is, but
    > perhaps is not s interesting as the points mentioned above.
    > > >
    > > > (c) Include tools in the Admin interface for reporting, etc.
    > > >
    > > > For all above, could you please suggest:
    > > >
    > > > (1) What files should I start studying to include the insert/update of
    > the tables in the DB, I would like to keep track of this statistics behind
    > the scenes and not in the Article's template. Same for topics.
    > > >
    > > > (2) What components should I study in order to modify or extend the
    > functionality of CS tags and
    > > >
    > > > (3) Is there any consideration that I have to follow to include the
    > aforementioned tables in the database?
    > > >
    > > > Regards,
    > > >
    > > > VEDAX
    > > >
    > > >
    > > >
    > > >
    > > > On 7/19/07, Ondra Koutek wrote:
    > > > >
    > > > >
    > > > >
    > > > >
    > > > > Dear Vladimir,
    > > > > to get proper statistics is an issue, that needs to be divided into
    > several parts.
    > > > > 1) defining stats you would like to have
    > > > > 2) collecting data you can collect
    > > > > 3) Making admin interface to setup reports and others
    > > > >
    > > > >
    > > > > We have tried something on this field in TOL, but the result was, if
    > you store all info to database and than try to generate reports in realtime,
    > most of them will be too slow (20mins) afret a while. The amount of data is
    > huge. therefore my suggestion on this field was to design standard static
    > stats, that will be generated every day on incremental base.
    > > > > Than for other stats you can hold data in database for n days, where
    > n is number, that everyone can modify in admin interface preferences.
    > > > > Data collected earlier are dropped from database (not to make it
    > huge) and all you have is the static reports.
    > > > > for latest n days you can use report designer to create any report
    > you wish.
    > > > > And of course if the process of generating daily static reports and
    > the dynamic on demand report is the same, it should be possible to let user
    > in admin interface to say, that this "on demand" report shall be put to the
    > static daily set.
    > > > >
    > > > >
    > > > > I think the collecting part is semi done and for remaining things
    > you might want to collect, it is easy to add it.
    > > > > The rest (administration part) is not completed and as the major
    > developer who started irt does not continue anymore, it might be interesting
    > for you to have a look at it, but may be starting from scratch would be
    > better idea. The reason is, that it was written as complete independent
    > application and I believe it is better from marketing reasons to have this
    > included into campsite admin interface.
    > > > >
    > > > >
    > > > > Ondra
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > > "Vladimir Ernesto Diaz Aviles" < vedax@bluecacao.com>
    > > > >
    > > > > 07/19/2007 07:02 AM
    > > > >
    > > > > To: contact@campware.org
    > > > > cc:
    > > > > Subject: Campsite Volunteer: Article Ranking (most
    > read) ; TAGS View, e.g., http://www.flickr.com/photos/tags/
    > > > >
    > > > >
    > > > > Dear Campware Team,
    > > > >
    > > > > Your products are great! and I would like to contribute developing
    > the
    > > > > necessary functionality for Campsite (CS) to keep statistics that
    > help
    > > > > determine the most read articles/topics , that would allow to get
    > > > > lists ordered by this criteria, and to build a look like the one in
    > :
    > > > > http://www.flickr.com/photos/tags/ , where the topics associated
    > to
    > > > > the article would be presented with a size proportional to their
    > > > > frequency.
    > > > >
    > > > > Where should I start?, perhaps modifying the Article components in
    > > > > order to increase a counter every time the article is read and
    > persist
    > > > > it in a database table that would take care of the statistics, from
    > > > > there it would be possible to create/modify a CS special tag
    > to
    > > > > be able to get an ordered list based on a ranking of top most read
    > > > > articles.
    > > > >
    > > > > Any suggestion on how to start, and recommendations are welcome.
    > > > >
    > > > > Regards,
    > > > >
    > > > > Ernesto
    > > > >
    > > > >
    > > > >
    > > >
    > > >
    > > >
    > >
    > > --
    > > /holman
    > >
    >
  • Dear Vladimir,

    We are actively working on the PHP template engine so it's not something
    stable you can use to implement the article ranking. In order to make things
    easier you should implement this feature completely separately from us but
    in a way that will allow us to integrate it in Campsite once we're done with
    the PHP template engine.

    First, I strongly suggest to read the following documentation:

    http://code.campware.org/projects/campsite/wiki/CampsiteNewbieGuide

    especially coding standards and database naming conventions. Be aware that
    we didn't convert the old code to the new coding standards but from now on
    all the new code should respect these rules.

    You should create separate tables that hold data about article rating and
    create an API that will allow data handling for article rating. If you write
    this feature in this way then we can even integrate it in older Campsite
    versions. Please let me know if you need more help.

    Regards,
    Mugur

    On 7/20/07, Vladimir Ernesto Diaz Aviles wrote:
    >
    > Dear Holman,
    >
    > > There is a specification draft about this
    > > topic (i think it is the same Douglas mentioned) at
    > > http://code.campware.org/projects/campsite/wiki/CampsiteStatsDesign
    > >
    >
    > I read the spec and it's very interesting, but to get there I would
    > suggest to start with a narrow scope. If we could start with the
    > issues I proposed, it would let us incorporate useful statistics more
    > easily, coping with performance issues gradually.
    >
    > > First of all, i hope you have not started any coding work on this,
    > > have you?, please say no =)
    >
    > More or less, but more less than more...I have incorporated an
    > independent counter to my article template (very rudimentary), that
    > keep track of the number of times an article has been read...this has
    > been done with PHP, and without touching CS code.
    >
    > But I would like to incorporate the functionality directly to CS.
    >
    > > well, i have not asked you something important, are you familiar with
    > > PHP? i hope you are =)
    > >
    >
    > I am familiar, but not like you guys, for sure...I have more
    > experience developing middleware components with Java JEE, and Web
    > components such as Portlets, Servlets and JSPs, but I'm comfortably
    > following PHP code, making modifications, and extending it...
    >
    > > Statistics is a point we have got drafted for Campsite 3.0 but we have
    > > not had the time to even think about it, right now our major and
    > > unique priority is the template engine and all what it means.
    > >
    > > Having said that, i think this is definitely a very good opportunity
    > > to get you involved in the project in terms of developing for 3.x
    > >
    >
    > I understand the priority, and as I said before, I would like to start
    > with some basic statistics and evolve from there...could you please
    > suggest where should I start or shall I wait for the template engine
    > to be finished?, could we work in parallel?...How things work around
    > here, shall I start planning, coding, or a task is open and assigned
    > to me?
    >
    > Cheers,
    >
    > VEDAX
    >
    > >
    > >
    > > On 7/19/07, Vladimir Ernesto Diaz Aviles wrote:
    > > > Dear Ondra,
    > > >
    > > > Right now I am concern on providing the following basic functionality:
    > > >
    > > > (1) A top-n list of most read articles, that would imply to extend the
    > tag, having something like:
    > > >
    > > >
    > > >
    > > > Where freq is the times or frequency that an article has been read.
    > > >
    > > > (2) Similar to (1) but for Topics
    > > >
    > > > (3) Obtain how many times an article has been read, with something
    > like:
    > > >
    > > > (4) Similar to (3) but for Topics
    > > >
    > > > These 4 points requires basic statistics to be collected:
    > > >
    > > > We have to keep track of counters for articles and its topics.
    > > >
    > > > Every time an article is read, we have to increase by one a counter in
    > a database table, that includes the article Id as key (possibly publication
    > id, etc., anything necessary to identified the article uniquely), the
    > counter itself, and a date of last read.
    > > >
    > > > Once the article's counter is increased, we get the Topics associated
    > to the article, and increase their individual counters, which are stored in
    > another database table, that includes a unique identifier for the Topic
    > (possible a composed primary key), the counter of the topic, and a date of
    > last update.
    > > >
    > > > At the beginning there would be no reports, no modifications to the
    > Admin interface, I just want to provide these points that would help to
    > create top-n lists of articles, lieke the top-5 most popular articles (of
    > the issue, or of all times), and something like Tags View, e.g.,
    > http://www.flickr.com/photos/tags/ , based on information on the
    > frequency on topics.
    > > >
    > > > In a second phase would be great to:
    > > >
    > > > (a) Keep track of an article rating, which would be provided by the
    > readers, something like the ones in Amazon.com, and then obtain the
    > top-n best articles according to the rating.
    > > >
    > > > (b) One could also be interested on how many comments an article has
    > gotten. This might be easy to implement with the database as it is, but
    > perhaps is not s interesting as the points mentioned above.
    > > >
    > > > (c) Include tools in the Admin interface for reporting, etc.
    > > >
    > > > For all above, could you please suggest:
    > > >
    > > > (1) What files should I start studying to include the insert/update of
    > the tables in the DB, I would like to keep track of this statistics behind
    > the scenes and not in the Article's template. Same for topics.
    > > >
    > > > (2) What components should I study in order to modify or extend the
    > functionality of CS tags and
    > > >
    > > > (3) Is there any consideration that I have to follow to include the
    > aforementioned tables in the database?
    > > >
    > > > Regards,
    > > >
    > > > VEDAX
    > > >
    > > >
    > > >
    > > >
    > > > On 7/19/07, Ondra Koutek wrote:
    > > > >
    > > > >
    > > > >
    > > > >
    > > > > Dear Vladimir,
    > > > > to get proper statistics is an issue, that needs to be divided into
    > several parts.
    > > > > 1) defining stats you would like to have
    > > > > 2) collecting data you can collect
    > > > > 3) Making admin interface to setup reports and others
    > > > >
    > > > >
    > > > > We have tried something on this field in TOL, but the result was, if
    > you store all info to database and than try to generate reports in realtime,
    > most of them will be too slow (20mins) afret a while. The amount of data is
    > huge. therefore my suggestion on this field was to design standard static
    > stats, that will be generated every day on incremental base.
    > > > > Than for other stats you can hold data in database for n days, where
    > n is number, that everyone can modify in admin interface preferences.
    > > > > Data collected earlier are dropped from database (not to make it
    > huge) and all you have is the static reports.
    > > > > for latest n days you can use report designer to create any report
    > you wish.
    > > > > And of course if the process of generating daily static reports and
    > the dynamic on demand report is the same, it should be possible to let user
    > in admin interface to say, that this "on demand" report shall be put to the
    > static daily set.
    > > > >
    > > > >
    > > > > I think the collecting part is semi done and for remaining things
    > you might want to collect, it is easy to add it.
    > > > > The rest (administration part) is not completed and as the major
    > developer who started irt does not continue anymore, it might be interesting
    > for you to have a look at it, but may be starting from scratch would be
    > better idea. The reason is, that it was written as complete independent
    > application and I believe it is better from marketing reasons to have this
    > included into campsite admin interface.
    > > > >
    > > > >
    > > > > Ondra
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > > "Vladimir Ernesto Diaz Aviles" < vedax@bluecacao.com>
    > > > >
    > > > > 07/19/2007 07:02 AM
    > > > >
    > > > > To: contact@campware.org
    > > > > cc:
    > > > > Subject: Campsite Volunteer: Article Ranking (most
    > read) ; TAGS View, e.g., http://www.flickr.com/photos/tags/
    > > > >
    > > > >
    > > > > Dear Campware Team,
    > > > >
    > > > > Your products are great! and I would like to contribute developing
    > the
    > > > > necessary functionality for Campsite (CS) to keep statistics that
    > help
    > > > > determine the most read articles/topics , that would allow to get
    > > > > lists ordered by this criteria, and to build a look like the one in
    > :
    > > > > http://www.flickr.com/photos/tags/ , where the topics associated
    > to
    > > > > the article would be presented with a size proportional to their
    > > > > frequency.
    > > > >
    > > > > Where should I start?, perhaps modifying the Article components in
    > > > > order to increase a counter every time the article is read and
    > persist
    > > > > it in a database table that would take care of the statistics, from
    > > > > there it would be possible to create/modify a CS special tag
    > to
    > > > > be able to get an ordered list based on a ranking of top most read
    > > > > articles.
    > > > >
    > > > > Any suggestion on how to start, and recommendations are welcome.
    > > > >
    > > > > Regards,
    > > > >
    > > > > Ernesto
    > > > >
    > > > >
    > > > >
    > > >
    > > >
    > > >
    > >
    > > --
    > > /holman
    > >
    >