[campsite-dev] [livesupport-dev] File Storage RFC
  • --=_mixed 0005111EC1256EDB_=
    Content-Type: multipart/alternative; boundary="=_alternative 0005111EC1256EDB_="

    --=_alternative 0005111EC1256EDB_=
    Content-Type: text/plain; charset="us-ascii"

    Hi all,

    Over at our sister mailing list for LiveSupport, Tomas Hlava has posted
    his RFC for file storage. Please take a look at this, both with an eye for
    using it in Campsite 3.0 as well as any other general comments you may
    have.

    doug

    ----- Forwarded by Douglas Arellanes/Mdlf on 07/24/2004 02:53 AM -----


    "Tomas Hlava"
    07/23/2004 10:09 PM
    Please respond to livesupport-dev


    To: livesupport-dev@campware.org
    cc:
    Subject: [livesupport-dev] File Storage RFC


    Hi,
    for synchronization of development I send a "File Storage RFC"
    with brief explanation of storing metadata and quick API overview.
    I hope it reflects specifications sent by Akos.

    Comments are welcome

    --
    Tomas Hlava
    th@red2head.com

    P.S. my response may delay next week due to problematic connection ...



    --=_alternative 0005111EC1256EDB_=
    Content-Type: text/html; charset="us-ascii"



    Hi all,



    Over at our sister mailing list for LiveSupport, Tomas Hlava has posted his RFC for file storage. Please take a look at this, both with an eye for using it in Campsite 3.0 as well as any other general comments you may have.



    doug



    ----- Forwarded by Douglas Arellanes/Mdlf on 07/24/2004 02:53 AM -----





    "Tomas Hlava" <th@red2head.com>

    07/23/2004 10:09 PM

    Please respond to livesupport-dev


           

            To:        livesupport-dev@campware.org

            cc:        

            Subject:        [livesupport-dev] File Storage RFC






    Hi,

    for synchronization of development I send a "File Storage RFC"

    with brief explanation of storing metadata and quick API overview.

    I hope it reflects specifications sent by Akos.



    Comments are welcome



    --

    Tomas Hlava

    th@red2head.com



    P.S. my response may delay next week due to problematic connection ...






    --=_alternative 0005111EC1256EDB_=--

    --=_mixed 0005111EC1256EDB_=
    Content-type: message/external-body; access-type=URL; URL=http://sympa.mdlf.org/sympa-bin/wwsympa.fcgi/attach/campsite-dev/OF5005577E.0AB5A776-ONC1256EDB.0004E587-C1256EDB.00051125@mdlf.org/gbrfc_html; name="gbrfc_html"; size="10252"

    Content-type: text/html; name="gbrfc.html"
    Content-Transfer-Encoding: base64

    --=_mixed 0005111EC1256EDB_=--

    ------------------------------------------
    Posted to Phorum via PhorumMail
  • 1 Comment sorted by
  • This is a multi-part message in MIME format.
    --------------070908070506010009030103
    Content-Type: text/plain; charset=us-ascii; format=flowed
    Content-Transfer-Encoding: 7bit

    Hi Doug and everyone

    I thought I'd chip in with the following comments:

    *video/images/pdfs etc.*
    The spec is of course centred on audio content. But for the sake of
    other projects which might share this file-repository code, wouldn't it
    be better to generally allow for video, image, and other content as
    well. A print-version newspaper could find this repository useful as a
    full-resolution image archive, for example, perhaps. The RPC interface
    looks suitable for general-purpose storage of any large files with
    metadata, with the content-type of the data just being another part of
    the metadata.

    *native metadata formats*
    Is there any intention to use the native metadata formats available with
    most media files, eg MP3 ID3 tags, JPEG IPTC/EXIF tags, PDF metadata,
    etc. At least for when you're uploading music to be played, this would
    cut down a lot on the metadata that would have to be entered (tools like
    cdex and musicbrainz can automatically add your metadata to MP3 music
    files). I realise there would be plenty of metadata fields that you need
    in addition to, for example, the standard ID3 ones though; so you'd
    still need the metadata get/set RPC calls.

    *streaming to the general public*
    You're probably not planning on allowing the general public to download
    media content from your server. (although I saw the reference to
    'darkice'). But I I read yesterday that the BBC is using an adapted
    version of BitTorrent for a new TV over IP network that is being
    trialed; they have solved the digital rights management issues and will
    massively reduce their required bandwidth by serving up using BitTorrent
    sharing. Consider that you might want to add that capability at some
    point, perhaps.

    Once you talk about sharing files with the general public, you need to
    think about allowing for multiple formats of each media item,
    server-side audio and video transcoding, image resizing, etc. In the
    case of a link with Campsite I can see that you would want to make
    audio/video content available at varying bitrates, and images at
    different sizes and levels of compression, for example a thumbnail image
    on the homepage and a larger image inside a news story. In some cases
    you might want to serve up a video version of a story as well as an
    audio-only version. The MIME 'multipart/alternative' standard comes to mind.

    I just looked at your XML namespace and I see that it's Dublin Core,
    which includes provision for alternative versions of the same content,
    and for the content-type as part of the metadata. So that will help.
    Maybe all you need to do for the moment is implement the DC 'related'
    metadata field in your app.


    Looks like good stuff, I'll be watching with interest.

    JP

    BTW I couldn't find out anything on 'Alib' (except for some assembly
    language library), do you have a URL for that, or is it another part of
    the LiveSupport project?

    Also, what is the relationship between this project and your
    Charlie/Sablotron-powered media storage project?

    > ----- Forwarded by Douglas Arellanes/Mdlf on 07/24/2004 02:53 AM -----
    >
    >
    > GreenBox - Livesupport file storage - request for comments
    >
    > 23.7.2004
    >
    >
    > Overview
    >
    > Extends Alib class which handles authentication and authorization
    > checking.
    > Alib provides tree hierarchy of objects with permission management,
    > grouping objects to classes and user management with recursive
    > groups/roles.
    > GreenBox uses Alib tree and extends it with db of file-specific
    > information and metadata RDF-style database with URI identification of
    > property types and values.
    >
    >
    > Basic model for metadata
    >
    > Subject --- Predicate(property) ---> Object
    >
    > Where:
    >
    > * Subject
    > may be:
    > 1. file in storage (identified by gunid - global unique id),
    > 2. metadata record (mostly blank node for metadata hierarchy) or
    > 3. literal (for some definition e.g. napespace prefix)
    > * Predicate
    > is identification of metadata property - i.e. namespace prefix
    > and tagname
    > * Object
    > is blank node or literal value of property
    >
    >
    > Reserved namespace prefixes:
    >
    > _L literal value
    > _G gunid (global unique id) of file
    > _I metada record ID
    > _blank blank node
    >
    >
    > Reserved predicates:
    >
    > _namespace (with empty prefix) namespace prefix definition
    >
    >
    > Example of metadata records:
    >
    > subject namespace: _G (means gunid of file)
    > subject: ea510749debe80e1a7c3e021a79b9288
    > predicate namespace: dc
    > predicate: metadata
    > object namespace: _blank
    > object: NULL
    >
    >
    > subject namespace: _I (means id of parent metadata record)
    > subject: 1234
    > predicate namespace: dc
    > predicate: title
    > object namespace: _L (means literal value)
    > object: Jingle bells
    >
    >
    > subject namespace: _L (means literal value)
    > subject: dc
    > predicate namespace:
    > predicate: _namespace
    > object namespace: _L (means literal value)
    > object: http://purl.org/dc/elements/1.1/
    >
    >
    >
    > XML serialization of previous eExample:
    >
    >
    >
    > Jingle bells
    >

    >
    >
    >
    > XMLRPC API for local storage
    >
    > * |login(/*string*/ $login, /*string*/ $pass)| returns string
    > returns auth token or error
    > * |logout(/*string*/ $sessid)| returns boolean
    > * |authenticate(/*string*/ $login, /*string*/ $pass)| returns boolean
    > basic authentication check
    > * |existsAudioClip(/*string*/ $gunid)| returns boolean
    > Check if an Audio clip with the specified id is stored in local
    > storage.
    > * |storeAudioClip(/*string*/ $gunid, /*string*/ $mediaFileLP,
    > /*string*/ $mdataFileLP, /*string*/ $sessid)| returns gunid
    > Store a new audio clip or replace an existing one.
    > * |deleteAudioClip(/*string*/ $gunid, /*string*/ $sessid)| returns
    > boolean
    > Delete an existing Audio clip.
    > * |updateAudioClipMetadata(/*string*/ $gunid, /*string*/
    > $newMetaData, /*string*/ $sessid)| returns boolean
    > Update the metadata of an Audio clip stored in Local storage.
    > * |accessRawAudioData(/*string*/ $gunid, /*string*/ $sessid)|
    > returns string
    > Get access to raw audio data of an AudioClip.
    > * |releaseRawAudioData(/*string*/ $gunid, /*string*/ $sessid)|
    > returns boolean
    > Release access for raw audio data.
    > * |searchMetadata(/*string*/ $criteria, /*string*/ $sessid)|
    > returns array
    > Search through the metadata of stored AudioClips, and return all
    > matching clip ids.
    >
    > Methods may return XMLRPC error response if fails ...
    >
    > Common parameters:
    > |$sessid| - session id (token returned by login method)
    > |$gunid| - global unique id of file
    >
    >
    > XMLRPC API of interface to central archive
    >
    > * |downloadRawAudioData(/*string*/ $gunid, /*int*/ $offset,
    > /*string*/ $sessid)| returns boolean
    > * |searchMetadata(/*string*/ $criteria, /*string*/ $sessid)|
    > returns array
    > Search through the metadata of stored AudioClips, and return all
    > matching clip ids.
    >
    >
    > Main part of PHP API:
    >
    > * |GreenBox(&$dbc, $config)|
    > class constructor
    > * |createFolder($parid, $folderName, $sessid)|
    > * |putFile($parid, $fileName, $mediaFileLP, $mdataFileLP, $sessid)|
    > $mediaFileLP and $mdataFileLP contains local path to media and
    > metadata files
    > * |getFile($id, $sessid)|
    > * |analyzeFile($id, $sessid)|
    > * |access($id, $sessid)|
    > * |moveFile($id, $did, $sessid)|
    > * |copyFile($id, $did, $sessid)|
    > * |deleteFile($id, $sessid)|
    > * |createReplica($id, $did, $replicaName, $sessid)|
    > * |createVersion($id, $did, $versionLabel, $sessid)|
    > * |updateMetadata($id, $mdataFile, $sessid)|
    > * |updateMetadataRecord($id, $mdid, $object, $sessid)|
    > * |addMetaDataRecord($id, $propertyNamespace, $propertyName,
    > $propertyValue, $sessid)|
    > * |getMdata($id, $sessid)|
    > * |localSearch($searchData, $sessid)|
    > * |uploadFile($id, $sessid)|
    > * |downloadFile($id, $parid, $sessid)|
    > * |getTransferStatus($transferId, $sessid)|
    > * |globalSearch($searchData, $sessid)|
    > * |getSearchResults($transferId, $sessid)|
    > * |listFolder($id, $sessid)|
    > * |getMetadata($id, $sessid)|
    >
    > Common parameters:
    > |$sessid| - session id (token returned by login method)
    > |$id| - id in object tree
    > |$parid| - parent id in object tree
    > |$mdid| - id in metadata table
    > |$transferId| - id of transfer or search job (returned by initiating
    > method)
    >
    >
    > Connection between local storage and other components
    >
    > Standard API call will be realised through XMLRPC interface.
    > Large media files will be provided to other components directly using
    > temporary symlinks (filesystem sharing have to be set up in
    > distributed intallation of Livesupport).
    > HTML interface written in PHP could include and call API methods
    > directly.
    >
    > ------------------------------------------------------------------------
    > Tomas Hlava
    > th@red2head.com
    >
    > P.S.: sorry for my English ... Wink
    >


    --------------070908070506010009030103
    Content-Type: text/html; charset=us-ascii
    Content-Transfer-Encoding: 7bit








    Hi Doug and everyone



    I thought I'd chip in with the following comments:



    video/images/pdfs etc.

    The spec is of course centred on audio content. But for the sake of
    other projects which might share this file-repository code, wouldn't it
    be better to generally allow for video, image, and other content as
    well.
    A print-version newspaper
    could find this repository useful as a full-resolution image archive,
    for example, perhaps.
    The RPC
    interface looks suitable for general-purpose storage of any large files
    with metadata, with the content-type of the data just being another
    part of the metadata.



    native metadata formats

    Is there any intention to use the native metadata formats available
    with most media files, eg MP3 ID3 tags, JPEG IPTC/EXIF tags, PDF
    metadata, etc. At least for when you're uploading music to be played,
    this would cut down a lot on the metadata that would have to be entered
    (tools like cdex and musicbrainz can automatically add your metadata to
    MP3 music files). I realise there would be plenty of metadata fields
    that you need in addition to, for example, the standard ID3 ones
    though; so you'd still need the metadata get/set RPC calls.



    streaming to the general public

    You're probably not planning on allowing the general public to download
    media content from your server. (although I saw the reference to
    'darkice'). But I I read yesterday that the BBC is using an adapted
    version of BitTorrent for a new TV over IP network that is being
    trialed; they have solved the digital rights management issues and will
    massively reduce their required bandwidth by serving up using
    BitTorrent sharing. Consider that you might want to add that capability
    at some point, perhaps.



    Once you talk about sharing files with the
    general public, you need to think about allowing for multiple formats
    of each media item, server-side audio and video transcoding, image
    resizing, etc. In the case of a link with Campsite I can see that you
    would want to make audio/video content available at varying bitrates,
    and images at different sizes and levels of compression, for example a
    thumbnail image on the homepage and a larger image inside a news story.
    In some cases you might want to serve up a video version of a story as
    well as an audio-only version. The MIME 'multipart/alternative'
    standard comes to mind.



    I just looked at your XML namespace and I
    see that it's Dublin Core, which includes provision for alternative
    versions of the same content, and for the content-type as part of the
    metadata. So that will help. Maybe all you need to do for the moment is
    implement the DC 'related' metadata field in your app.





    Looks like good stuff, I'll be watching with interest.



    JP



    BTW I couldn't find out anything on 'Alib' (except for some assembly
    language library), do you have a URL for that, or is it another part of
    the LiveSupport project?



    Also, what is the relationship between this project and your
    Charlie/Sablotron-powered media storage project?



    cite="midOF5005577E.0AB5A776-ONC1256EDB.0004E587-C1256EDB.00051125@mdlf.org"
    type="cite">-----
    Forwarded by Douglas Arellanes/Mdlf on 07/24/2004 02:53 AM -----



    reenBox - Livesupport file storage


    GreenBox - Livesupport file storage - request for comments


    23.7.2004

    Overview


    Extends Alib class which handles
    authentication
    and authorization checking.

    Alib provides tree hierarchy of objects with permission management,
    grouping objects to classes
    and user management with recursive groups/roles.

    GreenBox uses Alib tree and extends it with db of file-specific
    information
    and metadata RDF-style database with URI identification of property
    types
    and values.


    Basic model for metadata


    Subject --- Predicate(property)
    ---> Object


    Where:



    • Subject

      may be:

      1. file in storage (identified by gunid - global unique id),

      2. metadata
        record (mostly blank node for metadata hierarchy) or

      3. literal (for some definition
        e.g. napespace prefix)



    • Predicate

      is identification of metadata property - i.e. namespace prefix and
      tagname

    • Object

      is blank node or literal value of property



    Reserved namespace prefixes:





















    _L literal value
    _G gunid (global unique id) of file
    _I metada record ID
    _blank blank node

    Reserved predicates:









    _namespace (with empty prefix) namespace prefix definition

    Example of metadata records:



      subject namespace:      _G (means gunid of file)
    subject: ea510749debe80e1a7c3e021a79b9288
    predicate namespace: dc
    predicate: metadata
    object namespace: _blank
    object: NULL



      subject namespace:      _I (means id of parent metadata record)
    subject: 1234
    predicate namespace: dc
    predicate: title
    object namespace: _L (means literal value)
    object: Jingle bells



      subject namespace:      _L (means literal value)
    subject: dc
    predicate namespace:
    predicate: _namespace
    object namespace: _L (means literal value)
    object: http://purl.org/dc/elements/1.1/


    XML serialization of previous eExample:



      <?xml version="1.0" encoding="utf-8"?>
    <dc:metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
    <dc:title>Jingle bells</dc:title>
    </dc:metadata>


    XMLRPC API for local storage




    • login(/*string*/ $login, /*string*/ $pass) returns
      string

      returns auth token or error

    • logout(/*string*/ $sessid) returns boolean


    • authenticate(/*string*/ $login, /*string*/ $pass)
      returns boolean

      basic authentication check

    • existsAudioClip(/*string*/
      $gunid)
      returns boolean

      Check if an Audio clip with the specified id is stored in local
      storage.

    • storeAudioClip(/*string*/ $gunid, /*string*/
      $mediaFileLP, /*string*/ $mdataFileLP, /*string*/ $sessid)

      returns gunid

      Store a new audio clip or replace an existing one.

    • deleteAudioClip(/*string*/ $gunid, /*string*/ $sessid)
      returns boolean

      Delete an existing Audio clip.

    • updateAudioClipMetadata(/*string*/ $gunid, /*string*/
      $newMetaData, /*string*/ $sessid)
      returns boolean

      Update the metadata of an Audio clip stored in Local storage.

    • accessRawAudioData(/*string*/ $gunid, /*string*/ $sessid)
      returns string

      Get access to raw audio data of an AudioClip.

    • releaseRawAudioData(/*string*/ $gunid, /*string*/ $sessid)
      returns boolean

      Release access for raw audio data.

    • searchMetadata(/*string*/ $criteria, /*string*/ $sessid)
      returns array

      Search through the metadata of stored AudioClips, and return all
      matching clip ids.


    Methods may return XMLRPC error response if fails ...


    Common parameters:













    $sessid - session id (token returned by login method)
    $gunid - global unique id of file



    XMLRPC API of interface to central archive




    • downloadRawAudioData(/*string*/ $gunid, /*int*/ $offset,
      /*string*/ $sessid)
      returns boolean


    • searchMetadata(/*string*/ $criteria, /*string*/ $sessid)
      returns array

      Search through the metadata of stored AudioClips, and return all
      matching clip ids.




    Main part of PHP API:




    • GreenBox(&$dbc, $config)

      class constructor

    • createFolder($parid, $folderName, $sessid)


    • putFile($parid, $fileName, $mediaFileLP, $mdataFileLP,
      $sessid)


      $mediaFileLP and $mdataFileLP contains local path to media and metadata
      files

    • getFile($id, $sessid)


    • analyzeFile($id, $sessid)


    • access($id, $sessid)


    • moveFile($id, $did, $sessid)


    • copyFile($id, $did, $sessid)


    • deleteFile($id, $sessid)


    • createReplica($id, $did, $replicaName, $sessid)


    • createVersion($id, $did, $versionLabel, $sessid)


    • updateMetadata($id, $mdataFile, $sessid)


    • updateMetadataRecord($id, $mdid, $object, $sessid)


    • addMetaDataRecord($id, $propertyNamespace, $propertyName,
      $propertyValue, $sessid)



    • getMdata($id, $sessid)


    • localSearch($searchData, $sessid)


    • uploadFile($id, $sessid)


    • downloadFile($id, $parid, $sessid)


    • getTransferStatus($transferId, $sessid)


    • globalSearch($searchData, $sessid)


    • getSearchResults($transferId, $sessid)


    • listFolder($id, $sessid)


    • getMetadata($id, $sessid)



    Common parameters:

























    $sessid - session id (token returned by login method)
    $id - id in object tree
    $parid - parent id in object tree
    $mdid - id in metadata table
    $transferId - id of transfer or search job (returned by initiating
    method)



    Connection between local storage and other components


    Standard API call will be realised through XMLRPC interface.

    Large media files will be provided to other components directly using
    temporary
    symlinks (filesystem sharing have to be set up in distributed
    intallation of
    Livesupport).

    HTML interface written in PHP could include and call API methods
    directly.





    P.S.: sorry for my English ... Wink








    --------------070908070506010009030103--

    ------------------------------------------
    Posted to Phorum via PhorumMail