[campsite-dev] questions: HTMLArea
  • i posted this on the wrong list, sorry. i guess it belongs to the dev list,
    in case someone here has any ideas on what to do. the issues: using
    HTMLArea instead of campfire as the wysiwyg text editor in campsite.

    ---

    i have a question concerning the copy and paste into HTMLArea. for those
    with HTML email, vie source from now on, if this does not come out correctly.

    the following workflow:

    --------------------------------------
    1. i open MS word (yes, i do Wink

    --------------------------------------
    2. i type the following with default font settings:

    "This is to see what HTMLArea does with the Word font style specs when
    copying and pasting straight from document to document.

    I added a paragraph break here by pressing 'enter' twice.
    And now only once.
    IMPORTANT: I italicised the words 'HTMLArea' and 'Word' and made 'font
    style specs' bold."

    --------------------------------------
    3. i copy and paste straight into HTMLArea. then i click on the '<>' button
    to see the sources and this is what i get:

    style="mso-ansi-language: EN-GB">This is to see what style="mso-bidi-font-style: normal">HTMLArea does with the style="mso-bidi-font-style: normal">Word style="mso-bidi-font-weight: normal">font style specs when copying and
    pasting straight from document to document.

    class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"> style="mso-ansi-language: EN-GB">

    style="MARGIN: 0cm 0cm 0pt">I added a paragraph break here by pressing 'enter'
    twice.

    And now only
    once.

    lang="EN-GB" style="mso-ansi-language: EN-GB">IMPORTANT: I italicised the
    words 'HTMLArea' and 'Word' and made 'font style specs' bold.

    />



    --------------------------------------

    this is way too much HTML coming from word. even if we would clean it up
    with a regular expression like
    "/(style|face|class|size)=(\'|\")(.|\n)*?(\'|\")/i"
    we would be left with tooooo much HTML. if i use the regex, the source
    still comes out like this, but already looks much, much better:

    This is to see what HTMLArea does with the
    Word font style specs when copying and pasting straight from
    document to document.

    />

    I added a paragraph break here by pressing 'enter'
    twice.

    And now only once.

    />

    IMPORTANT: I italicised the words 'HTMLArea' and
    'Word' and made 'font style specs' bold.



    any ideas? could we pipe it through Tidy HTML? but that's not a php/pear
    class, but a compiled PECL class (for good reason, i guess Wink
    http://pecl.php.net/package/tidy

    huston, i think we got a problem... if we don't find a way to limit the
    amount of info that comes from word into HTMLArea.

    what do the others think?



    Micz Flor - micz@mi.cz

    content and media development http://mi.cz
    -----------------------------------------------------------------
    http://www.campware.org -- http://crash.mi.cz -- http://sue.mi.cz
    "Das kommt in beiden Faellen Blau. Von Lila keine Spur."
    (Heike Bruysten)
    -----------------------------------------------------------------

    ------------------------------------------
    Posted to Phorum via PhorumMail
  • 2 Comments sorted by
  • You might want to look at this

    http://www.fourmilab.ch/webtools/demoroniser/

    JP

    Micz Flor wrote:

    > i posted this on the wrong list, sorry. i guess it belongs to the dev
    > list, in case someone here has any ideas on what to do. the issues:
    > using HTMLArea instead of campfire as the wysiwyg text editor in
    > campsite.
    >
    > ---
    >
    > i have a question concerning the copy and paste into HTMLArea. for
    > those with HTML email, vie source from now on, if this does not come
    > out correctly.
    >
    > the following workflow:
    >
    > --------------------------------------
    > 1. i open MS word (yes, i do Wink
    >
    > --------------------------------------
    > 2. i type the following with default font settings:
    >
    > "This is to see what HTMLArea does with the Word font style specs when
    > copying and pasting straight from document to document.
    >
    > I added a paragraph break here by pressing 'enter' twice.
    > And now only once.
    > IMPORTANT: I italicised the words 'HTMLArea' and 'Word' and made 'font
    > style specs' bold."
    >
    > --------------------------------------
    > 3. i copy and paste straight into HTMLArea. then i click on the '<>'
    > button to see the sources and this is what i get:
    >
    >

    > style="mso-ansi-language: EN-GB">This is to see what > style="mso-bidi-font-style: normal">HTMLArea does with the > style="mso-bidi-font-style: normal">Word > style="mso-bidi-font-weight: normal">font style specs when copying
    > and pasting straight from document to document.

    > class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"> > style="mso-ansi-language: EN-GB">

    > style="MARGIN: 0cm 0cm 0pt"> > style="mso-ansi-language: EN-GB">I added a paragraph break here by
    > pressing 'enter' twice.

    > style="MARGIN: 0cm 0cm 0pt"> > style="mso-ansi-language: EN-GB">And now only once.

    > class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"> > style="mso-ansi-language: EN-GB">IMPORTANT: I italicised the words
    > 'HTMLArea' and 'Word' and made 'font style specs' bold.

    > />


    >
    > --------------------------------------
    >
    > this is way too much HTML coming from word. even if we would clean it
    > up with a regular expression like
    > "/(style|face|class|size)=(\'|\")(.|\n)*?(\'|\")/i"
    > we would be left with tooooo much HTML. if i use the regex, the source
    > still comes out like this, but already looks much, much better:
    >
    >

    This is to see what HTMLArea does with
    > the Word font style specs when copying and pasting
    > straight from document to document.

    > lang="EN-GB">

    I added a paragraph
    > break here by pressing 'enter' twice.

    > lang="EN-GB">And now only once.

    > lang="EN-GB">IMPORTANT: I italicised the words 'HTMLArea' and 'Word'
    > and made 'font style specs' bold.


    >
    > any ideas? could we pipe it through Tidy HTML? but that's not a
    > php/pear class, but a compiled PECL class (for good reason, i guess Wink
    > http://pecl.php.net/package/tidy
    >
    > huston, i think we got a problem... if we don't find a way to limit
    > the amount of info that comes from word into HTMLArea.
    >
    > what do the others think?
    >
    >
    >
    > Micz Flor - micz@mi.cz
    >
    > content and media development http://mi.cz
    > -----------------------------------------------------------------
    > http://www.campware.org -- http://crash.mi.cz -- http://sue.mi.cz
    > "Das kommt in beiden Faellen Blau. Von Lila keine Spur."
    > (Heike Bruysten)
    > -----------------------------------------------------------------
    >
    >

    ------------------------------------------
    Posted to Phorum via PhorumMail
  • Also this:

    http://www.microsoft.com/downloads/details.aspx?FamilyID=209ADBEE-3FBD-482C-83B0-96FB79B74DED&displaylang=EN

    I got these links from

    http://philip.greenspun.com/wtr/word.html

    and it looks like there is more there too

    JP

    John Pye wrote:

    > You might want to look at this
    >
    > http://www.fourmilab.ch/webtools/demoroniser/
    >
    > JP
    >
    > Micz Flor wrote:
    >
    >> i posted this on the wrong list, sorry. i guess it belongs to the dev
    >> list, in case someone here has any ideas on what to do. the issues:
    >> using HTMLArea instead of campfire as the wysiwyg text editor in
    >> campsite.
    >>
    >> ---
    >>
    >> i have a question concerning the copy and paste into HTMLArea. for
    >> those with HTML email, vie source from now on, if this does not come
    >> out correctly.
    >

    ------------------------------------------
    Posted to Phorum via PhorumMail