This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
From: kmself@ix.netcom.com Date: Tue, 19 Dec 2000 15:14:22 -0800 To: Silicon Valley Users Group <svlug@svlug.org> Subject: [svlug] MS Word 2000 HTML rationalizer? Cross-platform documentation project, partner in crime uses MS Word. "HTML" output does not print in Netscape 4.7x, crashes Mozilla, is printable (sans formatting) from Lynx and w3m, which may be a blessing in disguise. I'm familiar with John Walker's demoroniser, however Word2k appears to have taken noncompliance to entirely new hights. Is anyone familier with a postprocessor which will dump rational, simple, HTML from Word2K output? === Date: Tue, 19 Dec 2000 16:28:25 -0800 From: Brian Bilbrey <bilbrey@orbdesigns.com> To: kmself@ix.netcom.com Cc: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? On Tue, Dec 19, 2000 at 03:14:22PM -0800, kmself@ix.netcom.com wrote: [snip] > Is anyone familier with a postprocessor which will dump rational, > simple, HTML from Word2K output? You might try importing the W2K output into SO (aka OpenOffice) - while it doesn't print, it does allow you to re-export in HTML. I haven't tried this specifically, but was looking at the SO / O2K compatibility stuff a few weeks back, and found that the W2K documents imported fairly well (though there are a couple of hinks with tables??? or was that KWord that had such problems with the tables? Hmmm.) I certainly cain't vouch for the compliance or non- of the SO HTML output, but it can't be worse, can it? .brian -- bilbrey@orbdesigns.com www.orbdesigns.com "You have not experienced Shakespeare until you have read it in the original Klingon." Gorkon: Stardate 9522.6, STVI === Date: Tue, 19 Dec 2000 16:28:16 -0800 From: Aaron Lehmann <aaronl@vitelus.com> To: kmself@ix.netcom.com Cc: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? On Tue, Dec 19, 2000 at 03:14:22PM -0800, kmself@ix.netcom.com wrote: > Is anyone familier with a postprocessor which will dump rational, > simple, HTML from Word2K output? Have you tried wvHTML? There's a CGI version at http://www.freeviewer.com/. === From: kmself@ix.netcom.com Date: Tue, 19 Dec 2000 16:33:33 -0800 To: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? Brian Bilbrey (bilbrey@orbdesigns.com) wrote: > On Tue, Dec 19, 2000 at 03:14:22PM -0800, kmself@ix.netcom.com wrote: > [snip] > > Is anyone familier with a postprocessor which will dump rational, > > simple, HTML from Word2K output? > You might try importing the W2K output into SO (aka OpenOffice) - > while it doesn't print, it does allow you to re-export in HTML. I > haven't tried this specifically, but was looking at the SO / O2K > compatibility stuff a few weeks back, and found that the W2K > documents imported fairly well (though there are a couple of hinks > with tables??? or was that KWord that had such problems with the > tables? Hmmm.) > I certainly cain't vouch for the compliance or non- of the SO HTML > output, but it can't be worse, can it? Good idea....but. I'm not sure if it's worse. It's certainly not much (if any) better. === Date: Tue, 19 Dec 2000 16:58:06 -0800 To: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? From: Rick Moen <rick@linuxmafia.com> begin Brian Bilbrey quotation: > I certainly cain't vouch for the compliance or non- of the SO HTML > output, but it can't be worse, can it? It's pretty brain-dead. I had to do a tremendous amount of post-StarOffice pruning, on a MS-Word2k document I recently tried that with. It's http://linuxmafia.com/pub/jordan/Humor/abridged.html , actually. Pity I threw away the StarOffice-generated HTML mess it started out being: It was really wretched. === Date: Tue, 19 Dec 2000 17:16:38 -0800 From: Brian Bilbrey <bilbrey@orbdesigns.com> To: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? On Tue, Dec 19, 2000 at 04:58:06PM -0800, Rick Moen wrote: > begin Brian Bilbrey quotation: > > > I certainly cain't vouch for the compliance or non- of the SO HTML > > output, but it can't be worse, can it? > > It's pretty brain-dead. I had to do a tremendous amount of > post-StarOffice pruning, on a MS-Word2k document I recently tried that > with. It's http://linuxmafia.com/pub/jordan/Humor/abridged.html , > actually. Pity I threw away the StarOffice-generated HTML mess it > started out being: It was really wretched. I can draw a line with those two data points. Bad idea discarded. Be interested to hear of any successes - as I migrate more functions at work to Linux, I start to get the inquiries about how we might go about going whole-hog away from MS, while retaining the ability to collaborate with our customers. Tom and I also found this challenging when working with IDG... Hmmm. === Date: Tue, 19 Dec 2000 17:24:51 -0800 To: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? From: Rick Moen <rick@linuxmafia.com> begin Brian Bilbrey quotation: > I can draw a line with those two data points. Bad idea discarded. Be > interested to hear of any successes - as I migrate more functions at > work to Linux, I start to get the inquiries about how we might go > about going whole-hog away from MS, while retaining the ability to > collaborate with our customers. Don't forget: Microsoft is the company that did its best to sabotage the RTF format, when it discovered that far too many people were using it for meaningful formatted-text compatiblity. (I seem to recall them doing this in MS Office 4.2, but it could have been Office 95.) === From: duperron@charter.net (Vince Duperron) Subject: Re: [svlug] MS Word 2000 HTML rationalizer? To: kmself@ix.netcom.com Date: Tue, 19 Dec 2000 19:17:14 -0600 (CST) Cc: svlug@svlug.org (Silicon Valley Users Group) Hello; This isn't quite on topic (but close). Have you checked out http://www.antiword.org ? === Date: Tue, 19 Dec 2000 18:03:10 -0800 From: hvrietsc@yahoo.com To: Brian Bilbrey <bilbrey@orbdesigns.com> Cc: kmself@ix.netcom.com, Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? i've had some good results along this line: on windoze:create with word rest on linux: load .doc into staroffice save as html netscape seems to render it just fine. === Date: Wed, 20 Dec 2000 02:48:05 -0500 From: Bill Jonas <bill@billjonas.com> To: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? On Tue, Dec 19, 2000 at 04:58:06PM -0800, Rick Moen wrote: > It's pretty brain-dead. I had to do a tremendous amount of > post-StarOffice pruning, on a MS-Word2k document I recently tried that > with. It's http://linuxmafia.com/pub/jordan/Humor/abridged.html , Hmm, I had to create a resume in the company template for my new job. Of course, the attachment was one of those dreaded ".dot" files. The Word template itself was nothing fancy, so YMMV, but AbiWord imported "pretty okay" and the HTML output was fairly reasonable HTML. AbiWord's starting to get pretty not bad. (The epilogue is that HTML that looked "close" wasn't good enough, and I had to borrow someone's machine to do it in Word, if you're interested. <rant>You'd think that an Internet consulting company that wanted resumes to show potential clients would leverage the power of the 'net itself and, say, put HTML versions in a password-protected area of the site, and create a password for a client so they could peruse them online...</rant>) === Date: Wed, 20 Dec 2000 12:58:41 -0800 (PST) From: Deirdre Saoirse <deirdre@deirdre.net> To: Brian Bilbrey <bilbrey@orbdesigns.com> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? On Tue, 19 Dec 2000, Brian Bilbrey wrote: > You might try importing the W2K output into SO (aka OpenOffice) - > while it doesn't print, it does allow you to re-export in HTML. I > haven't tried this specifically, but was looking at the SO / O2K > compatibility stuff a few weeks back, and found that the W2K documents > imported fairly well (though there are a couple of hinks with > tables??? or was that KWord that had such problems with the tables? > Hmmm.) Unfortunately, one of the issues we discovered at the office was this scenario: 1) User saves a doc in Word. 2) User makes changes, which are fast-saved. 3) Doc is imported into Star Office. The problem is that you'll more likely see doc #1 than doc #2. And, as I'm about to start a Master's degree in creative writing, and as everyone sends their documents in Word and as realistic critiques are a significant part of my grade.... I am going to be using MacOS on the desktop -- with Word -- for the next two years. Feel my pain. I myself will be using my ancient, but still personal favorite, Word 5.1a to format my own documents -- after composing and editing them in html on bbedit (yes, I use CVS for revision control on fiction and prefer html for that). For one thing, Star Office, for all its faults, has one that is particularly annoying: it is incapable of printing headers and footers in the standard manuscript format. At least I'll be able to run MacOS X and get *some* of the advantages of BSD out of the experience. === Date: Wed, 20 Dec 2000 14:15:31 -0800 To: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? From: Rick Moen <rick@linuxmafia.com> begin Dire Red quotation: > Unfortunately, one of the issues we discovered at the office was this > scenario: > > 1) User saves a doc in Word. > 2) User makes changes, which are fast-saved. > 3) Doc is imported into Star Office. > > The problem is that you'll more likely see doc #1 than doc #2. Kill the [l]user. Problem solved. === To: Bill Jonas <bill@billjonas.com>, Subject: Re: [svlug] MS Word 2000 HTML rationalizer? Date: Wed, 20 Dec 2000 15:18:31 -0800 From: J C Lawrence <claw@kanga.nu> On Wed, 20 Dec 2000 02:48:05 -0500 Bill Jonas <bill@billjonas.com> wrote: > <rant>You'd think that an Internet consulting company that wanted > resumes to show potential clients would leverage the power of the > 'net itself and, say, put HTML versions in a password-protected > area of the site, and create a password for a client so they could > peruse them online...</rant>) I follow a simple rule: I don't work for or with companies that either require MS-based files (as versus say flat text, PDF, or NTML), or, more simply, which consider their time so much more valuable than mine. === Date: Wed, 20 Dec 2000 16:12:38 -0800 (PST) From: Deirdre Saoirse <deirdre@deirdre.net> To: Rick Moen <rick@linuxmafia.com> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? On Wed, 20 Dec 2000, Rick Moen wrote: > begin Dire Red quotation: > > > Unfortunately, one of the issues we discovered at the office was this > > scenario: > > > > 1) User saves a doc in Word. > > 2) User makes changes, which are fast-saved. > > 3) Doc is imported into Star Office. > > > > The problem is that you'll more likely see doc #1 than doc #2. > > Kill the [l]user. Problem solved. When the [l]user in question is a customer who wants to spend money, it's not so easily rationalised. :) While our company uses a lot of Linux, almost none of our customers do. Also, outside of Engineering, you rarely see Linux on the desktop. === Date: Wed, 20 Dec 2000 16:14:54 -0800 From: Rick Moen <rick@linuxmafia.com> To: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? begin Dire Red quotation: > When the [l]user in question is a customer who wants to spend money, it's > not so easily rationalised. :) Take his money, and _then_ kill him. See .signature block. -- Cheers, The Viking's Reminder: Rick Moen Pillage first, _then_ burn. rick@linuxmafia.com === From: kmself@ix.netcom.com Date: Thu, 21 Dec 2000 00:52:05 -0800 To: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? kmself@ix.netcom.com (kmself@ix.netcom.com) wrote: > Cross-platform documentation project, partner in crime uses MS Word. > "HTML" output does not print in Netscape 4.7x, crashes Mozilla, is > printable (sans formatting) from Lynx and w3m, which may be a blessing > in disguise. > I'm familiar with John Walker's demoroniser, however Word2k appears to > have taken noncompliance to entirely new hights. > Is anyone familier with a postprocessor which will dump rational, > simple, HTML from Word2K output? From the DocBook mailing list, kudos to Dave Pawson for suggesting 'tidy'. It has a somewhat eccentric arguments syntax -- you apparently *have* to feed it a config file -- but it nicely trimmed all the crap out of the monster which had landed on my doorstop. === Date: Fri, 22 Dec 2000 20:33:13 -0800 (PST) From: fdj <mrlocomojo@yahoo.com> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? To: Silicon Valley Users Group <svlug@svlug.org> fyi - HTML Tidy is a wonderful little command line tool that can clean up and convert your html. Tidy is endorsed by the w3c. It does have an option that can be placed in a .rc-style file or invoked on the command line to clean up word2000 documents. From the html Tidy page < http://www.w3.org/People/Raggett/tidy/ >: word-2000: bool If set to yes, Tidy will go to great pains to strip out all the surplus stuff Microsoft Word 2000 inserts when you save Word documents as "Web pages". The default is no. Note that Tidy doesn't yet know what to do with VML markup from Word, but in future I hope to be able to map VML to SVG. The above would be invoked as: tidy --word-2000 true msdoc.html > gooddoc.html This will not only correct broken html (ala msword or hand-coded hanging tags, open tags), it will produce warnings about recommended standards that are not complied with, such as using and ALT tag with an IMG. No custom setup files are required. Tidy will also do pretty-printing with indenting, making all your tags the same case, etc.... It has limited support for php, and facilitates creating custom tags. Finally, tidy is an excellent tool to aid you in the move from html to xml, as it has options to produce both xml and xhtml from html documents. I realize that someone else on the list mentioned tidy, but I'm not sure they did it justice. It is an excellent html validator, and a whole lot more. === From: kmself@ix.netcom.com Date: Sat, 23 Dec 2000 00:29:47 -0800 To: Silicon Valley Users Group <svlug@svlug.org> Subject: Re: [svlug] MS Word 2000 HTML rationalizer? fdj (mrlocomojo@yahoo.com) wrote: > fyi > HTML Tidy is a wonderful little command line tool > that can clean up and convert your html. Tidy is > endorsed by the w3c. It does have an option that can > be placed in a .rc-style file or invoked on the > command line to clean up word2000 documents. From the > html Tidy page < > http://www.w3.org/People/Raggett/tidy/ >: Found it, commented earlier. It did a bang-up job on one file. On the second, not only does MS HTML not validate, but it kills the validator. Go figure. To appear shortly at an ecommerce vendor's release near you: either DocBook generated materials, or something which looks suspiciously as if it's been passed through w3m and pr. Now how would I know that....? === To: Erik Steffl <steffl@bigfoot.com> Subject: Re: [svlug] PS2 -> USB Date: Sat, 23 Dec 2000 11:15:30 -0800 From: J C Lawrence <claw@kanga.nu> Erik Steffl <steffl@bigfoot.com> wrote: > neither does logitech, for some reason they make USB wireless > keyboard&mouse combo but urge you to use provided USB->ps/2 > convertors:-) kinda strange. Happily I've found there are several vendors of USB>-PS/2 converters ala: http://www.provantage.com/FP_48274.HTM Suggesting that my model Ms and I will be able to happily survive the move to USB. ===