This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
balug-talk mailing list Subject: Re: How do I serialize/persist html w/ links From: Brian Sroufek <brian_sroufek@msdesigninc.com> Date: Tue, 18 Apr 2000 09:40:16 -0700 Godfrey Hobbs wrote: > Hi, > > I want to be able to take a html page and allow users to upload it to a > remote system. Then the users can view it on the remote system or download > and view it on the original system. > > The problem is the links both relative, absolute and http. I want images to > be bundled. Also I want to be able to handle html errors. > > Are there any rfc's on this or any standards. Is this the same as web > caching? > > ________________________________________________________________________ > This message was sent by the balug-talk mailing list. To unsubscribe: > echo unsubscribe | mail -s '' balug-talk-request@balug.org Perl can do this for you dynamically. Least troublesome approach. === Subject: RE: How do I serialize/persist html w/ links From: "Godfrey Hobbs" <godfrey.hobbs@cubus.net> Date: Wed, 19 Apr 2000 14:12:57 -0700 But only http-links in document at >present and no 'bundled' images. These are features in a next version >which is being developed at present. Yes the bundling of linked objects, images/other pages/etc, is the challenge I am facing. There are two really hard parts described as follows: 1.Allowing html pages with syntax errors or In-Progess pages to be uploaded -hopefully these pages can be contained so as not to allow the errors to cascade to other elements -Also the user text can include html tags (say they are writing a 'how to build web pages' manual in HTML)which are never modified. Baically greping without context won't work 2.Allowing the user to download thier page an have it still work using the local links. The second part is a particular challenge. For instances, page has a link '../../myImage.gif' After upload that link is now xyz123.gif so it can be viewed without local context. Now when the user want to down load it the user expects for the xyz123.gif to revert back to ../../myimage.gif. I realize there are lots of ways to hack most of this but I am more instested in a standard protocol or RFI. Considering how long HTML has been around there must be some kinda standard libraries. I know there are some Web Caching protocols and RFI's. Is web caching similiar enough to this to be used?? === Subject: Re: How do I serialize/persist html w/ links From: Huub Schuurmans <twasm@aimnet.com> Date: Wed, 19 Apr 2000 12:55:17 -0700 Brian Sroufek wrote: > > Godfrey Hobbs wrote: > > > I want to be able to take a html page and allow users to upload it to a > > remote system. Then the users can view it on the remote system or download > > and view it on the original system. > > > > The problem is the links both relative, absolute and http. I want images to > > be bundled. Also I want to be able to handle html errors. > > > > Are there any rfc's on this or any standards. Is this the same as web > > caching? > > > > Perl can do this for you dynamically. Least troublesome approach. > > Brian > Could you elaborate a bit on what is available already in Perl? I use a relatively simple Perl cgi script with allow users to update webpages. Input via a webform, conversion to html and automatic adjustments of links on other pages. But only http-links in document at present and no 'bundled' images. These are features in a next version which is being developed at present. Huub === Subject: Re: How do I serialize/persist html w/ links From: Brian Sroufek <brian_sroufek@msdesigninc.com> Date: Thu, 20 Apr 2000 14:17:55 -0700 Huub Schuurmans wrote: > Brian Sroufek wrote: > > > > Godfrey Hobbs wrote: > > > I want to be able to take a html page and allow users > > > to upload it to a remote system. Then the users can > > > view it on the remote system or download and view it > > > on the original system. > > > > > > The problem is the links both relative, absolute and > > > http. I want images to be bundled. Also I want to be > > > able to handle html errors. > > > > > > Are there any rfc's on this or any standards. Is this > > > the same as web caching? > > > > Perl can do this for you dynamically. Least troublesome approach. > > > > Brian > > > > Could you elaborate a bit on what is available already in Perl? > I use a relatively simple Perl cgi script with allow users to update > webpages. Input via a webform, conversion to html and automatic > adjustments of links on other pages. But only http-links in document at > present and no 'bundled' images. These are features in a next version > which is being developed at present. > Huub > Perl can be used quite easily as a script to dynamically bundle files/images for subsequent download/presentation. If these are pre-bundled, then Perl can simply present the user with the appropriate link in an html "FORM"/page which only needs one link image, or several, but these get modified based on user selections/profile. === Subject: Re: How do I serialize/persist html w/ links From: Huub Schuurmans <twasm@aimnet.com> Date: Fri, 21 Apr 2000 08:28:44 -0700 I forward this reaction that I received from Frank Schuurmans > > > > I want to be able to take a html page and allow users to upload it to a > > > > remote system. This can be done using a form with the a <INPUT TYPE="file"> tag. The form can be processed using CGI.pm which supports this tag. I don't know the users but it is generally a bad idea to allow unchecked/unfiltered HTML/Scripts to be uploaded. > > > > Then the users can view it on the remote system or download > > > > and view it on the original system. Every web browser I know supports those requirements :) > > > > The problem is the links both relative, absolute and http. absolute URLs (starting with http/ftp/gopher/news/telnet) are no problem , if a document contains relative URLs, you just have to ask the user to upload the file the relative URL points to (and rewrite the URL if you rename the file). > > > > I want images to > > > > be bundled. Images can be handled like URLs > > > > Also I want to be able to handle html errors. You kan use a HTML validator to check if a HTML file contains errors (and probably most uploaded files will contain errors). Some errors can be corrected by software. Don't expect to much: AI is not advanced enough yet ;). > > > > Are there any rfc's on this or any standards. World Wide Web Consortium (http://www.w3.org/) for HTML standards URI rfc : http://www.ietf.org/rfc/rfc1630.txt URL rfc : http://www.ietf.org/rfc/rfc1808.txt relevant Perl modules: URI::URL (http://search.cpan.org/doc/RSE/lcwa-1.0.0/lib/lwp/lib/URI/URL.pm) HTML::Parser (http://search.cpan.org/search?dist=HTML-Parser) CGI.pm (http://search.cpan.org/search?dist=CGI.pm) To handle links and images you can use HTML::LinkExtor (subclass HTML::Parser) === Subject: RE: How do I serialize/persist html w/ links From: "Godfrey Hobbs" <godfrey.hobbs@cubus.net> Date: Fri, 21 Apr 2000 10:59:08 -0700 Thanks for the info Huub. I wasn't planning to involve a borwer in the upload process but something more like when Netscape publishes to the web. Anyway, I found rfc2557, "MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)", is pretty good. I talks about many of the issues, such as invalid links, that I am facing. http://www.ietf.org/rfc/rfc2557.txt Thanks ===