web_caching_etc

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.



balug-talk mailing list

Subject: Re: How do I serialize/persist html w/ links
From: Brian Sroufek <brian_sroufek@msdesigninc.com>
Date: Tue, 18 Apr 2000 09:40:16 -0700

Godfrey Hobbs wrote:

> Hi,
>
> I want to be able to take a html page and allow users to upload it to a
> remote system.  Then the users can view it on the remote system or download
> and view it on the original system.
>
> The problem is the links both relative, absolute and http.  I want images to
> be bundled.  Also I want to be able to handle html errors.
>
> Are there any rfc's on this or any standards.  Is this the same as web
> caching?
>
> ________________________________________________________________________
> This message was sent by the balug-talk mailing list. To unsubscribe:
> echo unsubscribe | mail -s '' balug-talk-request@balug.org

Perl can do this for you dynamically.  Least troublesome approach.

===
Subject: RE: How do I serialize/persist html w/ links
From: "Godfrey Hobbs" <godfrey.hobbs@cubus.net>
Date: Wed, 19 Apr 2000 14:12:57 -0700

But only http-links in document at
>present and no 'bundled' images. These are features in a next version
>which is being developed at present.

Yes the bundling of linked objects, images/other pages/etc, is the challenge
I am facing.

There are two really hard parts described as follows:

	1.Allowing html pages with syntax errors or In-Progess pages to be uploaded
		-hopefully these pages can be contained so as not to allow the errors to
cascade to other elements
		-Also the user text can include html tags (say they are writing a 'how to
build web pages' manual in HTML)which are never modified.  Baically greping
without context won't work
	2.Allowing the user to download thier page an have it still work using the
local links.


	The second part is a particular challenge.  For instances, page has a link
'../../myImage.gif' After upload that link is now xyz123.gif so it can be
viewed without local context.  Now when the user want to down load it the
user expects for the xyz123.gif to revert back to ../../myimage.gif.

	I realize there are lots of ways to hack most of this but I am more
instested in a standard protocol or RFI.  Considering how long HTML has been
around there must be some kinda standard libraries.

I know there are some Web Caching protocols and RFI's.  Is web caching
similiar enough to this to be used??


===

Subject: Re: How do I serialize/persist html w/ links
From: Huub Schuurmans <twasm@aimnet.com>
Date: Wed, 19 Apr 2000 12:55:17 -0700

Brian Sroufek wrote:
> 
> Godfrey Hobbs wrote:
> 
> > I want to be able to take a html page and allow users to upload it to a
> > remote system.  Then the users can view it on the remote system or download
> > and view it on the original system.
> >
> > The problem is the links both relative, absolute and http.  I want images to
> > be bundled.  Also I want to be able to handle html errors.
> >
> > Are there any rfc's on this or any standards.  Is this the same as web
> > caching?
> >
> 
> Perl can do this for you dynamically.  Least troublesome approach.
> 
> Brian
> 

Could you elaborate a bit on what is available already in Perl? 
I use a relatively simple Perl cgi script with allow users to update
webpages. Input via a webform, conversion to html and automatic
adjustments of links on other pages. But only http-links in document at
present and no 'bundled' images. These are features in a next version
which is being developed at present. 
Huub

===

Subject: Re: How do I serialize/persist html w/ links
From: Brian Sroufek <brian_sroufek@msdesigninc.com>
Date: Thu, 20 Apr 2000 14:17:55 -0700

Huub Schuurmans wrote:

> Brian Sroufek wrote:
> >
> > Godfrey Hobbs wrote:

> > > I want to be able to take a html page and allow users
> > > to upload it to a remote system.  Then the users can
> > > view it on the remote system or download and view it
> > > on the original system.
> > >
> > > The problem is the links both relative, absolute and
> > > http.  I want images to be bundled.  Also I want to be
> > > able to handle html errors.
> > >
> > > Are there any rfc's on this or any standards.  Is this
> > > the same as web caching?

> >
> > Perl can do this for you dynamically.  Least troublesome approach.
> >
> > Brian
> >
>
> Could you elaborate a bit on what is available already in Perl?
> I use a relatively simple Perl cgi script with allow users to update
> webpages. Input via a webform, conversion to html and automatic
> adjustments of links on other pages. But only http-links in document at
> present and no 'bundled' images. These are features in a next version
> which is being developed at present.
> Huub
>

Perl can be used quite easily as a script to dynamically
bundle files/images for subsequent download/presentation.

If these are pre-bundled, then Perl can simply present the
user with the appropriate link in an html "FORM"/page which
only needs one link image, or several, but these get
modified based on user selections/profile.

===
Subject: Re: How do I serialize/persist html w/ links
From: Huub Schuurmans <twasm@aimnet.com>
Date: Fri, 21 Apr 2000 08:28:44 -0700

I forward this reaction that I received from Frank Schuurmans 

 > > > > I want to be able to take a html page and allow users to upload
it to a
 > > > > remote system.
 
 This can be done using a form with the a <INPUT TYPE="file"> tag. The
form
 can be processed using CGI.pm which supports this tag. I don't know the
 users but it is generally a bad idea to allow unchecked/unfiltered
HTML/Scripts
 to be uploaded.
 
 > > > > Then the users can view it on the remote system or download
 > > > > and view it on the original system.
 
 Every web browser I know supports those requirements :)
 
 > > > > The problem is the links both relative, absolute and http.
 
 absolute URLs (starting with http/ftp/gopher/news/telnet) are no
problem ,
 if a document contains relative URLs, you
 just have to ask the user to upload the file the relative URL points to
 (and rewrite the URL if you rename the file).
 
 > > > > I want images to
 > > > > be bundled.
 
 Images can be handled like URLs
 
 > > > > Also I want to be able to handle html errors.
 
 You kan use a HTML validator to check if a HTML file contains errors
 (and probably most uploaded files will contain errors). Some errors
 can be corrected by software. Don't expect to much: AI is not
 advanced enough yet ;).
 
 > > > > Are there any rfc's on this or any standards.
 
 World Wide Web Consortium (http://www.w3.org/) for HTML standards
 URI rfc : http://www.ietf.org/rfc/rfc1630.txt
 URL rfc : http://www.ietf.org/rfc/rfc1808.txt
 
 relevant Perl modules:
 
 URI::URL
(http://search.cpan.org/doc/RSE/lcwa-1.0.0/lib/lwp/lib/URI/URL.pm)
 HTML::Parser (http://search.cpan.org/search?dist=HTML-Parser)
 CGI.pm (http://search.cpan.org/search?dist=CGI.pm)
 
 To handle links and images you can use HTML::LinkExtor (subclass
HTML::Parser)

===

Subject: RE: How do I serialize/persist html w/ links
From: "Godfrey Hobbs" <godfrey.hobbs@cubus.net>
Date: Fri, 21 Apr 2000 10:59:08 -0700

Thanks for the info Huub.

I wasn't planning to involve a borwer in the upload process but something
more like when Netscape publishes to the web.

Anyway, I found rfc2557, "MIME Encapsulation of Aggregate Documents, such as
HTML (MHTML)", is pretty good.  I talks about many of the issues, such as
invalid links, that I am facing.
http://www.ietf.org/rfc/rfc2557.txt
Thanks

===



the rest of The Pile (a partial mailing list archive)

doom@kzsu.stanford.edu