NYCPHP Meetup

NYPHP.org

[nycphp-talk] fetch page script

Hans Zaunere zaunere at yahoo.com
Wed Jan 29 16:29:29 EST 2003


Hello,

--- Carlos A Hoyos <cahoyos at us.ibm.com> wrote:
>
> Hi,
> I need to write a function that given a URL, will open it and "rip it" to
> my local server, (i.e. save the page's code to a specific directory, queue
> all of the page's assets -images, css, js, etc- save them to that directory
> as well and fix all paths to make it available for local browsing).

Connecting and ripping pages isn't a big deal, but there may be some added
complexity to consider:

-- Are cookies involved?  Will the remote server give you the content you
expect?

-- Parsing through the code for all required assets I think would be horrible
(especially trying to deal with JS, etc. in real complex pages).  Same goes
for fixing paths and the like.  Then again, I despise client-side code, so
I'm probably biased.

> Not really a big one, but if somebody already knows about a script that
> does this, I wouldn't mind the head start.

I'm sure something at least to start with exists; try
http://phpclasses.mirrors.nyphp.org maybe?  or hotscripts.com?

Also, maybe the good old wget command-line tool would come in handy.

> Lastly, it was great to have meet some of you last night... great meeting.

Good seeing you again Carlos,


=====
Hans Zaunere
President, New York PHP
http://nyphp.org
hans at nyphp.org



More information about the talk mailing list