NYCPHP Meetup

NYPHP.org

[nycphp-talk] Some comments on the XML Talk

Elliotte Harold elharo at metalab.unc.edu
Thu Nov 1 07:21:05 EDT 2007


Kenneth Downs wrote:

> Finally, I would have liked to hear more of Rusty's ideas about the 
> relationship between the file system, the web server, and the database.  
> Rusty, do you want to expand on that here?
> 

Well in most applications, the database stores its data in the file 
system. However it's just one or a few files. The structure is inside 
the files, just as it is with MySQL. The file system is just a 
convenient interface to the hard drive. I suppose it's possible a big 
XML DB might talk to the hard drive directly and by pass the file 
system, just as Oracle does sometimes, but that's an implementation detail.

The web server is the part I'm still thinking about. In practice today 
the web server is designed as an interface to the file system. URLs are 
converted into paths which are used to serve files. Sometimes those 
files are further processed by PHP or similar tools and what's served 
isn't quite what's in the file. Sometimes we use mod_rewrite or similar 
tools to remap some URLs to different file paths. However the basic 
design is that the URL structure mirrors one or more file system 
hierarchies, and everything's layered on top of that.

However, I'm starting to uncover a lot of applications where this 
URL==filesystem design doesn't work very well. I want to map URLs to 
something other than filesystems; for instance to database queries and 
templates. I've been building one such system lately as an internal 
controller for another application. All URLs are served by invoking 
certain methods in a running program. It's a special purpose system, but 
it's one for which the file system doesn't make sense.

I'm considering how one might genericize such a system. That is, what 
would a general purpose web server that doesn't necessarily serve files 
look like? How would one configure it, and tell it what to serve for 
each URL requested? How does one tell it that http://www.example.com/foo 
is a file but http://www.example.com/bar is a database query? Existing 
solutions like PHP, Java servlets, and mod_rewrite are too inflexible 
for what I envision. They're also too hard to use and too confusing. 
That may be partially a result of poor design, but I suspect it's mostly 
because they still implicitly assume that what we're doing is serving a 
file system with a few small tweaks. Perhaps we can do better if we get 
rid of the assumption that there must be a file system in place.

I don't have an answer yet. I'm mostly just musing on some 
possibilities, and letting the ideas cook in my head for now. The tricky 
bit is figuring out how to design this so that there aren't a lot of 
confusing precedence rules for resolving conflicts between different 
mappings, while still allowing arbitrary mappings. For instance, one 
should be able to say that http://www.example.com/foo/bar/baz1 through 
http://www.example.com/foo/bar/baz100 are all database queries except 
for http://www.example.com/foo/bar/baz23 which is a static file, or that 
http://www.example.com/foo/baz1 through 
http://www.example.com/foo/baz100 are database queries unless there's a 
static 23.html file in directory /baz, in which case that should be used 
instead.

It's possible I'm being too demanding. There may be a really clean 80/20 
cut somewhere, but so far I don't see it. I may need to build a few more 
applications along these lines first, just to see which features are 
really needed and which are just paint in the lilies. In any case, I 
don't have the answer yet, just the question.

This is orthogonal to the issue of whether the backend is an XML DB, a 
SQL DB, or something else.


-- 
Elliotte Rusty Harold  elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/



More information about the talk mailing list