NYCPHP Meetup

NYPHP.org

[nycphp-talk] Translating XML to SQL

Eric Rank erank at isthmus.com
Thu Feb 12 10:34:59 EST 2004


After I went home last night my wheels were still spinning about this idea,
and I started re-thinking the design from the ground up, using what I've
learned from this project so far. Currently the use of rulesets makes the
design a little bit too complex. The ruleset itself consists of a list of
tables in the database, within it are children tags containing <if> and
<else> tags. Within those condition tags are statements that would need to
occur to get the value for a specific column. This gets extremely messy and
confusing. Not only do you have to figure out what column relies on the
inserted id from another, but you also have to know what kind of SQL
statements to make. Confused? yeah, me too.

After thinking about it, all this ruleset is, is a combination of
XML-to-table.column mappings and primary key-foreign key pairs. Combining
them gets sloppy, but seperating them seems the way to go. It's much easier
to write 2 seperate files. 1. an xml file with Primary key - Foreign key
pairs and 2. an xml-to-table.column map xml file. I can also say that it'd
be much easier to write a parser for those files. Right now, parsing through
a ruleset xml file with pseudo logic embedded is ugly. Parsing this
psuedo-logic xml is half the battle. Once it has been incorporated into the
rule object, that rule object has to follow that logic. Yuck.

So, since I was off the clock, I cracked open a beer and started putting
together a new design for this project (beer, I find, is the only way to
really justify thinking about work stuff when I'm at home). A rough UML
diagram is here:

http://lo-fi.net/~eric/default.gif

All the objects are there, all the methods and properties are not.

The idea is this, the xml translator has a database property which holds
regular database stuff, but also an xml-to-table.column map and a primary
key to foreign key map. The tables hold record objects. the record objects
are created when the data file gets parsed by the data xml parser. When the
parser gets to an element that holds data, it looks at the xml-db map on the
database object, and knows what table to create a record for. After the data
has all been parsed into record objects for the tables, it's time to insert
the records. At that point, the translator goes through all the tables,
making inserts for records that have not yet been saved, ignoring those that
already exist. If an incomplete record is found during this process, where
an id is needed from a different record insertion, the translator->database
object knows where look by virtue of it's Primary Key - Foreign Key map. If
the id that we need is there, great, grab it and put it in the record. If
not, try to insert the record (or SELECT the record ID if it already exists,
eg. an authors name) for the table that we need the id from. Then you can go
back and save the first record. This process would need to be a recursive
one.

What kind of problems do you think I might run into using a design like
this? Is anything missing?

I'll consider the article idea. At this point, I've done enough head
scratching that I think it'd be nice to save anyone else the grief. I'll
work on collecting and writing up my experiences.

Eric Rank



----- Original Message ----- 
From: "Hans Zaunere" <hans not junk at nyphp.com>
To: "NYPHP Talk" <talk at lists.nyphp.org>
Sent: Wednesday, February 11, 2004 5:00 PM
Subject: RE: [nycphp-talk] Translating XML to SQL


>
> This is interesting, Eric.  If you write your emails anywhere as complete
as you'd write an article, this might make a good Advanced PHundamentals  :)
>
> H
>
> > -----Original Message-----
> > From: talk-bounces at lists.nyphp.org
> > [mailto:talk-bounces at lists.nyphp.org] On Behalf Of Eric Rank
> > Sent: Wednesday, February 11, 2004 5:40 PM
> > To: NYPHP Talk
> > Subject: Re: [nycphp-talk] Translating XML to SQL
> >
> > Thanks for your suggestions Dan (other Dan too). I should
> > apologize for
> > being a little vague in my previous post about the details of
> > the project
> > I'm working on. I mainly wanted to see if any of y'all had
> > faced similar
> > XML->SQL challenges and if there might be packaged solution
> > was out there to
> > solve it.
> >
> > Parsing the XML and encapsulating it into objects is the easy
> > part in this
> > project. It's trying to unpack it all that gets a little screwy.
> >
> > My solution is to do the following:
> >
> > 1. First parse the XML content into a multidimensional array
> > that reflects
> > the xml structure (After thinking about it, this should
> > probably be a more
> > structured object)
> > 2. Create a database object to encapsulate the database structure
> > 3. parse an xml formatted ruleset file into an object that
> > will determine
> > how to execute sql statements.
> >
> > This ruleset object is the key to making the whole thing come
> > together.
> > Essentially, it does the same thing as a stored procedure would. The
> > likeness is obvious now that you bring up the topic. I hadn't
> > thought of
> > that before. Basically, this creates a 'rule' for each table, and
> > contigencies for the columns and tables as needed. This also
> > serves as a
> > place to map out the relationship between an xml file and a database
> > structure. The cool thing is that I should be able to take
> > just about any
> > database & XML pair, create a ruleset file defining the
> > relationships &
> > contigencies, and it should all work. Although a new xml
> > parser would need
> > to be created for the xml file holding the data in order to create the
> > multidimensional struct the translator object expects.
> >
> > Right now, I'm to the point where:
> > 1. I'm looping through the list of tables from the database object
> >     1.1 checking the rules for that table
> >         1.1.1 if the rule passes, the insert query happens and returns
> >                 success
> >         1.1.2 if the rule does NOT pass, i need to look at a
> > different table
> >                 so i go through the process recursively.
> > checking table
> > rules,
> >                 inserting where possible until all records
> > have been saved,
> >                 or, at this point, ad infinitum
> >
> > I can insert data, and to a certain degree it gets inserted
> > correctly, but
> > I'm winding up with some corruption of data, which I'm trying
> > to pin down.
> > Infinite loops are happening right now... Recursion is cool
> > when it works. I
> > kind of feel like I'm working magic when it happens, but
> > until that magical
> > point, it's fairly hair raising.
> >
> >
> >
> >
> > ----- Original Message ----- 
> > From: "Dan Cech" <dcech at phpwerx.net>
> > To: "NYPHP Talk" <talk at lists.nyphp.org>
> > Sent: Wednesday, February 11, 2004 1:13 PM
> > Subject: Re: [nycphp-talk] Translating XML to SQL
> >
> >
> > > From the work I've done on adodb-xmlschema, I would advise an object
> > > oriented approach, it really helps to break up the logic.
> > >
> > > I would create a class to hold the methods for parsing a
> > document, etc,
> > > and another class to represent an article.
> > >
> > > Once you have a class for the article you can build up all
> > the article
> > > details from the xml, and finally generate and run the SQL
> > to insert all
> > > the required records.  You could also use the class for
> > accessing the
> > > data once it is in the database....
> > >
> > > You may also want to create classes for author, etc which
> > could handle
> > > the logic for updating/inserting the relevant details into
> > those tables.
> > >
> > > Anyway, it appears you are looking at a procedural solution
> > right now,
> > > which will work too....honestly I'd have to have more
> > details about the
> > > specific issues you are working around in order to give you
> > any concrete
> > > advice.
> > >
> > > Dan
> > >
> > > Eric Rank wrote:
> > > > I'm working on a PHP app that will translate XML to SQL based on a
> > > > configuration file with rules, and as I'm getting to the
> > end of this
> > > > project, all the quirks and airbubbles (yes, ok, they're BUGS) are
> > > > getting squeezed out. And man, it's no fun. So now I'm looking for
> > > > solutions that may have been created before me.
> > > >
> > > > Specifically, the xml contains information from articles
> > in a newspaper.
> > > > Author, issue, volumne, section, category, headline,
> > story, etc. This
> > > > xml we get from an export from QuarkXPress (which is its
> > own nightmare)
> > > >
> > > > I'm trying to  take this xml structure that contains NO database
> > > > specific information and place it into a MySQL database
> > with about a
> > > > dozen related tables. The trick is that in going through
> > an xml file,
> > > > there are a lot of contigencies to cover before finally
> > reaching the
> > > > database. For example, some INSERT statements require the
> > inserted id of
> > > > a previous insert statement in order to make the records
> > relational (eg.
> > > > we need to insert the 'story' info into the 'article'
> > table to get the
> > > > inserted id for the article, which then gets inserted
> > into the 'photos'
> > > > table in the 'articleID' column.). Some insert statements
> > might not need
> > > > to happen because the information already exists (eg. an
> > author that
> > > > already exists in the database). You get the idea.
> > > >
> > > > Anyone have experience doing this? Is there an open
> > source project out
> > > > there that's working on this? Third party software even?
> > > >
> > > > Thanks for your help,
> > > >
> > > > Eric Rank
> > >
> > > _______________________________________________
> > > talk mailing list
> > > talk at lists.nyphp.org
> > > http://lists.nyphp.org/mailman/listinfo/talk
> >
> > _______________________________________________
> > talk mailing list
> > talk at lists.nyphp.org
> > http://lists.nyphp.org/mailman/listinfo/talk
> >
> >
> _______________________________________________
> talk mailing list
> talk at lists.nyphp.org
> http://lists.nyphp.org/mailman/listinfo/talk


>From hans not junk at nyphp.com  Thu Feb 12 10:38:33 2004
Return-Path: <hans not junk at nyphp.com>
Received: from ehost011-1.exch011.intermedia.net (unknown [64.78.21.3])
	by virtu.nyphp.org (Postfix) with ESMTP id 0F13EA8628
	for <talk at lists.nyphp.org>; Thu, 12 Feb 2004 10:38:33 -0500 (EST)
X-MimeOLE: Produced By Microsoft Exchange V6.5.6944.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [nycphp-talk] Session question
Date: Thu, 12 Feb 2004 07:38:30 -0800
Message-ID: <41EE526EC2D3C74286415780D3BA9F87772AAD at ehost011-1.exch011.intermedia.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: [nycphp-talk] Session question
Thread-Index: AcPxJHy9pDAV8hSzTQe+pOowrTE7iwAWaxUQ
From: "Hans Zaunere" <hans not junk at nyphp.com>
To: "NYPHP Talk" <talk at lists.nyphp.org>
X-BeenThere: talk at lists.nyphp.org
X-Mailman-Version: 2.1.2
Precedence: list
Reply-To: NYPHP Talk <talk at lists.nyphp.org>
List-Id: NYPHP Talk  <talk.lists.nyphp.org>
List-Unsubscribe: <http://lists.nyphp.org/mailman/listinfo/talk>,
	<mailto:talk-request at lists.nyphp.org?subject=unsubscribe>
List-Archive: <http://lists.nyphp.org/pipermail/talk>
List-Post: <mailto:talk at lists.nyphp.org>
List-Help: <mailto:talk-request at lists.nyphp.org?subject=help>
List-Subscribe: <http://lists.nyphp.org/mailman/listinfo/talk>,
	<mailto:talk-request at lists.nyphp.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Feb 2004 15:38:33 -0000



> In createTireImg.php, code like following:
> -------
> <?php
> include ("img_text_config.inc.php");
> include ("tbContent.php");
>    session_start( );
> ?>

Are you also doing a session_start(); in the other pages?
session_start() has to be called at every page, or in a file that is
included by all your pages.

Also..

>       Text color: &nbsp;&nbsp;&nbsp;&nbsp;
>       <select name=3D"tt_color">
>       <?php
>          if (isSet($_POST['tt_color']))
>          {
>             $tt_color_def =3D ($_POST['tt_color']);
>=20
>          }
>             $_SESSION['tt_color'] =3D $_POST['tt_color'];

Keep in mind, that even if $_POST['tt_color'] is not set, you are
assigning it to $_SESSION['tt_color'];

You might want to add a else in there.

HTH,
H




More information about the talk mailing list