NYCPHP Meetup

NYPHP.org

[nycphp-talk] Best way to accomplish this task

Gary Mort garyamort at gmail.com
Sun Feb 14 22:35:23 EST 2010


First off, always process in the smallest increment you can.  So don't grab
EVERY row, just grab ONE row and process it, then grab the next row and
process it... That allows you to run multiple, concurrent processors and
each one can grab the data they want.

Secondly....  there are many queing services already written,
http://en.wikipedia.org/wiki/Message_queue

As an oddball idea, instead of using a database store, to add an item to the
queue, json encode it and email it to a special address on the processing
server(s)....then configure your mail server to trigger a PHP script for
each email received and process it from there.  Then your email server will
give you queueing for free..

On Sun, Feb 14, 2010 at 8:49 PM, Anthony Papillion <papillion at gmail.com>wrote:

>  Hello Everyone,
>
> I'm designing a system that will work on a schedule. Users will submit data
> for processing into the database and then, every minute, a PHP script will
> pass through the db looking for unprocessed rows (marked pending) and
> process them.
>
> The problem is, I may eventually have a few million records to process at a
> time. Each record could take anywhere from a few seconds to a few minutes to
> perform the required operations on. My concern is making sure that the
> script, on the next scheduled pass, doesn't grab the records currently being
> processed and start processing them again.
>
> Right now, I'm thinking of accomplishing this by updating a 'status' field
> in the database. So unprocessed records would have a status of 'pending',
> records being processed would have a status of 'processing' and completly
> processed record will have a status of 'complete'.
>
> For some reason, I see this as ugly but that's the only way I can think of
> making sure that records aren't duplicatly processed. So when I select
> records to process, I'm ONLY selecting one's with the status of 'pending'
> which means they are new, unprocessed.
>
> Is there a better, more eleqent way of doing this or is this pretty much
> it?
>
> Thanks!
> Anthony Papillion
>
> _______________________________________________
> New York PHP Users Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/Show-Participation
>



-- 
----
Hudson Valley Sudbury School
What GPL is for application users
Our school is for students
Help your children grow, change, and learn
Let your child direct, control, amend
Check out http://www.sudburyschool.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20100214/cff583c3/attachment.html>


More information about the talk mailing list