NYCPHP Meetup

NYPHP.org

[nycphp-talk] Best way to accomplish this task

Anthony Papillion papillion at gmail.com
Sun Feb 14 21:00:51 EST 2010


Andy,

Thanks for the input. I'm going to look into that too.
Sounds reasonable and takes care of a few problems that popped up in my 
head.

Thanks,
Anthony

----- Original Message ----- 
From: "Andy Dirnberger" <dirn at dirnonline.com>
To: "NYPHP Talk" <talk at lists.nyphp.org>
Sent: Sunday, February 14, 2010 7:56 PM
Subject: Re: [nycphp-talk] Best way to accomplish this task


> On Sun, Feb 14, 2010 at 8:49 PM, Anthony Papillion <papillion at gmail.com> 
> wrote:
>>
>> Hello Everyone,
>>
>> I'm designing a system that will work on a schedule. Users will submit 
>> data for processing into the database and then, every minute, a PHP 
>> script will pass through the db looking for unprocessed rows (marked 
>> pending) and process them.
>>
>> The problem is, I may eventually have a few million records to process at 
>> a time. Each record could take anywhere from a few seconds to a few 
>> minutes to perform the required operations on. My concern is making sure 
>> that the script, on the next scheduled pass, doesn't grab the records 
>> currently being processed and start processing them again.
>>
>> Right now, I'm thinking of accomplishing this by updating a 'status' 
>> field in the database. So unprocessed records would have a status of 
>> 'pending', records being processed would have a status of 'processing' 
>> and completly processed record will have a status of 'complete'.
>>
>> For some reason, I see this as ugly but that's the only way I can think 
>> of making sure that records aren't duplicatly processed. So when I select 
>> records to process, I'm ONLY selecting one's with the status of 'pending' 
>> which means they are new, unprocessed.
>>
>> Is there a better, more eleqent way of doing this or is this pretty much 
>> it?
>>
>> Thanks!
>> Anthony Papillion
>
> I would generate a unique value for each pass (whether that's an auto
> increment field in a table, a UUID, or something else is up to you). I
> would add a field to your table(s), just like you were going to do
> with status, to store this value.
>
> When processing, do something along the lines of UPDATE table SET
> new_field = unique_value WHERE new_field IS NULL (make sure new_field
> is indexed). Then you process any records with a matching value in the
> new field. This also has the benefit that, should a pass fail, you can
> identify which records were part of that pass.
> _______________________________________________
> New York PHP Users Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/Show-Participation 




More information about the talk mailing list