NYCPHP Meetup

NYPHP.org

[nycphp-talk] importing 650,000 records

Kenneth Downs ken at secdat.com
Sat Dec 31 13:05:44 EST 2005


three things.

First, As a general rule, never post code with the password, even if it is
a throwaway.

Second, if the suggestions to directly import w/mysql work out, that is
probably the best.

Third, if you cannot import w/mySQL tools, a better processing routine
would look like this:

$F=fopen("filename","r");
$kount=0;
while (!feof($F)) {
  $kount++;
  if ($kount % 100 == 0) {
     echo "On line $kount \n";
  }

  $line=fgets(F);  // beware! contains trailing newline char
  ... all processing here ...

}
fclose($F);



> Hey everyone.
>
> I have a client who has a comma delimited file each line being a new
> record.
> currently i am trying to import this data into their new system.  The file
> is 40mb in size and contains 650,000 records.
>
> I am using file() and i am sure this is not the best method as the import
> process is going dead slow.  What i am doing is this
>
>
> <?php
> ini_set('max_execution_time', 99200);
> $link = mysql_connect('localhost', 'root', '8rittany');
> $db = mysql_select_db('12all_db');
> $lines = file('import.txt');
> $count = 0;
> $skipped = 0;
> foreach ($lines as $l) {
>     $line = explode(',', $l);
>     if($line[0] == 'Email') continue;
>     $sql = "SELECT id FROM 12all_listmembers WHERE
> email='".trim($line[0])."'";
>     $res = mysql_query($sql);
>     if(mysql_num_rows($res) == 0) {
>         $count++;
>         $dt = explode(' ', $line[4]);
>         $sql = "INSERT INTO 12all_listmembers (sdate, email, name, nl,
> stime) VALUES('".trim(addslashes($dt[0]))."',
> '".trim(addslashes($line[0]))."', '".trim(addslashes($line[1]))."
> ".trim(addslashes($line[2]))."', 2, '".trim(addslashes($dt[1]))."')";
>         $res = mysql_query($sql) or die(mysql_error());
>         $id = mysql_insert_id($link);
>         $query[] = "INSERT INTO 12all_fieldsd (fid, eid, val, nl)
> VALUES(1,
> ".$id.", '".trim(addslashes($line[3]))."', 2)";
>         $query[] = "INSERT INTO 12all_fieldsd (fid, eid, val, nl)
> VALUES(2,
> ".$id.", '".trim(addslashes($line[4]))."', 2)";
>         $query[] = "INSERT INTO 12all_fieldsd (fid, eid, val, nl)
> VALUES(3,
> ".$id.", '".trim(addslashes($line[1]))."', 2)";
>         $query[] = "INSERT INTO 12all_fieldsd (fid, eid, val, nl)
> VALUES(4,
> ".$id.", '".trim(addslashes($line[2]))."', 2)";
>         foreach($query as $q) {
>             $res = mysql_query($q) or die(mysql_error());
>         }
>     } else $skipped++;
> }
> echo 'imported '.$count.' records and skipped '.$skipped.' duplicates.';
> ?>
>
> Any suggestions on how to speed things up?
>
> --
> Joseph Crawford Jr.
> Zend Certified Engineer
> Codebowl Solutions, Inc.
> 1-802-671-2021
> codebowl at gmail.com
> _______________________________________________
> New York PHP Talk Mailing List
> AMP Technology
> Supporting Apache, MySQL and PHP
> http://lists.nyphp.org/mailman/listinfo/talk
> http://www.nyphp.org


-- 
Kenneth Downs
Secure Data Software
631-379-0010
ken at secdat.com
PO Box 708
East Setauket, NY 11733




More information about the talk mailing list