[nycphp-talk] OT: Apache access_log integrity

pswebcode, nyc psaw at
Thu Oct 2 05:30:26 EDT 2003

Did you look at Webalizer, an app that has many startup config options that
allow you to remove/massage info to generate focused reports and charts from
Apache logs. Also, cron to grab, analyze and archive your logs
Then maybe you could do some set it and forget it type of processing. PSaw

-----Original Message-----
From: talk-bounces at [mailto:talk-bounces at] On
Behalf Of D C Krook
Sent: Thursday, October 02, 2003 12:59 AM
To: talk at
Subject: [nycphp-talk] OT: Apache access_log integrity


I'm trying to trim the fat from Apache's access_log: removing my own IP 
addresses from the log; stripping referer spam; bots; etc.

While I understand that I can exclude known IP addresses and other common 
patterns via mod_setenvif, I'd like to be able to do this on an ad hoc basis

when I notice certain spikes in useless records in the log and/or when my IP

changes when hitting my own site from various wireless points.

My first idea was to grep the logs by using a shell or Perl script that I 
could add to my daily cron or call arbitrarily like so:

grep -v "" /var/log/apache/access_log > 
mv /var/log/apache/access_log.tmp /var/log/apache/access_log

Of course, this has the drawback of restarting Apache everytime the 
access_log is changed by the script, but the second or two of down time is 
acceptable if it means logs that can be regularly analyzed for useful 
reports that don't have "http://jeff-knights-online-viagra-megastore" as my 
top referer.

I'd like to know if anyone else had addressed this problem in a sucessful 
way or has any best practice, either via mod_setenvif, Perl, CLI PHP, cron 

I've Googled the following topics without any good results: "strip line from
access_log perl" "clean access_log" "setEnvIf" "eliminate referer spam
access_log" "remove line from log"

Thanks in advance for any tips.


MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.

talk mailing list
talk at

More information about the talk mailing list