NYCPHP Meetup

NYPHP.org

[nycphp-talk] Many pages: one script SEO

inforequest 1j0lkq002 at sneakemail.com
Sun Aug 5 16:45:50 EDT 2007


Not everyone cares about search engines indexing unique URLs, but if you 
do, you have to consider how the server response codes are generated for 
various URLs. Aliasing content under different URLs is akin to asking 
the search engines not to index it properly, and/or not to rank it 
highly for relevant searches.

If you are using Apache, Micahel Sims has it correct. Apache is making 
the decisions about URL dispatching, and since mod_rewrite is Apache's 
extension for mapping URLs to dispatch, then that's the logical tool for 
the job. As far as search engines go, you always need to be aware of 
(and work around) how the default dispatching is handled (trailing 
slashes, file not found, etc).

If you want cross-platform (server) scripts written in PHP, I'm not sure 
you can ever break free of the web server completely and still stay in 
complaince with search engine indexing best practices. Of course once 
you are at that level of analysis, I think some of the PHP gurus can 
highlight other problems you'll encounter that are more difficult than 
how to properly manage the URL dispatch.

I think the Zend Framework is one case where they really really really 
would like to avoid needing apache mod rewrite. As it has evolved, they 
have added several rewrite routers to manage the same infrastructure 
issues that mod-rwrite is typically used for. The best and only one I 
like (for SEO purposes) is a full blown regex rewrite router, certainly 
no "simpler" than mod rewrite.

-=john andrews

Hans Zaunere lists-at-zaunere.com |nyphp dev/internal group use| wrote:

>Elliotte Harold wrote on Sunday, August 05, 2007 1:43 PM:
>  
>
>>I'm considering a simple site that I may design in PHP. PHP is
>>probably the simplest solution except for one thing: it carries a
>>very strong coupling between pages and scripts. As far as I've ever
>>been able to tell PHP really, really, really wants there to be a
>>single primary .php file for each URL that does not contain a query
>>string (though that file may of course invoke others).
>>    
>>
>
>PHP doesn't actually care, but...
>
>  
>
>>For the system I'm designing that simply won't work. In Java servlet
>>environments it's relatively trivial to map one servlet to an entire
>>directory structure, so that it handles all requests for all pages
>>within that hierarchy.
>>    
>>
>
>It has to do with the way PHP reaches into the request processing stack in
>Apache (assuming Apache).  Basically PHP doesn't reach as far up the request
>stack as other things do, like mod_perl, Java, etc, which of course could be
>argued as a good or bad thing.
>
>  
>
>>Is there any *reasonable* way to do this in PHP? The only way I've
>>ever seen is what WordPress does: use mod_rewrite to redirect all
>>requests within the hierarchy to a custom dispatcher script that
>>converts actual hierarchy components into query string variables. I
>>am impressed by this hack, but it's way too kludgy for me to be
>>comfortable with. For one thing, I don't want to depend on
>>mod_rewrite if I don't have to.
>>    
>>
>
>A lot of people use mod_rewrite, but I never was a big fan either.  However,
>you can implement this "fuse-box" style processing quite elegantly in pure
>Apache.  There are a number of ways, most of which are covered in these
>results:
>
>http://www.google.com/search?q=mediawiki+url+rewrite
>
>There are other options as well, including the ErrorDocument hack and
>playing with ForceType, but I'm not much of a fan of those either.  I find
>the following to be the most elegant:
>
>    Alias /support/ "/var/www/www.something.com/support/"
>    Alias /Test/ "/var/www/www.something.com/Test/"
>    AliasMatch /(.*) "/var/www/www.something.com/index.php"
>
>/support becomes a nice place to throw static stuff, like images, CSS, etc.
>
>/Test can be used as a test bed, to test PHP scripts outside of the
>fuse-box.
>
>And then index.php is where the action is, getting called on every request.
>Of course, the above can be adjusted as needed to have an unlimited number
>of fuse-boxes at different URLs, etc.  Combined with things like Apache's
>AddType, the possibilities are endless.  I think you can even use AddType
>for a directory (or maybe it's ForceType).
>
>I actually leave index.php empty, and use auto_prepend_file to call a PHP
>file that handles the heavy lifting.  This typically allows for better
>delegation of responsbility and keeping PHP code outside of the
>DocumentRoot.
>
>And all of the above can be combined with various combinations of
><Directory> and <Location> directives in Apache, making it really flexible
>and dizzying.
>
>But, I generally keep it simple and use something like the above, and then
>have a request processor in PHP do the URL mapping in a style akin to the
>Java world.  Straightforward, none of the PATH_INFO confusions, and
>setup-and-forget.
>
>  
>


-- 
-------------------------------------------------------------
Your web server traffic log file is the most important source of web business information available. Do you know where your logs are right now? Do you know who else has access to your log files? When they were last archived? Where those archives are? --John Andrews Competitive Webmaster and SEO Blogging at http://www.johnon.com




More information about the talk mailing list