NYCPHP Meetup

NYPHP.org

[nycphp-talk] Related to "How to create custom URLs"

inforequest 1j0lkq002 at sneakemail.com
Mon Mar 19 15:49:50 EDT 2007


tedd tedd-at-sperling.com |nyphp dev/internal group use| wrote:

> Hi gang:
>
> I got the idea of how to change things so that if the user types in a 
> url to a part of your site, then you can redirect the user to where 
> you want.
>
> For example, long ago I had my site setup with folders and indexes. 
> For example, one folder was named "contact" with my contact page 
> residing inside named as index.php.
>
> For SEO considerations I changed everything to root level. For 
> example, the "contact folder index page" combination was changed to 
> just contact.php.
>
> However, that left legacy link problems where people/SE's/etc had 
> previously linked to my site using the old links.
>
> To solve that problem, I simply used htaccess like so:
>
> RewriteEngine on
> RewriteRule ^contact$ contact.php
>
> That meant that anyone who had a link to my previous contact, would be 
> redirected to my new contact.php. Everything is cool.
>
> However, I liked the way the url looked before. I liked:
>
> [1] http://sperling.com/contact
>
> instead of the new:
>
> [2] http://sperling.com/contact.php
>
> How can I get the url displayed to look like number [1] regardless if 
> the user adds ".php" or not?
>
> In other words, I know how to redirect the user to where I want, I 
> just don't know how to change the url the user see's -- how do you do 
> that?
>
> Thanks,
>
> tedd


Tedd,

You say you are motivated by "SEO" concerns. SEO is a dynamic field, 
with "rules" changing frequently but more importantly the rules are 
applied differently in different situations. Certain markets (certain 
search keywords and phrases) are handled by search engines differently 
than others. So when one thing seems "better for SEO" it might only be 
better for that particular situation... not in general. Often the rules 
follow competitiveness, with more competitive search markets getting the 
most restrictions.

To my knowledge and in my experience is is not better to have /contact 
than to have /contact/index.php. Google considers those two forms of URL 
the same and has published that their algo is careful with /file and 
/file.ext and /file/index.ext, so that they can be sure they have the 
right "canonicalization" or root paths to the resource. They have a 
strong desire to eliminate duplicate routes to the same data in the 
search index because they don't want to index /contact as a resource, 
and also an identical /contact/index.php as a second resource showing 
the same content. .

And of course, thinking competitively, that is precisely why I think 
it's best to have /contact/index.html. By presenting a clean, 
no-monkey-business URL structure, you may avoid the extra scrutiny (and 
all possible consequences of caution that may ensue if flags are 
triggered). Of course I would and use important and relevant keywords in 
the URL path, and usually I use a Front Controller and/or mod rewrite 
solution to actually deliver the URLs and web pages. But I keep the URLs 
as "standard" as possible.

It is commonly observed that if you leave duplicate paths to the same 
resources, and Google doesn't properly detect it, you may suffer 
filtering such that one path gets included and others get filtered out. 
You may not exercise clean control over which one they keep, so if they 
keep "A"  and you have tons of quality incoming back links to"B" which 
Google chose to filter out,  you suffer. Certain very competitive SEOs 
may notice that you have left a duplicate path to a popular resource in 
place on your site, and link to it from a high page rank page in hope of 
getting the "other way in"  indexed and ranked ahead of the more popular 
route. I'm not saying it works, but if you also have a mistake that 
causes a blip in the indexing and inclusion of your site, it can work as 
a competitive tactic to get your resource removed or demoted.

This idea of maintaining unique routes within your site is known as 
"strict URLs" -- no two URLs go to the same exact content. When you use 
a front controller, it becomes important (difficult?) to ensure strict 
URLs because of the default behavior of Apache and all the possible 
incoming URLs.  So if you want to deploy a PHP front controller for 
example, then plan your URL structure so that you properly redirect or 
404 all routes except the one you intend to use as the strict URL to the 
resource.

If you have a controller called "contact" and a method called 
"contact_us" then your URL is /contact/contact_us/ but you need to be 
sure that you also trap /contact/contact_us and 301 redirect it to 
/contact/contact_us/ if apache doesn't do it (and Apache won't do it if 
there is no actual folder in place on the web server). You also need to 
make sure your controller is "strict" so it doesn't accept 
/contact/whatever without throwing a 404 error (which apache wouldn't 
have done, because it passed everything to the front controller as a 
valid resource locator).

With frameworks, I think it's the same old rules applied to a new, more 
complex yet elegant solution for web app development. Before it was 
really hard to manage everything. Now we have some great tools (like 
frameworks) for managing everything, but out of the box they seem to 
default to the same old behaviors and so, need optimizing.

Of course your situation may be different than my experience.

-=john andrews
http://www.johnon.com/php-seo/






More information about the talk mailing list