NYCPHP Meetup

NYPHP.org

[nycphp-talk] A good PCRE expression for matching URLs

Michael B Allen ioplex at gmail.com
Fri Jul 25 02:27:20 EDT 2008


On Thu, Jul 24, 2008 at 7:50 PM, Michael B Allen <ioplex at gmail.com> wrote:
> But it would be nice to exclude those end-of-sentence punctuation from
> the capture output. I tried the following minimalistic expression just
> to try and get the trailing condition right I'm not able to
> distinguish between a dot that is part of the URL and a period at the
> end.

Got it. I just needed to negate the end-of-sentence punctuation character class.

This seems to be handling all cases properly:

  $expr = '([a-zA-Z0-9]{1,10}://[a-zA-Z0-9.-]+[\p{L}0-9"!#$%&\\()+,\\./:;=?\\@\\\\^_{}~-]*)[^,\\.?!:;"\'\\s]';

Thanks,
Mike

-- 
Michael B Allen
PHP Active Directory SPNEGO SSO
http://www.ioplex.com/



More information about the talk mailing list