NYCPHP Meetup

NYPHP.org

[nycphp-talk] Regex for P Elements

Dan Cech dcech at phpwerx.net
Wed Jan 12 09:20:51 EST 2011


On 1/12/2011 9:00 AM, Donald J. Organ IV wrote:
> $blockpattern='/<p*[^>]>.*?/m';
>
> Notice the m after the last /   this says it can span multiple
> lines....

Good call, I missed the multiple-line thing.  In this situation though 
you'd actually want /s like:

$blockpattern='/<p[^>]*>.*?<\/p>/s';

 From the manual
[http://php.net/manual/en/reference.pcre.pattern.modifiers.php]:

> s (PCRE_DOTALL) If this modifier is set, a dot metacharacter in the
> pattern matches all characters, including newlines. Without it,
> newlines are excluded. This modifier is equivalent to Perl's /s
> modifier. A negative class such as [^a] always matches a newline
> character, independent of the setting of this modifier.

The /m modifier is used to control how the ^ and $ characters match:

> m (PCRE_MULTILINE) By default, PCRE treats the subject string as
> consisting of a single "line" of characters (even if it actually
> contains several newlines). The "start of line" metacharacter (^)
> matches only at the start of the string, while the "end of line"
> metacharacter ($) matches only at the end of the string, or before a
> terminating newline (unless D modifier is set). This is the same as
> Perl. When this modifier is set, the "start of line" and "end of
> line" constructs match immediately following or immediately before
> any newline in the subject string, respectively, as well as at the
> very start and end. This is equivalent to Perl's /m modifier. If
> there are no "\n" characters in a subject string, or no occurrences
> of ^ or $ in a pattern, setting this modifier has no effect.

And isn't really applicable in this case.

Dan



More information about the talk mailing list