NYCPHP Meetup

NYPHP.org

[nycphp-talk] PCRE expression for tokenizing?

Michael B Allen ioplex at gmail.com
Mon Jul 21 19:53:39 EDT 2008


On Mon, Jul 21, 2008 at 6:45 PM, Dan Cech <dcech at phpwerx.net> wrote:
> Michael B Allen wrote:
>>
>> On Mon, Jul 21, 2008 at 6:08 PM, Dan Cech <dcech at phpwerx.net> wrote:
>>>
>>> Michael B Allen wrote:
>>
>> So is there any way to say "capture anything that didn't match" (aside
>> from created a sub-expression that explicitly excludes all of the
>> tokens)?
>
> Afaik no, you could probably do something like:
>
> preg_match('@^(.*?)(~|\*\*|//|=====|====|===|==|=|$)@',$string,$m);
>
> Which would give you anything before the first token (or end if there are no
> more tokens) in $m[1] and the first token (or nothing if there are no more
> tokens) in $m[2].

Actually matching two tokens is what I started with and probably what
I end up using. Except I don't try to match the string. I match only
tokens, capture them separately and then use the offset of the
captured value to determine which token was matched. That way I get a
numeric value representing which token matched (otherwise I have to
try to map strings to token values after the preg_match). Then I use
the offset of the matched token to determine if I also got a string
and if so, save the token for the next iteration. It's probably more
efficient processing two tokens at a time anyway.

Thanks,
Mike

-- 
Michael B Allen
PHP Active Directory SPNEGO SSO
http://www.ioplex.com/



More information about the talk mailing list