Modifiers in regular expression patterns

We all like REGEX patterns, because it’s a great way to select, filter or replace strings, numbers or complete code blocks. How about PCRE modifiers, do you use them very often? A few weeks ago I need to use one because I get in trouble with line breaks in some string, I needed to select in some transaction. I will show in this article some examples with the “most important” PCRE modifiers.

Note each modifier is responsible for the whole pattern!

Upper and lowercase letters

Using and “i” will match both upper and lowercase letters. Use '/[a-z]*/i' to match those strings.

Match patterns across multiple lines

Normally a regex test ends at end of each line, using the “m” modifier the string is processed until the end is reached, example:

$str = 'Hello 
World';
if (preg_match('/^World$/', $str)) echo 'Yes!'; // this will not work
if (preg_match('/^World$/m', $str)) echo 'Yes, a multi-line!';

The first test will stop after “Hello” word and the second test will find the word “World” on the second row.

Match all characters even new line characters

Some times the multi line modifier is not useful, That’s the moment that the modifier “s” will help. Using this modifier your “.” (dot) pattern will match every character including new lines.

Ignore whitespace!

If you need to select some string within a particular HTML tag it might be useful to ignore whitespace. Using the modifier “x” all whitespace is ignored. If you insert comments into complicated patterns, this modifier will ignore even whitespace in those comments.

Switch to “Ungreedy”

Normally your engine will test against the pattern until the end of the string and will result into true/not true. This is painful if you need to collect characters in (sub)classes. This example is about to use a regular expression to collect one or more href attributes from a regular web page. This could be very difficult if there are more than one link elements. The modifier “U” makes your pattern “ungreedy”. There are many more modifiers mentioned in the PHP manual. Check this link to get more information. Note modifiers are cool options to optimize or extend your pattern, they are never a base functionality! Check this regular expression example page with a few useful REGEX patterns.

Published in: PHP Scripts

9 Comments

  1. Good article, short and very much to the point, regular expressions have always been a pain and you explain these modifiers in plain english so thumbs up to you.

    Have a pint on me aye?!

  2. Hey Olaf,

    first of all, nice article but this test will not find the word World in your example string ;)

    if (preg_match(‘/^World$/m’, $str)) echo ‘Yes, a multiline!’;

  3. Hi,
    normally I’m not very good in writing test cases ;), but this time I did a test and pasted the code into this article after the browser shows me the string “Yes, a multiline!”

    maybe you can try it by yourself?

  4. okay i trust u :)

    i thought it, because ^ means string-start and $ means string-end, and the string is hello world.

    okay, the m modifier interpret every line new. that was new for me :)

    thank you!

  5. Thanks so much for this. I’ve always had a problem with the greediness of regex, not to mention ignoring newlines (which I would have to prepare the string by stripping out all newline characters). This is definitely a lifesaver!

Comments are closed.