PHP Regular Expressions
Switching from ereg to preg
Problem
You want to convert from using ereg functions to preg functions.
Solution
First, you have to add delimiters to your patterns:
preg_match('/pattern/', 'string');
For case-insensitive matching, use the /i modifier with preg_match() instead:
preg_match('/pattern/i', 'string');
When using integers instead of strings as patterns or replacement values, convert the number to hexadecimal and specify it using an escape sequence:
$hex = dechex($number);
preg_match("/\x$hex/", 'string');
Discussion
There are a few major differences between ereg and preg. First, when you use preg functions, the pattern isn’t just the string pattern; it also needs delimiters, as in Perl, so it’s /pattern/ instead.1 So:
ereg('pattern', 'string');
becomes:
preg_match('/pattern/', 'string');
When choosing your pattern delimiters, don’t put your delimiter character inside the regular expression pattern, or you’ll close the pattern early. If you can’t find a way to avoid this problem, you need to escape any instances of your delimiters using the backslash. Instead of doing this by hand, call addcslashes().
For example, if you use / as your delimiter:
$ereg_pattern = '<b>.+</b>';
$preg_pattern = addcslashes($ereg_pattern, '/');
the value of $preg_pattern is now <b>.+<\/b>.
The preg functions don’t have a parallel series of case-insensitive functions. They have a case-insensitive modifier instead. To convert, change:
eregi('pattern', 'string');
to:
preg_match('/pattern/i', 'string');
Adding the i after the closing delimiter makes the change.
Finally, there is one last obscure difference. If you use a number (not a string) as a pattern or replacement value in ereg_replace(), it’s assumed you are referring to the ASCII value of a character. Therefore, because 9 is the ASCII representation of tab (i.e., \t), this code inserts tabs at the beginning of each line:
$tab = 9;
$replaced = ereg_replace('^', $tab, $string);
Here’s how to convert linefeed endings:
$converted = ereg_replace(10, 12, $text);
To avoid this feature in ereg functions, use this instead:
$tab = '9';
On the other hand, preg_replace() treats the number 9 as the one-character string '9', not as a tab substitute. To convert these character codes for use in preg_replace(), convert them to hexadecimal and prefix them with \x. For example, 9 becomes \x9 or \x09, and 12 becomes \x0c. Alternatively, you can use \t , \r, and \n for tabs, carriage returns, and linefeeds, respectively.
No comments:
Post a Comment