PHP Regular Expressions
Escaping Special Characters in a Regular Expression
Problem
You want to have characters such as * or + treated as literals, not as metacharacters, inside a regular expression. This is useful when allowing users to type in search strings you want to use inside a regular expression.
Solution
Use preg_quote() to escape PCRE metacharacters:
$pattern = preg_quote('The Education of H*Y*M*A*N K*A*P*L*A*N').':(\d+)';
if (preg_match("/$pattern/",$book_rank,$matches)) {
print "Leo Rosten's book ranked: ".$matches[1];
}
Discussion
Here are the characters that preg_quote() escapes:
. \ + * ? ^ $ [ ] () { } < > = ! | :
It escapes the metacharacters with a backslash.
You can also pass preg_quote() an additional character to escape as a second argument. It’s useful to pass your pattern delimiter (usually /) as this argument so it also gets escaped. This is important if you incorporate user input into a regular expression pattern. The following code expects $_GET['search_term'] from a web form and searches for words beginning with $_GET['search_term'] in a string $s:
$search_term = preg_quote($_GET['search_term'],'/');
if (preg_match("/\b$search_term/i",$s)) {
print 'match!';
}
Using preg_quote() ensures the regular expression is interpreted properly if, for example, a Magnum, P.I. fan enters t.c as a search term. Without preg_quote(), this matches tic, tucker, and any other words whose first letter is t and third letter is c.
Passing the pattern delimiter to preg_quote() as well makes sure that user input with forward slashes in it, such as CP/M, is also handled correctly.
No comments:
Post a Comment