PHP Regular Expressions
Preventing Parentheses from Capturing Text
Problem
You’ve used parentheses for grouping in a pattern, but you don’t want the text that matches what’s in the parentheses to show up in your array of captured matches.
Solution
Example Preventing text capture
$html = '<link rel="icon" href="http://www.example.com/icon.gif"/>
<link rel="prev" href="http://www.example.com/prev.xml"/>
<link rel="next" href="http://www.example.com/next.xml"/>';
preg_match_all('/rel="(prev|next)" href="([^"]*?)"/', $html, $bothMatches);
preg_match_all('/rel="(?:prev|next)" href="([^"]*?)"/', $html, $linkMatches);
print '$bothMatches is: '; var_dump($bothMatches);
print '$linkMatches is: '; var_dump($linkMatches);
In $bothMatches contains the values of the rel and the href attributes. $linkMatches, however, just contains the values of the href attributes. The code prints:
$bothMatches is: array(3) {
[0]=>
array(2) {
[0]=>
string(49) "rel="prev" href="http://www.example.com/prev.xml""
[1]=>
string(49) "rel="next" href="http://www.example.com/next.xml""
}
[1]=>
array(2) {
[0]=>
string(4) "prev"
[1]=>
string(4) "next"
}
[2]=>
array(2) {
[0]=>
string(31) "http://www.example.com/prev.xml"
[1]=>
string(31) "http://www.example.com/next.xml"
}
}
$linkMatches is: array(2) {
[0]=>
array(2) {
[0]=>
string(49) "rel="prev" href="http://www.example.com/prev.xml""
[1]=>
string(49) "rel="next" href="http://www.example.com/next.xml""
}
[1]=>
array(2) {
[0]=>
string(31) "http://www.example.com/prev.xml"
[1]=>
string(31) "http://www.example.com/next.xml"
}
}
Discussion
Preventing capturing is particularly useful when a subpattern is optional. Because it might not show up in the array of captured text, an optional subpattern can change the number of pieces of captured text. This makes it hard to reference a particular matched piece of text at a given index. Making optional subpatterns noncapturing prevents this problem.
Example A noncapturing optional subpattern
$html = '<link rel="icon" href="http://www.example.com/icon.gif"/>
<link rel="prev" title="Previous" href="http://www.example.com/prev.xml"/>
<link rel="next" href="http://www.example.com/next.xml"/>';
preg_match_all('/rel="(?:prev|next)"(?: title="[^"]+?")? href="([^"]*?)"/',
$html, $linkMatches);
print '$bothMatches is: '; var_dump($linkMatches);
No comments:
Post a Comment