Groovy Regex & Lookahead/Lookbehind revisited.

def t = '134 445 887 6667 456-987 fdcf'
def p1 = /(?!^)\d{3}\b/            // No good selects 667
def p2 = /(?!^)\b\d{3}\b/          // Great
def p3 = /((\s)(\d{3})(?=\D))/     // No good omits 987 also 3 groups..
def p4 = /((\s|\-)(\d{3})(?=\D))/  //Good again, but 3 groups, result third (post 4 first is full)
def p5 = /((?!^)(?<!\-)(?=\b)(\d{3})(?!\-)(?=\b))/  // Great if you want to omit ###-### permutation
def m1 = (t =~ p1)
def m2 = (t =~ p2)
def m3 = (t =~ p3)
def m4 = (t =~ p4)
def m5 = (t =~ p5)
assert m1.size() == 5
m1.each{assert it in ['445','887','667','456','987']}
assert m2.size() == 4
m2.each{assert it in ['445','887','456','987']}
assert m3.size() == 3
m3.each{assert it[3] in ['445','887','456']}
assert m4.size() == 4
m4.each{assert it[3] in ['445','887','456','987']}
assert m5.size() == 2
m5.each{assert it[0] in ['445','887']}

This is very similar to my earlier post
Groovy Regular Expressions to abbreviate compass directions with look ahead and look behind.
If I break apart pattern 5 by groups, we have:

  • (?!^) : Lookahead negative start (ie Not at start)
  • (?<!\-)   Lookbehind negative for ‘-‘ i.e. exclude -###
  • (?=\b) Lookahead positive for word boundary
  • (\d{3}) 3 digits ###
  • (?!\-) Lookahead negative for ‘-‘ i.e. exclude ###-
  • (?=\b) Lookahead positive for word boundary
Advertisements

About this entry