Sunday 28 June 2015

tutorial on regular expressions

Hi all,


website to test regular expression: http://regexpal.com/

Have you ever come up to validate string and used string as array to validate pattern? Do you write 50 lines of code to validate password? Do you want to read about regular expression? I think if the answer of any of the above is yes then you can read on otherwise this blog is to techie like me.(:P)


Grammar of Regular expression:

. matches any literal(character,digit,..)
\w(matches character(A-Z, a-z,-,_))
\d matches digit
[] character set
() grouping characters so that you can put repetition condition of group.
{} for repetition of characters.{x,y} tells that there should be minimum of x repetition and maximum of y repetition.
| alternation metacharacter which is or operator


meta characters
? 0 or 1 times
* 1 or more times
+ 0 or more times

leftmost expression got precedence over rightmost.

regular expression are eager

zeel|zeelshah
zeel

it will return the first match it found . it will not check for second match.

regular expression are greedy
zeel(shah)?
zeelshah

it prefers to return result with shah.

global in regular expression.
it will just return first match rather then finding for every possible match in the string. you can provide global flag using g.


repetition in alternation is repeating alternation that many times rather then repeating single element
start and end anchors($, ^, \A, ) they refer to position , not the actual character.

single line tools
they do not matches on line breaks.

javascript /m to tell it to use multiline mode .


word boundary
\b word boundary
\B non-word boundary

space is not word boundary. word boundary have zero length and it


back reference regular expression stores grouped expression that is evaluated.

so z(e{2})l stores ee . which you can access using
\1 , ..., \9




You can do find and replace using sublime text using regular expression using group expression.

Situation:
so what if you have one column containing firstname lastname and you want to change it like lastname firstname.


regular expression can be helpful in such event.

press ctrl + h to open find and replace window and make sure you turned on regular expression mode.

put \b([\w;,]+)\b\s([\w;,]+) in find window

and put

\2 \1 in replace window.


non capturing regular expression:

to turn off the default behavior of storing the group expression we need to add ?: in beginning of the group expression. You can use this functionality to optimize the  times  as well as to  optimize space to store more group expression.

positive look-ahead assertion inside group expression.

?=

assertion of what ought to be ahead
? says there is something special with this group expression and = says that is look-ahead assertion.

it is same as if condition. so if the condition matches then it will go ahead with otherwise it will not match ahead.

/(?=seashore)sea/ in seashore will match sea. it will first look for seashore and if it's present then it goes ahead.

it does have zero length so it does not match any character.

suppose you want all word with comma after that.

\b\w+\b(?=,) will match word with comma and it will not include comma in the match.



you can include multiple lookahead to use nesting.

using this concept you can create a regular expression for password for website.

suppose you want to create password with 8,15 characters which should have one Capital letter, One small letter and one digit and one special character then following is the regular expression for that:


^(?=.*\d)(?=.*[A-Z])(?=.*[a-z])(?=.*[&?!@#$%^*()_-]).{8,15}$



one thing here to notice that it's great to use lookahead rather then just character set as you don't have to think about what should be first in the sequence. as it works independently without depending on any regular expression.

No comments:

Post a Comment