Sunteți pe pagina 1din 13

Regular expressions

Pattern matching with Perl scripting language

http://arraylist.blogspot.com
Regular expressions
We usually talk about regular expressions and
pattern matching within the context of scripting
language such as Perl or Shell script.

Lets us look at pattern matching using regular


expression with Perl scripting language

http://arraylist.blogspot.com
Regular expressions
Pattern matching in Perl occurs using a match
operator such as
m// or m:: or m,,

Example – m/simple/

Here the text “simple” is matched against ? - $_

$_ is the default scalar variable in Perl.

http://arraylist.blogspot.com
Regular expressions
Metacharacters have to be preceded with a \
during pattern matching.

Metacharacters ^ $ ( ) \ | @ [ { ? . + *

So to match m/$10/ we write m/\$10/

http://arraylist.blogspot.com
Regular expressions
m// if we use // as delimiters – we can avoid
the character m during pattern matching.
So m/simple/ can be /simple/

To match variables using regex simply use /


$varname/

http://arraylist.blogspot.com
Regular expressions
Metacharacters . ^ $ ( ) \ | @ [ { ? + *

. Matches a single character


Example /d.t/ matches dot, dit, d t

If we want . to behave as a non-metacharacter


we preceed it with a \

Thus /d\.t/ matches d.t

http://arraylist.blogspot.com
Regular expressions
Metacharacters . ^ $ ( ) \ | @ [ { ? + *

Special characters
\n – newline
\r – carriage return
\t – tab
\f – formfeed

Special characters take the same meaning


inside // during pattern matching

http://arraylist.blogspot.com
Regular expressions
Quantifiers – tells the regex , how many times a
pattern should be matched.
“+” match minimum once or as many times as it
occurs
Example /go+d/ matches good but not god
“*” matches preceding character 0 or more times

Example /hik*e/ matches hike, hie – matches k 0


or more times between hi and e

“?” matches preceding character 0 or 1 times but


not more.
Example /h?ello/ matches hello or ello but not
hhello

http://arraylist.blogspot.com
Regular expressions
{} matched characters specified number of times
/a{5,10}/ - matches the character a at least 5
times ,
but no more than 10 times
/a{5,}/ - matches 5 and more times.
/a{0,2}/ - matches 0 or at the most 2 times.
/a{5}/ - match exactly six times

.* - matches anything between 2 set of characters


/hello.*world/ matches “hello Joe welcome to the
world”

http://arraylist.blogspot.com
Regular expressions
Square brackets [] and character class

[abcd] – match any of the characters a, b, c, d


[a-d] – also means the same thing
[ls]Aa[rs] – match uppercase A or lowercase a
[0-9] – match a digit
[A-Za-z]{5} - match any group of 5 alphabetic
characters
[^a-z] - match all capital case letters - ^ is a
negation
[*!@#$%&()] - match any of these characters

http://arraylist.blogspot.com
Regular expressions
Special Character classes

\w – match a word character same as [a-Za-z]


\W – match non-word characters
\d –match a digit [0-9]
\D- match a non-digit
\s - match a whitespace character
\S - match a non-whitespace character
Example - /\d{3}/ - match 3 digits
/\s\w+\s/ - match a group of words surrounded
by white space

http://arraylist.blogspot.com
Regular expressions
Alternation and Anchors
Alternation uses | which means “or”
Eg. /tea|coffee/  check if string contains tea or
coffee
Grouping with alternation
Eg. /(fr|bl|cl)og/  if string contains frog or blog
or clog

Anchors let you tell where you want to look for a


character
^ - caret .eg. /^tea/ matches tea only if it occurs
at the beginning of the line
$ - dollar sign .eg. /sample$/ matches sample only
at the end of the line.

http://arraylist.blogspot.com
Regular expressions
Substitution
Syntax – s/// 
s/searchstring/replacementstring/
Eg. $_ = “lies does not make sense”
s/lies/truth/  “truth does not make sense”

Instead of / you can use # as a substitution


operator
Example . s#lies#truth#;

http://arraylist.blogspot.com

S-ar putea să vă placă și