UNIX Special Characters

Unix Filename Wildcards – Metacharacters
#1) ‘*’ – any number of characters:

This wild-card selects all the files that matches the expression by replacing
the asterisk-mark with any set of zero or more characters.
 Example1: List all files that start with the name ‘file’. g. file, file1, file2,
filenew
 $ ls file*
 Example2: List all files that end with the name ‘file’. g. file, afile, bfile,
newfile
 $ ls *file
#2) ‘?’ – single character:

the question-mark with any one character.
 Example1: List all files that have one character after ‘file’. g. file1,
file2, filea
 $ ls file?
 Example2: List all files that have two characters before ‘file’. g. dofile,
tofile, a1file
 $ ls ??file
#3) ‘[’ range ‘]’ – single character from a range:

the marked range with any one character in the range.
 Example1: List all files that have a single digit after ‘file’. g. file1, file2
 $ ls file[0-9]
 Example2: List all files that have anyone letter before ‘file’. g. afile,
zfile
 $ ls [a-z]file
# 4) ‘[’ range ‘]*’ – multiple characters from a range:

the marked range with one or more characters from the range.
 Example1: List all files that have digits after ‘file’. g. file1, file2, file33
 $ ls file[0-9]*
Overview of Regular expressions in Unix:

This tutorial covers all about regular expressions. Regular expression is a
powerful tool that is used to specify search patterns of text.
The expressions use special characters to match the expression with one
or more lines of text.
The pattern is constructed using a series of characters and special
characters representing anchors, character-sets, and modifiers.
Unix Video #10:
Unix Regular Expressions
Regular expressions can be used with text processing commands like vi,
grep, sed, awk, and others.Note that although some regular-expression
patterns look similar to filename-matching patterns – the two are unrelated.
#1) ‘^’ – anchor character for start of line:
If the carat is the first character in an expression, it anchors the remainder
of the expression to the start of the line.
 Example1: Match all lines that start with ‘A’. g. “A plane”

 Pattern: ‘^A’
 Example2: Match all lines that start with ‘hello’. g. “hello there”
 $ grep “^hello” file1
#2) ‘$’ – anchor character for end of line:
If the carat is the last character in an expression, it anchors the remainder
of the expression to the end of the line.
 Example1: Match all lines that end with ‘Z’. g. “The BUZZ”
 Pattern: ‘Z$’
 Example2: Match all lines that end with ‘done’. g. “well done”
 $ grep “done$” file1
#3) ‘.’ – any single character:
The ‘.’ character matches any character except the end-of-line.
 Example1: Match all lines that contain a single character. g. “a”

 Pattern: ‘^.$’
 Example2: Match all lines that end with ‘done’. g. “well done”
 $ grep “done$” file1
#4) ‘[’ range ‘]’ – a range of characters:
This pattern matches the set of characters specified between the square
brackets.
 Example1: Match all lines that contain a single digit. g. “8”

 Pattern: ‘^[0-9]$’
 Example2: Match all lines that contain any of the letters ‘a’, ‘b’, ‘c’, ‘d’
or ‘e’
 $ grep “[abcde]”
 Example3: Match all lines that contain any of the letters ‘a’, ‘b’, ‘c’, ‘d’
or ‘e’.
 $ grep “[a-e]” file1
#5) ‘[^’ range ‘]’ – a range of characters to be excluded:
This pattern matches any pattern except the set of characters specified
between the square brackets.
 Example1: Match all lines that do not contain a digit. g. “hello”

 Pattern: ‘[^0-9]’
 Example2: Match all lines that do not contain a vowel
 $ grep “[^aeiou]” file1
#6) ‘*’ – ‘zero or more’ modifier:
This modifier matches with zero or more instances of the preceding
character-set.
 Example1: Match all lines that contain ‘ha’ followed by zero or more
instances of ‘p’ and then followed by ‘y’. g. “happpy” or “hay”
 Pattern: ‘hap*y’
 Example2: Match all lines that start with a digit following zero or more
spaces E.g. “ ” or “2.”
 $ grep “ *[0-9]” file1
#7) ‘?’ – ‘zero or one’ modifier:
This modifier matches with zero or one instances of the preceding
character-set.
 Example1: Match all lines that contain ‘hap’ followed by zero or one
instances of ‘p’ and then followed by ‘y’. g. “hapy” or “happy”
 Pattern: ‘happ?y’
 Example2: Match all lines that start with a digit followed by zero or
one ‘:’ characters E.g. “1” or “2:”
 $ grep “^[0-9]:?” file1
Metacharacters
Metacharacters consist of non-alphanumeric symbols such as:
. \\\ | ( ) [ { $ * + ?
To match metacharacters in R you need to escape them with a double

backslash “\\”. The following displays the general escape syntax for the
most common metacharacters:
# substitute $ with !
sub(pattern = "\\$", "\\!", "I love R$")
## [1] "I love R!"
# substitute ^ with carrot

sub(pattern = "\\^", "carrot", "My daughter has a ^
with almost every meal!")
## [1] "My daughter has a carrot with almost every
meal!"
# substitute \\ with whitespace

gsub(pattern = "\\\\", " ", "I\\need\\space")
## [1] "I need space"
Sequences
To match a sequence of characters we can apply short-hand notation which
captures the fundamental types of sequences. The following displays the
general syntax for these common sequences:
# substitute any digit with an underscore
gsub(pattern = "\\d", "_", "I'm working in RStudio
v.0.99.484")
## [1] "I'm working in RStudio v._.__.___"
# substitute any non-digit with an underscore

gsub(pattern = "\\D", "_", "I'm working in RStudio
v.0.99.484")
## [1] "_________________________0_99_484"
# substitute any whitespace with underscore

gsub(pattern = "\\s", "_", "I'm working in RStudio
v.0.99.484")
## [1] "I'm_working_in_RStudio_v.0.99.484"
# substitute any wording with underscore

gsub(pattern = "\\w", "_", "I'm working in RStudio
v.0.99.484")
## [1] "_'_ _______ __ _______ _._.__.___"

UNIX Special Characters

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

UNIX Special Characters

Încărcat de

Drepturi de autor:

Formate disponibile

Unix Filename Wildcards – Metacharacters

#1) ‘*’ – any number of characters:

#2) ‘?’ – single character:

#3) ‘[’ range ‘]’ – single character from a range:

# 4) ‘[’ range ‘]*’ – multiple characters from a range:

Overview of Regular expressions in Unix:

Unix Video #10:

Unix Regular Expressions

 Example1: Match all lines that start with ‘A’. g. “A plane”

 Example1: Match all lines that contain a single character. g. “a”

 Example1: Match all lines that contain a single digit. g. “8”

 Example1: Match all lines that do not contain a digit. g. “hello”

To match metacharacters in R you need to escape them with a double

# substitute ^ with carrot

# substitute \\ with whitespace

# substitute any non-digit with an underscore

# substitute any whitespace with underscore

# substitute any wording with underscore

S-ar putea să vă placă și