Sunteți pe pagina 1din 80

FUNCTIONS Function -definition:A FUNCTION returns a value from a computation or system manipulation that requires zero or more arguments.

And li!e most programming languages the "A" "ystem provides an e#tensive li$rary of %$uilt&in'functions. "A" has more than ()* functions for a variety of programming tas!s. This tutorial +ill cover the synta# for invo!ing functions an overvie+ of the functions availa$le e#amples of commonly used functions selected character handling and numeric functions and some tric!s and applications of functions that +ill surprise you.

Character Functions:
A ma,or strength of "A" is its a$ility to +or! +ith character data. The "A" character functions are essential to this. The collection of functions and call routines in this chapter allo+ you to do e#tensive manipulation on all sorts of character data.

Functions That Change the Case of Characters


1)UPCASE 2 !O"CASE #)P$OPCASE

T+o old functions U-CA". and /O0CA". change the case of characters. A ne+ function 1as of 2ersion )3 -4O-CA". 1proper case3 capitalizes the first letter of each +ord
Function: UPCASE

Pur%ose: To change all letters to uppercase. Note5 The corresponding function /O0CA". changes uppercase to lo+ercase. S&nta': UPCASE(character-value)
"here character-value is any "A" character e#pression. If a length has not $een previously assigned the length of the resulting varia$le +ill $e the length of the argument

Progra)1:-

Changing *o+ercase to u%%ercase for a** character ,aria-*es in a data set


6ATA 7I8.69

/.N:T; A < C 6 . = (9 IN-UT A < C 6 . 8 >9 6ATA/IN."9 7f-p6(? mfmF7@A 9 6ATA U--.49 ".T 7I8.69 A44A> A//BCCDE BC;A4ACT.4B9 6O I F ( TO 6I71A//BC39 A//BCCIE F U-CA".1A//BCCIE39 .N69 64O- I9 4UN9 -4OC -4INT 6ATAFU--.4 NOO<"9 TIT/. G/isting of 6ata "et U--.4G9 4UN9

.#planation5&

4emem$er that upper& and lo+ercase values are represented $y different internal codes so if you are testing for a value such as > for a varia$le and the actual value is y you +ill not get a match. Therefore it is often useful to convert all character values to either upper& or lo+ercase $efore doing your logical comparisons. In this program BC;A4ACT.4B is used in the array statement to represent all the character varia$les in the data set 7I8.6. Inspection of the listing $elo+ verifies that all lo+ercase values +ere changed to uppercase

2)Function: !O"CASE

Pur%ose: To change all letters to lo+ercase. S&nta': !O"CASE(character-value) character-value is any "A" character e#pression. Note: The corresponding function U-CA". changes lo+ercase to uppercase

Progra)1:-

Progra) to ca%ita*i.e the first *etter of the first and *ast na)e (using SU/ST$)
6ATA CA-ITA/IH.9 INFO47AT FI4"T /A"T =@*.9 IN-UT FI4"T /A"T9 FI4"T F /O0CA".1FI4"T39 /A"T F /O0CA".1/A"T39 "U<"T41FI4"T ( (3 F U-CA".1"U<"T41FI4"T ( (339 "U<"T41/A"T ( (3 F U-CA".1"U<"T41/A"T ( (339 6ATA/IN."9 ronald cO6y T;oma" e6I"ON al$ert einstein 9 -4OC -4INT 6ATAFCA-ITA/IH. NOO<"9 TIT/. I/isting of 6ata "et CA-ITA/IH.I9 4UN9 .#planation5&

This program capitalizes the first letter of the t+o character varia$les FI4"T and /A"T. The same technique could have other applications. The first step is to set all the letters to lo+ercase using the /O0CA". function. The first letter of each name is then turned $ac! to uppercase using the "U<"T4 function 1on the right side of the equal sign3 to select the first letter in the first and last names and the U-CA". function to capitalize it. The "U<"T4 function on the left side of the equal sign is used to place this letter in the first position of each of the varia$les.

#)Function: P$OPCASE

Pur%ose: To capitalize the first letter of each +ord in a string. S&nta': -4O-CA".1character&value3

Progra):-

Ca%ita*i.ing the first *etter of each +ord in a string


6ATA -4O-.49 IN-UT NA7. =J*.9 NA7. F -4O-CA".1NA7.39 6ATA/IN."9

ronald cO6y T;oma" e6I"ON al$ert einstein 9 -4OC -4INT 6ATAF-4O-.4 NOO<"9 TIT/. I/isting of 6ata "et -4O-.4I9 4UN9 .#planation5&

In this program you use the -4O-CA". function to capitalize the first letter of the first and last names.

Functions That $e)o,e Characters fro) Strings (3CO/7-< 1compress $lan!s3 can replace multiple $lan!s +ith a single $lan!. ?3CO7-4."" function can remove not only $lan!s $ut also any characters you specify from a string. Function: CO0P/!

Pur%ose: To replace all occurrences of t+o or more $lan!s +ith a single $lan! character. This is particularly useful for standardizing addresses and names +here multiple $lan!s may have $een entered. S&nta': CO0P/!(character-value)
Progra):-

Using the CO0P/! function to con,ert )u*ti%*e -*an1s to a sing*e -*an1


6ATA "KU..H.9 IN-UT L( M( NA7. =?*. L? M( A664."" =@*. L@ M( CIT> =(N. M?* "TAT. =?. M?N HI- =N.9 NA7. F CO7-</1NA7.39 A664."" F CO7-</1A664.""39 CIT> F CO7-</1CIT>39 6ATA/IN."9 4ON CO6> O) /AH> <4OOP 4OA6 F/.7IN:TON NQ *OO?? <I// <4O0N ?O CAT;> "T4..T

NO4T; CIT> N> ((N(O 9 -4OC -4INT 6ATAF"KU..H.9 TIT/. G/isting of 6ata "et "KU..H.G9 I6 NA7.9 2A4 A664."" CIT> "TAT. HI-9 4UN9 E'%*anation:-

.ach line of the addresses +as passed through the CO7-</ function to replace any sequence of t+o or more $lan!s to a single $lan!
FUNCTION : CO0P$ESS Pur%ose: To remove specified characters from a character value.

S&nta': CO7-4.""1character-value R Gcompress-listGS3


0here character-value is any "A" character e#pression. compress-list is an optional list of the characters you +ant to remove. If this argument is omitted the default character to $e removed is a $lan!. If you include a list of values to remove only those characters +ill $e removed. If a $lan! is not included in the list $lan!s +ill not $e removed. .#amples5& In the e#amples $elo+ CHAR = "A C123XYZ" Function $eturns "ACXYZ" "9087771234" "A CXYZ"

COMPRESS("A C XYZ") COMPRESS("(908) 777-1234"," (-)") COMPRESS(CHAR,"0123456789")

-rogram(5&

$e)o,ing dashes and %arentheses fro) %hone nu)-ers


6ATA -;ON.BNU7<.49 IN-UT -;ON. = (&(N9 -;ON.( F CO7-4.""1-;ON.39 -;ON.? F CO7-4.""1-;ON. G1&3 G39 6ATA/IN."9

1)*O3?@N&AA)* 1?*(3 NNN&TT )) 9 -4OC -4INT 6ATAF-;ON.BNU7<.49 TIT/. G/isting of 6ata "et -;ON.BNU7<.4G9 4UN9

.#planation5& For the varia$le -;ON.( the second argument is omitted from the CO7-4."" function9 therefore only $lan!s are removed. For -;ON.? left and right parentheses dashes and $lan!s are listed in the second argument so all of these characters are removed from the character value. -rogram?5&

Con,erting socia* securit& nu)-ers fro) character to nu)eric


6ATA "OCIA/9 IN-UT M( ""BC;A4 =((. M( 7IP.BH6.< CO77A((.9 ""BNU7.4IC F IN-UT1CO7-4.""1""BC;A4 G&G3 ).39 ""BFO47ATT.6 F ""BNU7.4IC9 FO47AT ""BFO47ATT.6 ""N.9 6ATA/IN."9 (?@&AN&JTO) **(&((&(((( 9 -4OC -4INT 6ATAF"OCIA/ NOO<"9 TIT/. I/isting of 6ata "et "OCIA/I9 4UN9 .#planation5& The CO7-4."" function is used to remove the dashes from the social security num$er and the IN-UT function does the character to numeric conversion. It should $e noted here that the social security num$er including dashes can $e read directly into a numeric varia$le using the comma((. informat. . ;ere the varia$le ""BFO47ATT.6 is set equal to the varia$le ""BNU7.4IC so that you can see the effect of adding the ""N. format. 1Note5 ""N. is equivalent to ""N((.3 This format prints numeric values +ith leading zeros and dashes in the proper places

Functions That "earch for Characters Functions in this category allo+ you to search a string for specific characters or for a character category 1such as a digit3. "ome of these functions can also locate the first position in a string +here a character does not meet a particular specification. The IAN>I functions 1AN>A/NU7 AN>A/-;A AN>6I:IT AN>-UNCT and AN>"-AC.3 This group of functions is descri$ed together $ecause of the similarity of their use. Ne+ as of 2ersion ) these functions return the location of the first alphanumeric letter digit punctuation or space in a character string. Note that there are other IAN>I functions $esides those presented hereUthese are the most common ones 1see the SAS OnlineDoc 9.1 for a complete list3. It is important to note that it may $e necessary to use the T4I7 function 1or "T4I- function3 +ith the AN> and NOT functions since leading or especially trailing $lan!s +ill affect the results. For e#ample if 8 F IA<C I 1A<C follo+ed $y three $lan!s3 > F NOTA/NU7183 +ill $e A the location of the first $lan!.

Function5 AN>A/NU7

Purpose: To locate the first occurrence of an alphanumeric character 1any upper& or


lo+ercase letter or num$er3 and return its position. If none is found the function returns a *. 0ith the use of an optional parameter this function can $egin searching at any position in the string and can also search from right to left if desired.

Syntax: AN>A/NU71character-value R startS3


0here character-value is any "A" character e#pression.

start is an optional parameter that specifies the position in the string to


$egin the search. If it is omitted the search starts at the $eginning of the string. If it is non&zero the search $egins at the position in the string of the a$solute value of the num$er 1starting from the left&most position in the string3. If the start value is positive the search goes from left to right9 if the value is negative the search goes from right to left. A negative value larger

than the length of the string results in a scan from right to left starting at the end of the string. If the value of start is a positive num$er longer than the length of the string or if it is * the function returns a * .#amples5& For these e#amples STRI ! = "A"C 123 #$%&'('" Function $eturns A YA) *M(STRI !) A YA) *M("##++,,") A YA) *M(STRI !,5) A YA) *M(STRI !,-4) A YA) *M(STRI !,6)

1 0 5 3 6

1the position of "A"3 1no alpha&numeric characters3 1the position of "1"3 1the position of "C"3 1the position of "2"3

Function5 AN>A/-;A

Purpose: To locate the first occurrence of an alpha character 1any upper& or lo+ercase
letter3 and return its position. If none is found the function returns a *. 0ith the use of an optional parameter this function can $egin searching at any position in the string and can also search from right to left if desired.

Syntax: AN>A/-;A1character-value R startS3


.#amples5& For these e#amples STRI ! = "A"C 123 #$%&'('" Function $eturns A YA)PHA(STRI !) 1 A YA)PHA("##++,,") 0 A YA)PHA(STRI !,5) 10 A YA)PHA(STRI !,-4) 3 A YA)PHA(STRI !,6) 10

1position of IAI3 1no alpha characters3 1position of I#I3 1position of ICI3 1position of I#I3

Function5 AN>6I:IT

Purpose: To locate the first occurrence of a digit 1numeral3 and return its position. If
none is found the function returns a *. 0ith the use of an optional parameter this function can $egin searching at any position in the string and can also search from right to left if desired.

Syntax: AN>6I:IT1character-value R startS3

.#amples5& For these e#amples STRI ! = "A"C 123 #$%&'('" Function $eturns A Y-I!IT(STRI !) A Y-I!IT("##++,,") A Y-I!IT(STRI !,5) A Y-I!IT(STRI !,-4) A Y-I!IT(STRI !,6) Function: AN2PUNCT

5 0 5 0 6

1position of I(I3 1no digits3 1position of I(I3 1no digits from position A to (3 1position of I?I3

Purpose: To locate the first occurrence of a punctuation character and return its
position. If none is found the function returns a *. 0ith the use of an optional parameter this function can $egin searching at any position in the string and can also search from right to left if desired. In the A"CII character set the follo+ing characters are considered punctuation5 . " / + , 0 1 ( ) 2 3 , - 4 5 6 7 8 = 9 # : ; < = > ' ? @ A B C

Syntax: AN>-UNCT1character-value R startS3


.#amples5&For these e#amples STRI ! = "A.C 123 #$%&'('" Function $eturns A YP* CT(STRI !) 2 1position of IVI3 A YP* CT("##++,,") 1 1position of IWI3 A YP* CT(STRI !,5) 9 1position of IWI3 A YP* CT(STRI !,-4) 2 1starts at position A and goes left position of VI 3 A YP* CT(STRI !,-3) 2 1starts at ICI and goes left position of IVI3

Function5 AN>"-AC.

Purpose: To locate the first occurrence of a +hite space character 1a $lan!


or vertical ta$ carriage return linefeed and form&feed3 and return its position. If none is found the function returns a *. 0ith the use of an optional parameter this function can $egin searching at any position in the string and can also search from right to left if desired.

horizontal

Syntax5 AN>"-AC.1character-value R startS3


.#amples5& For these e#amples STRI ! = "A"C 123 #$%&'('" Function $eturns A YSPACE(STRI !) 4 A YSPACE("##++,,") 0 A YSPACE(STRI !,5) 8 A YSPACE(STRI !,-4) 4 A YSPACE(STRI !,6) 8

1position of the first $lan!3 1no spaces3 1position of the second $lan!3 1position of the first $lan!3 1position of the second $lan!3

-rogram(5&

3e)onstrating the 4AN25 character functions


6ATA AN>0;.4.9 IN-UT "T4IN: =C;A4?*.9 A/-;ABNU7 F AN>A/NU71"T4IN:39 A/-;ABNU7B) F AN>A/NU71"T4IN: &)))39 A/-;A F AN>A/-;A1"T4IN:39 A/-;ABN F AN>A/-;A1"T4IN: &N39 6I:IT F AN>6I:IT1"T4IN:39 6I:ITB) F AN>6I:IT1"T4IN: &)))39 -UNCT F AN>-UNCT1"T4IN:39 "-AC. F AN>"-AC.1"T4IN:39 6ATA/IN."9 Once upon a time (?@ ;./-V )OTJNA@?( 9 -4OC -4INT 6ATAFAN>0;.4. NOO<" ;.A6IN:F;9 TIT/. I/isting of 6ata "et AN>0;.4.I9 4UN9 .#planation5& .ach of these %AN>' functions +or!s in a similar manner the only difference $eing in the types of character values it is searching for. The t+o statements using a starting value of X))) demonstrate an easy +ay to search from right to left +ithout having to !no+ the length of the string 1assuming that you donGt have any strings longer than ))) in +hich case you could choose a larger num$er3. Functions such as AN>A/-;A and AN>6I:IT can $e very useful for e#tracting values from strings +here the positions of digits or letters are not fi#ed.

-rogram?5&

Using the functions AN23I6IT and AN2SPACE to find the first nu)-er in a string
6ATA ".A4C;BNU79 IN-UT "T4IN: =J*.9 "TA4T F AN>6I:IT1"T4IN:39 .N6 F AN>"-AC.1"T4IN: "TA4T39 IF "TA4T N. * T;.N NU7 F IN-UT1"U<"T41"T4IN: "TA4T .N6&"TA4T3 ).39 6ATA/IN."9 This line has a NJ in it t+o num$ers (?@ and ANJ in this line No digits here 9 -4OC -4INT 6ATAF".A4C;BNU7 NOO<"9 TIT/. I/isting of 6ata "et ".A4C;BNU7I9 4UN9 .#planation5& This program identifies the first num$er in any line of data that contains a numeric value 1follo+ed $y one or more $lan!s3. The AN>6I:IT function determines the position of the first digit of the num$er9 the AN>"-AC. function searches for the first $lan! follo+ing the num$er 1the starting position of this search is the position of the first digit3. The "U<"T4 function e#tracts the digits 1starting at the value of "TA4T +ith a length determined $y the difference $et+een .N6 and "TA4T3. Finally the IN-UT function performs the character to numeric conversion.

The INOTI functions 1NOTA/NU7 NOTA/-;A NOT6I:IT and NOTU--.43 This group of functions is similar to the IAN>I functions 1such as AN>A/NU7 AN>A/-;A etc.3 e#cept that the function returns the position of the first character value that is not a particular value 1alphanumeric character digit or uppercase character3. Note that this is not a complete list of the INOTI functions. As +ith the IAN>I functions there is an optional parameter that specifies +here to start the search and in +hich direction to search. Function5 NOTA/NU7

Purpose: To determine the position of the first character in a string that is not an
alphanumeric 1any upper& or lo+ercase letter or a num$er3. If none is found the function returns a *. 0ith the use of an optional parameter this function can $egin searching at any position in the string and can also search from right to left if desired.

Syntax: NOTA/NU71character-value R startS3


0here character-value is any "A" character e#pression. start is an optional parameter that specifies the position in the string to $egin the search. If it is omitted the search starts at the $eginning of the string. If it is non&zero the search $egins at the position in the string of the a$solute value of the num$er 1starting from the left&most position in the string3. If the start value is positive the search goes from left to right9 if the value is negative the search goes from right to left. A negative value larger than the length of the string results in a scan from right to left starting at the end of the string. If the value of start is a positive num$er longer than the length of the string or if it is * the function returns a *. .#amples5& For these e#amples STRI ! = "A"C 123 #$%&'('" Function $eturn OTA) *M(STRI !) OTA) *M("TDEFG(H123") OTA) *M("##++,,") OTA) *M(STRI !,5) OTA) *M(STRI !,-6) OTA) *M(STRI !,8)

1position of the (st $lan!3 1all alpha&numeric values3 1position of the IWI3 1position of the ?nd $lan!3 4 1position of the (st $lan!3 9 1position of the IWI3

4 0 1 8

Function5 NOTA/-;A

Purpose: To determine the position of the first character in a string that is not an
upper& or lo+ercase letter 1alpha character3. If none is found the function returns a *. 0ith the use of an optional parameter this function can $egin searching at any position in the string and can also search from right to left if desired.

Syntax: NOTA/-;A1character-value R startS3


.#amples5& For these e#amples STRI ! = "A"C 123 #$%&'('" Function $eturns OTA)PHA(STRI !) OTA)PHA("A"CIJK") OTA)PHA("##++,,") OTA)PHA(STRI !,5) OTA)PHA(STRI !,-10) of IWI3 OTA)PHA(STRI !,2)

4 0 1 5 9

1position of (st $lan!3 1all alpha characters3 1position of first IWI3 1position of I(I3 1start at position (* and search left position

4 1position of (st $lan!3

Function5 NOT6I:IT

Purpose: To determine the position of the first character in a string that is not a digit.
If none is found the function returns a *. 0ith the use of an optional parameter this function can $egin searching at any position in the string and can also search from right to left if desired.

Syntax5 NOT6I:IT1character-value R startS3


.#amples5& For these e#amples STRI ! = "A"C 123 #$%&'('" Function $eturns OT-I!IT(STRI !) OT-I!IT("123456") OT-I!IT("##++,,") OT-I!IT(STRI !,5) OT-I!IT(STRI !,-6) OT-I!IT(STRI !,6)

1 0 1 8 4 8

1position of IAI3 1all digits3 1position of IWI3 1position of ?nd $lan!3 1position of (st $lan!3 1position of ?nd $lan!3

Function5 NOTU--.4

Purpose: To determine the position of the first character in a string that is not an
uppercase letter. If none is found the function returns a *. 0ith the use of an optional parameter this function can $egin searching at any position in the string and can also search from right to left if desired.

Syntax5 NOTU--.41character-value R startS


.#amples5& For these e#amples STRI ! = "A"C 123 #$%&'('" Function $eturns OT*PPER("A"C-IJKL") OT*PPER("A"C-EM!") OT*PPER(STRI !) OT*PPER("##++,,") OT*PPER(STRI !,5) OT*PPER(STRI !,-6) OT*PPER(STRI !,6)

5 0 4 1 5 6 6

1position of IaI3 1all uppercase characters3 1position of (st $lan!3 1position of IWI3 1position of I(I3 1position of I?I3 1position of I?I3

-rogram(5&

3e)onstrating the 7NOT7 character functions


6ATA N.:ATI2.9 IN-UT "T4IN: =N.9 NOTBA/-;ABNU7.4IC F NOTA/NU71"T4IN:39 NOTBA/-;A F NOTA/-;A1"T4IN:39 NOTB6I:IT F NOT6I:IT1"T4IN:39 NOTBU--.4 F NOTU--.41"T4IN:39 6ATA/IN."9 A<C6. a$cde a$c6. (?@AN 5L=YZ A<C 9 -4OC -4INT 6ATAFN.:ATI2. NOO<"9 TIT/. I/isting of 6ata "et N.:ATI2.I9 4UN9

.#planation5& This straightfor+ard program demonstrates each of the INOTI character functions. As +ith most character functions $e careful +ith trailing $lan!s. Notice that the last o$servation 1IA<CI3 contains only three characters $ut since "T4IN: is read +ith a =N. informat there are t+o trailing $lan!s follo+ing the letters GA<CG. That is the reason you o$tain a value of A for all the functions e#cept NOT6I:IT +hich returns a ( 1the first character is not a digit3.

FIN6 and FIN6C This pair of functions shares some similarities to the IN6.8 and IN6.8C functions. FIN6 and IN6.8 $oth search a string for a given su$string. FIN6C and IN6.8C $oth search for individual characters. ;o+ever $oth FIN6 and FIN6C have some additional capa$ility over their counterparts. For e#ample this pair of functions has the a$ility to declare a starting position for the search the direction of the search and to ignore case or trailing $lan!s. Function5 FIN6

Purpose: To locate a su$string +ithin a string. 0ith optional arguments


define the starting point for the search the direction of the search and ignore case or trailing $lan!s.

you can

Syntax: FIN61character-value,
find-string R GmodifiersGS R startS3 0here character-value is any "A" character e#pression. find-string is a character varia$le or string literal that contains one or more characters that you +ant to search for. The function returns the first position in the character-value that contains the find-string. If the find-string is not found the function returns a *. The follo+ing modifiers 1in upper& or lo+ercase3 placed in single or dou$le quotation mar!s may $e used +ith FIN65 G ignore case. F ignore trailing $lan!s in $oth the character varia$le and the findstring. E'a)%*es:-

For these e#amples STRI !1 = "HDNNO PDNNO HOOLJ%D" and STRI !2 = "PDNNO" Function MI MI MI MI MI $eturns 7 1 17 7 7

-(STRI !1, STRI !2) -(STRI !1, STRI !2, 1I1) -(STRI !1,"J%D") -("IJK$%&IJK","IJK",4) -(STRI !1, STRI !2, "G", -99

Function5 FIN6C

Purpose: To locate a character that appears or does not appear +ithin a string. 0ith
optional arguments you can define the starting point for the search the direction of the search to ignore case or trailing $lan!s or to loo! for characters e#cept the ones listed.

Syntax5 FIN6C1character-value

find-characters

R GmodifiersGS R startS3 0here character-value is any "A" character e#pression. find-characters is a list of one or more characters that you +ant to search for. The function returns the first position in the character-value that contains one of the find-characters. If none of the characters are found the function returns a *. 0ith an optional argument you can have the function return the position in a character string of a character that is not in the find-characters list. modifiers 1in upper& or lo+ercase3 placed in single or dou$le quotation mar!s may $e used +ith FIN6C as follo+s5 G ignore case. F ignore trailing $lan!s in $oth the character varia$le and the find-characters. Q count only characters that are not in the list of find characters.

o process the modifiers and find characters only once to a specific call to the function. In su$sequent calls changes to these arguments +ill have no effect.

.#amples5& For these e#amples STRI !1 = "ARRNDE I(L "OOSE" and STRI !2 = "IJKLD" Function $eturns MI -C(STRI !1, STRI !2) MI -C(STRI !1, STRI !2, 1G1) MI -C(STRI !1,"IRND",1QG1) MI -C("IJK$%&IJK","IJK",4)

5 1 6 7

-rogram5&

Using the FIN3 and FIN3C functions to search for strings and characters

6ATA FIN6B2O0./9 IN-UT M( "T4IN: =?*.9 -.A4 F FIN61"T4IN: I-earI39 -O"B2O0./ F FIN6C1"T4IN: IaeiouI GIG39 U--.4B2O0./ F FIN6C1"T4IN: IaeiouI39 NOTB2O0./ F FIN6C1"T4IN: IA.IOUI GI2G39 6ATA/IN."9 8>HA<Ca$c 8>H Apple and -ear 9 -4OC -4INT 6ATAFFIN6B2O0./ NOO<"9 TIT/. I/isting of 6ata "et FIN6B2O0./I9 4UN9 .#planation5& The FIN6 function returns the position of the characters I-earI in the varia$le "T4IN:. "ince the i modifier is not used the search is case&sensitive. The first use of the FIN6C function loo!s for any upper& or lo+ercase vo+el in the string 1$ecause of the i modifier3. The ne#t statement +ithout the i modifier locates only lo+ercase vo+els. Finally the v modifier in the last FIN6C function reverses the search to loo! for the first character that is

not a vo+el 1upper& or lo+ercase $ecause of the i modifier3.

IN6.8 IN6.8C and IN6.80 This group of functions all search a string for a su$string of one or more characters. IN6.8 and IN6.80 are similar the difference $eing that IN6.80 loo!s for a +ord 1defined as a string $ounded $y spaces or the $eginning or end of the string3 +hile IN6.8 simply searches for the designated su$string. IN6.8C searches for one or more individual characters and al+ays searches from right to left. Note that these three functions are all case&sensitive.

Function5 IN6.8

Purpose: To locate the starting position of a su$string in a string. Syntax: IN6.81character-value


find-string3

0here character-value is any "A" character e#pression. find-string is a character varia$le or string literal that contains the su$string for +hich you +ant to search. The function returns the first position in the character-value that contains the find-string. If the find-string is not found the function returns a *. .#amples5& For these e#amples STRI ! = "A"C-EM!"

Function I -EX(STRI I -EX(STRI I -EX(STRI I -EX(STRI

$eturns !,1C1) !,1-EM1) !,1X1) !,1ACE1) 3 4 0 0 1the position of the GCG3 1the position of the G6G3 1no I8I in the string3 1no IAC.I in the string3

-rogram5&

Con,erting nu)eric ,a*ues of )i'ed units (e8g89 1g and *-s) to a sing*e nu)eric :uantit&
6ATA ;.A2>9 IN-UT C;A4B0T = MM9 0.I:;T F IN-UT1CO7-4.""1C;A4B0T GP:G3 O.39 IF IN6.81C;A4B0T GPG3 N. * T;.N 0.I:;T F ?.?? D 0.I:;T9 0.I:;T F 4OUN610.I:;T39 64O- C;A4B0T9 6ATA/IN."9 J*P: (NN O?P: NAP: )O 9 -4OC -4INT 6ATAF;.A2> NOO<"9 TIT/. I/isting of 6ata "et ;.A2>I9 2A4 0.I:;T9 4UN9

.#planation5& The data lines contain num$ers in !ilograms follo+ed $y the a$$reviation P: or in pounds 1no units used3. As +ith most pro$lems of this type +hen you are reading a com$ination of num$ers and characters you usually need to first read the value as a character. ;ere the CO7-4."" function is used to remove the letters P: from the character value. The IN-UT function does its usual ,o$ of character to numeric conversion. If the IN6.8 function returns any value other than a * the letter P +as found in the string and the 0.I:;T value is converted from P: to pounds. Finally the value is rounded to the nearest pound using the 4OUN6 function.

Function5 IN6.8C

Purpose: To search a character string for one or more characters. The IN6.8C
function +or!s in a similar manner to the IN6.8 function +ith the difference $eing it can $e used to search for any one in a list of character values.

Syntax:IN6.8C1character-value
...3

Gchar1G Gchar2G Gchar3G

0here INDEXC(character-value, 'char1char2char3. . .') character-value is any "A" character e#pression. char1, char2, 444 are individual character values that you +ish to search for in the character-value. The IN6.8C function returns the first occurrence of any of the char1 , char2 etc. values in the string. If none of the characters is found the function returns a *. -rogram5&

Searching for one of se,era* characters in a character ;aria-*e


6ATA C;.CP9 IN-UT TA:BNU7<.4 = MM9 DDDIf the tag num$er contains an 8 > or H it indicates an international destination other+ise the destination is domestic9 IF IN6.8C1TA:BNU7<.4 G8G G>G GHG3 :T * T;.N 6."TINATION F GINT.4NATIONA/G9 ./". 6."TINATION F G6O7."TICG9 6ATA/IN."9 T(?@ T>@@@ (@NTH UH>8 OOO A<C 9 -4OC -4INT 6ATAFC;.CP NOO<"9 TIT/. I/isting of 6ata "et C;.CPI9 I6 TA:BNU7<.49 2A4 6."TINATION9 4UN9

E'%*anation:4ather than use three statements using the IN6.8 function you can use the IN6.8C function +hich allo+s you to chec! for any one of a num$er of character values. ;ere if an 8 > or H is found in the varia$le TA:BNU7<.4 the function returns a num$er greater than * and 6."TINATION +ill $e set to INT.4NATIONA/.

-rogram5&

$eading dates in a )i'ture of for)ats


6ATA 7I8.6B6AT."9 IN-UT M( 6U77> =(N.9 IF IN6.8C16U77> G[&5G3 N. * T;.N 6AT. F IN-UT16U77> 7766>>(*.39 ./". 6AT. F IN-UT16U77> 6AT.).39 FO47AT 6AT. 0O466AT..9 64O- 6U77>9 6ATA/IN."9 (*[?([()AJ *JQUN?**? N&(*&()N* T5)5NT 9 -4OC -4INT 6ATAF7I8.6B6AT." NOO<"9 TIT/. I/isting of 6ata "et 7I8.6B6AT."I9 2A4 6AT.9 4UN9

.#planation5& In this some+hat trumped&up e#ample dates are entered either in mm[dd[yyyy or dd7ONyyyy form. Also $esides a slash dashes and colons are used. Any string that includes either a slash dash or colon is a date that needs the mmddyy(*. informat. Other+ise the date). informat is used.

Function5 IN6.80 Purpose: To search a string for a +ord defined as a group of letters separated on $oth ends $y a +ord $oundary 1a space the $eginning of a string end of the string3. Note that punctuation is not considered a +ord $oundary.

Syntax: IN6.801character-value find-string3


+here character-value is any "A" character e#pression. find-string is the +ord for +hich you +ant to search The function returns the first position in the character-value that contains the find-string. If the find-string is not found the function returns a *.

.#amples5& For these e#amples STRI !1 = "FPDTD GE I FPD PDTD" and STRI !2 = "D(L G( FPD4" Function $esu*t I -EXU(STRI !1,"FPD") I -EXU("A"A"A"","A"") I -EXU(STRI !1,"DT") I -EXC(STRI !2,"FPD")

12 1the +ord ItheI3 0 1no +ord $oundaries around IA<I3 0 1not a +ord3 0 1punctuation is not a +ord $oundary

-rogram5&

Searching for a +ord using the IN3E<" function


6ATA FIN6B0O469 IN-UT "T4IN: =A*.9 -O"ITIONB0 F IN6.801"T4IN: ItheI39 -O"ITION F IN6.81"T4IN: ItheI39 6ATA/IN."9 there is a the in this line ends in the ends in the. none here 9 -4OC -4INT 6ATAFFIN6B0O469 TIT/. I/isting of 6ata "et FIN6B0O46I9 4UN9

.#planation5& This program demonstrates the difference $et+een IN6.8 and IN6.80. Notice in the first o$servation in the listing $elo+ the IN6.8 function returns a ( $ecause the letters ItheI as part of the +ord IthereI $egin the string. "ince the IN6.80 function needs either +hite space at the $eginning or end of a string to delimit a +ord it returns a (? the position of the +ord ItheI in the string. O$servation @ emphasizes the fact that a punctuation mar! does not serve as a +ord separator. Finally since the string ItheI does not appear any+here

in the fourth o$servation $oth functions return a *.

Function5 2.4IF> Purpose: To chec! if a string contains any un+anted values Syntax: 2.4IF>1character-value verify-string3 0here character-value is any "A" character e#pression. verify-string is a "A" character varia$le or a list of character values in quotation mar!s. This function returns the first position in the character-value that is not present in the verify-string. If the character-value does not contain any characters other than those in the verify-string the function returns a *. <e especially careful to thin! a$out trailing $lan!s +hen using this function. If you have an O&$yte character varia$le equal to GA<CG 1follo+ed $y five $lan!s3 and if the verify string is equal to GA<CG the 2.4IF> function returns a A the position of the first $lan! 1+hich is not present in the verify string3. Therefore you may need to use the T4I7 function on either the character-value the verify-string or $oth.

.#amples5& For these e#amples STRI ! = "A"CXA"-" and V = "A"C-E" Function $eturns VERIMY(STRI !,V) VERIMY(STRI !,"A"C-EXYZ") VERIMY(STRI !,"AC-") VERIMY("A"C ","A"C") VERIMY(TRIM("A"C "),"A"C")

4 0 2 4 0

1I8I is not in the verify string3 1no I$adI characters in "T4IN:3 1position of the I<I3 1position of the (st $lan!3 1no invalid characters

-rogram5&

Using the ;E$IF2 function to chec1 for in,a*id character data ,a*ues
6ATA 2.4>BFI9

IN-UT I6 = (&@ AN"0.4 = N&)9 - F 2.4IF>1AN"0.4 GA<C6.G39 OP F - .K *9 6ATA/IN."9 **( AC<.6 **? A<86. **@ (?CC. **A A<C . 9 -4OC -4INT 6ATAF2.4>BFI NOO<"9 TIT/. Ilisting of 6ata "et 2.4>BFII9 4UN9

.#planation5& In this e#ample the only valid values for AN"0.4 are the uppercase letters AX.. Any time there are one or more invalid values the result of the 2.4IF> function 1varia$le -3 +ill $e a num$er from ( to N. The "A" statement that computes the value of the varia$le OP needs a +ord of e#planation. First the logical comparison - .K * returns a value of true or false +hich is equivalent to a ( or *. This value is then assigned to the varia$le OP. Thus the varia$le OP is set to ( for all valid values of AN"0.4 and to * for any invalid values

Functions That .#tract -arts of "trings The functions descri$ed in this section can e#tract parts of strings. 0hen used on the left hand side of the equal sign the "U<"T4 function can also $e used to insert characters into specific positions of an e#isting string. Function5 "U<"T4 Purpose: To e#tract part of a string. 0hen the "U<"T4 function is used on the left side of the equal sign it can place specified characters into an e#isting string. Syntax5 "U<"T41character-value start R lengthS3 character-value is any "A" character e#pression. start is the starting position +ithin the string. length if specified is the num$er of characters to include in the

su$string. If this argument is omitted the "U<"T4 function +ill return all the characters from the start position to the end of the string. If a length has not $een previously assigned the length of the resulting varia$le +ill $e the length of the character-value.

.#amples5& For these e#amples let STRI ! = "A"C123XYZ" Function $eturns S*"STR(STRI !,4,2) S*"STR(STRI !,4) S*"STR(STRI !,)E !TH(STRI !))

"12" "123XYZ" "Z" 1last character in the string3

-rogram5&

E'tracting %ortions of a character ,a*ue and creating a character ,aria-*e and a nu)eric ,a*ue
6ATA "U<"T4IN:9 IN-UT I6 = (&)9 /.N:T; "TAT. = ?9 "TAT. F "U<"T41I6 ( ?39 NU7 F IN-UT1"U<"T41I6 T @3 @.39 6ATA/IN."9 N>8888(?@ NQ(?@ANJT 9 -4OC -4INT 6ATAF"U<"T4IN: NOO<"9 TIT/. G/isting of 6ata "et "U<"T4IN:G9 4UN9

.#planation5& In this e#ample the I6 contains $oth state and num$er information. The first t+o characters of the I6 varia$le contain the state a$$reviations and the last three characters represent numerals that you +ant to use to create a numeric varia$le. .#tracting the state codes is straightfor+ard. To o$tain a numeric value from the last @ $ytes of the I6 varia$le it is necessary to first use the "U<"T4 function to e#tract the three characters of interest and to then use the IN-UT function to do the character to numeric conversion

program5&

E'tracting the *ast t+o characters fro) a string9 regard*ess of the *ength
6ATA .8T4ACT9 IN-UT M( "T4IN: =?*.9 /A"TBT0O F "U<"T41"T4IN: /.N:T;1"T4IN:3&( ?39 6ATA/IN."9 A<C6. A8(?@ANN> (?JTO) 9 -4OC -4INT 6ATAF.8T4ACT NOO<"9 TIT/. I/isting of 6ata "et .8T4ACTI9 2A4 "T4IN: /A"TBT0O9 4UN9

.#planation5& This program demonstrates ho+ you can use the /.N:T; and "U<"T4 functions together to e#tract portions of a string +hen the strings are of different or un!no+n lengths. To see ho+ this program +or!s ta!e a loo! at the first line of data. The /.N:T; function +ill return a N and 1NX(3 F A the position of the ne#t to the last 1penultimate3 character in "T4IN:

-rogram5&

Using the SU/ST$ function to 7un%ac17 a string


6ATA -ACP9 IN-UT "T4IN: = (&N9 6ATA/IN."9 (?@AN O JA? 9 6ATA UN-ACP9 ".T -ACP9 A44A> 8CNE9 6O Q F ( TO N9

8CQE F IN-UT1"U<"T41"T4IN: Q (3 (.39 .N69 64O- Q9 4UN9 -4OC -4INT 6ATAFUN-ACP NOO<"9 TIT/. I/isting of 6ata "et UN-ACPI9 4UN9

E'%*anation:There are times +hen you +ant to store a group of one&digit num$ers in a compact space saving +ay. In this e#ample you +ant to store five one&digit num$ers. If you stored each one as an O&$yte numeric you +ould need A* $ytes of storage for each o$servation. <y storing the five num$ers as a N&$yte character string you need only N $ytes of storage. ;o+ever you need to use C-U time to turn the character string $ac! into the five num$ers. The !ey here is to use the "U<"T4 function +ith the starting value as the inde# of a 6O loop. As you pic! off each of the numerals you can use the IN-UT function to do the character&to&numeric conversion. Notice that the A44A> statement in this program does not include a list of varia$les.

Function: SU/ST$ (on the *eft-hand side of the e:ua* sign)

As +e mentioned in the description of the "U<"T4 function there is an interesting and useful +ay it can $e usedUon the left&hand side of the equal sign. Purpose: To place one or more characters into an e#isting string. Syntax: SU/ST$(character-value9 start =9 length>) > character-value
E'a)%*es:-

In these e#amples EXISTI ! = "A"C-EM!H", EU = "XY" Function $eturns S*"STR(EXISTI !,3,2) = EU EXISTI ! is no+ = "A"XYEM!H" S*"STR(EXISTI !,3,1) = "2" EXISTI ! is no+ = "A"2-EM!H"

Progra):-

3e)onstrating the SU/ST$ function on the *eft-hand side of the e:ua* sign
6ATA "TA4"9 IN-UT "<- 6<- MM9 /.N:T; "<-BC;P 6<-BC;P = A9 "<-BC;P F -UT1"<- @.39

6<-BC;P F -UT16<- @.39 IF "<- :T (J* T;.N "U<"T41"<-BC;P A (3 F GDG9 IF 6<- :T )* T;.N "U<"T416<-BC;P A (3 F GDG9 6ATA/IN."9 (?* O* (O* )? ?** ((* 9 -4OC -4INT 6ATAF"TA4" NOO<"9 TIT/. I/isting of 6ata "et "TA4"I9 4UN9 E'%*anation:In this program you +ant to IflagI high values of systolic and diastolic $lood pressure $y placing an asteris! after the value. Notice that the varia$les "<-BC;P and 6<-BC;P are $oth assigned a length of A $y the length statement. The fourth position needs to $e there in case you +ant to place an asteris! in that position to flag the value as a$normal. The -UT function places the numerals of the $lood pressures into the first @ $ytes of the corresponding character varia$les. Then if the value is a$ove the specified level an asteris! is placed in the fourth position of these varia$les

Function: SU/ST$N

Purpose: This function serves the same purpose as the "U<"T4 function +ith a fe+ added features. Unli!e the "U<"T4 function the starting position and the length arguments of the "U<"T4N function can $e * or negative +ithout causing an error. In particular if the length is * the function returns a string of * length. This is particularly useful +hen you are using regular e#pression functions +here the length parameter may $e * +hen a pattern is not found. Syntax: SU/ST$N(character-value9 start =9 length>) character-value is any "A" character varia$le. start is the starting position in the string. If this value is non&positive the function returns a su$string starting from the first character in character-value the length of the su$string +ill $e computed $y counting starting from the value of start3. length is the num$er of characters in the su$string. If this value is nonpositive 1in particular *3 the function returns a string of length *. If this argument is omitted the "U<"T4N function +ill return all the characters from the start position to the end of the string.
E'a)%*es:-

For these e#amples STRI ! = "A"C-E"

Function S*"STR S*"STR S*"STR S*"STR

(STRI (STRI (STRI (STRI

$eturns !,2,3) !,-1,4) !,4,5) !,3,0)

""C-" "A"" "-E" EFTG(H OW &DTO ND(HFP

Progra):3e)onstrating the uni:ue features of the SU/ST$N Function 6ATA ;OA:I.9 "T4IN: F GA<C6.F:;IQG9 /.N:T; 4."U/T =N.9 4."U/T F "U<"T4N1"T4IN: ? N39 "U<( F "U<"T4N1"T4IN: &( A39 "U<? F "U<"T4N1"T4IN: @ *39 "U<@ F "U<"T4N1"T4IN: T N39 "U<A F "U<"T4N1"T4IN: * ?39 FI/. -4INT9 TIT/. I6emonstrating the "U<"T4N FunctionI9 -UT IOriginal "tring FI M?N "T4IN: [ I"U<"T4N1"T4IN: ? N3 FI M?N 4."U/T [ I"U<"T4N1"T4IN: &( A3 FI M?N "U<( [ I"U<"T4N1"T4IN: @ *3 FI M?N "U<? [ I"U<"T4N1"T4IN: T N3 FI M?N "U<@ [ I"U<"T4N1"T4IN: * ?3 FI M?N "U<A9 4UN9 E'%*anation:In data set ;OA:I. 1su$&strings get itW3 the storage lengths of the varia$les "U<(X"U<A are all equal to the length of "T4IN: 1+hich is (*3. "ince a /.N:T; statement +as used to define the length of 4."U/T it has a length of N.

Functions That ?oin T+o or 0ore Strings Together

There are three call routines and four functions that concatenate character strings. Although you can use the \\ concatenation operator in com$ination +ith the "T4I- T4I7 or /.FT functions these routines and functions ma!e it much easier to put strings together and if you +ish to place one or more separator characters $et+een the strings. The three call routines are discussed first follo+ed $y the four concatenation functions.

Ca** $outines

These three call routines concatenate t+o or more strings. Note that there are four concatenation functions as +ell 1CAT CAT" CATT and CAT83. The differences among these routines involve the handling of leading and[or trailing $lan!s as +ell as spacing $et+een the concatenated strings. The traditional concatenation operator 1 ||3 is still useful $ut it sometimes ta!es e#tra +or! to strip leading and trailing $lan!s 1/.FT and T4I7 functions or the ne+ "T4I- function3 $efore performing the concatenation . Function: CA!! CATS Purpose: To concatenate t+o or more strings removing $oth leading and trailing $lan!s $efore the concatenation ta!es place. To help you remem$er that this call routine is the one that strips the leading and trailing $lan!s $efore concatenation thin! of the " at the end of CAT" as Istrip $lan!s.I Note5 To call our three cats I usually ,ust +histle loudly. Syntax: CA!! CATS(result9 string-1 =9string-n@) +here result is the concatenated string. It can $e a ne+ varia$le or if it is an e#isting varia$le the other strings +ill $e added to it. /e sure that the *ength of resu*t is *ong enough to ho*d the concatenated resu*ts8 If not the resulting string +ill $e truncated and you +ill see an error message in the log. string-1 and string-n are the character strings to $e concatenated. /eading and trailing $lan!s +ill $e stripped prior to the concatenation.
E'a)%*es:-

For these e#amples A = ""GNJO" 1no $lan!s3 " = " MTOLO" 1leading $lan!s3 C = "HOJJGF " 1trailing $lan!s3 - = " !I(LINW " 1leading and trailing $lan!s3 Function $eturns CA)) CATT(RES*)T, A, ") CA)) CATT(RES*)T, ", C, -) CA)) CATT(RES*)T, "HDNNO", -)

""GNJO MTOLO" " MTOLOHOJJGF !I(LINW" "HDNNO !I(LINW"

Function: CA!! CATT

Pur%ose: To concatenate t+o or more strings

removing only trailing $lan!s $efore the concatenation ta!es place. To help you remem$er this thin! of the T at the end of CATT as Itrailing $lan!sI or Itrim $lan!s.I

S&nta': CA!! CATT(result9 string-1 =9string-n@)


result is the concatenated string. It can $e a ne+ varia$le or if it is an

e#isting varia$le the other strings +ill $e added to it. /e sure that the *ength of resu*t is *ong enough to ho*d the concatenated resu*ts8 If not the program +ill terminate and you +ill see an error message in the log. string-1 and string-n are the character strings to $e concatenated.
E'a)%*es:-

For these e#amples A = ""GNJO" 1no $lan!s3 " = " MTOLO" 1leading $lan!s3 C = "HOJJGF " 1trailing $lan!s3 - = " !I(LINW " 1leading and trailing $lan!s3 Function $eturns CA)) CATT(RES*)T, A, ") CA)) CATT(RES*)T, ", C, -) CA)) CATT(RES*)T, "HDNNO", -)
Function: CA!! CAT<

""GNJO MTOLO" " MTOLOHOJJGF !I(LINW" "HDNNO !I(LINW"

Pur%ose: To concatenate t+o or more strings

removing $oth leading and trailing $lan!s $efore the concatenation ta!es place and place a single space or one or more characters of your choice $et+een each of the strings. To help you remem$er this thin! of the 8 at the end of CAT8 as Iadd e8tra $lan!.I S&nta': CA!! CAT<(separator9 result9 string-1 =9string-n@) +here separator is one or more characters placed in single or dou$le quotation mar!s that you +ant to use to separate the strings
E'a)%*es:-

For these e#amples A = ""GNJO" 1no $lan!s3 " = " MTOLO" 1leading $lan!s3 C = "HOJJGF " 1trailing $lan!s3 - = " !I(LINW " 1leading and trailing $lan!s3 Function $eturns CA)) CATX(" ", RES*)T, A, ") CA)) CATX(",", RES*)T, ", C, -) CA)) CATX("6", RES*)T, "HDNNO", -) CA)) CATX(", ", RES*)T, "HDNNO", -) CA)) CATX("222", RES*)T, A, ")
Progra):-

""GNJO MTOLO" "MTOLO,HOJJGF,!I(LINW" "HDNNO6!I(LINW" "HDNNO, !I(LINW" ""GNJO222MTOLO"

3e)onstrating the three concatenation ca** routines


6ATA CA//BCAT9 "T4IN:( F IA<CI9 D No spaces9 "T4IN:? F I6.F I9 D Three trailing spaces9 "T4IN:@ F I :;II9 D Three leading spaces9 "T4IN:A F I QP/ I9 D Three leading and trailing spaces9 /.N:T; 4."U/T( & 4."U/TA = ?*9 CA// CAT"14."U/T( "T4IN:? "T4IN:A39 CA// CATT14."U/T? "T4IN:? "T4IN:(39 CA// CAT81I I 4."U/T@ "T4IN:( "T4IN:@39 CA// CAT81I I 4."U/TA "T4IN:@ "T4IN:A39 4UN9 -4OC -4INT 6ATAFCA//BCAT NOO<"9 TIT/. I/isting of 6ata "et CA//BCATI9 4UN9

E'%*anation:The three concatenation call routines each perform concatenation operations. The CAT" call routine strips leading and trailing $lan!s9 the CATT call routine removes trailing $lan!s $efore performing the concatenation9 the CAT8 call routine is similar to the CAT" call routine e#cept that it inserts a separator character 1specified as the first argument3 $et+een each of the concatenated strings

The 4CAT5 Functions (CAT9 CATS9 CATT9 and CAT<)

These four concatenation functions are very similar to the concatenation call routines descri$ed a$ove. ;o+ever since they are functions and not call routines you need to name the ne+ character varia$le to $e created on the left&hand side of the equal sign and the function along +ith its arguments on the right&hand side of the equal sign

Function: CAT

Pur%ose: To concatenate 1,oin3 t+o or more character strings leaving leading and[or
trailing $lan!s unchanged. This function accomplishes the same tas! as the concatenation operator 1\\3. S&nta': CAT(string-19 string-2 =9string-n@) string-1, string-2 8,string-n9 are the character strings to $e concatenated. These arguments can also $e +ritten as5 CAT(OM C1-

C5)+here C( to CN are character varia$les.

Note5 It is very important to set the length of the resulting character string using a
/.N:T; statement 1or other method3 $efore using any of the concatenation functions. Other+ise the length of the resulting string +ill default to ?**. E'a)%*es:-

For these e#amples A = ""GNJO" 1no $lan!s3 " = " MTOLO" 1leading $lan!s3 C = "HOJJGF " 1trailing $lan!s3 - = " !I(LINW " 1leading and trailing $lan!s3 C1-C5 are five character varia$les +ith the values of 1A1, 1"1, 1C1, 1-1, and 1E1 4espectively. Function $eturns CAT(A, ") ""GNJO MTOLO" CAT(", C, -) " MTOLOHOJJGF !I(LINW " CAT("HDNNO", -) "HDNNO !I(LINW " CAT(OM C1-C5) "A"C-E"

Function: CATS

Purpose: To concatenate 1,oin3 t+o or more character strings stripping $oth leading
and trailing $lan!s. Syntax: CATS(string-19 string-2 =9string-n@) string-1, string-2, and string-n are the character strings to $e concatenated. These arguments can also $e +ritten as5 CATS(OM C1C5)+here C( to CN are character varia$les

E'a)%*es:-

For these e#amples A = ""GNJO" 1no $lan!s3 " = " MTOLO" 1leading $lan!s3 C = "HOJJGF " 1trailing $lan!s3 - = " !I(LINW " 1leading and trailing $lan!s3 C1-C5 are five character varia$les +ith the values of 1A1, 1"1, 1C1, 1-1, and 1E1 4espectively. Function $eturns

CATS(A, ") CATS(", C, -) CATS("HDNNO", -) CATS(OM C1-C5)

""GNJOMTOLO" "MTOLOHOJJGF!I(LINW" "HDNNO!I(LINW" "A"C-E"

Function: CATT

Purpose: To concatenate 1,oin3 t+o or more character strings stripping only trailing
$lan!s.

Syntax: CATT(string-19 string-2 =9string-n@)


string1, string-2, and string-n are the character strings to $e concatenated. These arguments can also $e +ritten as5 CATT(OM C1C5)+here C( to CN are character varia$les.

E'a)%*es:-

For these e#amples A = ""GNJO" 1no $lan!s3 " = " MTOLO" 1leading $lan!s3 C = "HOJJGF " 1trailing $lan!s3 - = " !I(LINW " 1leading and trailing $lan!s3 C1-C5 are five character varia$les +ith the values of 1A1, 1"1, 1C1, 1-1, and 1E1 4espectively. Function $eturns CATT(A, ") CATT(", C, -) CATT("HDNNO", -) CATT(OM C1-C5)

""GNJO MTOLO" " MTOLOHOJJGF !I(LINW" "HDNNO !I(LINW" "A"C-E"

Function: CAT<

Purpose: To concatenate 1,oin3 t+o or more character strings stripping $oth leading
and trailing $lan!s and inserting one or more separator characters $et+een the strings. Syntax: CAT<(separator9 string-19 string-2 =9string-n@) separator is one or more characters placed in single or dou$le quotation mar!s to $e used as separators $et+een the concatenated strings. string-1, string-2,string-n are the character strings to $e concatenated. These arguments can also $e +ritten as5 CATX(" ",OM C1-C5) +here C( to CN are character varia$les.

E'a)%*es:-

For these e#amples A = ""GNJO" 1no $lan!s3 " = " MTOLO" 1leading $lan!s3 C = "HOJJGF " 1trailing $lan!s3 - = " !I(LINW " 1leading and trailing $lan!s3 C1-C5 are five character varia$les +ith the values of 1A1, 1"1, 1C1, 1-1, and 1E1 4espectively. Function $eturns

CAT81I I A <3

I<il$o FrodoI
"MTOLO6HOJJGF6!I(LINW" "HDNNO222!I(LINW" "A,",C,-,E"

CATX("6"", C, -) CATX("222", "HDNNO", -) CATX("," ,OM C1-C5)

Progra)1:-

3e)onstrating the four concatenation functions


6ATA CATBFUNCTION"9 "T4IN:( F IA<CI9 D No spaces9 "T4IN:? F I6.F I9 D Three trailing spaces9 "T4IN:@ F I :;II9 D Three leading spaces9 "T4IN:A F I QP/ I9 D Three leading and trailing spaces9 /.N:T; QOIN( & QOINN = ?*9 QOIN( F CAT1"T4IN:? "T4IN:@39 QOIN? F CAT"1"T4IN:? "T4IN:A39 QOIN@ F CATT1"T4IN:? "T4IN:(39 QOINA F CAT81I I "T4IN:( "T4IN:@39 QOINN F CAT81I I "T4IN:@ "T4IN:A39 4UN9 -4OC -4INT 6ATAFCATBFUNCTION" NOO<"9 TIT/. I/isting of 6ata "et CATBFUNCTION"I9 4UN9

E'%*anation:Notice that each of the "T4IN: varia$les differs +ith respect to leading and trailing $lan!s.

The CAT function is identical to the \\ operator. The CAT" function removes $oth leading and trailing $lan!s and is equivalent to T4I71/.FT1"T4IN:?33 \\ T4I71/.FT1"T4IN:A3. The CATT function trims only trailing $lan!s. The last t+o statements use the CAT8 function +hich removes leading and trailing $lan!s. It is ,ust li!e the CAT" function $ut adds one or more separator characters 1specified as the first argument3 $et+een each of the strings to $e ,oined

Functions That $e)o,e /*an1s fro) Strings

There are times +hen you +ant to remove $lan!s from the $eginning or end of a character string. The t+o functions /.FT and 4I:;T merely shift the characters to the $eginning or the end of the string respectively. The T4I7 T4I7N and "T4I- functions are useful +hen you +ant concatenate strings 1although the ne+ concatenation functions +ill do this for you3.
!EFT and $I6AT

These t+o functions left& or right&align te#t. 4emem$er that the length of a character varia$le +ill not change +hen you use these t+o functions. If there are leading $lan!s the /.FT function +ill shift the first non&$lan! character to the first position and move the e#tra $lan!s to the end9 if there are trailing $lan!s the 4I:;T function +ill shift the non&$lan! te#t to the right and move the e#tra $lan!s to the left.
Function: !EFT

Pur%ose: To left&align te#t values. A su$tle $ut important point5 /.FT doesnGt
IremoveI the leading $lan!s9 it moves them to the end of the string. Thus it doesnGt change the storage length of the varia$le even +hen you assign the result of /.FT to a ne+ varia$le. The /.FT function is particularly useful if values +ere read +ith the =C;A4 informat +hich preserves leading $lan!s. Note that the "T4I- function removes $oth leading and trailing $lan!s from a string. S&nta': !EFT(character-value) character-value is any "A" character e#pression.

E'a)%*es:-

In these e#amples STRI ! = " A"C" Function $eturns )EMT(STRI !) "A"C " )EMT(" 123 ") "123 "

Progra)1:-

!eft-a*igning te't ,a*ues fro) ,aria-*es read +ith the

BCAA$ infor)at
6ATA /.A6BON9 IN-UT "T4IN: =C;A4(N.9 /.FTB"T4IN: F /.FT1"T4IN:39 6ATA/IN."9 A<C 8>H 4on Cody 9 -4OC -4INT 6ATAF/.A6BON NOO<"9 TIT/. I/isting of 6ata "et /.A6BONI9 FO47AT "T4IN: /.FTB"T4IN: =KUOT.(T.9 4UN9

E'%*anation:If you +ant to +or! +ith character values you +ill usually +ant to remove any leading $lan!s first. The =C;A4+. informat differs from the =+. informat. =C;A4+. maintains leading $lan!s9 =+. left&aligns the te#t. -rograms involving character varia$les sometimes fail to +or! properly $ecause careful attention +as not paid to either leading or trailing $lan!s. Notice the use of the =KUOT. format in the -4INT procedure. This format adds dou$le quotation mar!s around the character value.

Function: $I6AT

Pur%ose: To right&align a te#t string. Note that if the length of a character varia$le
has previously $een defined and it contains trailing $lan!s the 4I:;T function +ill move the characters to the end of the string and add the $lan!s to the $eginning so that the final length of the varia$le remains the same.

S&nta': right(character-value)
character-value is any "A" character e#pression.
E'a)%*es:-

In these e#amples STRI ! = "A"C " Function $eturns RI!HT(STRI !) RI!HT(" 123 ")

" A"C" " 123"

Progra)1:-

$ight-a*igning te't ,a*ues


6ATA 4I:;TBON9 IN-UT "T4IN: =C;A4(*.9 4I:;TB"T4IN: F 4I:;T1"T4IN:39 6ATA/IN."9 A<C (?@ ANJ 4on Cody 9 -4OC -4INT 6ATAF4I:;TBON NO<"9 TIT/. I/isting of 6ata "et 4I:;TBONI9 FO47AT "T4IN: 4I:NTB"T4IN: =KUOT.(?.9 4UN9

E'%*anation:6ata lines one and t+o $oth contain three leading $lan!s9 lines one and three contain trailing <lan! Notice the use of the =KUOT. format in the -4INT procedure. This format adds dou$le quotation mar!s around the character value. This is especially useful in de$ugging programs involving character varia$les since it allo+s you to easily identify leading $lan!s in a character value

T$I09 T$I0N9 and ST$IP

This group of functions trims trailing $lan!s 1T4I7 and T4I7N3 and $oth leading and trailing $lan!s 1"T4I-3. The t+o functions T4I7 and T4I7N are similar5 they $oth remove trailing $lan!s from a string. The functions +or! identically e#cept +hen the argument contains only $lan!s. In that case T4I7 returns a single $lan! 1length of (3 and T4I7N returns a null string +ith a length of *. The "T4I- function removes $oth leading and trailing $lan!s.
Function: T$I0

Purpose: To remove trailing $lan!s from a character value. This is especially useful
+hen you +ant to concatenate several strings together and each string may contain trailing $lan!s. Syntax: T$I0(character-value)

character-value is any "A" character e#pression.


I)%ortant note5 The length of the varia$le returned $y the T4I7 function +ill $e the same

length as the argument unless the length of this varia$le has $een previously defined. If the result of the T4I7 function is assigned to a varia$le +ith a length longer than the trimmed argument the resulting varia$le +ill $e padded +ith $lan!s.

E'a)%*es:-

For these e#amples STRI !1 = "A"C " and STRI !2 = " XYZ" Function $eturns TRIM(STRI !1) "A"C" TRIM(STRI !2) " XYZ" TRIM("A " C ") "A " C" TRIM("A ") AA TRIM("" ") "A""

Progra)1:-

Creating a %rogra) to concatenate first9 )idd*e9 and *ast na)es into a sing*e ,aria-*e
6ATA -UTBTO:.T;.49 /.N:T; NA7. = AN9 INFO47AT NA7.(&NA7.@ =(N.9 INFI/. 6ATA/IN." 7I""O2.49 IN-UT NA7.( NA7.? NA7.@9 NA7. F T4I71NA7.(3 \\ G G \\ T4I71NA7.?3 \\ G G \\ T4I71NA7.@39 0IT;OUT F NA7.( \\ NA7.? \\ NA7.@9 P..- NA7. 0IT;OUT9 6ATA/IN."9 4onald Cody Qulia Child ;enry Ford /ee ;arvey Os+ald 9 -4OC -4INT 6ATAF-UTBTO:.T;.4 NOO<"9 TIT/. I/isting Of 6ata "et -UTBTO:.T;.4I9 4UN9

E'%*anation:-

This program reads in three names each up to (N characters in length. Note the use of the INFI/. option 7I""O2.4. This options sets the value of NA7.@ to missing +hen there are only t+o names. To put the names together you use the concatenate operator 1\\3. The T4I7 function is used to trim trailing $lan!s from each of the +ords 1+hich are all (N $ytes in length3 $efore putting them together. 0ithout the T4I7 function there are e#tra spaces $et+een each of the names 1see the varia$le 0IT;OUT3.

Function: T$I0N

Pur%ose: To remove trailing $lan!s from a character value. This is especially useful
+hen you +ant to concatenate several strings together and each string may contain trailing $lan!s. The difference $et+een T4I7 and T4I7N is that the T4I7 function returns a single $lan! for a $lan! string +hile T4I7N returns a null string 1zero $lan!s3.

S&nta': T$I0N(character-value)
character-value is any "A" character e#pression.

.#amples5&

For these e#amples STRI !1 = "A"C " and STRI !2 = " XYZ" Function $eturns TRIM (STRI !1) TRIM (STRI !2) TRIM ("A " C ") TRIM ("A ") AA TRIM("" ") TRIM (" ")

"A"C" " XYZ" "A " C" "A"" "" 1length F *3

Progra):-

3e)onstrating the difference -et+een the T$I0 and T$I0N Functions

6ATA A//BT;.BT4I77IN:"9 A F IAAAI9 < F I<<<I9 /.N:T;BA< F /.N:T;C1A \\ <39 /.N:T;BA<BT4I7 F /.N:T;C1T4I71A3 \\ T4I71<339

/.N:T;BA<BT4I7N F /.N:T;C1T4I7N1A3 \\ T4I7N1<339 /.N:T;BNU// F /.N:T;C1CO7-4.""1A IAI3 \\ CO7-4.""1< I<I339 /.N:T;BNU//BT4I7 F /.N:T;C1T4I71CO7-4.""1A IAI33 \\ T4I71CO7-4.""1< I<I3339 /.N:T;BNU//BT4I7N F /.N:T;C1T4I7N1CO7-4.""1A IAI33 \\ T4I7N1CO7-4.""1< I<I3339 -UT AF <F [ /.N:T;BA<F /.N:T;BA<BT4I7F /.N:T;BA<BT4I7NF [ /.N:T;BNU//F /.N:T;BNU//BT4I7F /.N:T;BNU//BT4I7NF9 4UN9

E'%*anation:First remem$er that the /.N:T;C function returns the length of its argument including trailing $lan!s. As the listing from the "A" log 1$elo+3 sho+s the t+o functions T4I7 and T4I7N yield identical results +hen there are no null strings involved. 0hen you compress an GAG from the varia$le A or G<G from varia$le < the result is null. Notice that +hen you trim these compressed values and concatenate the results the length is ? 1( ] (39 +hen you use the T4I7N function the length is *.

Function: ST$IP

Pur%ose: To strip leading and trailing $lan!s from character varia$les or strings.
STRIP(CHAR) is equivalent to TRIM ()EMT(CHAR33 $ut more convenient. S&nta': ST$IP1character-value) character-value is any "A" character e#pression.
Note:-If the "T4I- function is used to create a ne+ varia$le the length of that ne+ varia$le +ill $e equal to the length of the argument of the "T4I- function. If leading or trailing $lan!s +ere trimmed trailing $lan!s +ill $e added to the result to pad out the length as necessary. The "T4I- function is useful +hen using the concatenation operator. ;o+ever note that there are several ne+ concatenation functions and call routines that also perform trimming $efore concatenation

E'a)%*es:-

For these e#amples let STRI ! = " IJK " Function $eturns

STRIP(STRI !) "IJK" 1if result +as previously assigned a length of three other+ise trailing $lan!s +ould $e added3 STRIP(" )EA-I ! A - TRAI)I ! ") ")EA-I ! A - TRAI)I !X

Progra):-

Using the ST$IP function to stri% -oth *eading and trai*ing -*an1s fro) a string
6ATA BNU//B9 ON. F I ON. I9 DDDNote5 three leading and trailing $lan!s9 T0O F I T0O I9 DDDNote5 three leading and trailing $lan!s9 CATBNOB"T4I- F I5I \\ ON. \\ I&I \\ T0O \\ I5I9 CATB"T4I- F I5I \\ "T4I-1ON.3 \\ I&I \\ "T4I-1T0O3 \\ I5I9 -UT ON.F T0OF [ CATBNOB"T4I-F [ CATB"T4I-F9 4UN9 E'%*anation:0ithout the "T4I- function the leading and trailing $lan!s are maintained in the concatenated string. The "T4I- function as advertised removed the leading and trailing <lan!s.

Functions That Compare "trings 1.#act and IFuzzyI Comparisons3 Functions in this section allo+ you to compare strings that are e#actly ali!e 1similar e#cept for case3 or close 1not e#act matches3. -rogrammers find this latter group of functions useful in matching names that may $e spelled differently in separate files.

Function5 CO7-A4. Purpose: To compare t+o character strings. 0hen used +ith one or more modifiers this function can ignore case remove leading $lan!s truncate the longer string to the length of the shorter string and strip quotation mar!s from "A" n&literals. Syntax: CO7-A4.1string-1 string-2 R GmodifiersGS3 0here

string-1 is any "A" character e#pression. string-2 is any "A" character e#pression. modifiers are one or more modifiers placed in single or dou$le quotation mar!s as follo+s5 G or I ignore case. N or ) remove leading $lan!s. ( or remove quotation mar!s from any argument that is an n&literal and ignore case. An n&literal is a string in quotation mar!s follo+ed $y an GnG useful for non&valid "A" names. 5 1colon3 truncate the longer string to the length of the shorter string.

.#amples5& For these e#amples EFTG(H1 = "AJC", EFTG(H2 = " A"C", EFTG(H3 = " 1A"C1(", EFTG(H4 = "A"CXYZ"

Function $eturns COMPARE(EFTG(H1,EFTG(H4) COMPARE(EFTG(H4,EFTG(H1) COMPARE(EFTG(H1,EFTG(H2,1G1) COMPARE(EFTG(H1,EFTG(H4,16I1) COMPARE(EFTG(H1,EFTG(H3,1(N1) COMPARE(EFTG(H1,EFTG(H3,1N(1)

2 1I<I comes $efore I$I3 -2 1 0 4 1

-rogram 5&

Co)%aring t+o strings using the CO0PA$E


6ATA CO7-A4.9 IN-UT M( "T4IN:( =C;A4@. MN "T4IN:? =C;A4(*.9 IF U-CA".1"T4IN:(3 F U-CA".1"T4IN:?3 T;.N .KUA/ F G>."G9 ./". .KUA/ F GNOG9 IF U-CA".1"T4IN:(3 F5 U-CA".1"T4IN:?3 T;.N CO/ON F G>."G9 ./". CO/ON F GNOG9

CO7-A4. F CO7-A4.1"T4IN:( "T4IN:?39 CO7-A4.BI/ F CO7-A4.1"T4IN:( "T4IN:? GI/G39 CO7-A4.BI/BCO/ON F CO7-A4.1"T4IN:( "T4IN:? GI/5G39 6ATA/IN."9 A$c A<C a$c A<C6.F:; (?@ @(( 9 -4OC -4INT 6ATAFCO7-A4. NOO<"9 TIT/. I/isting of 6ata "et CO7-A4.I9 4UN9 .#planation5& The first t+o varia$les .KUA/ and CO/ON use the U-CA". function to convert all the characters to uppercase $efore the comparison is made. The colon modifier follo+ing the equal sign 1the varia$le CO/ON3 is an instruction to truncate the longer varia$le to the length of the shorter varia$le $efore a comparison is made The three CO7-A4. functions demonstrate the coding efficiency of using this function +ith its many modifiers .Using the I/ and colon modifiers allo+s you to compare the t+o strings ignoring case removing leading $lan!s and truncating the t+o strings to a length of @ 1the length of "T4IN:( CA// CO7-CO"T CO7-:.6 and CO7-/.2 The t+o functions CO7-:.6 and CO7-/.2 are $oth used to determine the similarity $et+een t+o strings. The CO7-CO"T call routine allo+s you to customize the scoring system +hen you are using the CO7-:.6 function. CO7-:.6 computes a quantity called genera*i.ed edit distance +hich is useful in matching names that are not spelled e#actly the same. The larger the value the more dissimilar the t+o strings. CO7-/.2 performs a similar function $ut uses a method called the !e,enshtein edit distance. It is more efficient than the generalized edit distance $ut may not $e as useful in name matching.

Function5 CA// CO7-CO"T Purpose: To determine the similarity $et+een t+o strings using a method called the generalized edit distance. The cost is computed $ased on the difference $et+een the t+o strings. >ou need to call this function only once in a 6ATA step. "ince this is a very advanced and complicated routine only a fe+ e#amples of its use +ill $e e#plained

Syntax: CA// CO7-CO"T1Goperation-1G cost-1 R Goperation2G cost-2 ...S3 0here operation is a !ey+ord placed in quotation mar!s. A fe+ !ey+ords are listed here for e#planation purposes $ut see the SAS OnlineDoc 9.1 documentation for a complete list of operations5 -artial /ist of Operations -E)ETE= REP)ACE= SUAP= TR* CATE= cost is a value associated +ith the operation. 2alid values for cost range from X@? TJT to ]@? TJT. .#amples5& CA)) COMPCOST(1REP)ACE=1, 100, 1SUAP=1, 200)7 CA)) COMPCOST(1SUAP=1, 150)7

Note5 Operation can $e upper& or lo+ercase Function5 CO7-:.6 Purpose: To compute the similarity $et+een t+o strings using a method called the generalized edit distance. "ee "-.6I" for a discussion of the possi$le uses of this function. This function can $e used in con,unction +ith CA// CO7-CO"T if you +ant to alter the default costs for each type of spelling error. Syntax: CO7-:.61string-1 string-2 R ma!costS R GmodifiersGS3 string-1 is any "A" character e#pression. string-2 is any "A" character e#pression. maxcost, if specified is the ma#imum cost that +ill $e returned $y the CO7-/.2 function. If the cost computation results in a value larger than maxcost, the value of maxcost +ill $e returned. modifiers placed in single or dou$le quotation mar!s as follo+s5

G or I ignore case. N or ) remove leading $lan!s. ( or remove quotation mar!s from any argument that is an n&literal and ignore case. An n&literal is a string in quotation mar!s follo+ed $y an GnG useful for non&valid "A" names. 5 1colon3 truncate the longer string to the length of the shorter string. Note5 If multiple modifiers are used the order of the modifiers is important. They are applied in the same order as they appear. program5&

Using the CO0P6E3 function +ith a SAS n-*itera*


O-TION" 2A/I62A4NA7.FAN>9 6ATA NB/IT.4A/9 "T4IN:( F IGIN2A/I6LGNI9 "T4IN:? F GIN2A/I6G9 CO7-( F CO7-:.61"T4IN:( "T4IN:?39 CO7-? F CO7-:.61"T4IN:( "T4IN:? GN5G39 4UN9 -4OC -4INT 6ATAFNB/IT.4A/ NOO<"9 TIT/. I/isting of 6ata "et NB/IT.4A/I9 4UN9

.#planation5& This program demonstrates the use of the CO7-:.6 function +ith a "A" n&literal. "tarting +ith 2ersion T "A" varia$le names could contain characters not normally allo+ed in "A" names. The system option 2A/I62A4NA7. is set to IAN>I and the name is placed in quotation mar!s follo+ed $y the letter N. Using the N modifier 1+hich strips quotation mar!s and the GnG from the string3 and the colon modifier 1+hich truncates the longer string to the length of the shorter string3 results in a value of * for the varia$le CO7-?.

Function5 CO7-/.2 Purpose: To compute the similarity $et+een t+o strings using a method called the "evenshtein edit distance. It is similar to the CO7-:.6 function e#cept that it uses less computer resources $ut may not do as good a ,o$ of matching misspelled names. Syntax: CO7-/.21string-1 string-2 R ma!costS R GmodifiersGS3

String1 string2 SAME SAME KIED CASE KIED CASE KIED CASE RO( RY(

Function COMP)EV(STRI COMP)EV(STRI COMP)EV(STRI COMP)EV(STRI COMP)EV(STRI

$eturns !1, STRI !2) !1, STRI !2) !1,STRI !2,1I1) !1, STRI !2, 999, 1I1) !1, STRI !2)

0 4 0 0 1

Progra):&

Changing the effect of the ca** to CO0PCOST on the resu*t fro) CO0P6E3
6ATA BNU//B9 TIT/. I-rogram +ithout Call to CO7-CO"TI9 IN-UT M( "T4IN:( =C;A4(*. M(( "T4IN:? =C;A4(*.9 6I"TANC. F CO7-:.61"T4IN:( "T4IN:?39 -UT "T4IN:(F "T4IN:?F [ 6I"TANC.F9 6ATA/IN."9 4on 4un A<C A< 9 6ATA BNU//B9 TIT/. I-rogram +ith Call to CO7-CO"TI9 IN-UT M( "T4IN:( =C;A4(*. M(( "T4IN:? =C;A4(*.9 IF BNB F ( T;.N CA// CO7-CO"T1GA--.N6FG @@39

6I"TANC. F CO7-:.61"T4IN:( "T4IN:?39 -UT "T4IN:(F "T4IN:?F [ 6I"TANC.F9 6ATA/IN."9 4on 4un A<C A< 9

.#planation5& The first 6ATA BNU//B program is a simple comparison of "T4IN:( to "T4IN:? using the CO7-:.6 function. The second 6ATA BNU//B program ma!es a call to CO7-CO"T 1note the use of BNBF (3 $efore the CO7-:.6 function is used. In the "A" logs $elo+ you can see that the distance in the second o$servation in the first program is N* +hile in the second program it is @@. That is the result of overriding the default value of N* points for an appending error and setting it equal to @@

Functions That 6ivide "trings into I0ordsI These e#tremely useful functions and call routines can divide a string into +ords. 0ords can $e characters separated $y $lan!s or other delimiters that you specify. SCAN and SCANC The t+o functions "CAN and "CANK are similar. They $oth e#tract I+ordsI from a string +ords $eing defined as characters separated $y a set of specified delimiters. -ay particular attention to the fact that the "CAN and "CANK functions use different sets of default delimiters. The "CANK function also has some additional useful features. -rograms demonstrating $oth of these functions follo+ the definitions.

Function5 "CAN Purpose: .#tracts a specified +ord from a character e#pression +here +ord is defined as the characters separated $y a set of specified delimiters. The length of the returned varia$le is ?** unless previously defined. Syntax:"CAN1character-value n-#ord R $delimiter-listGS3 +here

character-value is any "A" character e#pression. n-word is the nth I+ordI in the string. If n is greater than the num$er of +ords the "CAN function returns a value that contains no characters. If n is negative the character value is scanned from right to left. A value of zero is invalid. delimiter-list is an optional argument. If it is omitted the default set of delimiters are 1for A"CII environments35 JNI(S 4 8 ( 3 0 . + 2 ) 7 > - 5 , , A For .<C6IC environments the default delimiters are5 JNI(S 4 8 ( 3 A 0 . + 2 ) 7 Z - 5 , , A [ If you specify any delimiters only those delimiters +ill $e active. 6elimiters $efore the first +ord have no effect. T+o or more contiguous delimiters are treated as one.

.#amples5& For these e#amples STRI !1 = "A"C -EM" and STRI !2 = "O E#TUO THREE3MO*RAMIVE" This is an A"CII e#ample. Function SCA (STRI SCA (STRI SCA (STRI SCA (STRI SCA (STRI SCA (STRI $eturns !1,2) !1,-1) !1,3) !2,4) !2,2," ") !1,0) "-EM" "-EM" no characters "MIVE" "THREE3MO*RAMIVE" An error in the "A" log

Function5 "CANK Purpose: To e#tract a specified +ord from a character e#pression +ord $eing defined as characters separated $y a set of specified delimiters. The $asic differences $et+een this function and the "CAN function are the default set of delimiters 1see synta# $elo+3 and the fact that a value of * for the +ord count does not result in an error message. "CANK also ignores delimiters enclosed in quotation mar!s 1"CAN recognizes them3.

Syntax: "CANK1character-value n-#ord R Gdelimiter-listGS3 character-value is any "A" character e#pression. n-word is the nth I+ordI in the string 1+ord $eing defined as one or more characters separated $y a set of specified delimiters. If n is negative the scan proceeds from right to left. If n is greater than the num$er of +ords or * the "CANK function +ill return a $lan! value. 6elimiters located $efore the first +ord or after the last +ord are ignored. If t+o or more delimiters are located $et+een t+o +ords they are treated as one. If the character value contains sets of quotation mar!s any delimiters +ithin these mar!s are ignored. delimiter-list is an optional argument. If it is omitted the default set of delimiters are +hite space characters 1$lan! horizontal and vertical ta$ carriage return line feed and form feed

.#amples5& For these e#amples STRI !1 = "A"C -EM", STRI !2 = "O E TUO THREE MO*R MIVE", STRI !3 = "1A" C-1 1X Y1", and STRI !4 = "O E/ 66TUO" Function SCA \(STRI SCA \(STRI SCA \(STRI SCA \(STRI SCA \(STRI SCA \(STRI SCA \(STRI $eturns !1,2) !1,-1) !1,3) !2,4," ") !3,2) !1,0) !4,2," /6")

"-EM" "-EM" no characters "MO*R" "1X Y1" no characters "TUO

-rogram(5&

SCAN function to con,ert )i'ednu)-ers to deci)a* ,a*ues


6ATA -4IC."9 IN-UT M( "TOCP =@. MN 7I8.6 =J.9 INT.:.4 F "CAN17I8.6 ( G[ G39 NU7.4ATO4 F "CAN17I8.6 ? G[ G39 6.NO7INATO4 F "CAN17I8.6 @ G[ G39 IF NU7.4ATO4 F G G T;.N 2A/U. F IN-UT1INT.:.4 O.39

./". 2A/U. F IN-UT1INT.:.4 O.3 ] 1IN-UT1NU7.4ATO4 O.3 [ IN-UT16.NO7INATO4 O.339 P..- "TOCP 2A/U.9 6ATA/IN."9 A<C (A @[O 8>H O T00 N ([O 9 -4OC -4INT 6ATAF-4IC." NOO<"9 TIT/. I/isting of 6ata "et -4IC."I9 4UN9

.#planation5& The "CAN function has many uses $esides merely e#tracting selected +ords from te#t e#pressions. In this program you +ant to convert num$ers such as ?@ N[O into a decimal value 1?@.JTN3. An elegant +ay to accomplish this is to use the "CAN function to separate the mi#ed num$er into three parts5 the integer the numerator of the fraction and the denominator. Once this is done all you need to do is to convert each piece to a numerical value 1using the IN-UT function3 and add the integer portion to the fractional portion. If the num$er $eing processed does not have a fractional part the "CAN function returns a $lan! value for the t+o varia$les NU7.4ATO4 and 6.NO7INATO4.

CA// "CAN and CA// "CANK The "CAN and "CANK call routines are similar to the "CAN and "CANK functions. <ut $oth call routines return a position and length of the nth +ord 1to $e used perhaps in a su$sequent "U<"T4 function3 rather than the actual +ord itself. 6ifferences $et+een CA// "CAN and CA// "CANK are the same differences $et+een the t+o functions "CAN and "CANK.

Function5 CA// "CAN Purpose: To $rea! up a string into +ords +here +ords are defined as the characters separated $y a set of specified delimiters and to return the starting position and the length of the nth +ord. Syntax: CA// "CAN1character-value n-#ord position length R $delimiter-listGS3

character-value is any "A" character e#pression. n-word is the nth I+ordI in the string. If n is greater than the num$er of +ords the "CAN call routine returns a value of * for position and length. If n is negative the scan proceeds from right to left. position is the name of the numeric varia$le to +hich the starting position in the character-value of the nth +ord is returned. length is the name of a numeric varia$le to +hich the length of the nth +ord is returned. delimiter-list is an optional argument. If it is omitted the default set of delimiters are 1for A"CII environments35 JNI(S 4 8 ( 3 0 . + 2 ) 7 > - 5 , , A For .<C6IC environments the default delimiters are5 JNI(S 4 8 ( 3 A 0 . + 2 ) 7 Z - 5 , , A [ If you specify any delimiters only those delimiters +ill $e active. 6elimiters are slightly different in A"CII and .<C6IC systems.

.#amples5& For these e#amples STRI !1 = "A"C -EM" and STRI !2 = "O E#TUO THREE3MO*RAMIVE" Function Position $eturns 1,POSITIO ,)E !TH) 5 CA)) SCA (STRI !1,3,POSITIO ,)E !TH) 0 CA)) SCA (STRI !2,1,POSITIO ,)E !TH) 7 CA)) SCA (STRI !2,4,POSITIO ,)E !TH) 4 CA)) SCA (STRI !2,2,POSITIO ,)E !TH," 15 CA)) SCA (STRI !1,0,POSITIO ,)E !TH) CA)) SCA (STRI !1,3 0 1 20 ") missing missing 9

-rogram5&

3e)onstrating the SCAN ca** routine


6ATA 0O46"9 IN-UT "T4IN: =A*.9 6./I7 F G6efaultG9 N F ?9 CA// "CAN1"T4IN: N -O"ITION /.N:T;39 OUT-UT9 N F &(9 CA// "CAN1"T4IN: N -O"ITION /.N:T;39 OUT-UT9 6./I7 F GLG9 N F ?9 CA// "CAN1"T4IN: N -O"ITION /.N:T; GLG39 OUT-UT9 6ATA/IN."9 ON. T0O T;4.. OneDLT+o ThreeDFour 9 -4OC -4INT 6ATAF0O46" NOO<"9 TIT/. I/isting of 6ata "et 0O46"I9 4UN9

.#planation5& The "CAN routine is called three times in this program t+ice +ith default delimiters and once +ith the pound sign 1L3 as the delimiter. Notice that using a negative argument results in a scan from right to left.

Function5 CA// "CANK Purpose: To $rea! up a string into +ords +here #ords are defined to $e the characters separated $y a set of specified delimiters and to return the starting position and the length of the nth +ord. The $asic differences $et+een this call routine and CA// "CAN is that CA// "CANK uses +hite space characters as default delimiters and it can accept a value of * for

the n&+ord argument. In addition the "CANK call routine ignores delimiters +ithin quotation mar!s. Syntax: CA// "CAN1character-value n-#ord position length R Gdelimiter-listGS3

.#amples5& For these e#amples STRI !1 = "A"C -EM" and STRI !2 = "O E TUO THREE MO*R MIVE", and STRI !3 = "1A" C-1 1X Y1" Function CA)) SCA \(STRI !1,2,POSITIO ,)E !TH) 3 CA)) SCA \(STRI !1,-1,POSITIO ,)E !TH) 3 CA)) SCA \(STRI !1,3,POSITIO ,)E !TH) 0 CA)) SCA \(STRI !2,4,POSITIO ,)E !TH) 3 CA)) SCA \(STRI !2,2,POSITIO ,)E !TH," 15 CA)) SCA \(STRI !1,0,POSITIO ,)E !TH) 0 CA)) SCA \(STRI !3,2,POSITIO ,)E !TH) 5 Position $eturns 5 5 0 5 ") 9 0 9

Functions That "u$stitute /etters or 0ords in "trings T4AN"/AT. can su$stitute one character for another in a string. T4AN046 is more fle#i$leUit can su$stitute a +ord or several +ords for one or more +ords.

Function5 T4AN"/AT. Purpose To e#change one character value for another. For e#ample you might +ant to change values (XN to the values AX.. Syntax: T4AN"/AT.1character-value to-1 from-1 R ^ to-n from-nS3

character-value is any "A" character e#pression. to-n is a single character or a list of character values. from-n is a single character or a list of characters. .ach character listed in from-n is changed to the corresponding value in to-n. If a character value is not listed in WTO]-( it +ill $e unaffected

.#amples5& In these e#amples CHAR = "12X45", A S = "Y" Function $eturns TRA S)ATE(CHAR,"A"C-E","12345") "A"X-E" TRA S)ATE(CHAR,1A1,111,1"1,121,1C1,131,1-1,141,1E1,151) "A"X-E" TRA S)ATE(A S,"10","Y ") "1"

-rogram5&

Con,erting ,a*ues of D1D9D2D9D#D9DED9 and DFD to DAD9D/D9DCD9D3D9 and DED res%ecti,e*&


6ATA 7U/TI-/.9 IN-UT KU." 5 =(. MM9 KU." F T4AN"/AT.1KU." GA<C6.G G(?@ANG39 6ATA/IN."9 (A@?N N@A?( 9 -4OC -4INT 6ATAF7U/TI-/. NOO<"9 TIT/. I/isting of 6ata "et 7U/TI-/.I9 4UN9

.#planation5& In this e#ample you +ant to convert the character values of (XN to the letters AX.. The t+o

arguments in this function seem $ac!+ards to this author. >ou +ould e#pect the order to $e IfromXtoI rather than the other +ay around. I suppose others at "A" felt the same +ay since a more recent function T4AN046 1ne#t e#ample3 uses the Ifrom X toI order for its arguments. 0hile you could use a format along +ith a -UT function to do this translation.

-rogram5&
Con,erting the ,a*ues 727 and 7N7 to 1Ds and GDs

6ATA >."BNO9 /.N:T; C;A4 = (9 IN-UT C;A4 MM9 8 F IN-UT1 T4AN"/AT.1 U-CA".1C;A43 G*(G GN>G3 (.39 6ATA/IN."9 N>nyA<*( 9 -4OC -4INT 6ATAF>."BNO NOO<"9 TIT/. I/isting of 6ata "et >."BNOI9 4UN9

.#planation5& In this program the U-CA". function converts lo+ercase values of InI and IyI to their uppercase equivalents. The T4AN"/AT. function then converts the Ns and >s to the characters I*I and I( I respectively. Finally the IN-UT function does the character to numeric conversion. Note that the data values of I(I and I*I do not get translated $ut do get converted to numeric values.

Function5 T4AN046 Purpose: To su$stitute one or more +ords in a string +ith a replacement +ord or +ords. It +or!s li!e the find and replace feature of most +ord processors. Syntax: T4AN0461character-value from-string to-string3 character-value is any "A" character e#pression.

from-string is one or more characters that you +ant to replace +ith the character or characters in the FO-EFTG(H4 to-string is one or more characters that replace the entire fromstring.

.#amples:For these e#amples STRI ! = "123 EN] ROIL" MROM = "ROIL" and TO = "RL4" Function TRA UR-(STRI TRA UR-(" O^ TRA UR-("O(D TRA UR-("MT4 TRA UR-("O E $eturns !,MROM,TO) GE FPD FG]D","GE","GE (OF") F^O FPTDD","WOYT","4") ROHDTE","MT4"," ") TUO THREE","O E TUO","A "")

"123 EN] RL4" " O^ GE (OF FPD FG]D" "O(D F^O FPTDD" " ROHDTE" "A " THREE"

-rogram5&

Con,erting +ords such as Street to their a--re,iations such as St8 in an address


6ATA CON2.4T9 IN-UT M( A664."" =?*. 9 DDD Convert "treet Avenue and 4oad to their a$$reviations9 A664."" F T4AN0461A664."" G"treetG G"t.G39 A664."" F T4AN046 1A664."" GAvenueG GAve.G39 A664."" F T4AN046 1A664."" G4oadG G4d.G39 6ATA/IN."9 O) /azy <roo! 4oad (?@ 4iver 4d. (? 7ain "treet 9 -4OC -4INT 6ATAFCON2.4T9 TIT/. G/isting of 6ata "et CON2.4TG9 4UN9

.#planation5& T4AN046 is one of the relatively ne+ "A" functionsUand it is enormously useful. This e#ample uses it to help standardize a mailing list su$stituting a$$reviations for full +ords. Another use for this function is to ma!e to&string a $lan! thus allo+ing you to remove +ords such as Qr. or 7r. from an address.

Functions That Compute the /ength of "trings

Function5 /.N:T; Purpose: To determine the length of a character value not counting trailing $lan!s. A null argument returns a value of (.

Syntax: /.N:T;1character-value3 character-value is any "A" character e#pression

.#amples5& For these e#amples CHAR = "A"C " Function $eturns )E !TH("A"C") )E !TH(CHAR) )E !TH(" ")

3 3 1

Function5 /.N:T;C Purpose: To determine the length of a character value including trailing $lan!s. Syntax5 /.N:T;C1character-value3 character-value is any "A" character e#pression.

.#amples5& For these e#amples CHAR = "A"C " Function $eturns )E !TH("A"C") )E !TH(CHAR) )E !TH(" ")

3 6 1

Function5 /.N:T;7 Purpose: To determine the length of a character varia$le in memory. Syntax: /.N:T;71character-value3 character-value is any "A" character e#pression.

.#amples5& For these e#amples CHAR = "A"C " Function $eturns )E !THM("A"C") )E !THM(CHAR) )E !THM(" ")

3 6 1

Function5 /.N:T;N Purpose: To determine the length of a character value not counting trailing $lan!s. A null argument returns a value of *. Syntax5 /.N:T;N1character-value3 character-value is any "A" character e#pression.

.#amples5& For these e#amples CHAR = "A"C " Function $eturns

)E !TH("A"C") )E !TH(CHAR) )E !TH(" ")

3 3 0

Functions That Count the Num$er of /etters or "u$strings in a "tring The COUNT function counts the num$er of times a given su$string appears in a string. The COUNTC function counts the num$er of times specific characters occur in a string.

Function5 COUNT Purpose: To count the num$er of times a given su$string appears in a string. 0ith the use of a modifier case can $e ignored. If no occurrences of the su$string are found the function returns a *. Syntax:COUNT1character-value find-string R GmodifiersGS3 character-value is any "A" character e#pression find-string is a character varia$le or "A" string literal to $e counted. The follo+ing modifiers placed in single or dou$le quotation mar!s may $e used +ith COUNT5 G or I ignore case. F or T ignore trailing $lan!s in $oth the character value and the find-string. .#amples5& For these e#amples STRI !1 = "HO^

O^ "TO^( COU" and STRI !2 = "O^"

Function $eturns CO* T(STRI !1, STRI !2) CO* T(STRI !1,STRI !2,1I1 CO* T(STRI !1, "XX") CO* T("LG(H I(L LO(H","H ") CO* T("LG(H I(L LO(H","H ","T")

3 4 0 1 2

-rogram5&

Using the COUNT function to count the nu)-er of ti)es the +ord 7the7 a%%ears in a string
6ATA 64ACU/A9 IN-UT "T4IN: =C;A4J*.9 NU7 F COUNT1"T4IN: ItheI39 NU7BNOBCA". F COUNT1"T4IN: ItheI GIG39 6ATA/IN."9 The num$er of times ItheI appears is the question T;. the None on this lineV There is the map 9 -4OC -4INT 6ATAF64ACU/A NOO<9 TIT/. I/isting of 6ata "et 6raculaI9 4UN9

.#planation5 In this program the COUNT function is used +ith and +ithout the I 1ignore case3 modifier. In the first o$servation the first ITheI has an uppercase T so it does not match the su$string and is not counted for the varia$le NU7. <ut +hen the I modifier is used it does count. The same holds for the second o$servation. 0hen there are no occurrences of the su$string as in the third o$servation the function returns a *. The fourth line of data demonstrates that COUNT ignores +ord $oundaries +hen searching for strings

Function5 COUNTC

Purpose: To count the num$er of individual characters that appear or do not appear in a string. 0ith the use of a modifier case can $e ignored. Another modifier allo+s you to count characters that do not appear in the string. If no specified characters are found the function returns a *. Syntax5COUNTC1character-value characters R GmodifiersGS3 character-value is any "A" character e#pression. characters is one or more characters to $e counted. It may $e a string literal 1letters in quotation mar!s3 or a character varia$le. The follo+ing modifiers placed in quotation mar!s may $e used +ith COUNTC5 G or I ignore case. O or O If this modifier is used COUNTC processes the character or characters and modifiers only once. If the COUNTC function is used in the same 6ATA step the previous character and modifier values are used and the current values are ignored. F or T ignore trailing $lan!s in the character-value or the KPITIKFDTE. Note this modifier is especially important +hen loo!ing for $lan!s or +hen you are using the Q modifier 1$elo+3. Q or V count only the characters that do not appear in the KPITIKFDT-QINYD. 4emem$er that this count +ill include trailing $lan!s unless the F modifier is used.

.#amples5& For these e#amples "T4IN:( F I;o+ No+ <ro+n CO0I and "T4IN:? F I+oI Function $eturns CO* TC("AI"JJC-E","C"A") 3 CO* TC("AI"JJC-E","C"A",1I1) 7 CO* TC(STRI !1, STRI !2) 6 CO* TC(STRI !1,STRI !2,1I1) 8 CO* TC(STRI !1, "XX") 0 CO* TC("LG(H I(L LO(H","H ") 4 1? gGs and ? $lan!s3 CO* TC("LG(H I(L LO(H","H ","T") 2 1$lan!s trimmed3 CO* TC("A"C-EIJKLD",""C-",1VI1) 4 1A . a and e3

-rogram5&

3e)onstrating the COUNTC function to find one or )ore Characters


6ATA COUNTBC;A49 IN-UT "T4IN: =?*.9 NU7BA F COUNTC1"T4IN: GAG39 NU7BAa F COUNTC1"T4IN: GaG GiG39 NU7BABO4B< F COUNTC1"T4IN: GA<G39 NOTBA F COUNTC1"T4IN: GAG GvG39 NOTBABT4I7 F COUNTC1"T4IN: GAG GvtG39 NOTBAa F COUNTC1"T4IN: GAG GivG39 6ATA/IN."9 U--.4 A AN6 /O0.4 a a$A< <<<$$$ 9 -4OC -4INT 6ATAFCOUNTBC;A49 TIT/. I/isting of 6ata "et COUNTBC;A4I9 4UN9

E'%*anation:This program demonstrates several features of the COUNTC function. The first use of the function simply loo!s for the num$er of times the uppercase letter A appears in the string. Ne#t $y adding the i modifier the num$er of upper& or lo+ercase AGs is counted. Ne#t +hen you place more than one character in the list the function returns the total num$er of the listed characters. The v modifier is interesting. The first time it is used COUNTC is counting the num$er of characters in the string that are not uppercase AGs.

Function: 0ISSIN6 Purpose: To determine if the argument is a missing 1character or numeric3 value.

This is a handy function to use since you donGt have to !no+ if the varia$le you are testing is character or numeric. The function returns a ( 1true3 if the value is a missing value a * 1false3 other+ise. Syntax5 7I""IN:1varia%le3 variable is a character or numeric varia$le or e#pression.

.#amples5& For these e#amples Function MISSI !( *M1) MISSI !( *M2) MISSI !(CHAR1) MISSI !(CHAR2)

*M1 = 5

*M2 = 4 CHAR1 = "A"C" and CHAR2 = " "

$eturns 0 1 0 1

Function5 4ANP Purpose: To o$tain the relative position of the A"CII 1or .<C6IC3 characters. This can $e useful if you +ant to associate each character +ith a num$er so that an A44A> su$script can point to a specific character. Syntax: 4ANP1letter3 letter can $e a string literal or a "A" character varia$le. If the literal or varia$le contains more than one character the 4ANP function returns the collating sequence of the first character in the string. s .#amples5& For these e#amples STRI !1 = "A" and STRI !2 = "XYZ" Function $eturns RA _(STRI !1) RA _(STRI !2) RA _("X") RA _("I")

65 88 88 97

Function5 4.-.AT Purpose: To ma!e multiple copies of a string. syntax: 4.-.AT1character-value n3 character-value is any "A" character e#pression. n is the num$er of repetitions. The result of this function is the original string plus n repetitions. Thus if n equals ( the result +ill $e t+o copies of the original string in the result. If you do not declare the length of the character varia$le holding the result of the 4.-.AT function it +ill default to ?**. .#amples5& For these e#amples STRI ! = "A"C" Function $eturns REPEAT(STRI !,1) REPEAT("HE))O ",3) REPEAT("2",5)

"A"CA"C" "HE))O HE))O HE))O HE))O" "222222"

-rogram5&

Using the $EPEAT function to under*ine out%ut ,a*ues


6ATA BNU//B9 FI/. -4INT9 TIT/. I6emonstrating the 4.-.AT FunctionI9 /.N:T; 6A"; = N*9 IN-UT "T4IN: =N*.9 IF BNB F ( T;.N -UT N*DIDI9 6A"; F 4.-.AT1I&I /.N:T;1"T4IN:3 & (39 -UT "T4IN: [ 6A";9 6ATA/IN."9 "hort line This is a longer line <ye 9

.#planation5&ssss The program a$ove underlines each string +ith the same num$er of dashes as there are characters in the string. "ince you +ant the line of dashes to $e the same

length as the string you su$tract one from the length remem$ering that the 4.-.AT function results in n ] ( copies of the original string 1the original plus n repetitions3. The t+o important points to remem$er +hen using the 4.-.AT function are5 al+ays ma!e sure you have defined a length for the resulting character varia$le and the result of the 4.-.AT function is n ] ( repetitions of the original string

Function5 4.2.4". Purpose: To reverse the order of te#t of a character value. Syntax: 4.2.4".1character-value3 character-value is any "A" character e#pression. .#amples5& For these e#amples STRI !1 = "A"C-E" and STRI !2 = "XYZ " Function $eturns REVERSE(STRI !1) REVERSE(STRI !2) REVERSE("1234")

"E-C"A" " ZYX" "4321"

-rogram5&

Using the $E;E$SE function to create -ac1+ards +riting


6ATA <ACP0A46"9 IN-UT M( "T4IN: =C;A4(*.9 :NI4T" F 4.2.4".1"T4IN:39 6ATA/IN."9 4on Cody 8>H A<C6.F: # (?@ANJTO)* 9

-4OC -4INT 6ATAF<ACP0A46" NOO<"9 TIT/. I/isting of 6ata "et <ACP0A46"I9 4UN9

.#planation5& It is important to realize that if you donGt specify the length of the result it +ill $e the same length as the argument of the 4.2.4". function. Also if there +ere trailing $lan!s in the original string there +ill $e leading $lan!s in the reversed string.

6ate and Time Functions Functions That Create "A" 6ate 6atetime and Time 2alues The first three functions in this group of functions create "A" date values datetime values and time values from the constituent parts 1month day year hour minute second3. The 6AT. and TO6A> functions are equivalent and they $oth return the current date. The 6AT.TI7. and TI7. functions are used to create "A" datetime and time values respectively. Function5 76> Pur%ose: To create a "A" date from the month day and year. S&nta'5 76>1month day year3 month is a numeric varia$le or constant representing the month of the year 1a num$er from ( to (?3. day is a numeric varia$le or constant representing the day of the month 1a num$er from ( to @(3. year is a numeric varia$le or constant representing the year. .#amples For these e#amples 7 F (( Function 76>17 6 >3 76>1(* ?( ()O*3 76>1( ( ()N*3 76>1(@ *( ?**@3 -rogram5&

6 F (N > F ?**@. 4eturns (J*?A 1(NNO2?**@ X formatted value3 TN)) 1?(OCT()O* X formatted value3 &@JN? 1*(QAN()N* X formatted value3 numeric missing value

Creating a SAS date ,a*ue fro) se%arate ,aria-*es re%resenting the da&9 )onth9 and &ear of the date
data funnydate9 input M( 7onth ?. MT >ear A. M(@ 6ay ?.9 6ate F mdy17onth 6ay >ear39 format 6ate mmddyy(*.9

datalines9
*N ?*** ?N

(( ?**( *? 9 title I/isting of FUNN>6AT.I9 proc print dataFfunnydate noo$s9 run9

.#planation5& ;ere the values for month day and year +ere not in a form +here any of the standard date informats could $e used. Therefore the day month and year values +ere read into separate varia$les and the 76> function +as used to create a "A" date. -rogram5&

Progra) to read in dates and set the da& of the )onth to 1F if the da& is )issing fro) the date data )issingH in%ut I1 3u))& B1G8H 3a& > scan(3u))&929DJD)H if not )issing(3a&)then 3ate > in%ut(3u))&9))dd&&1G8)H e*se 3ate > )d&(in%ut(scan(3u))&919DJD)928)9 1F9 in%ut(scan(3u))&9#9DJD)9E8))H for)at date dateK8H data*inesH 1GJ21J1KEL 1J J2GGG G1J J2GG2 sss H tit*e 7!isting of 0ISSIN67H %roc %rint data>)issing noo-sH runH
.#planation5& This program reads in a date and +hen the day of the month is missing it uses the (Nth of the month. The entire date is first read as a characterstring as the varia$le 6U77>. Ne#t the "CAN function is e#ecuted +ith the slash character 1[3 as the I+ordI delimiter. The second +ord is the month. If this is not missing the IN-UT function is used to convert the character string into a "A" date. If 6A> is missing the 76> function is used to create the "A" date +ith the value of (N representing the day of the month. Function5 6;7"

Pur%ose: To create a SAS dateti)e ,a*ue fro) a SAS date ,a*ue and a ,a*ue for
the hour9 )inute9 and second8

S&nta': 6;7"1date

hour minute second3

date is a SAS date ,a*ue9 either a ,aria-*e or a date constant8 hour is a nu)erica* ,a*ue for the hour of the da&8 If hour is greater than 2E9 the function +i** return the a%%ro%riate dateti)e ,a*ue8 )inute is a nu)erica* ,a*ue for the nu)-er of )inutes8 second is a nu)erica* ,a*ue for the nu)-er of seconds8

Function5 ;7" Pur%ose5 To create a "A" time value from the hour minute and second. S&nta': ;7"1hour minute second3 hour is the value corresponding to the num$er of hours. minute is the value corresponding to the num$er of minutes. second is the value corresponding to the num$er of seconds. .#amples For these e#amples ; F ( 7 F @* " F (N. Function 4eturns ;7"1; 7 "3 NA(N 1(5@*5(N X formatted value3 ;7"1* * ?@3 ?@ 1*5**5?@ X formatted value3

Function5 6AT. and TO6A> 1equivalent functions3 Pur%ose5 To return the current date. S&nta': 6AT.13 or TO6A>13 Note that the parentheses are needed even though these functions do not ta!e any arguments. .#amples5&

Function 3ATE() TO3A2()


Function5 6AT.TI7.

$eturns 1FMLG (GE?UN2GG# N for)atted) 1FMLG (GE?UN2GG# N for)atted)

Pur%ose: To return the dateti)e ,a*ue for the current date and ti)e8 S&nta': 6AT.TI7.13
.#amples

Function $eturns

3ATETI0E() 1#OG#OLLGG (GE?UNG#:2G:1G:GG N for)atted)


Function5 TI7.

Pur%ose: To return the ti)e of da& +hen the %rogra) +as run8 S&nta'5 TI7.13 E'a)%*es Function TI0E()
-rogram5&

$eturns O2LGG (2G:1G:GG N for)atted)

3eter)ining the date9 dateti)e ,a*ue9 and ti)e of da&


data test9 6ate F today139 6T F datetime139 Time F time139 6T? F dhms16ate O (N @*39 Time? F hms1O (N @*39 6O< F G*(,an()J*Gd9 Age F int1yrdif16O< 6ate GactualG339 format 6ate 6O< date). 6T 6T? datetime. Time Time? time.9 run9 title I/isting of 6ata "et T."TI9 proc print dataFtest noo$s9 run9 .#planation5& The varia$le 6T? is a "A" datetime value created from the current date and specified values for the hour minute and second. TI7.? is a "A" time value created from three valuesfor hour minute and second. Finally the age +as computed using the >46IF function. The INT function +as used to compute age as of the last $irthday 1it thro+s a+ay all digits to the right of the decimal point3. Functions That .#tract the >ear 7onth 6ay etc. from a "A" 6ate

This group of functions ta!es a "A" date value and returns parts of the date such as the year the month or the day of the +ee!. "ince these functions are demonstrated in a single program letGs supply the synta# and e#amples.
Function5 >.A4

Pur%ose5 To e#tract the year from a "A" date. S&nta': >.A41date3 date is a "A" date value.
.#amples

Function >.A41G(JAU:?**?Gd3 >.A41G(JAU:*?Gd3


Function5 KT4

4eturns ?**? ?**?

Pur%ose5 To e#tract the quarter 1QanuaryX7arch F ( AprilXQune F ? etc.3 from a "A" date. S&nta'5 KT41date3 date is a "A" date value.
.#amples

Function KT4 1G*NF.<?**@Gd3 KT41G*(6.C?**@Gd3


Function5 7ONT;

4eturns ( A

Pur%ose: To e#tract the month of the year from a "A" date 1( F Qanuary ?FFe$ruary etc.3. S&nta': 7ONT;1date3 date is a "A" date value.
.#amples

Function 7ONT;1G(JAU:?**?Gd3
Function5 0..P

4eturns O

Pur%ose: To e#tract the +ee! num$er of the year from a "A" date 1the +ee!&num$er value is a num$er from * to N@ or ( to N@ depending on the optional modifier3. S&nta'5 0..P1RdateS R GmodifierGS3 date is a "A" date value. If date is omitted the 0..P function returns the +ee! num$er of the current date. modifier is an optional argument that determines ho+ the +ee!&num$er value is determined.If modifier is omitted the first "unday of the year is +ee! (.
.#amples5&

Function 0..P1G(JAU:?**?Gd3 0..P1G*(QAN()J*Gd3 0..P1G*@QAN()J*Gd3 0..P1G*(QAN()J*Gd G2G3

4eturns @? * ( N@

Function5 0..P6A>

Pur%ose5 To e#tract the day of the +ee! from a "A" date 1( F "unday ?F7onday etc.3. S&nta'5 0..P6A>1date3 date is a "A" date value.
.#amples

Function 4eturns 0..P6A>1G(JAU:?**?Gd3 N 1Thursday3


Function5 6A>

Pur%ose: To e#tract the day of the month from a "A" date a num$er from ( to @(. S&nta'5 6A>1date3 date is a "A" date value.
.#amples

Function 6A>1G(JAU:?**?Gd3
-rogram5&

4eturns (J

3e)onstrating the functions 2EA$9 CT$9 0ONTA9 "EEP9 3A29 and "EEP3A2
data dateBfunctions9 set dates1dropF6ate?39 >ear F year16ate(39 Kuarter F qtr16ate(39 7onth F month16ate(39 0ee! F +ee!16ate(39 6ayBofBmonth F day16ate(39 6ayBofB+ee! F +ee!day16ate(39 run9 title I/isting of 6ata "et 6AT.BFUNCTION"I9 proc print dataFdateBfunctions noo$s9 run9 .#planation5& These $asic date functions are straightfor+ard. They all ta!e a "A" date as the single argument and return the year the quarter the month the +ee! the day of the month or the day of the +ee!. 4emem$er that the 0..P6A> function returns the day of the +ee! +hile the 6A> function returns the day of the month 1itGs easy to confuse these t+o functions3

Functions That .#tract ;ours 7inutes and "econds from "A" 6atetime and Time 2alues

The ;OU4 7INUT. and ".CON6 functions +or! +ith "A" datetime or time values in much the same +ay as the 7ONT; >.A4 and 0..P6A> functions +or! +ith "A" date values.
Function5 ;OU4

Pur%ose5 To e#tract the hour from a "A" datetime or time value. S&nta': ;OU41time or dt3 time or dt is a "A" time or datetime value.
.#amples

For these e#amples 6T F G*?QAN()J*5N5(*5(NGdt T F GN5O5(*GT. Function 4eturns ;OU416T3 N ;OU41T3 N ;OU41;7"1N O )33 N
Function5 7INUT.

Pur%ose: To e#tract the minute value from a "A" datetime or time value. S&nta'5 7INUT.1time or dt3 time or dt is a "A" time or datetime value.
.#amples

For these e#amples 6T F G*?QAN()J*5N5(*5(NGdt T F GN5O5(*GT. Function 4eturns 7INUT.16T3 N 7INUT.1T3 N 7INUT.1;7"1N O )33 N
Function5 ".CON6

Pur%ose: To e#tract the second value from a "A" datetime or time value. S&nta'5 ".CON61time or dt3 time or dt is a "A" time or datetime value.
.#amples

For these e#amples 6T F G*?QAN()J*5N5(*5(NGdt T F GN5O5(*GT. Function 4eturns ".CON616T3 (N ".CON61T3 (*

".CON61;7"1N O )33
-rogram5&

3e)onstrating the AOU$9 0INUTE9 and SECON3 functions


data time9 6T F G*(,an()J*5N5(N5@*Gdt9 T F G(*5*N5?@Gt9 ;ourBdt F hour16T39 ;ourBtime F hour1T39 7inuteBdt F minute16T39 7inuteBtime F minute1T39 "econdBdt F second16T39 "econdBtime F second1T39 format 6T datetime.9 run9 title I/isting of 6ata "et TI7.I9 proc print dataFtime noo$s headingFh9 run9 .#planation5&

The varia$le 6T is a "A" datetime value 1computed as a "A" datetime constant3 and T is a "A" time value 1computed as a "A" time constant3. The program demonstrates that the ;OU4 7INUT. and ".CON6 functions can ta!e either "A" datetime or time values as arguments.
Functions That .#tract the 6ate or Time from "A" 6atetime 2alues

The 6AT.-A4T and TI7.-A4T functions e#tract either the date or the time from a "A" datetime value 1the num$er of seconds from Qanuary ( ()J*3.
Function5 6AT.-A4T

Pur%ose: To compute a "A" date from a "A" datetime value. S&nta'5 6AT.-A4T1date&time&value3 date&time&value is a "A" datetime value.
Function5 TI7.-A4T

Pur%ose5 To e#tract the time part of a "A" datetime value. S&nta': TI7.-A4T1date&time&value3 6ate&time&value is a "A" datetime value.
-rogram

E'tracting the date %art and ti)e %art of a SAS dateti)e

;a*ue
data piecesBparts9 6T F G*(,an()J*5N5(N5@*Gdt9 6ate F datepart16T39 Time F timepart16T39 format 6T datetime. Time time. 6ate date).9 run9 title I/isting of 6ata "et -I.C."B-A4T"I9 proc print dataFpiecesBparts noo$s9 run9 .#planation5&

The 3ATEPA$T and TI0EPA$T functions e'tract the date and the ti)e fro) the dateti)e ,a*ue9 res%ecti,e*&8 These t+o functions are es%ecia**& usefu* +hen &ou i)%ort data fro) other sources8
Functions That 0or! +ith 6ate 6atetime and Time Intervals s

Functions in this group +or! +ith date or time intervals. The INTCP function +hen used +ith date or datetime values can determine the num$er of interval $oundaries crossed $et+een t+o dates. 0hen used +ith "A" time values it can determine the num$er of hour minute or second $oundaries $et+een t+o time values. The INTN8 function +hen used +ith "A" date or datetime values is used to determine the date after a given num$er of intervals have passed. 0hen used +ith"A" time values it computes the time after a given num$er of time interval units have passed.
Function5 INTCP

Pur%ose5 To return the num$er of intervals $et+een t+o dates t+o times or t+o datetime values. To $e more accurate the INTCP function counts the num$er of times a $oundary has $een crossed going from the first value to the second. For e#ample if the interval is >.A4and the starting date is Qanuary ( ?**? and the ending date is 6ecem$er @( ?**? the function returns a *. The reason for this is that the $oundary for >.A4 is Qanuary ( and even though the starting date is on a $oundary no $oundaries are crossed in going from the first date to the second. S&nta'5 INTCP1GintervalR7ultipleSR.shiftSG
end&value3 start&value

interval can $e date units or time units or datetime units multiple is an optional modifier in the interval. >ou can specify multiples of an interval. For e#ample 7ONT;? specifies t+o&month intervals9 6A>N* specifies N*&day intervals. .shift is an optional parameter that determines the starting point in an interval. For e#ample >.A4.A specifies yearly intervals starting from April (.
"hift value for "A" date and datetime values5

Interval >.A4 ".7I>.A4 KT4 7ONT; ".7I7ONT; T.N6A> 0..P6A> 0..P 6A> Interval ;OU4 7INUT. ".CON6
Function5 INTN8

"hift 2alue 7onth 7onth 7onth 7onth "emimonthD Tenday 6ay 6ay 6ay "hift 2alue ;ourD 7inuteD "econdD

"hift value for "A" time intervals5

Pur%ose5 To return the date after a specified num$er of intervals have passed. S&nta'5 INTN81GintervalG start&date increment R GalignmentGS3 interval is one of the same values that are used +ith the INTCP function 1placed in quotation mar!s3. start&date is a "A" date. increment is the num$er of intervals $et+een the start date and the date returned $y the function. alignment is an optional argument and has a value of <.:INNIN: 1<3 7I66/. 173 .N6 1.3 or "A7.6A>1"3. The default is <.:INNIN:.
Function5 >46IF

Pur%ose: To return the difference in years $et+een t+o dates 1includes fractional parts of a year3. S&nta'5 >46IF1start&date end&date G$asisG3 start&date is a "A" date value. end&date is a "A" date value.

$asis is an argument that controls ho+ "A" computes the result. The first value is used to specify the num$er of days in a month9 the second value 1after the slash3 is used to specify the num$er of days in a year.
-rogram5&

To de)onstrate the date inter,a* functions


data period9 set dates9 IntervalBmonth Fintc!1GmonthG 6ate( 6ate?39 IntervalByear F intc!1GyearG 6ate( 6ate?39 >earBdiff F yrdif16ate( 6ate? GactualG39 IntervalBqtr F intc!1GqtrG 6ate( 6ate?39 Ne#tBmonth F intn#1GmonthG 6ate( (39 Ne#tByear F intn#1GyearG 6ate( (39 Ne#tBqtr F intn#1GqtrG 6ate( (39 "i#Bmonth F intn#1GmonthG 6ate( J39 format Ne#t5 "i#Bmonth date).9 run9 title I/isting of 6ata "et -.4IO6I9 proc print dataFperiod headingFh9 id date( date?9 run9

Function That Computes 6ates of "tandard ;olidays Function5 ;O/I6A> -urpose5 4eturns a "A" date given a holiday name and a year. "ynta#5 ;O/I6A> 1holiday year3 holiday is a holiday name 1see list $elo+3. year is a numeric varia$le or constant that represents the year. .#amples5& unctions That 0or! +ith Qulian 6ates

This group of functions involves Qulian dates. Qulian dates are commonly used in computer applications and represent a date as a t+o& or four&digit year follo+ed $y a three&digit day of the year 1( to @JN or @JJ if it is a leap year3. For e#ample Qanuary @ ?**@ in Qulian notation +ould $e either ?**@**@ or *@**@. 6ecem$er @( ?**@ 1a non&leap year3 +ould $e either ?**@@JN or *@@JN.
Function5 6AT.QU/

Pur%ose5 To convert a Qulian date into a "A" date. S&nta': 6AT.QU/1,ul&date3 ,ul&date is a numerical value representing the Qulian date in the form dddyy or dddyyyy
.#amples5&

For these e#amples Q6AT. F ()J*(?@. Function 4eturns 6AT.QU/1()J***(3 * 1*(QAN()J* formatted3 6AT.QU/1?**@@JN3 (J*T* 1@(6.C?**@ formatted3 6AT.QU/1Q6AT.3 (?? 1*?7A>()J* formatted3
Function5 QU/6AT.

Pur%ose: To convert a "A" date into a Qulian date. S&nta': QU/6AT.1date3 date is a "A" date. .#amples For these e#amples 6AT. F G@(6.C?**@G6. Function 4eturns QU/6AT.16AT.3 @@JN QU/6AT.1G*(QAN()J*G63 J*** QU/6AT.1(??3 J*(?
Function5 QU/6AT.T

Pur%ose5 To convert a "A" date into seven&digit Qulian date. S&nta'5 QU/6AT.T1date3 date is a "A" date. .#amples For these e#amples 6AT. F G@(6.C?**@G6. Function 4eturns QU/6AT.T16AT.3 ?**@@JN QU/6AT.T1G*(QAN()J*G63 ()J***( QU/6AT.T1(??3 ()J*(?@
-rogram5&

3e)onstrating the three ?u*ian date functions


data ,ulian9 input 6ate 5 date). Qdate9 QdateBtoBsas F date,ul1Qdate39 "asBtoBQdate F ,uldate16ate39 "asBtoB,dateT F ,uldateT16ate39 format 6ate QdateBtoBsas mmddyy(*.9

datalines9 *(QAN()J* ?**@@JN (N7A>()*( ()*N**( ?(OCT()AJ N**( 9 title I/isting of 6ata "et QU/IANI9 proc print dataF,ulian noo$s9 var 6ate "asBtoB,date "asBtoB,dateT Qdate QdateBtoBsas9 run9 .#planation5& It is important to realize that Qulian dates +ithout four&digit years +ill $e converted to "A" dates $ased on the value of the >.A4CUTOFF system option. To avoid any pro$lems it is $est to use seven&digit Qulian dates.

S-ar putea să vă placă și