Sunteți pe pagina 1din 8

STATISTICAL ANALYSIS SYSTEM

SAS, the statistical analysis system is not only a reporting tool but also performs various functions.
In SAS, there are basically two program building blocks one is DATA step and the other PROC step.
In an application we can have any number of DATA and PROC steps. Generally the coding starts
with DATA step. As SAS is not case sensitive we can use lower case, upper case or mixed case
alphabets.

DATA statement creates a temporary dataset to hold the input records. After obtaining all input
information it passes the control to PROC step for analysis and reporting. As SAS is very flexible we
can not confine analysis part only to PROC step we can do analysis at DATA step also.

We need not compile the source code. We just need to submit the JCL if we use it in batch mode.
SAS will automatically compile the application and execute it. Suppose if it encounters any
syntax/runtime errors it will prompt the error in the SAS log. The key word RUN specifies that the
above SAS statements are ready for execution.

In this background let us analyze the following program.

*******************************************************************************************************************
First SAS Program

*******************************************************************************************************************
//STEP7A EXEC SCSAS,REGION=5000K,OPTIONS='CAPSOUT'
//FT11F001 DD SYSOUT=*
//FT12F001 DD SYSOUT=*
//WORK DD UNIT=SYSDA,SPACE=(CYL,(10,10))
//SORTMSG DD SYSOUT=*
//ISUSP DD DSN=RCPRC26.RC.RC1098C4.PWL.SUSPENSE.MASTER,DISP=SHR
//ASIFILE DD DSN=RCPRC26.RC.RC1127DR.SML.ASI.DEMAND.MSTR,DISP=SHR
//VEHSUSP DD DSN=RCPRC26.RC.RC1046PB.VEHICLE.SUSPENSE,DISP=SHR
//PENDNG DD DSN=RCPRC26.RC.RC1046PB.PWL.PENDING.FILE,DISP=SHR
//SYSIN DD *

DATA ISUSP;
INFILE ISUSP;
INPUT @ 1 RECTYP$ CHAR1.
@ 2 DOCNO$ CHAR6.
@ 12 BASIZE$ CHAR2.
@ 14 PARTNO$ CHAR18.
@ 32 RCPTQTY PD4.
@ 36 STRQTY PD4.
@ 40 SHIPQTY PD4.
@ 44 RCPTDT PD4.
@ 48 GENDATE PD4.
@ 52 UNITISS$ CHAR4.
@ 56 SRC_CD$ CHAR2.
@ 58 CPT$ CHAR3.
@ 61 STATUS$ CHAR2.
@ 63 VENDCD$ CHAR5.
@ 68 SHIPTO$ CHAR5.
@ 73 ASNTYP$ CHAR1.
@ 74 MODE$ CHAR2.
@ 76 REL_NBR$ CHAR6.
@ 82 CAR_TRK$ CHAR11.
@ 93 DUPCONV$ CHAR3.
@ 96 SHIPDT PD3.
@ 99 XMITAL PD9.
@ 108 IC_FLAG$ CHAR1.
@ 109 TL_GATE$ CHAR1.
@ 110 INVOICE$ CHAR11.
@ 121 QCQTY PD2.
@ 123 MERCHCD$ CHAR3.
@ 126 XESSQTY PD4.
@ 130 XESS_FLG$ CHAR1.
@ 131 C1_CONF$ CHAR1.
@ 132 C3_CONF$ CHAR1.
@ 133 C1STRQT PD4.
@ 137 C3STRQT PD4.
@ 141 C1SHIP PD4.
@ 145 C3RECV PD4.
@ 149 REASON$ CHAR1.
@ 150 SDR_NO$ CHAR6.
@ 156 SDR_QTY PD4.
@ 160 CONFVEND$ CHAR5.
@ 165 AGEFLAG PD2.
@ 167 CONFQTY PD4.
@ 171 CONFSTAT$ CHAR2.
@ 173 ARRDT PD5.
@ 178 ARRTIME$ 4.
@ 182 RECLOC$ 1.
@ 183 CONFDATE PD4.
@ 187 MACHCD$ CHAR2.
@ 189 LABRHRS PD3.
@ 192 PKGSTK$ CHAR1.
@ 193 SMLSTAT$ CHAR2.
@ 195 CLKQTY PD4.
@ 199 C1CLKQTY PD4.
@ 203 DLRNET PD4.
@ 207 SHIP_TM$ CHAR4.;

DATA ASIFILE;
INFILE ASIFILE;
INPUT @ 1 CARTRK$ CHAR11.
@ 12 DUPCONV$ CHAR3.
@ 15 SUP_CD$ CHAR5.
@ 20 MODE$ CHAR2.
@ 22 ASN_NBR$ CHAR11.
@ 33 ASN_TYP$ CHAR1.
@ 34 ARRDT$ 8.
@ 42 ARRTIME$ CHAR4.
@ 46 SHIPDT$ 8.
@ 54 PARTNO$ CHAR18.
@ 72 ASIQTY$ 7.
@ 79 SHIPTO$ CHAR5.
@ 84 PACKSLIP$ CHAR11.
@ 95 BILL$ CHAR6.
@ 101 SHIPDT 6.
@ 107 INP_DT$ CHAR8.
@ 117 PLACE_DT$ CHAR6.
@ 123 DISTREQ$ CHAR1.
@ 124 PROC_CD$ CHAR1.
@ 125 CPT$ CHAR3.
@ 128 MDSR_CD$ CHAR3.
@ 131 STATUS$ CHAR2.
@ 133 STD_COST$ CHAR1.
@ 134 MSQ$ CHAR3.
@ 137 PROC_FLG$ CHAR1.
@ 138 RECLOC$ CHAR1.
@ 139 RAIL_CD$ CHAR1.
@ 140 NOREQCD$ CHAR1.
@ 141 WONBR$ CHAR6.
@ 147 XSQTY$ 6.
@ 153 XSFLAG$ CHAR1.
@ 154 SHIP_TM$ CHAR4.
@ 158 C_200_1$ CHAR200.
@ 358 C_200_2$ CHAR200.
@ 558 C_200_3$ CHAR200.
@ 758 C_200_4$ CHAR200.
@ 958 C_200_5$ CHAR200.
@1158 C_200_6$ CHAR62.;
DATA VEHSUSP;
INFILE VEHSUSP;
INPUT @1 CARTRK10$ CHAR10.
@11 DUPCONV$ CHAR3.
@14 ARRDT 8.
@22 ARRTIME 4.
@26 PRINTDT 8.
@34 MODE$ CHAR2.
@36 SHIPDT 8.
@44 SHIP_TM$ CHAR4.
@48 EOCDT 8.
@56 EOCTIME 4.
@60 SHIPTO$ CHAR5.
@65 BADGE$ CHAR4.
@69 RSEOCDT 8.
@77 RSEOCTM 4.
@81 CAR_TRK$ CHAR11.
@92 FILLER$ CHAR34.;
DATA PENDING;
INFILE PENDING;
INPUT @1 RECTYPE$ CHAR1.
@2 CONVEY$ CHAR7.
@9 DUPCONV 3.
@12 MODE$ CHAR2.
@14 SCAC$ CHAR4.
@18 SHIPTO$ CHAR5.
@23 NO_PALT 9.
@32 NO_RACK 9.
@41 BULK 9.
@50 BIN 9.
@59 CAROUSEL 9.
@68 CONTAINR 9.
@77 RECLOC$ CHAR1.
@78 C_200_1$ CHAR200.
@278 C_200_2$ CHAR200.
@478 C_24$ CHAR24.;
@1 RECTYPE$ CHAR1.
@2 ASN_NO$ CHAR11.
@13 ASN_TYP$ CHAR1.
@14 SHIP_TO$ CHAR5.
@19 SHIP_FR$ CHAR5.
@24 SHIP_DT$ CHAR8.
@32 SHIP_TM$ CHAR4.
@36 GROSS_WT 12.
@48 C_200_1$ CHAR200.
@248 C_200_2$ CHAR200.
@448 C_54$ CHAR54.
@1 RECTYPE$ CHAR1.
@2 PART_NO$ CHAR18.
@20 PARTTYP$ CHAR2.
@22 SHIPQTY 9.
@31 CNTRLBL$ CHAR9.
@40 CONTTYP$ CHAR5.
@45 PARTLBL$ CHAR9.
@54 PACKSLIP$ CHAR11.
@65 GDS_NO$ CHAR6.
@71 DLR_DEST$ CHAR5.
@76 GDSORDLN$ CHAR9.
@85 CUM_SHIP$ 9.
@94 INYARDTS$ CHAR26.
@76 C_200_1$ CHAR200.
@76 C_177_1$ CHAR177.;
@1 RECTYPE$ CHAR1.
@2 TOTPARTS 9.
@11 TOTSHIP 9.
@20 C_200_1$ CHAR200.
@220 C_200_2$ CHAR200.
@420 C_82$ CHAR82.;
IF RECTYPE='H';
CAR_TRK=SCAC³³CONVEY;
RUN;

PROC SORT DATA=ISUSP;


BY CAR_TRK DOCNO;

PROC SORT DATA=PENDING;


BY CAR_TRK;

DATA PENDSUSP;
MERGE PENDING (IN=INPUTA) ISUSP (IN=INPUTB);
BY CAR_TRK;
IF INPUTA & INPUTB ;
RUN;

DATA PENDASI;
MERGE PENDING (IN=INPUTC) ASIFILE (IN=INPUTD);
BY CAR_TRK;
IF INPUTC & INPUTD;
RUN;
DATA PENDREP;
MERGE PENDSUSP (IN=INPUTE) PENDASI (IN=INPUTF);
BY CAR_TRK;
IF INPUTF & NOT INPUTE;
RUN;

PROC PRINT DATA=PENDSUSP;


TITLE 'FORD MOTOR COMPANY - PARTS REDISTRIBUTION CENTER';
TITLE2 'OUTSTANDING RAIL/TRUCK S';
VAR CAR_TRK DOCNO ARRDT ARRTIME CONFQTY;

*******************************************************************************************************************
END
*******************************************************************************************************************

The SAS application is using 4 flat files namely ISUSP, ASIFILE, VEHSUSP and PENDNG. The
actual program started in SYSIN DD *:

-DATA ISUSP -- tells SAS to create a temporary dataset named ISUSP


-INFILE ISUSP -- specifies the external input file, i.e the flat file specified against ISUSP in JCL
-INPUT -- specifies the organization of input data and passes the input data to the
temporary dataset.
-@n -- specifies the starting position of the variable in the raw input file(RECTYP) and
is a character of length ‘n’.
-PD4. -- is the packed decimal and occupies 4 bytes.

In this way the data from the flat file ISUSP is read into the temporary dataset ISUSP.

In the same way the data from the flat files ASIFILE,VEHSUSP and PENDNG are extracted into the
temporary datasets ASIFILE,VEHSUSP & PENDNG respectively.

For the RECTYPE 'H' the value of CAR_TRK is set to SCAC 'CONVEY'.

In the PROC step the data in the temp. dataset ISUSP is sorted by the primary key CAR_TRK and
secondary key DOCNO.

In the other PROC step the data of temp.dataset PENDING is sorted by the key CAR_TRK.

The data in the above two temp. data sets PENDING and ISUSP are merged by the key CAR_TRK
and stored in the temporary dataset PENDSUSP. The 'IN=' option specifies whether the dataset
contributed data to the current observation or not. The value of INPUTA is '1' if dataset contributed
data to the current observation , and '0' otherwise. SAS will merge only those observations whose
'INPUTA' & 'INPUTB' tags are true.

In the same way the observations of the temp.dataset PENDING and ASIFILE are merged by the key
CAR_TRK.

In case of temp.dataset PENDREP the contribution to the observations is only from PENDSUSP
and not from PENDASI. (IF INPUTF & NOT INPUTE;)
The VAR statement names the variables to be printed. In our application VAR CAR_TRK DOCNO
ARRDT ARRTIME CONFQTY;

specifies that only 5 fields viz. CAR_TRK DOCNO ARRDT ARRTIME CONFQTY are to be printed in
the output.

TITLE '.....' gives the title to the printed output.

some other points:


***********************

PROCS used to execute SAS programs in the following regions:

TSOXB : SASTEST ( SYS1.HFCP )

TSOXC : SAS7 ( SYST.SAS.V700.PROCLIB(SAS7) )


SAS609 ( SYST.SAS.V609.PROCLIB(SAS609) )

1. The SAS program from SAS: Fundamentals is shown below.

 The DATA statement tells SAS to create a data set.


 The INPUT statement tells SAS how the data is organized.
 The LABEL statement allows you to print more descriptive column headings.
 The CARDS statement tells SAS to look for instream data.
 The PROC PRINT statement requests a basic printed report.
 The TITLE statement adds a descriptive heading to the entire report.
 The RUN statement tells SAS that the preceding statements are ready to be executed.

2. To bring raw data in an external file into your DATA step, use INFILE in the DATA step
in place of CARDS.

The syntax of the INFILE statement is INFILE 'external-data-filename';

3. To save a SAS program to an external file, use the FILE command on the PROGRAM EDITOR
window command line.

The syntax of the FILE command is FILE 'external-program-filename'

4. To bring an external program file into your SAS session, issue the INCLUDE command on the
PROGRAM EDITOR window command line.

The syntax of the INCLUDE command is INCLUDE 'external-program-filename'

5. Here are the rules for coding SAS statements.

 SAS statements end with a semicolon.


 SAS statements can be coded in uppercase, lowercase or in mixed case.
 More than one SAS statement can be coded on a line.
 A SAS statement can be continued on the next line as long as no words are split.
 SAS statements can be coded in free form. That is, you can start and end in any
column on the statement line.
 Blanks and special characters separate words in SAS statements.

6. Here are the rules for SAS names such as data set names or variable names.

 A name can be up to eight characters in length.


 The first character must be a letter or an underscore.
 Later characters may include letters, numbers or underscores
(no other characters are permitted).
 SAS names cannot contain blanks.

when you submit a DATA step to SAS, it is first compiled. The compile phase
- checks the syntax of the SAS statements and compiles the statements (i.e. it translates them
into code the computer can understand.)
- creates areas of memory {input buffer and the program data vector (PDV)} where SAS builds
the data set and the descriptor information.

The input buffer is an area of memory set aside to hold one record of data from the raw data file. At
compile time, SAS creates the input buffer, but it doesn't put anything into it.

The program data vector is another buffer-like entity that will hold all the variables used in the DATA
step. At compile time, SAS defines the PDV, but, like the input buffer, it doesn't put anything into it.

Note: The main difference between input buffer and the PDV is that the Input Buffer holds one record
of raw data. where as PDV holds intermediate variables in addition to input raw data. Consider the
following case :

If the input raw data contains three fields basic pay, HRA and Deductions.
In our application if we need to compute total gross and net. We do it as follows:

gross = basic + hra;


net = gross - deductions;

Here gross and net are two intermediate variables(just like WS variables). So PDV contains
basic,hra,deductions and gross,net.

As SAS builds temporary data set, the PDV is used to hold one observation at a time.

The optional question mark (?) and double question mark (??) format modifiers suppress the printing
of both the error messages and the input lines when invalid data values are read. The ? modifier
suppresses the invalid data message. The ?? modifier also suppresses the invalid data message
and, in addition, prevents the automatic variable _ERROR_ from being set to '1' when invalid data are
read.

Two automatic variables are generated for each DATA step _N_ and _ERROR_
_N_ is initially set to '1'. Each time the data step loops past the data statement, the variable n is
incremented by '1'. The value represents the number of times the data step has iterated.

_ERROR_ is '0' by default but is set to '1' whenever an error is encountered, such as an input data
error, a conversion etc. we can read this variable to help locate errors in data records and to print an
error message in SAS log.

E.g.: if _ERROR_ = 1 then put _infile_ ;


If _ERROR_ then put _infile_;

_infile_ writes the last record read from the file currently being used as input.

*******************************************************************************************************************
End

*******************************************************************************************************************

S-ar putea să vă placă și