Course Description:
Introduction (through both lecture and supervised work, integrated in a practicum format) to elementary use and overview of SAS Version 8.2 for Windows, including data file organization, data management, data import and export (from/to other formats and operating systems), and basic analysis. Use of SAS on other platforms supported by ITC (Mac, Unix) will be addressed but not explicitly instructed.
This document is the first part of the Introduction to SAS workshop; the second part is also available online.
Prerequisites
Familiarity with DOS (file paths and directory structures) and Microsoft Windows (booting, menus, mouse, scrolling, saving, etc.).
Table of Contents
- Workshop Basics
- Workshop Structure
- License Warning
- Program Overview
- Syntax Conventions
- Obtaining the Files Used in this Tutorial
- Getting Started
- Starting SAS for Windows
- Managing a Dataset
- Making a Dataset Active
- Saving Commands
- Examining Log and Output
- Data Preparation
- Running Frequencies
- Missing Values
- Variable Labels
- Data Analysis
- Examining Differences
- Recoding Variables
- Saving your Data
- Generating Cross tabulations
- Documentation and Help
- SAS Tutorial
- SAS Manuals
- Sample Syntax
- Web Documentation
- Consulting Services
Workshop Basics
-
Workshop Structure
The goals of this document (and workshop) are to provide a brief introduction to SAS, explain a few fundamental commands, and practice some of the many features of SAS for Windows.
In this first session, we will focus on basic data management procedures, including how to:
- open and manage a data file,
- define data (resolve missing values, and create variable labels and value labels),
- run basic descriptive statistics (frequencies and descriptives),
- transform data (recoding, and computing new variables), and
- perform basic analytical procedures (cross-tabulations and regression analysis).
In the next session, we will look at additional procedures you may need in order to be productive in a SAS environment, including how to:
- read other data formats (e.g. raw ASCII data, Excel files, SPSS files)
- save the current dataset as a SAS permanent dataset, and
- several advanced procedures including IF-THEN statements, merging, and macros.
Both sessions will concentrate on introductory manipulation of SAS for Windows; they will touch on interpretation of output and invocation of particular procedures. But they are not a complete beginner's guide. The SAS for Windows Tutorial is highly recommended for such issues, and covers topics beyond the scope of these documents.
License Warning
SAS for Windows is a product of SAS Institute Inc. Information Technology and Communication (ITC) has a site license that permits ITC to distribute SAS to faculty, staff, and students for use, in Charlottesville only. In addition, ITC agrees to provide user support: if you have a question regarding the SAS software, call the ITC Research Computing Support Center at 243-8800. Unauthorized copying and use of SAS software violates the copyright and site license and will result in ITC losing its site license. Please use SAS legally.
ITC provides access to a number of other general purpose statistical packages for faculty, staff, and graduate students. For a listing of available software and associated information, please see our Researchers website.
Program Overview
Variation: The use of SAS can vary from simple to complex, depending on the condition of the data and sophistication of the analysis. SAS programs may include a variety of activities: data input, data transformation, elementary or advanced statistical analysis, creation of data sets, or custom output.
Routine: Any data analysis (whether with SAS or another program) has five steps:
- prepare data,
- prepare commands,
- invoke commands,
- examine output, and
- save your work.
In practice, you will typically repeat several of these steps, particularly as you correct errors and re-invoke commands.
Statements: SAS programs consist of SAS statements, largely contained in DATA and PROC steps. DATA steps introduce and prepare data for use in SAS. PROC steps issue procedures, to actually perform analysis and generate desired output. Each PROC step calls a program to process (e.g. list, sort, compute, plot, and/or print) the information stored in a data set using keywords such as PROC FREQ for frequencies, PROC MEANS for descriptive statistics, and PROC REG for regression.
Order: A typical SAS run will encompass both DATA steps and PROC steps. You can use several DATA steps to create multiple data sets, and these in turn can be processed by several PROC steps. These steps may occur in any order, though a DATA step must precede a PROC step using that data.
Errors: By default, SAS does not write error messages and warnings to your terminal as it executes. When SAS completes execution it writes a summary of errors encountered to the LOG file . You should always begin the examination of the results of your SAS program by looking at the LOG file for ERRORS and NOTES.
Etc: SAS has many capabilities that this course does not have time to cover. More than just a statistical analysis tool, it is capable of complex report generation, database management, and graphics. For further details, see the SAS Language Usage and SAS Language Reference manuals (available at the Research Computing Support Center in 244 Wilson).
Syntax Conventions
Keywords begin most statements and are recognized as commands (e.g. DATA or PROC). They are reserved as commands and should not be part of file or variable names. (More on names later.)
Case sensitivity: SAS does not care if commands are typed in UPPERCASE or lowercase. (Unix users should note, however, that the RS/6000 AIX operating system is case-sensitive.)
Spaces: Commas, spaces, and the "equals" sign are NOT used interchangeably in SAS. When items are separated by spaces, the number of spaces is not important, since "extra" spaces are ignored.
Lines: must end with a semi-colon, although statements may be spread over several lines and can begin anywhere on a line. All statements must end by column 80, although you do not have to use all 80 columns. It is also possible (but not recommended for debugging purposes) to have multiple statements on a single line.
Errors: Common errors include omitting a semi-colon at the end of a statement, and omitting a period in a format modifier (which we'll discuss in the second session).
Obtaining the Files Used in this Tutorial
There are several files that have been created for use in this tutorial. The tutorial assumes that the files are saved on the hard drive of your PC in an area named C:/Temp (of course you may choose to save these files elsewhere). If you have problems downloading or using the files using Netscape, using Internet Explorer may resolve the problem.
To save the following files to the C:/Temp area of your hard drive - right click on the link corresponding to each file. Choose the Save As... option from the pull down menu. When the Save As... dialog box opens, use the navigation tools to designate the C:/Temp area as the save in area. Then hit the save button to save the file in the C:/Temp area.
Using the above method, save the following files in your C:/Temp directory.
- bank.dat - The ASCII file containing the raw bank data
- bankdata.sas7bdat- The SAS dataset (for Part 1)
- bankend1.sas7bdat- The SAS dataset (for Part 2)
- course0.sas - A file containing the SAS commands to read bank.dat
- course1.sas - A file containing the sas commands for part 1 of the tutorial
- course2.sas - A file containing the sas commands for part 2 of the tutorial
- bank.xls - an Excel file containing the bank data
- sasdata.dat - a raw data file
Click on each of these five selections, in this order:
- the START button on the screen.
- the PROGRAMS listing.
- the STATISTICAL listing.
- the SAS folder listing.
- the SAS 8 listing (or listing for other version, as instructed).
SAS for Windows consists of five sub-windows. By default, the EXPLORER, ENHANCED EDITOR, and LOG windows are initialy open, while the OUTPUT and RESULTS window are hidden under these three.
You can position and resize these windows any way you wish. To bring a partially hidden window to the front, click once anywhere on it. To find a window that not showing at all, select the item on the SAS window bar. You can also right-click anywhere in the LOG or OUTPUT windows, choose VIEW from the pop-up context menu, and select any window from the list at the top.
The Enhanced Editor window provides a number of useful editing features, including color coding and syntax checking of SAS language.
The Results window helps you navigate and manage output from SAS programs that you submit. You can view, save, and print individual items of output. By default, the Results window is positioned behind the Explorer window and it is empty until you submit a SAS program that creates output. Then it moves to the front of your display.
In the Explorer window, you can view and manage your SAS files, and create shortcuts to non-SAS files. Use this window to create new libraries and SAS files, to open any SAS file, and to perform most file management tasks such as moving, copying, and deleting files.
To create a new library, sasclass, with directory C:/Temp, click the Explorer window, then select FILE, NEW in the pull-down menu. You will get to the New Library dialog box, type in "sasclass" next to Name, and "C:/Temp" next to Path, as shown here:
Now click OK, a new library called sasclass will show up in the Explorer window.
Note that SAS uses a single icon on the start bar, whereas SPSS, for example, has a separate icon for each syntax, output, and data window.
In this session, we will enter commands by typing directly into the ENHANCED EDITOR window and run commands directly from this window. (Alternately, you could create such a file in any text editor or word processor. Note, however, that you must save the data or command file as an ASCII text or DOS file, not in the format of the word processor you are using.) You may submit a single command, an entire command file, or merely part of the file; we will try each in this session.
First, let's look at some data.
Managing a Database
In many statistical programs, data is typically created and displayed in a conventional spreadsheet format (a grid of rows and columns, where each row is a case and each column is a variable). SAS does not typically display data in this format, but there is a VIEWTABLE option (effective with version 6.12) that will let you look at data in this way.
Opening a SAS data file using the VIEWTABLE option
Step 1: From the drop-down menus, choose TOOLS, TABLE EDITOR, as shown here:
Step 2: To view a data set, you will need to get to the Open File dialog box. Select FILE and OPEN as shown here:
Step 3:You will next need to select the location of the dataset which you would like to manage. Select the library where the file is. Next, select the bankdata SAS dataset. (Note that only SAS datasets are listed.) In the image above, the file bankdata.sd2 is listed as BANKDATA. Once you've navigated to the correct path and see the file you want listed, mouse click once on that filename to highlight it.
Step 4: Click OPEN, and the VIEWTABLE window should open as follows:
Step 5: There are two modes in SAS for looking at the table. The "browse" mode only allows you to look at the data. So that we can also manipulate the data set in VIEWTABLE, choose EDIT, Edit Mode, as shown here:
If you wanted or needed to, you could edit the actual data in this window -- changing a particular value or values, for instance. You can also investigate and manipulate labels given to particular variables.
Labeling a Variable
To associate a label with a variable (within VIEWTABLE) follow these steps:
Step 1: Click at the top of the third column to highlight that column's variable name.
Step 2: Double-click that variable name (bdate) to see the SAS variable dialog box.
Step 3: Type "Birth date of the respondent" in the white box marked "Label"
Step 4: Click on APPLY then CLOSE.
Then close VIEWTABLE with selecting FILE > CLOSE.
Using VIEWTABLE is fine to look at or edit data, but won't make the data file active (i.e. available for you to perform statistical analysis on it).
Making a Dataset Active
To create a SAS dataset, data may be included directly in the SAS command file (using the cards syntax) but is more often read into SAS from a separate data file. Frequently, this data will be in some non-SAS format, such as a columnar ASCII (text or DOS) file or an Excel file. In the second session, we will discuss how to create and import these other formats. We'll begin today with the bankdata dataset, which is already in SAS format.
Below we demonstrate the use of a series of SAS commands to operate on the bankdata data set.
Step 1: Clear the data from the ENHANCED EDITOR window by clicking in the ENHANCED EDITOR window to make it the active window (or by selecting it from the Windows menu), then select Edit > Clear All from the pop-up menu, as shown here:
Step 2: Type the following statements in the ENHANCED EDITOR window. (Remember that you are only setting up the commands here, which is not the same as invoking those commands.)
PROC CONTENTS DATA = sasclass.bankdata ; PROC PRINT DATA = sasclass.bankdata ; RUN ;
What these commands do:
1) The first line runs the procedure "contents" to view the contents of the workdata SAS data set (which resides in the library area called sasclass).
3) The second line prints the data contained the the bankdata data set and the last line submits the previous two lines to the SAS system for processing.
Saving Commands
You have just written your first SAS program! Now, save it with the following steps:
Step 1: Choose FILE, SAVE AS from the menu bar
Step 2: Click the down arrow, scroll up, and choose C:
Step 3: Double-click on the Temp directory icon
Step 4: Type practice.sas in the filename box, and
Step 5: