Last updated: February 13, 2018
SAS Programming: Mine Your Spreadsheets for Statistical Data
Disclosure: Your support helps keep the site running! We earn a referral fee for some of the services we recommend on this page. Learn more
SAS is a specialized programming language primarily designed for performing statistical analysis of data from spreadsheets or databases. SAS is used to compile such data, analyze it, and output the results to tables, graphs, and other text or web-based documents. Unlike built-in tools available from programs such as Microsoft Excel, SAS allows users to retrieve and manage data from a variety of sources, and offers a far greater degree of control and freedom when manipulating and compiling that data.
The SAS programming language was designed specifically for SAS System software suite. The suite provides both a graphical interface for non-programmers, as well as several advanced options only possible using the SAS language.
SAS programming uses a two-step approach to handling data. In the DATA step, the program retrieves data from its source and uses it to create an SAS data set. In the PROC step, the program analyzes that data. Each of these steps are broken down into a series of statements. In the DATA step, statements are used to instruct the software to perform an action, read a data set, or alter the data's appearance. In the PROC step, statements are used to call named procedures, sort data, or display results.
Work on SAS began in 1966 at North Carolina University, through funding provided by the National Institute of Health. At that time, the newly re-hired programmer, Anthony Barr, was tasked with developing a variance and regression software that could be run on IBM System computers, and which would be used to analyze agricultural data. Barr, along with a NCU student, James Goodnight, released the first version of SAS in 1972, but the project lost funding almost immediately afterward. Barr and Goodnight continued working on the project, and it soon picked up funding from the University Statisticians of the Southern Experiment Stations in 1973. Several new members joined the team at this time, introducing new features such as econometrics, matrix algebra, and new programming functionality.
In 1976, the team pulled the project from NCU and incorporated it into SAS Institute Inc. Throughout the 1980s and 1990s, SAS was introduced to several new platforms and its features were further expanded and refined. In the 2000s, the company began developing a number of new products specifically aimed at business data analysis, including its Text Miner software, which analyzes data from text sources, such as company emails, and its CRM software. In 2010, they introduced a free version of SAS for students. As of 2013, SAS had the largest market-share of any advanced analytics software product.
In the 2000s, the UK company World Programming Limited released its own SAS compiler, World Programming System (WPS), which can be used to create, edit, and run SAS programs and includes many of the same features as SAS Systems.
SAS Institute vs World Programming Limited
Since 2010, SAS System, Inc. has filed multiple lawsuits against World Programming Limited, claiming WPL infringed upon SAS Institute copyrights and reverse engineered SAS software.
The EU Court of Justice found that WPL did not infringe upon the copyright of SAS software, because they did not have access to SAS source code and merely used the SAS software to determine the functionality for their own product. The ruling is significant for the software world, because it sets the precedent that copyright protection does not extend to software functionality.
WPL was, however, found to be in violation of copyright laws for their use of the SAS manual, sections of which were copied nearly verbatim in their own manual. A US federal court also found WPL guilty of engaging in unfair and deceptive trade practices, claiming that WPL violated the terms of the SAS software agreement when they used the free SAS Learning Edition software for non-commercial use (to create their own software).
As with most programming books, make sure you pick the one geared toward your level of knowledge. In the case of SAS, books tend to be written specifically for experienced programmers, data analysts, or both. And there's a good reason for that. SAS has a limited focus, so without a background in one of these areas, chances are you wouldn't be looking to learn the language in the first place.
- The Little SAS Book by Delwiche and Slaughter: this book is designed for beginning and experienced SAS programmers. It breaks down topics into short, self-contained lessons with plenty of examples and visuals.
- SAS Essentials: Mastering SAS for Data Analytics by Elliott and Woodward: while designed for beginning SAS programmers, this book takes a more advanced approach than others, as it is geared primarily for upper-level undergraduates and master's students studying programming, data analysis, or analytics. In addition to teaching common SAS procedures, the book provides an overview of current statistical techniques and data manipulation methodology.
- SAS for Dummies by McDaniel and Hemedinger: on the opposite end of the spectrum, this books takes a fun, simple approach to SAS programming. It provides similar information to SAS Essentials (background knowledge on statistical analysis, an overview of SAS Systems, and common SAS procedures), but it takes an easy-to-follow, absolute beginner approach to the language.
- Learning SAS by Examples: A Programmer's Guide by Ron Cody: if you learn through doing, this is the text for you. It breaks SAS down by specific techniques, provides real-world examples, and then dissects the code to show you step-by-step how it works. Each chapter ends with test problems to check what you've learned.
- SAS Certification Prep Guide: for programmers looking to be certified for career development, this is the official test-prep guide released by the SAS Institute.
SAS trainings range from the complex, statistics-drive tutorials to ultra-techy, program-specific guides, and even some very basic, new-to-programming tools. If you can't afford a degree in statistical analysis, or you already have one and want extra training, there are plenty of options available:
- SAS Certification: the SAS Institute offers several worldwide certifications in basic and advanced SAS programming, statistical analysis, business intelligence, data management, and SAS administration.
- Learn Analytics: geared toward analysts, this SAS certification training can be done in a class or through their collection of online video lectures.
- SAS Training Videos: posted by YouTuber Tamirat Chulta, these short training videos cover a wide range of common applications and programming tips, such as combining data sets, formatting input, and managing SAS email.
- SAS Tutorials: the Study SAS blog provides links to dozens of free SAS notes and video tutorials provided by UCLA, Texas A&M University, and Virginia Commonwealth University. Topics range from general discussions on modifying and exploring data to specific functions and language logic.
When it comes to analyzing spreadsheet data, the majority of us just rely on the tools provided within our favorite spreadsheet program. However, programmers looking to get the most from their data will find SAS programming to be a vital tool for analysis. Whether you're just getting started with SAS or looking to improve your knowledge, these community resources can help:
If you're looking for a tool to perform complex data analysis, SAS System is the market leader, and understanding how the SAS language functions will give you a big leg up in the world of business analytics.
The SAS language has a very specific focus, so it's unlikely general programmers are going to pick it up and decide to learn it for fun. On the other hand, students of statistics and data analysis who have had no prior interest in programming may want to make an exception here.
There is, however, a financial consideration to be made when choosing SAS as your language of choice. While the language itself is freely available, the two major compilers both require licenses. A student edition of SAS System is available for free, but you'll need to start paying if you want to continue to use a personal copy after you graduate.