Database Administration - ANSI SQL Standards and Guidelines
SQL: ANSI Standards for Database Administration
Structured Query Language, or SQL, is the standard language of database administrators (DBA) for accessing and managing databases. Throughout the years, SQL has evolved, and many versions and languages began to emerge. To unify SQL for best practices, the American National Standards Institute (ANSI) created specific standards for database query languages.
The SQL Timeline
- 1970: Dr. Edgar Frank Codd, a computer working for IBM publishes his famous paper, "A Relational Model of Data for Large Shared Data Banks" in June 1970, in the Association of Computer Machinery (ACM) journal. Dr. Codd's model remains the definitive model for relational database management systems (RDBMS).
- 1978: IBM Corporation develops the database system, System/R, along with a database language, Structured English Query Language (SEQUEL) at their San Jose Research Center in California. They base it on Dr. Codd’s original model, and they call it SEQUEL.
- 1979: Relational Software, Inc., releases the first relational database management system, and name it Oracle. Their RDBMS functions on a minicomputer using SQL as the main query language. The product becomes so popular, the company changes its name to Oracle.
- 1982: IBM releases its first commercial SQL-based RDBMS they name the SQL/Data System, or SQL/DS, and in 1985 they release the Database 2 system, or DB2. Both systems run on an IBM mainframe computer. IBM ported DB2 to additional systems later, including those running on Windows and UNIX operating systems.
ANSI Standards Year by Year
- 1986: SQL-87 was initially formalized by ANSI in 1986.
- 1989: The American National Standards Institute (ANSI) publishes the first set of standards for database query languages, known as SQL-89 or FIPS 127-1.
- 1992: ANSI publishes their revised standards, ANSI/ISO SQL-92 or SQL2, which were stricter than SQLI, adding some new features. These standards introduce levels of compliance that indicated the extent to which a dialect meets ANSI standards.
- 1999: ANSI publishes SQL3, or ANSI/ISO SQL: 1999, with new features, like support for objects. The replaced the levels of compliance with core specifications, as well as additional specifications for nine more packages.
- 2003: ANSI publishes SQL: 2003, introducing standardized sequences, XML-related features and identity columns. The creator of the first RDBMS, Dr. E.F.Codd, passes away on April 18 of the same year.
- 2006: ANSI publishes SQL: 2006, defining how to use SQL with XML and enabling applications to integrate XQuery into their existing SQL code.
- 2008: ANSI publishes SQL: 2008, introducing INSTEAD OF triggers, as well as the TRUNCATE statement.
- 2011: ANSI publishes SQL: 2011 or ISO/IEC 9075:2011, the seventh revision of the ISO (1987) and ANSI (1986) standard for the SQL database query language.
They split the new ANSI SQL standards into nine parts, including:
- Part 1 – ISO/IEC 9075-1:2011 SQL/Framework, which offers logical concepts.
- Part 2 – ISO/IEC 9075-2:2011 SQL/Foundation, includes central elements SQL.
- Part 3 ISO/IEC 9075-3:2008 SQL/Call-Level Interface (CLI), explains interfacing components, like procedures, structures and variable bindings for executing SQL statements in various coding languages, like COBOL and C++, for example.
- Part 4 ISO/IEC 9075-4:2011 SQL/Persistent Stored Modules (PSM), outlines standards for procedural extensions for SQL, including condition handling and the flow of control, as well as statement condition signals and resignals, and local variables and cursors. Assigns expressions to parameters and variables. Addresses the maintenance of persistent database language routines, like stored procedures.
- Part 9 ISO/IEC 9075-9:2008 SQL/Management of External Data (MED), includes extensions to SQL to define datalink types and foreign-data wrappers that enable SQL to manage external data. External data is accessible, but not managed by an SQL-based DBMS.
- Part 10 ISO/IEC 9075-10:2008 SQL/Object Language Bindings (OLB), defines the semantics and syntax of SQLJ. SQLJ is the embedding of SQL into Java. The standard specifies mechanisms for the binary portability of SQLJ applications. It also defines several Java packages along with their contained classes.
- Part 11 ISO/IEC 9075-11:2011 SQL/Information and Definition Schemata, specifies the Information Schema and Definition Schema, offering tools to make SQL objects and databases self-describing. Includes the SQL object identifier, security and authorization specifications, as well as the security and integrity constraints. Supports the features and packages of ISO/IEC 9075, as well as the features provided by SQL-based DBMS implementations.
- Part 13 – ISO/IEC 9075-13:2008 ISO/IEC 9075-13:2008: SQL Routines and Types Using the Java Programming Language (JRT), Specifies the capacity to utilize Java classes as SQL structured user-defined types, as well as invoke static Java methods as routines from within an SQL application, known as Java in the Database.
- Part 14 – ISO/IEC 9075-14:2011 SQL/XML-Related Specifications, defines SQL-based extensions when using XML with SQL. It outlines XML data types, routines and functions. It also describes XML to SQL data type mappings for the storage and managing of XML inside an SQL database.
The ISO/IEC 13249 SQL Multimedia and Application Packages work alongside ISO/IEC 9075 as a separate, yet relevant standard that specifies various interfaces and packages based on SQL. The goal of this package is to afford centralized access to the most common database applications, such as pictures, text, spatial data and data mining.
How to Find a Copy of the Standards
Hard copies of SQL standards specifications are only available to buy, because of ISO and ANSI copyright restrictions. You can find an electronic copy at the ANSI website, by selecting "Access Standards - eStandards Store" and search for "SQL Language."
There are two variants of each document, but they are identical:
- INCITS/ISO/IEC 9075-*-2011
- ISO/IEC 9075-*:2011
Every database creator aims to comply with these standards, causing the various implementations of SQL to become similar; however, each implementation still has a unique dialect for using SQL that includes extensions or additions to the standards.
Putting SQL to the Test
The fundamental SQL commands and statements are similar for all SQL dialects, so once a DBA knows how to use one, they can learn others easily.
Although most SQL database programs have their own proprietary extensions, in order to comply with ANSI standards, they must, at the very least, support these five main commands in the same way:
SQL enables a DBA to conduct the following actions in a database:
- Execute queries
- Retrieve data
- Insert, update and delete records
- Create new tables and views
- Build new databases
- Produce stored procedures
- Set permissions for views, tables and procedures
To produce a website that can access data from a database, a Relational Database Management System (RDBMS) database program is necessary. Some popular RDBMS programs include:
- SQL Server
- MS Access
- Postgre SQL
- Oracle Database
In order to create a database, the database administrator will also need to utilize the following:
- A standard markup language like CSS/HTML
- A server side scripting language, such as ASP or PHP.
- Understand how to use SQL to retrieve the data you request.
The Anatomy of a Database
The relational database management system (RDBMS) forms the foundation of the database. Using RDBMS allows a DBA to store data in database objects, called tables. Tables consist of columns and rows of related data organized in sections.
Identifying Database Tables
Most databases contain more than one table, so the DBA must identify each table with a name. Each table will contain rows of records containing information and data. For example, a business would use a database with a customer table containing the following data for each customer:
- Customer Name
- Contact Name
- Customer ID
- Customer Address – street, city, zip code and country
- Customer Contact Information
In this case, the table contains five records - one for each customer - and eight columns, one for each chunk of data: Customer Name, Contact Name, Customer ID, Customer Address including City, Zip Code, and Country, and Customer Contact Information.
Communicating with a SQL Database
Database administrators perform all the actions necessary in a database using SQL statements. The results are then stored in a result table, called the result-set. SQL is usually not case-sensitive, and some database systems require the use a semicolon at the end of every SQL statement. A semicolon has become the standard method of separating SQL statements from each other in a database system. This enables a DBA to execute multiple SQL statements in one call to the server.
Some of SQL commands include:
- ALTER TABLE Modifies tables.
- ALTER DATABASE Modifies databases.
- CREATE DATABASE Creates new databases.
- CREATE INDEX Creates index/search keys.
- CREATE TABLE Creates new tables.
- DELETE Deletes data from databases.
- DROP INDEX Deletes indexes.
- DROP TABLE Deletes tables
- INSERT INTO Inserts new data into databases.
- SELECT Extracts data from databases.
- UPDATE Updates data in databases.
Codd’s 12 Rules
When he created the first relational database management system, Dr. Codd included 13 rules, suggesting that if a Database Management System meets all of these rules, it is a true Relational Database Management System. Because he numbered them from zero to 12, they are called Codd’s 12 Rules:
- Rule Zero: Foundations States that a basic system must meet three basic requirements: It must be relational and include a database and a management system. It also has to leverage relational facilities exclusively to manage the database to be considered a true RDBMS.
- Rule One: Information Representation Represent all information in the database in a singular way, specifically by placing values in column positions within rows of tables.
- Rule Two: Guaranteed Access All data has to be accessible, like the fundamental requirement for the primary keys. Every individual scalar value in a database should be sensibly addressable by defining the name of the containing table and column, as well as the primary key value for the containing row.
- Rule Three: Handling Null Values The DBMS must let each field stay empty or null. This means it has to support the manipulation of a representation of any inapplicable or missing information in a systematic way that is different from all regular values, as well as independent of the type of data.
- Rule Four: Active Online Catalog The system must support an inline, online, relational structure or catalog accessible to authorized users using their normal query language. Users should be able to access the database's catalog using the same relational model and query language they used to access the data inside the database.
- Rule Five: Comprehensive Data Sub Language The system has to support a minimum of one relational language with a linear syntax. Users should be able to leverage it within application programs and interactively, as well. It also must support data manipulation operations, like updating and retrieval, and data definition operations, like view definitions, as well as transaction management operations like commit, begin and rollback. It should also have integrity and security constraints in place.
- Rule Six: Updating Any views that can be updated theoretically must be updated by the system.
- Rule Seven: High Level Update, Insert and Delete The system should support set-at-a-time update, insert and delete operators, so that users can retrieve data from a relational database in sets constructed of data from multiple tables and/or rows. Update, insert and delete operations should be enabled for any retrievable set instead of for a single row in a single table.
- Rule Eight: Physical Data Independence Updates to the physical level should not require an update to an application based on the structure. For example, changes in how the data is stored, like whether to place it in arrays or linked lists.
- Rule Nine: Logical Data Independence Updates to the logical level, such as columns, tables and rows should not require an update to an application based on the structure. It is harder to achieve logical data independence than physical data independence.
- Rule 10: Integrity Independence Specify integrity constraints must be stored separately from application programs and in the catalog. It has to be possible to update such constraints when appropriate without affecting any currently existing applications.
- Rule 11: Distribution Independence The distribution of portions of the database to various locations should not be visible to database users. All existing applications should be able to operate continuously and efficiently in the face of an introduction of a distributed version of the DBMS or when current distributed data gets redistributed throughout the system.
- Rule 12: Non Subversion If the system provides a one-record-at-a-time or low-level interface, it can’t be leveraged to abate another system. An example of this would be the act of bypassing a relational integrity or security constraint.
Structured Query Language, or SQL has come a long way, but the foundations used to create it still stand strong. Database administrators may discover more ways and places to use it, as cloud computing becomes the norm; however, ANSI will remain in place to specify standards to unify database query languages.