Relational Databases and Structured Query Language (SQL)
All websites require some way to store content and data. The earliest websites did not make a distinction between content and presentation, so content was simply stored in pages or documents which were accessed by a web browser.
As the websites became more complex and the technology to deal with this complexity advanced, developers began to separate content from presentation.
Content is now usually held in some form of a distinct data store. More often than not, this is a relational database.
In this article, we will cover what a relational database (or a relational database management system) is, the language used to interact with these types of databases, and what you need to know about both when purchasing web hosting.
Relational Database Basics
A relational database stores information in tables. Generally, there is a table for each class of "real" things which you are trying to keep track of. For example, in a library database, there would be a table for Books and a table for Authors (among many others).
Each row of a table represents one instance of that class — a single book, or a single author. Each column of the table is an attribute — title, year published.
Why is it Called Relational?
The database is called "relational" because the tables can be related to each other.
For example, authors are related to books in that each book has an author.
So a column on the Books table might be "Author," and the contents of that column would be references to rows on the Authors table (that is, reference to specific authors).
Unique and Foreign Keys
References are made by the use of unique keys and foreign keys.
Each row on the Authors table (and every other table as well) has a Unique ID, usually a number.
In the author column on the Books table, the attribute for the author isn't stored as the author's name, but rather as a numerical reference to the appropriate row on the Authors table.
The Single Source of Truth Rule
This way of referencing rows in other tables helps enforce the rule of Single Source of Truth.
In our example, the Authors table holds all information about the Author, including how his or her name is spelled.
If a change needs to be made, the change is made in only one place, and the referring rows don't have to make any changes.
Data Modeling in Relational Databases
The art and science of designing a database — deciding what tables are needed and how they will relate to each other — is called data modeling.
It's more difficult than it may initially seem.
Understanding Data Modelling
For example, let's explore this library database further.
Should there really be a table for Books? A library might have several different types of media available.
Should there be a table for Books and a separate one for Magazines, DVDs, Manuscripts, and CDs? Probably not.
A more sensible design would probably be to have a table called Items, which includes as an attribute a column called media or type.
This would then be a foreign key reference to a table that enumerates the different types of media (books, magazines, DVDs, etc.).
What about Authors? Should there be a table for Authors?
What about cast members in movies, publishers of magazines, producers of albums?
If someone has done all those different things for different projects, should they show up in four or five different tables?
Violation of the Single Source of Truth
That would violate Single Source of Truth.
Maybe we should just have a table called People. But then, what about the author column on the Books (Items) table?
What if a book has more than one author? What if a book has an editor, but no single author (like a collection of poetry)?
The standard solution here is to have a table that defines a relationship between Items and People, perhaps called Contributors.
This would have three columns — item, person, and role.
The role would then need to be a foreign key reference to a table which enumerates the different possible roles a person might have — author, editor, producer, actor, singer.
How Is Good Data Modelling Done?
With data modeling, each piece of the puzzle — each "real-world" class of items you need to keep track of — creates a layer of potential complexity.
Good data modeling is a process of finding the simplest solution which will allow for a maintainable and consistent dataset.
Furthermore, any possible question about the data should have one and only one answer.
Relationships Between Database Tables
What kind of relationships exists between database tables?
There are three kinds of relationships between tables. The most basic is the one-to-many relationship.
In our phone call database, there is a one-to-many relationship between calls and caller tables. A single caller can show up any number of times in the calls table.
Different Data Relationships
The one-to-one relationship is just what it sounds like. It is rarely used because it is just a way of separating what would normally be a single table into two tables.
But it is occasional used when there are a lot of data elements that are rarely accessed.
The many-to-many relationship is where there are many corresponding items in each table.
An example of this would occur in our call log if there were conference calls with more than one caller for a single call.
This is managed in database design by creating an intermediary table. This new table then has a one-to-many relationship with the two tables it is linking.
What is SQL?
SQL, which stands for Structured Query Language, provides a set of commands for designing and manipulating database tables and retrieving information (running queries) from them.
When working with SQL tables in our databases, we (or our web application) need a way to "talk" to the server. For example, our web application might need to ask the database for a user's profile. To that end, we use SQL.
Most relational databases — and all relational databases which are popular for web applications — use SQL.
What is CRUD (Create, Read, Update, Delete)?
CRUD is a popular acronym that refers to the common database functions and commands to create, read, update, and delete records.
Please note that, though many different RDBMS use SQL, many use variants that are not compatible.
- Queries that run on Microsoft's SQL Server may not be valid against MySQL databases.
- Queries that run on MySQL databases might not be valid against PostgreSQL.
This means that moving from one RDBMS to another is not commonly done, since exporting, moving, and importing data requires quite a bit of work.
Is SQL Difficult to Learn? A Simple SQL Query
What does SQL look like? Is it difficult to learn?
Well, let's say that we want to return all rows from a table called "Books" where the "authorLn" (which is our shorthand for author's last name) is "Greene." Our SQL query would, therefore, look like this:
SELECT * FROM books WHERE authorLn = "Greene";
Depending on the SQL variant you are using, you might see some small syntactic differences in the above query.
You can see that SQL is quite intuitive, simple, and straightforward.
More often than not, though, there are more similarities than differences.
Other SQL Actions
In addition to running queries against your database, SQL is used to manage your database and its tables. To create a new table, you would use the CREATE command:
CREATE TABLE books (id INT, title VARCHAR(100), author VARCHAR(250));
To drop a table (which is like deleting a table, but you are also removing anything associated with the table, such as permissions), you would use the DROP command.
DROP TABLE books;
You get the idea. SQL, as its name implies, provides you with a lot of structure when it comes to interacting with your database
Even though all SQL databases use the same query language for programming, each database management system implements the storage and management layers differently.
This means that each has particular strengths and weaknesses in the area of performance of different types of queries.
What are the Most Popular SQL Relational Database Management Systems?
The following is a list of the most popular and most commonly-used relational database management systems (RDBMS).
Drop-in replacement for MySQL with improved performance. MariaDB is a community-developed fork of MySQL and is highly sought after.
Desktop database application-building tool. Not generally considered appropriate for web use. It combines the Microsoft Jet Database Engine with a GUI.
Microsoft SQL Server
Sometimes abbreviated MSSQL or MS SQL Server, this is an RDBMS for use with other Microsoft applications and tools such as Sharepoint and the .NET Framework.
One of the most popular database systems. Oracle is a proprietary option often used by enterprise-level clients.
Community Open Source database that is seen by many as a better alternative to MySQL. PostgreSQL is an object-relational database.
Database-management library that can be included in applications. Stores databases as files. Useful for prototyping and embedded applications.
The cost of the options listed above vary.
For example, MySQL, Maria DB, and PostgreSQL are open source, and SQLite is in the public domain.
Using these options are definitely budget-friendly. Oracle and Microsoft's database options, however, are proprietary, so be sure to factor these costs into your budget when shopping around.
Cloud-Based Database Options
With the popularity of cloud computing, there are cloud-based options as well. These options make it easy for you to deploy a database to a cloud-based environment.
- Amazon Web Services (AWS) Aurora
- Amazon Web Services (AWS) Relational Database Service (RDS)
- Google Cloud Platform
- Microsoft Azure SQL Database
Amazon Web Services (AWS) Aurora
A cloud-based relational database compatible with MySQL and PostgreSQL offering the performance and availability of a commercial product but with the simplicity and cost-effectiveness of open source options
Amazon's Aurora is compatible with MySQL but provides far greater performance even on the same hardware. It can scale up to handle millions of transactions a minute.
Amazon Web Services (AWS) Relational Database Service (RDS)
Google Cloud Platform
Google's cloud option for users of MySQL and PostgreSQL; can be considered the equivalent to AWS' Aurora.
Microsoft Azure SQL Database
A relational cloud database service compatible with Microsoft's SQL Server offerings; is compatible with other Microsoft tools you might be using for development, such as SQL Server Management Studio, SQL Server Express, or Visual Studio
Cloud-based databases can become costly, but depending on your situation, they may be cheaper than setting up your own physical infrastructure.
Choosing Your SQL Relational Database Management System
When purchasing web hosting, you will have some choice over the database system you use. Typically, you will not need to seek specialized SQL database hosting — databases typically come with your web hosting package.
Shared Hosting Plans
If you opt for a shared hosting plan, you will likely be using a MySQL database (though you might occasionally find hosts that offer PostgreSQL as well). Both options are compatible with either Linux-based hosting or Windows-based hosting.
If you purchase a Windows hosting plan, you might also have the option of using MS SQL Server. The downside to this is that you might have to pay a bit extra since Microsoft's products are neither open source nor free to use.
Virtual Private Servers (VPS) and Dedicated Servers
If you opt for a virtual private server (VPS) or dedicated server, you have many more options. You can pretty much use whatever you would like, as long as everything compatible (that is, you cannot run an MS SQL Server database on a Linux-based server, even if it is a dedicated instance).
Which RDBMS Option Should I Choose?
For most users, however, the open source, readily available options (MySQL and PostgreSQL) are perfectly sufficient.
There are certainly downsides along with the upsides when it comes to using these RDBMS, but for many people, the downsides will not have a noticeable impact on day-to-day operations.
Only when you start working with highly-trafficked, resource-intensive applications will you begin to see performance differences among the various database systems.
If you are getting databases with your web hosting (which is frequently the case with shared hosting plans), a cloud-based option is probably not on your radar.
In addition to having the additional databases be overkill, you may not even have the option of using an external database environment.
Though, as always, this depends on your specific web host.
For more advanced hosting options, such as VPS and dedicated instances, a cloud-based database is a great way of separating your hosting/website from your data, as well as gaining additional storage space.
Many such servers have strictly defined storage limits, and as your site grows, moving the portion that is likely growing the fastest — your database — elsewhere is wise. Furthermore, such storage options also tend to be cheaper than purchasing additional disk space from your web host.
What are NoSQL Databases?
Unless you have been living under a rock, you have probably heard of NoSQL databases.
NoSQL is a catch-all term for databases that do not follow the RDBMS model.
Basically, rather than structuring your data using a model that you have put together, your data is stored in whatever manner you (or your database) deems appropriate.
For in-depth information, please see our page on NoSQL databases.
Summary of Relational Databases and SQL
You will need a place to store your web application's data, and this is commonly done using relational database management systems (RDBMS) that utilize Structured Query Language (SQL).
In this article, we covered what RDBMS and SQL are, as well as commonly-used database options.
We also briefly discussed the considerations you will need to keep in mind as you set up your web application and its database management system.
Hosting Frequently Asked Questions
What is a relational database?
A relational database is one that is divided into different tables, which have set relationships to each other.
The critical factor is that no bit of information 'belongs' to any other bit of information. As a result, it is easy to search for arbitrary information.
What were there before relational databases?
Before relational databases, there were hierarchical databases. In these kinds of systems, data is organized in such a way that records belong to other records.
Hierarchical databases result in more complicated searching -- following from parent to child and on and on.
Who is Edgar F Codd and what did he do in 1969?
Edgar F Codd was a computer scientist who worked for IBM in the 1960s and 1970s.
In 1969, he invented the idea of the relational database, which he shared with the world in the 1970 paper, "A Relational Model of Data for Large Shared Data Banks."
What is the Single Source of Truth?
The Single Source of Truth (SSoT) is the practice of only storing information once in a database.
Consider a database of phone calls. It would display who called at what time. If there were five calls from Dennis Callsalot, the database would store the name "Dennis Callsalot" in just one place, and the call log would refer to that entry.
Or to put it in database language, the call table would have a caller column. The caller column would include a number that references a people table.
What is the point of SSoT?
SSoT makes databases far more efficient.
In our phone call database, we would waste a lot of space by entering "Dennis Callsalot" over and over.
The smaller the database, the faster it is to search. It may not matter that much in a very small database, but imagine one where there are a million references to "Dennis Callsalot."
Another important factor is correction. Suppose the name "Dennis Callsalot" is wrong, and it is actually, "Dennis Calster."
With a SSoT database, it is just a matter of changing the name in one place and then it is automatically and instantly changed in all applications that use the database.
What is SQL?
SQL is an acronym for "Structured Query Language."
It is a specialized programming language that can be used to extract data from a relational database. The SQL queries can be very simple, just reading data from a single table.
Or they can be very complex, reading data linked across various tables with numerous qualifiers.
How do you pronounce "SQL"?
Some people pronounce SQL as "S-Q-L" and others pronounce it like "sequel." Which is right? There is no definitive answer.
Originally, SQL was called "Structured English Query Language," and was known as SEQUEL. So it was pronounced "sequel."
But there was a trademark issue, so the name was changed to just "Structured Query Language," and known as SQL.
Even after this, Oracle still referred to it as "sequel." But the developers of MySQL were very clear that the name of their database was pronounced "My-S-Q-L." Both pronunciations are perfectly acceptable.
Why is MySQL so widely used?
MySQL is an established and powerful database system. It was was the first major free database system to become widely used.
It is the engine behind much of the software that makes up the internet, such as WordPress -- the most popular blogging software in the world.
But none of this should be taken to mean that MySQL's popularity rests on its market position. It is also easy to work with, secure, scalable, and quite fast.
Then why do many people claim PostgreSQL is so great?
PostgreSQL has a few advantages over MySQL.
For one thing, MySQL does not implement the full SQL standard which, for most applications, doesn't matter.
However, for certain kinds of work, a fuller implementation is important.
Also PostgreSQL is an object-relational database, which is an object-oriented database model -- making certain types of development easier.
What is SQLite?
SQLite is the most popular database management system in the world because it is used on all Android and iPhones. It is not a client-server database like MySQL and PostgreSQL. It is a database library where it is used in individual programs.
What about Oracle Database? Microsoft SQL Server?
These are commercial database products with their own advantages and disadvantages. The primary advantage is support and the primary disadvantage is cost.
The Oracle Database is object-relational whereas Microsoft SQL Server is a traditional relational database.
What about Microsoft Access?
Access isn't actually a database. It comes with the Jet Database Engine. Access is just the interface. It is possible to use it to access other databases.
Are there other commercial options besides offerings from Oracle and Microsoft?
There are many other commercial options. There are even commercial versions of MySQL.
Which database should I use?
This will depend upon your needs.
If you are creating an Android app, you will almost certainly use SQLite.
If you have a database project that is fairly straightforward but requires speed, MySQL may be the way to go.
If you need a more feature rich database, PostgreSQL could be what you are looking for.
If you need service and rock solid reliability, a commercial product may be the best choice.
It all depends upon you and what you want to do.