PostgreSQL Introduction and Resources

PostgreSQL (sometimes just called Postgres) is an open source object-relational database management system (ORDBMS). The intention of the PostgreSQL development community is to make it "the world's most advanced open source database." It implements the majority of the SQL standard, but also adds a number of advanced features not found in other relational database systems.

Brief History of PostgreSQL

In 1973, two computer scientists at UC Berkeley, Eugene Wong and Michael Stonebraker, began working on a database system they called Ingres (INteractive GRaphics REtreival System). The work was originally intended for use by the University's economists, but Stonebraker and Wong expanded the project to work on a relational database system, inspired by IBM's System R and the concepts in the new Structured Query Language.

In 1982, Stonebraker left Berkeley to work on a proprietary relational database system, which he called POSTGRES (Post-Ingres). POSTGRES used many ideas from Ingres, but was not a fork of its codebase. Stonebraker was intent on adding explicit support for data types and relationships into the database.

The first prototype for the new system was demonstrated in 1988, and Version 1 release came in 1989. Version 2 came the next year and Version 3 the next year, each time with re-written query rules. By 1993, there were so many users that the POSTGRES team could not keep up with support requests. In 1994, the team released Version 4 and ended the project. They released all of the work under an open (MIT-style) license, allowing other developers to take over the project.

That same year, two UC Berkeley students replaced POSTGRES's query language with SQL and released their new version (Postgres95) on the web. In 1996, Marc Fournier of Hub.org provided a development server and Postgres became an open source development project. It was renamed PostgreSQL ( post-gres-cue-ell ). Development has continued since then by a large community of volunteers and developers known as The PostgreSQL Global Development Group.

About PostgreSQL

PostgreSQL does all the things you would expect a relational database system to do. Data is stored in user-created tables, with each entry being a row of a specific table. Columns on a table correspond to various attributes. Tables can be related to each other in order to create complex data structures. Postgres, like most RDBMSs, uses SQL for queries. What makes PostgreSQL interesting is the number of additional features it offers.

Object Relational

PostgreSQL is not just a relational database, it is an object-relational database. This means that is supports object oriented programming principles such as inheritance and polymorphism. These features are an attempt to address some of the difficulties frequently encountered when mapping relational data structures to objects.

Data Types

PostgreSQL provides an expanded set of data types. All RDBMSs support several varieties of numeric, floating-point, string, boolean and date types. PostgreSQL also provides:

  • uuid, Universally Unique IDs, which are essential for distributed database systems;
  • monetary, a fixed-decimal type that eliminates the rounding and calculation errors found in floating point numbers;
  • enumerated, a static set of options;
  • geometric types: point, box, line segment, line, path, polygon, and circle;
  • binary, similar to SQL's blob object;
  • network addresses, according to several protocols;
  • bit string, fixed-length binary strings which can be used as bit masks;
  • text vectors useful for full text search functions;
  • data representation types like XML, JSON, and arrays;
  • composite types, which group several other types together;
  • range types, including numerical ranges, date ranges, and time ranges;
  • user-defined types.

Data Size and Integrity

PostgreSQL can handle huge amounts of data. The the data itself can be huge — the row size limit is 1.6 TB, and a single field can hold 1 GB of data.

The system is also ACID (Atomicity, Consistency, Isolation and Durability) compliant, with very strong transactional and referential integrity.

Virtual Tables

When you run an SQL query against a set of database tables, the result is another table. Virtual Tables are a database feature that allows you to create those result tables and then run additional queries against them. This allows for more complex querying and opportunities for read-performance enhancements.

PostgreSQL supports several Virtual Table features. Two that are unique to PostgreSQL are recursion and materialized views. Recursion allows the same query to be run on the result set that it already produced. This can be done multiple times until all the possible aggregated results are found. Recursion is useful for hierarchically structured data. Materialized views create persistent (stored) virtual tables, which can be updated whenever needed (once a day, on write). For often-run queries with complex joins, this can speed up performance tremendously.

PostgreSQL Resources

PostgreSQL bills itself as "the world's most advanced open source database," which is probably true. This means that when it comes to PostgreSQL, there's a lot to learn. Here are the best resources for going from Postgres novice to expert.

Online

The first place to start is the official PostgreSQL website.

Tutorials

Video Tutorials

Reference

Tools

Drivers and Libraries

Community and Ongoing Learning

Books

General SQL and Relational Database Principles

You will get a lot more out of PostgreSQL if you have a good understanding of basic RDBMS principles, especially data modeling (the art and science of figuring out how to represent information as relational data).

Comparing Databases

PostgreSQL is one of several options for relational database management systems (RDBMS). The primary alternatives to PostgreSQL are:

Of course, for most projects you don't really have a choice between these. Oracle is a closed source, proprietary application used for large enterprises; if that isn't you, you won't be needing it. MS SQL Server is also closed source, and is only an advantage if you are building and deploying in a Windows environment. SQLite is a great database for lots of unconventional purposes, but is not well-suited for most database-driven applications.

So, how can you choose between PostgreSQL and MySQL? Here are a few resources to help:

NoSQL Databases

Comparing Postgres to MySQL leaves out a fast-growing category of database systems: non-relational, or "NoSQL," databases. Here the comparison isn't between two relatively similar technologies, but between completely different ways of handling and managing data.

The most popular (by far) NoSQL database is MongoDB. The other relatively popular choices are Cassandra, Couchbase, and Redis.

Here are some resources to help you think through deciding between PostgreSQL and a NoSQL alternative:

Interestingly, PostgreSQL has recently added some NoSQL features, while MongoDB is using PostgreSQL for some of its analytics tools.

Do I Need to Learn PostgreSQL?

"Learning" PostgreSQL involves (at least) two things: general relational database principles (SQL, data modeling), and PostgreSQL's unique approach to these things.

For a majority of the features built on relational database systems, there is little difference from the developer's standpoint between working with PostgreSQL and another RDBMS like MySQL. This SQL core is important to know for most web developers, and will serve you well for most jobs.

Do you need to know the specific things that make PostgreSQL unique? That depends on what other skills you already have, and what types of system you're likely to work on.

In general, MySQL is preferred by PHP developers — it is the default RDBMS for WordPress, Joomla!, and Drupal. .NET developers and others working on Microsoft servers will typically favor MS SQL Server. Large enterprise developers are more likely to need to know Oracle.

So who is really into PostgreSQL? It seems to be the preferred database for most of the Ruby and Python communities, especially with Ruby on Rails and Django. Node.js developers frequently use Mongo, but when a relational datastore is called for it is usually PostgreSQL. Cloud service providers tend to prefer Postgres to MySQL, even providing WordPress deployments on top of it.

Thanks to ORM and the database abstraction provided by frameworks like Ruby on Rails, Django, and Laravel, you can get quite a lot done without knowing PostgreSQL specifically, so you probably want to focus on other languages first. But if you've already gotten your head around a few programming languages, and you want to dig into database technology, PostgreSQL is a great place to start.


Further Reading and Resources

We have more guides, tutorials, and infographics related to coding and development:

Ultimate Guide to Web Hosting

If you are going to be creating database driven applications, you are going to need to host them somewhere. Check out our Ultimate Guide to Web Hosting. It will explain everything you need to know in order to make an informed choice.