CGI Primer

When the web was born, it was static; populated by HTML documents linked together by hyperlinks. That arrangement didn't last.

Before too long, web servers were developed that could perform work on the server before returning a webpage to the browser. Initially, different web servers implemented this functionality in different ways. As a result, applications written for one web server couldn't be used with most other web servers.

In the late 1990s, a workgroup was created to define a standard mechanism that all web servers should implement to allow this interaction, and the first version of the Common Gateway Interface (CGI) was born.

CGI was the earliest standardized mechanism provided to enable server-side work in response to input from a client browser. While its use isn't as widespread as it once was, CGI is still supported by most modern web hosts, and it's descendants, like FastCGI, are central to the operation of the modern web.

In this primer, we'll take a look at how CGI scripts are executed. This primer will walk through basic implementation of a simple CGI script so that you can get one up and running on your own web server. However, we'll conclude this primer with links to complex CGI scripts that you can use to expand your working knowledge of CGI and put CGI to good use.

We'll use Python for the examples in this tutorial, but many languages can be used to write CGI scripts.

FastCGI: CGI's High-Performance Cousin

One of the limitations inherent to the architecture of CGI is that when a web server receives a request to process a CGI script it must fire up a single process to handle the request, process the script, and generate the result. This means that CGI scripts cannot reuse resources such as database connections or cached queries from one request to the next.

This isn't a big deal if the script in question is simple and the load on the server light. However, under heavy load — either due to heavy traffic or a complex script — this architecture means that CGI scripts are highly inefficient and can really drag down server performance.

FastCGI was developed to mimic the simpicity of CGI while increasing the efficiency of script processing on a web server. As a result, FastCGI allows web servers to handle many more requests than CGI. Today, CGI is still used on occassion but FastCGI has become one of the industry standard tools for processing server-side scripts.

CGI Basics

CGI is a standardized communication protocol — a formally defined method for exchanging information. Information comes from a browser and is received as input into a CGI script. The script generates a response which the server delivers back to the browser.

CGI scripts are usually simple affairs, and CGI was designed as a way to build dynamic websites at a time when websites were much simpler than they are today. CGI scripts can be written in many different programming languages, but the most common languages for CGI scripts are Perl and Python.

CGI scripts are usually stored in a designated directory called cgi-bin. In most cases, this directory is contained in the root directory designated for storing webpage resources. On Apache web servers, this directory is called public_html and CGI bin can generally be found at public_html/cgi-bin.

The normal arrangement is for a domain to point at the web root directory (/public_html). So, if the domain http://example.com were configured to pull documents from the public_html folder, scripts in the CGI bin would be accessed by going to http://example.com/cgi-bin/script.py, where script.py is the name of the CGI script to be executed.

Running a CGI Script

Web servers recognize CGI scripts based on two factors: location and extension.

  • The location of CGI scripts we've already talked about: public_html/cgi-bin.
  • Extension, refers to the extension tacked on to the end of the file that contains the script.

In the case of a Python script, the usual extension is .py while Perl scripts are designated as such by the .pl extension. However, some CGI programmers simply default to the use of .cgi for all CGI scripts.

CGI scripts are run, or executed, when the script extension and location both indicate that the file in question is in fact a CGI script. Most commonly, as we already mentioned, this is done by simply calling the script from it's position in the CGI bin by requesting a URL that points to the CGI script.

However, it isn't always possible or desirable to place the script in the CGI bin. In that case, the following directive added to a .htaccess file on Apache web server will cause the server the treat that directory (and every subdirectory) as a CGI bin.


Options +ExecCGI
AddHandler cgi-script .py

As you may have guessed, that code is specific to Python scripts — the .py extension is a dead giveaway. To do the same thing with Perl scripts or those identified by the .cgi extension, add the following instructions to a .htaccess file in the directory where you wish to store and execute CGI scripts:


Options +ExecCGI
AddHandler cgi-script .pl .cgi

Let's say that you wanted to keep all of your CGI scripts in a subdirectory called /process. All you would have to do is create a text file titled .htaccess and paste in the code from the examples above, and then drop it into a directory located at public_html/process. Then you'd be able to place your scripts in that directory and execute them by using the following URL format: http://example.com/process/script.py.

Lastly, if you don't want to call a script directly, you can run a script by calling it from within an HTML document. For example, if you want to pass data from an HTML form to a CGI script you could use the following syntax to do so.


<form method="get" action="cgi-bin/form.py">
  <!--form elements added here-->
</form>

When the form is submitted, the data will be passed to the CGI script named form.py in the /cgi-bindirectory. That script will process the form input and send a response back to the web browser.

Sending Data to a Script

The keen-eyed programmers among you may have noticed the use of the method="get" attribute to send data back to the server using the HTTP GET method. If you use this method, the data submitted with the form will be attached to the URL and passed to the CGI script in the form of URL arguments.

The GET method is usually the first method programmers use because it is generally perceived as an easier method to learn how to use. While that may be true with some technologies, with CGI, data is handled the exact same way whether you submit it using GET or the alternative, POST.

The real difference between the two is that the GET method attaches the data to the URL while POST does not, instead POST adds the data to the message body sent along with the HTTP request.

One practical reason to use GET is that GET URLs can be bookmarked since all of the data required to process the script is contained in the URL. Scripts processed with data submitted with POST cannot be bookmarked. However, URLs are limited to around 2,000 characters, so there is a limit to how much data you can send using the GET method.

Writing a Basic Script with HTML and Python

Now that we know the basic mechanics behind CGI scripts, let's write a simple script. Fire up your text editor and create a new empty file called hello.py. Next, paste or type the following code into the file:


#!/usr/bin/python

print "Content-type:text/html"
print

Every CGI script must begin with a line telling the script where to find the applicable interpreter. On Apache servers with support for Python, the interpreter can usually be found at the location shown in the script above, but if you're unsure, check with your hosting provider.

The next line begins the definition of the script output. Every CGI script sends something back to the browser. When you're ready to start defining your script's output you must begin with a line defining the output as html (or whatever alternate format you're using). This information will be sent back to your browser in an HTTP header.

The last line prints an extra blank line following the content type declaration. This line signals the end of the HTTP header to the script. If you don't like seeing the empty print line, you can also add a new line to signal the header by adding new lines right to your code, like this:


print "Content-type:text/html\r\n"

Let's throw a few more lines of output into the mix so that we can get this script running. Here's a complete script that will print out a classic programming exercise:


#!/usr/bin/python
print "Content-type:text/html\r\n"
print """\
<html>
  <head>
    <title>Learn CGI | Hello Py</title>
  </head>
  <body>
    <h1>Hello, World!</h1>
  </body>
</html>
"""

Save all of that code into your hello.py file and use an FTP client to upload it to the CGI bin on your hosting account. Make sure the file is executable by setting permissions to 755. Finally, access the file by going to http://yourdomain.com/cgi-bin/hello.py. When you do, a page that looks something like this will greet you.

Hello World!

Once you get that simple script working you will have run your first CGI script.

Congratulations! Let's keep moving.

Working with Form Data

One common use for CGI scripts is to use them to receive data from an HTML form and manipulate that data in some way. Let's do that next. First, we'll need to code a simple HTML form.

Create a new file, name it form.html and paste the following code into it:

<!doctype html>
<html>
  <head>
    <title>Learn CGI | Favorite Py</title>
  </head>
  <body>
    <form method="get" action="cgi-bin/favorite.py">
      <p>Hi! What is your name?<br><input type="text" name="name"></p>
      <p>What is your favorite kind of pie?<br><input type="text" name="pie">
      <p><input type="submit" value="Submit"></p>
      </form>
  </body>
</html>

Upload that file to the root directory of your website. It should be in the same directory as your cgi-bin.

Next, create another new file, name it favorite.py, and paste this bit of code into it:

#!/usr/bin/python

print "Content-type:text/html\r\n\r\n"

import cgi
form = cgi.FieldStorage()

name = form.getfirst('name', '')
pie = form.getfirst('pie', '')

if name == "" or pie == "":
  print "<p>Oops!</p>"
  print "<p>You didn't fill out all the fields!</p>"
  print "<p><a href='../form.html'>Try again</a>?</p>"
else :  
  print "<p>Hi, " + name + "!</p>"
  print "<p>Thanks for letting us know that your favorite kind of pie is " + pie + " pie.</p>"
  print "<p>Care to <a href='../form.html'>change your answer</a>?</p>"

Now upload that file into the cgi-bin directory.

When you have things working properly you will be able to access your form by going to http://yourdomain.com/form.html. Then, when you enter your details and hit the Submit button you'll be greeted by a message letting you know that you need to fill out one of the fields or by a greeting that thanks you for sharing your pie preferences.

While this particular example may seem trivial, getting it to work and digging through the code will teach you several valuable skills:

  • How to pass data to a CGI script (favorite.py) from a website file (form.html).
  • How to use an if statement to check for form input and modify the script output based on the results.
  • How to access data submitted with a form within a CGI script.

With those basic lessons under your belt you'll be prepared to begin writing and implementing genuinely useful CGI scripts.

Additional Examples

  • Python Scripts from Real Python: this GitHub repository includes 33 python scripts ready for you to drop into a CGI bin. Available scripts include some useful tools, such as an income tax calculator and JSON to YAML converter, as well as fun examples, like a Twitter bot website link crawler. These scripts aren't set up to work as CGI scripts right out of the gate, but using what you've learned in this primer you will be able to get them up and running.
  • BoboMail: a webmail application written entirely in Python. This free script isn't quite up to production standards — the current release version is 0.6 — but get it up and running in a local environment and use it as a learning tool to see how you can build an entire web application using Python and CGI.
  • Tutorial: How to Send Email with Python: one of the most common tasks tackled with CGI scripts is to send an email based on a specific action, such as a form submission. This tutorial includes practical code samples you can use to add email-sending capability to your scripts and even shows you how to encode attachments properly for transmission.
  • Read and Parse JSON via URL: if you want to pull data from an API that generates JSON data, this simple script will give you the basic building blocks to pull down the data you need.
  • What Is My IP -- Python CGI Script: a simple script that generates a "What's my IP?" website that will detect and display a website visitor's IP address.
  • Wunder: a script that loads local weather information pulled from the Weather Underground API based on an iPhone user's GPS location.
  • Check XMPP DNS: a CGI script that checks and displays the DNS SRV records associated with any URL.

Summary

CGI is a mechanism you can use to easily execute simple server-side scripts. Broad adoption of CGI means that scripts written in languages like Python and Perl can be readily used on virtually all web servers. While CGI is simple and useful, keep in mind that it is not the most efficient mechanism available for the execution of server-side scripts, and busy websites will want to make use of modern alternatives such as FastCGI.


Further Reading and Resources

We have more guides, tutorials, and infographics related to coding and development:

What Code Should You Learn?

Confused about what programming language you should learn to code in? Check out our infographic, What Code Should You Learn? It not only discusses different aspects of the languages, it answers important questions such as, "How much money will I make programming PHP for a living?"