Sorting Algorithms Introduction and Resources
Have you ever wondered how a computer puts things in order? When you click the A-Z button in Excel, what's really going on?
Computers sort lists by following procedures called "sorting algorithms." An algorithm is a set of instructions that can be followed over and over, mechanistically. There are several different sorting algorithms — different procedures for putting lists into their proper order. Each one has its strengths and weaknesses.
Here are the most popular sorting algorithms.
The bubble sort is an inefficient, but simple, algorithm. It is rarely used in practice, but it is easy to understand.
The bubble sort works by comparing successive pairs of elements in a list. The first and second elements are compared, then the second and third, then the third and fourth, and so on through the list. After each comparison, the two elements are swapped if they are out of order. After a single iteration, the largest element will have "bubbled up" to the end of the list. This procedure is repeated over and over until the list is sorted.
In the bi-directional variant, comparisons are made alternately forward through the list and then backwards through the list. This grows a sorted section on each end of the list. This is also called the "cocktail shaker sort."
When to Use Bubble Sort
Never, other than than for demonstration.
More Information About Bubble Sort
- Bubble Sort explanation from the Algorithmist, with example code;
- Bubble Sort Algorithm coded in over 100 different programming languages;
- Bubble Sort illustrated with Legos;
- Bubble Sort illustrated with Hungarian folk dancing.
The insertion sort is how most people intuitively sort things manually — for example, when alphabetizing papers. It involves moving each element, in turn, into its proper place within an already-sorted portion.
The first element of the list is considered "sorted." Then the second element is considered. If it is larger than the element before it, it is in the correct place and forms part of the "sorted" section. Otherwise, it is moved backwards one slot and compared with the element behind it. The element continues moving backward until it is in the correct position within the sorted portion of the list. This procedure is then repeated with the third, fourth, fifth elements, and so on.
When to Use Insertion Sort
Insertion sort is generally the fastest solution for sorting small lists (10 or fewer elements), and performs at least adequately in medium-sized lists. It can become prohibitively slow in larger sets, however.
Insertion sort also performs well, even in large sets, if the list is already mostly sorted. This makes it ideal for cleaning up legacy human-sorted data, or for using in conjunction with other sorting algorithms. For example, it is common to begin with a quick sort, and then switch to insertion sort after a few iterations.
More Information About Insertion Sort
- Insertion Sort explanation and coding exercise, at Khan Academy;
- Insertion Sort Algorithm coded in nearly 100 different programming languages;
- Insertion Sort illustrated with styrofoam cups and narrated by a Harvard CS student;
- Insertion Sort illustrated with Hungarian folk dancing.
The heap sort is a two-part sort that uses a binary heap data structure as an intermediate holder for the elements in the list.
A binary heap is a tree structure in which each node has up to two child nodes. The largest element is at the top of the tree. Each child node is smaller than its parent node. This means that the process of building a heap partially sorts the list. Once the heap is built, the largest element is removed and placed into the sorted array — then the largest of the remaining heap, and so on.
When to Use Heap Sort
On average, the heap sort provides good performance. In addition, its worst-case scenario is much better than the worst-case scenario for nearly any other algorithm. It is therefore often used with large and distributed data sets.
More Information on Heap Sort
- Detailed Explanation of Heap Sort from a CS professor at Kent State;
- Heaps and Heap Sort Video Lecture from MIT;
- For an in-depth understanding of heaps and heap sort, see this Five-Part Video Series.
Merging together two already-sorted lists into a new already-sorted list is relatively simple: compare the first element of each and place the smaller into the new list, repeat. This is the basis for the merge sort.
To begin a merge sort, a list is divided into a set of 1-element "lists." Then, pairs of these lists are merged into 2-element lists, which are then merged into 4-element lists, and so on until the entire list is merged back together.
When to Use Merge Sort
Merge sort is very efficient. The only problem is that merge sort usually requires enough active memory (RAM) to hold the list twice. It can therefore become unfeasible in certain circumstances.
More Information about Merge Sort
- Merge Sort from Khan Academy is an excellent six-part lesson with programming exercises included;
- Merge Sort coded in over 80 different programming languages;
- Merge Sort with German Folk Dance provides a dance routine demonstrating the sorting algorithm.
Quick sort is not very intuitive. It is, however, very efficient.
The first step in a quick sort is to select a pivot element. This can be any element in the list. Next, all values greater than the pivot are placed after it, while all values less than are placed before it. This creates two partitions which are not sorted, but are in correct relation to each other. This same procedure is done on each partition, then on each sub-partition, and so on. When the partitions reach a size of one element each, the list is sorted.
When to Use Quick Sort
Quick sort is one of the most popular sorting algorithms for practical use. Average speed is very fast. However, the worst case scenario (which happens rarely in natural data sets) becomes problematic with very large sets.
More Information about Quick Sort
- Quick Sort coded in over 100 programming languages;
- Quick Sort Explained with Blocks from Harvard's CS-50 class;
- Quick Sort Tutorial from Tutorials Point has a lot of detail, sample code, and animated images.
General Sorting Algorithm Resources
- Data Structure — Sorting Techniques is a very detailed series from Tutorials Point;
- Sorting Algorithms has visualizations of every sort, under different conditions — you can even watch different algorithms race each other;
- 15 Sorting Algorithms in 6 Minutes is a fascinating video visualization;
- Getting Sorted is video exploring the problem of computerized sorting;
- What Different Sorting Algorithms Sound Like is an incredible "audibilization" video translating several sorting algorithms into sound.
Most of the time, when programming, if you need something trivial sorted — a short array of values, a column of data — you'll just use whatever method or function is built into your programming language or favorite library.
But when working on large data applications, its important to think about how your choice of algorithm will affect the performance of your system. Beyond that, sorting algorithms are a fundamental aspect of computer science, and should be well understood by anyone working in development or programming.
Further Reading and Resources
We have more guides, tutorials, and infographics related to coding and development:
- C++ Developer Resources: one of the most popular programming languages, good for most kinds of applications.
- Unicode Introduction and Resources: learn all about character encoding.
- MantisBT Introduction and Resources: MantisBT is one of the most popular bug tracking programs.
What Code Should You Learn?
Confused about what programming language you should learn to code in? Check out our infographic, What Code Should You Learn? It not only discusses different aspects of the languages, it answers important questions such as, "How much money will I make programming Java for a living?"