SNOBOL Text Processing and Programming Language
SNOBOL — StriNg Oriented and symBOlic Language — is a family of programming languages originally developed in the mid 1960s, primarily for the purpose of text processing and string analysis.
A Quick Note About Versions and Implementations
The last stable release of SNOBOL by the original developers was SNOBOL4, in 1967. You'll see both books and websites use both "SNOBOL" and "SNOBOL4" (and sometimes "Snobol"). On anything after 1967, these all refer to the same (final) version of the language.
There were also a handful of extensions and implementations. Snocone is a language preprocessor that provides syntactic sugar to the language, making it easier to use. SPITBOL is a compiler for SNOBOL; this is of particular interest because it was originally thought that SNOBOL was uncompilable. There is also the Snowball programming language, which was inspired by and named after SNOBOL.
Because of these and other extensions, some people use the phrase "Vanilla SNOBOL" when referring to code which only implements the original SNOBOL4 specification, and not any additional features.
About the Language
SNOBOL was created specifically for text and string manipulation. Because of this, it has a relatively unique feature: patterns are considered first-class data types. This allows patterns themselves to be manipulated, just like any other data structure. Additionally, strings can be treated as code and evaluated. This allows for recursive use of patterns and highly complex string processing and analysis. A SNOBOL program can even change its own source code.
Patterns in SNOBOL can be simple, like short spans of text or regex-like character-type strings. But they can also be exceedingly complex, like a complete formal description of the grammar of a language. Programming language interpreters can be written in SNOBOL, as well as natural language grammar analysis, spell check, and (in theory) translation engines.
SNOBOL was very popular in Computer Science academia in the 1960s and 70s, and was used extensively in the humanities through the 1980s. It has largely fallen out of use at this point, in favor of less powerful Regular Expression programming using languages like Awk and Perl. There are still a handful of loyal SNOBOL developers out there, and the language has the potential to be just as useful as ever.
- A Snobol4 Tutorial, a tutorial from 1985 by Mark Emmer;
- Emmer also wrote Vanilla Snobol4: Tutorial and Reference Manual (PDF) and Macro SPITBOL: The High Performance SNOBOL4 Language (PDF);
- Using SNOBOL/SITBOL on TWENEX.ORG, this tutorial is for the SITBOL implementation of SNOBOL, for use at the SDF Public Access TOPS-20 system — there are some oddly specific platform instructions here, but also a good tutorial on SNOBOL itself;
- SNOBOL4 Powerpoint Presentation, this provides just the slides from a presentation on SNOBOL — not a great stand-alone introduction, but worth a look for an overview on key concepts;
- Using SNOBOL on MTS, a guide to using the language on the mainframe Michigan Terminal System — this can be useful, along with the Hercules emulator, if you need to work on a legacy SNOBOL system.
- SPITBOL, a compiled implementation of SNOBOL, available on Github;
- SnoPy, a Python library that lets you use SNOBOL-based text patterns;
- Mini SNOBOL Interpreter, written in F#;
- Macro implementation of SNOBOL4 in C.
Community and Ongoing Learning
- Yahoo Email Group, for SNOBOL developers and people working with similar text-processing technology;
- SNOBOL4.com, a website about the language from a company founded by Mark Emmer, writer of several books and tutorials on the language;
- The SNOBOL listserve.
Books about SNOBOL
- General Books on the the Laguage:
- A Snobol4 Primer, by Ralph Griswold: a basic introduction to the language, written by one of its inventors;
- The Snobol4 Programming Language, by Ralph Griswold: called "the Green Book," this is the classic book on the language;
- String and List Processing in Snobol 4: Techniques and Applications, by Ralph Griswold;
- Programmer's Introduction to Snobol, by Ward Douglas Maurer.
- Special Topics in SNOBOL Programming:
- SNOBOL Programming for the Humanities, by Susan Hockey;
- Algorithms in Snobol 4, by James Gimpel;
- The Macro Implementation of Snobol 4: A Case Study of Machine-Independent Software Development, by Ralph Griswold.
Should I learn SNOBOL?
SNOBOL is not a terribly popular language, and there aren't a lot of employers looking for SNOBOL developers. So, from a career advancement standpoint, you are better off focusing on more in-demand languages.
However, if you are interested in text-centric computing (search, translation, natural-language processing, literary analysis ) you might want to spend some time with SNOBOL: especially if you've already pushed the boundaries of what can be accomplished with regular expressions.
Other Text Tools
If you're interested in SNOBOL, you'll want to check out some of these other tools for processing and analyzing text.
- Natural Language Toolkit, a Python platform for working with human language data;
- Stanford CoreNLP, a suite of Java-based tools for natural language analysis;
- Awk, a scripting language designed specifically for text processing;
- Perl, another scripting language, widely considered to have the best regular expression implementation available;
- ANTLER is ANother Tool for Language Recognition, and can be used for parsing both natural and artificial (computer) languages;
- Apache OpenNLP, a machine learning toolkit for natural language processing;
- Apache Lucene, a suite of search software tools in Java and Python;
- GATE, General Architecture for Text Engineering, a framework for "solving almost any text processing problem;"
- Prolog, a logic programing language invented for natural language processing;
- Icon, another text-processing language created by Ralph Griswold after his work on SNOBOL.
You might also want to read Taming Text: How to Find, Organize, and Manipulate It, by Ingersoll, Morton, and Farris. The book provides a great overview of text processing, with examples using several of the software tools listed above.
Finally, check out TAPoR3, a website and online community dedicated to tools for analyzing text.
Further Reading and Resources
We have more guides, tutorials, and infographics related to coding and development:
- Perl Guide and Resources: this is an excellent guide to getting started with this powerful scripting language.
- Awk Resources: learn this powerful scripting language available on most computers.
- Prolog Resources: this will get you started with this iconic logic programming language.
Natural Language Processing Come to Life!
The science of natural language processing has come a long way since the days of SNOBOL. Find out all about it in our infographic, How to Avoid Falling in Love with a Chatbot. It covers the long history of "thinking" computers — and might even save you from a broken heart!