Cron is a software utility on Linux and other Unix-based systems which runs commands based on timing rules defined in cron tables (crontab). This can be used for automating common tasks that need to be repeated on a scheduled basis.
Do you want to run regularly scheduled smoke testing on your production server every night? Do you want to clear the page cache once an hour? Do you need to backup the database three times a day and save the output file on another machine?
You don’t need a calendar reminder and a low paid intern. You need a cron job.
How does cron work?
Cron is a utility that runs in the background of the operating system and executes commands as scheduled.
This is accomplished a few different ways on different systems, but they tend to follow the same pattern:
- The cron process checks the crontab file(s) to find the soonest command it is scheduled to run.
- Sets itself an “alarm clock” and sleeps until that tasks needs to be run.
- Wakes up in time to run the task.
- Upon completing the task, checks the schedule for the next soonest activity. The cycle repeats.
Naturally, the implementation details (how the “sleep” and “wakeup” processes are handled, for example) vary from cron tool to cron tool, as well as from operating system to operating system. The notion of “cron” encompasses the idea of scheduling and the way users interact with the system, not the details of implementation.
How to use a cron scheduler
If you have access to the server’s file system, setting up cron jobs is fairly easy.
The scheduling files are called “crontabs” or “cron tables.” There is one file for the entire system (in the
/etc/ folder, usually), as well as (in newer systems) one for each user. User crontabs run commands as that user, and so are dependent on that user’s permissions. The system-wide crontab runs as an administrator, and so that file can only be edited by a user with admin privileges.
A crontab file is simply a plain text file with a single line for each scheduled job. It might look like this:
30 08 10 06 * /home/backup/backup.rb 00 11,16 * * * /home/python-tests/smoke.py 00 09-18 * * * /home/emailer/notifications.php 00 09-18 * * 1-5 /home/gps/dispatch.ping.js
This may look confusing, but it’s fairly simple. Each line represents a single scheduled job. The numbers and asterisks represent the schduled (when to do something) and the text afterward is a shell command. At the scheduled time, cron runs the command exactly as if a user typed that command into a terminal window.
In the example above, as is usual, these aren’t specific commands to do something on their own, but rather scripts that will be run. The logic of actually backing up, or running smoke tests, or emailing notifications, or pinging the gps server is all contained in files stored elsewhere.
If you wanted to do something simple with a cron there’s no reason you couldn’t just type the bash commands directly into the crontab file:
0 0 * * * mv /home/app/error.log /home/errors/$(date +%F).log
This renames the error log to a filename based on the current date and moves it to a special directory for such logs. Setting up this job is an easy way to make sure error logs don’t accumulate into one giant file.
(Of course, another way would be to create date-based files from within an application’s error reporting, as the errors are generated. But you might need to do it this way.)
The weird numbers at the beginning of every line refer to the schedule. The notation is a little hard to get used to, but it’s manageable (and you can always look it up).
There are 5 “slots,” each separated by a space, and each representing a unit of time — days, minutes, hours, etc.
<month> <hour> <month-day> <month> <week-day> <year>
* * * * * * | | | | | | | | | | | +-- Year (range: 1900-3000) | | | | +---- Day of the Week (range: 1-7, 1 standing for Monday) | | | +------ Month of the Year (range: 1-12) | | +-------- Day of the Month (range: 1-31) | +---------- Hour (range: 0-23) +------------ Minute (range: 0-59)
The cron scheduler is looking for pattern matches between the numbers and the system’s clock. An asterisk means any value matches.
So for example:
* * * * *
This will match EVERY check against the clock, so it will run every single minute.
If you need to run something every five minutes, you can do this:
*/5 * * * *
To run the command once a year, you could do:
0 0 1 1 * *
That means that the schedule matches when minute is 0 and the hour is 0 (midnight), on the first day of the first month. The two asterisks mean it doesn’t matter what day of the week it is, or what year it is. This job will run once a year, on January 1.
How to access cron scheduling
You have to access to the server or computer’s operating system itself. If this is a remote server used for a web hosting environment, this means you’ll need to access it using
ssh or a remote server admin panel like Ajenti.
Some web hosting control panels (like CPanel) also provide access to a cron scheduler. Often, these control panel tools provide a GUI that simplifies the task of setting up the schedule (so you don’t have to remember what the numbers mean).
Not all web hosts provide this kind of access, though. This is essentially an administrative function, and some hosting companies restrict their customers from it.
If you will need access to some kind of automated cron scheduling, be sure to check if the hosting company provides it before selecting one.