URLs are one of the most commonly used technology concepts today. Essentially, they’re the addresses you’d use to access various resources — most of the time, you’re using one to access a particular website available on the internet. Because URLs are "handled" so frequently by users, it’s important to take care when choosing your domain, as well as the folder structure of your website, since the decisions you make are apparent to users and impact their experience in navigating through your site. URLs are also significant when it comes to search engine optimization.
What is a URL?
URL is an acronym for Uniform Resource Locator, and it is one of the core concepts of modern computing. By definition, a URL is a formatted text string referring to the location of a resource on a computer network (most commonly the web). Typically, these resources are web pages, but they can also be text documents, graphics, programs, or pretty much anything that can be stored digitally.
In addition to the "address" of the resource, a full URL will also imply the method (or protocol) by which the resource will be retrieved.
A basic URL will consist of three parts, or substrings, separated by defining characters. These include the protocol, the host name or address, and the resource location. A simple example of a URL string would take the following form:
Today, the URL has become a common part of our computer vocabulary, as ubiquitous as the internet itself. But that wasn't always the case.
ARPANET was introduced in the late 1960s. It was the first computer network to implement TCP/IP and became the basis for the internet. With it, it was possible to move files and documents between computers through the network. Unfortunately, at the time access to those documents could require any one of a number of different protocols. What was needed was a unifying principle that would allow files and documents to be easily linked, identified, and retrieved on demand.
ARPANET went international in the mid-1970s, thus becoming the internet. But the system remained much the same.
In the early 1990s, the web was built on top of the internet. This made it easier to find and link documents to one another. There were three fundamental building blocks of the web — the HTTP protocol, HTML, and the URL. HTTP is an acronym for Hypertext Transfer Protocol — a method for sending and receiving documents. HTML is, of course, the Hypertext Markup Language, which allows text-based documents to be rendered on a computer screen. And the URL is a consistent way to describe where a document is and how it should be delivered.
The Anatomy of a URL
The basic anatomy of a URL consists of several parts. Some of these parts are mandatory, while others are optional and can be included when something other than the default operation is needed. For example, HTTP uses port 80 by default, but it doesn't have to.
The best way to examine the anatomy of a URL is through example, so for our purposes we will use the following mock URL and break it down into its various components:
- http:// — this is the scheme, or protocol substring, and it indicates which protocol must be used to fetch the desired file or document. While HTTP is the most common, it is by no means the only option. Other protocols include HTTPS (the secured version of HTTP), mailto: (to open a mail client), ftp: (to handle a basic file transfer), and others. The colon (:) is the URI scheme separator, and the paired forward slashes (//) define the start of the local host name.
- www. — this part of the URL defines the content, in this case the world wide web. This portion of a URL can also be used to indicate a subdomain. For instance, we might alter our example to include http://support.whatever.com to access an internal support page from the target website.
- whatever.com — this is the domain name, and is used to indicate the targeted host or web server. The last part of our domain name, the .com, is the domain suffix and is used to identify the type or location of the website in question. Other domain suffixes include .org, .net, and region specific suffixes such as .co.uk. There are over 500 domain suffixes (or gTLDs) in existence.
- :80 — this is the port, and it indicates the "gate" used to access resources on the intended web server. This part of a URL is often omitted when the web server is using standard ports for the HTTP or HTTPS protocols. If a non-standard port is in use, this section must be included in the URL. Again, the colon (:) acts as a separator.
- whatever/whatever.html — this indicates the path to the resource on the server. Originally this section pointed to a physical location on a specific server, though now it more typically indicates an abstract location of the data being fetched. The forward slash again acts as a separator to maintain the integrity of the URL hierarchical syntax.
- ?this=that&that=this — this is the query string. It consists of a question mark followed by one or more parameters which a web server can use to return specific content, or a specific version of the requested content. URLs with query strings are commonly referred to as "dynamic URLs." The parameters used in dynamic URLs are not necessarily universal, and every web server has its own rules regarding their use.
- #fn2 — the last part of URL is the optional fragment or "anchor." It is indicated by a hash (#) and is followed by some text. This is used by the browser to position that webpage at a particular location.
Taken all together, these substrings form a full URL. It defines: the protocol necessary to retrieve a file or document; the server; the location of that content on that server; the gateway used to access that server; server-related information about the content; and client-related information about the content's display.
Designing an Optimal URL
Now that we know what a URL is, let's talk about why it's important to have a well-designed URL. We'll then dive into the things that contribute to the making of a "good" URL.
Why URL Design Matters
First, URLs are one of the few things that are consistently used by everyone who accesses the internet, regardless of the browser, operating system, or device being used — in some cases, your URL transcends the internet and is shared using analog methods, such as memos and other official documents. URLs are navigation aids (And more!) used by real people, not just machines, so their design is another method by which you can reach your audience.
Most of all, URLs are an unstated agreement between your users and you. Given a particular URL, the person should be able to use it now and at a later date to return a specific resource (or subsection of that resource). As such, you should avoid changing the URLs to your pages if at all possible. If you must, set up redirects (but doing so will add to your page load times as each redirection has to be parsed and executed). Because of this, designing your URLs at the beginning means that you don't have to worry as much about needing to change them at a later point in time.
General Guidelines for the Design of URLs
When thinking about how you want to structure your URLs, keep the following principles in mind:
Pay careful attention to top-level sections. The top-level section forms the base of all your URLs:
twitter.comare all top-level sections. Obviously, the market for unique, yet memorable URLs that are still on the shorter side is becoming limited, but its importance demands that you pay careful attention to its selection.
The availability of unique extensions (as an example, one of the most commonly-used extensions is
.com) opens up new possibilities, but use these carefully, as many will automatically assume that your site can be accessed with
Keep your URLs clean. What this boils down to is, "Can your users type your URL with ease?" This is something for you to look into, especially if you have a content management system or blog engine that auto-generates your URLs. In most cases, less is certainly more. Bonus: if your URLs are easy to type, it's probably easy to remember, which is useful for ensuring repeat visits to your site.
Use a single domain, and avoid subdomains if at all possible. The primary reason for both is that search engines see
example.com/subas two different sites, even though you intended for the pages in the
subfolder to be a subset of your main site. This negatively affects your search engine results rankings.
Use keywords in your URLs. By using keywords in your URL, you convey crucial information to both your visitors and search engines. For example,
mystore.com/books/nonfictiontells your viewer exactly what the page is: a list of nonfiction books for sale. Conversely,
mystore.com/category038/234823doesn't tell your visitor anything of use. Because your URLs are shown in search engine results, users are more likely to click your link if your URL conveys to the user that your site is legitimate and of interest.
However, be wary of using too many keywords. At one point, you could improve your search engine rankings by cramming your URLs with as many keywords as possible (eg,
mystore.com/books/non-fiction/non-fiction-books/realistic-fiction), but this is no longer the case. Doing this also has the downside of making your URL look spammy.
Match URLs to titles. Not only does this help you in following the suggestion directly above, but this also has the benefit of providing an additional signpost, so to speak, to your viewer. By looking at your URL, they develop an expectation for your page, and when they visit and see the title, you deliver something that meets their expectations.
If you have a lengthy title for your article, you don't have to craft a similarly lengthy URL — just create one that contains the key terms. For example, if your title is "How to Wear Your Scarf in Forty Different Styles," your URL might contain
Keep your folder structure simple. Generally, people assume that each additional
/indicates an additional layer of depth, so a URL like
example.com/sports/teams/volleyballleads to the Volleyball folder, which is the third nested layer under the primary domain. However, nested too deeply means that you're less likely to get views on those pages.
Tips for Designing a Great URL
Now that we've discussed the best practices for designing your URL, here are some practical tips to help you implement them.
Use hyphens over underscores. While search engines once had an easier time parsing URLs with underscores over those with hyphens, that's no longer the case. This, combined with the fact that hyphens are easier to use, make hyphens the better choice for URLs.
Use short, easy-to-remember words. This ties directly into having shorter URLs, which affects user experience. In addition to being easier to copy and paste, to share, and to embed into other websites.
Make your URLs case insensitive. People will most likely use all lowercase letters when typing out your URL, but you certainly don't want to lose users who use
/HOMEwhen your URL only works with
Avoid non-ASCII characters. Using only ASCII characters improves user experience, since they're easier to type. In addition, using non-ASCII characters means that your URL is less likely to convey information to the user on what they can expect to see on your page.
Avoid file extensions. First, file extensions tend not to be forward-compatible, so if standards change, you'll need to rework all of your URLs so that existing links don't break.
While we may have come to take the URL for granted, it is an integral part of the modern computing landscape. To learn more about URLs, their development and how to create and use them effectively, we suggest the following materials:
- The History of the URL: this article by Zack Bloom provides insight into the development of the URL, and how it led to the internet as we think of it today.
- Working with URLs: from Oracle Industries comes this extensive tutorial on URLs for Java programmers. Topics include basic URL definitions, creating effective URLs that meet basic web standards, parsing a URL, and reading from and writing to the URLConnection class.
- What is a URL: this beginners guide to URLs is hosted by the Mozilla Developer Network. In addition to providing a basic overview of how to create and use URLs effectively, this article delves into the value of semantic URLs, and the difference between absolute and relative URLs.
- Understanding URLs: written to appeal to those with minimal computing experience, this basic guide outlines the components of a URL, and how they can be read to identify the file path and retrievable content.
URLs, one of the most commonly-used concepts of computing today, are text strings designed to help locate the resource(s) in which you're interested in — while you're most likely looking for a specific website on the internet, URLs can be used to locate any resource on a given computer network. Because they are so commonly handled, URL design is one way that you can make an impact on user experience. By taking the time to design your URL to make it as informative and easy-to-use as possible, you'll ensure more page views for your resources.
Further Reading and Resources
We have more guides, tutorials, and infographics related to coding and website development:
- Composing Good HTML: this is a solid introduction to writing well-formed HTML and using HTML validator software.
- CSS3 — Intro, Guides & Resources: this is a great place to start learning webpage layout.
- ASP.NET Resources: this guide will get you going with Microsoft's .NET framework for creating webpages.
HTML for Beginners — Ultimate Guide
If you really want to learn HTML, we've created a book-length article, HTML for Beginners — Ultimate Guide And it really is the ultimate guide; it will take you from the very beginning to mastery.