Class 10 – Fancy URLs: Customizing Your Site’s URLs Using Mod_Rewrite

December 4th, 2009 § 0

Now that you know all the basic techniques of web development, it’s time to start thinking about aesthetics. One the most obvious aesthetic choices you can make on your site is what domain name you choose, and what you call the file names on that site. Domain names are something I can’t help you with, but the rest of the URL after the domain name, including the folder and file names, is something I can help you beautify.

This is an advanced topic, but one that can provide polish to your sites if you are comfortable with all we have covered so far.

The problem: ugly URLs

As you know, depending on what we call our files and how we use the query string to pass data from one page to another, we sometimes end up with URLs that look like this:

http://onepotcooking.com/index.php?post=19&view=rss

But you might rather have URLs that look like this:

http://onepotcooking.com/rss/post/19/

And actually, search engines sometimes prefer more descriptive URLs, so they can more easily determine what a page is about:

http://onepotcooking.com/rss_feed/why_urls_should_be_pretty.html

But you don’t want to change the structure of your folders and file names, and change the entire way you use the $_GET, $_POST, and $_REQUEST variables in PHP just to make the URLs pretty. When you’re coding the site, you’re usually thinking about functionality and getting the job done, not aesthetics.

The solution to URL woes: mod_rewrite

Apache, the most popular software used by web servers to handle the requests and responses for web pages (and the software used by our class server and most other UNIX web servers) comes with a module called mod_rewrite that is used for creating custom URLs.

mod_rewrite lets you publish fancy URLs like:

http://onepotcooking.com/isnt_this_a_prety_url.html

But have them actually get converted internally into ugly URLs like this, without the user ever seeing it:

http://onepotcooking.com/process_something.php?id=1884&to_do=something&this_is=ugly

You will be able to use the fancy URLs for any links to your pages, but your folders, filenames, and PHP code will not have to change, so long as you use mod_rewrite correctly.

Rewriting vs. Redirecting

This process of having fancy URLs that get internally converted by the server into ugly URLs is known as URL rewriting. With a rewrite, since it only happens internally in the server, the user only ever sees the fancy URL. They will never see the ugly URL in the browser address bar.

However, the term redirect is generally used to refer to the technique where client, meaning the web browser, handles the redirecting. In the case of a client-side redirect, the user can see the final destination URL in the browser’s address bar after the redirect occurs. So they will ultimately see the ugly URL clearly in the address bar of the browser.

Another look at the client/server request/response relationship

To understand how mod_rewrite works, it’s important to understand where it fits into the whole request/response relationship. Here’s a very broad overview of the just relevant steps of what happens when a client requests a file from a server:

  • a user tries to load a web page in the browser (whether by going directly to a URL, clicking a link, submitting a form, or making an AJAX request)
  • the browser sends an HTTP request (either GET or POST) for the file to the server.
  • the server receives the request, and launches Apache’s request handler
  • Apache tries to figure out how to respond to the request
  • Apache first checks mod_rewrite settings to see if it should do any fancy processing of the URL of the file that the user is requesting
  • Then, if Apache determines that the requested file is a PHP script, it launches the PHP engine and sends any data that was passed along with the request to the PHP script that the browser requested
  • The PHP script runs and sends its output back to Apache
  • Apache sends a response to the web browser. The response contains an HTTP status code indicating some information about whether the request was processed properly or not, as well as any content that was output by the requested file, regardless of whether it’s a PHP script, HTML file, CSS file, Javascript file, or any other type of file.
  • The browser receives the response from Apache, and figures out how to display whatever content it received back from the server to the user.

As you can see, the mod_rewrite technique we will be discussing that allows sites to use fancy URLs will occur after the server has received the request from the browser, but before it has passed that request on to the PHP processor. It will be written in language that Apache can understand, not in PHP, since when it is processed, the PHP engine hasn’t even been launched yet.

Apache configuration files: httpd.conf and .htaccess

When a user requests a URL like this:

http://onepotcooking.com/spring2009/test.php

the Apache server checks two sets of configuration files to see whether it should do something fancy with that URL.

First, Apache checks its main configuration file, called httpd.conf, which is usually buried somewhere obscure in the deep recesses of the server filesystem. Httpd.conf has global settings that apply to your entire site. If you have a shared hosting plan for your site, which most of you will do, you do not have access to this file.

After it has checked httpd.conf for any relevant settings, Apache then checks the directory-specific configuration files called .htaccess, which have settings that apply only to specific folders.

With the example URL above, Apache would have to check for the existence of either of these two .htaccess files:

/.htaccess
/spring2009/.htaccess

Since the requested file is nested inside the spring2009/ folder, which is inside of the root / folder, either of those settings files could have an effect on how the request for the file is handled by the server.

We will be focusing on settings in the .htaccess files since these are the ones you will always have access to, regardless of your hosting setup. However, the same URL rewriting techniques will be applicable to settings in the httpd.conf file, with slight modifications.

How to use .htaccess files to rewrite URLs

Rather than rewrite an entire tutorial on how to rewrite URLs (which I initially started to do), there is an excellent tutorial already written which covers all the basic types of rewriting you are likely to do:

http://corz.org/serv/tricks/htaccess2.php

Note: Although I don’t think it’s clearly described on this site, all of the example code written there is meant to go into a file called “.htaccess” located in the root folder of your project. So if your project is at http://onepotcooking.com/johnhancock/final_project/, you should create an .htaccess file located at /johnhancock/final_project/.htaccess, so you can create fancy URLs like http://onepotcooking.com/johnhancock/final_project/this-is-a-fancy-url.html

In other words, fancy URLs only work at the level at which you put an .htaccess file. If you want a fancy URL like http://onepotcooking.com/this-is-a-fancy-url.html, you need to put an .htaccess file in the root folder of the server, /.htaccess.

I highly recommend you read that otherwise well-written document linked above if you wish to use fancy URLs on your own sites.

An example page

I have created a single example PHP script which can be accessed by a number of fancy URLs by taking advantage of rewriting rules found in a .htaccess file in the same folder. The PHP script just outputs whatever data was passed to it in the query string along with the GET request.

In other words, there is an .htaccess file which is allowing a variety of fancy URLs to all internally point to the same PHP script. Each URL is meant to exhibit a slightly different aspect of URL rewriting that may be useful to you. Several of them focus on passing data through the query string even though there is no query string in the fancy URL.

You will definitely want to read that tutorial linked above before going in to read the code in this example.

The direct URL to the example script is http://onepotcooking.com/amosbloomberg/spring2009/class12/mod_rewrite/index.php

The fancy URLs that internally rewrite to that same script are:

And the following URL uses mod_rewrite to do a client-side redirect (not a rewrite):

Reminder: all the rules that allow these URLs to point to and pass data to the same index.php script are found in the .htaccess file in the same folder as the PHP script.

Tagged: , , , , , , ,

§ Leave a Reply

What's this?

You are currently reading Class 10 – Fancy URLs: Customizing Your Site’s URLs Using Mod_Rewrite at Web Development Intensive.

meta