URL Design
What is URL Design
When designing a website the first thing you should do is think about the URLs. How will the URL paths be laid out? What extension will the pages have .html, .php, .do, .pl or no extension?
It's something that's easily overlooked and quite often it's something that evolves. But if you don't plan your URL stucture in the beginning, you'll probably end up changing the URL structure later on. Changing URLs after a website goes live always causes problems. Sites that link to yours will contain broken links, anyone who has bookmarked pages on your site will find the bookmarks are invalid, search engines will index pages that no longer exist and it might have a negative effect on your search engine rankings. Of course, you could use mod_rewrite to redirect old URLs to new URLs, but managing lots of redirects can be complex and messy, and there will come a point when old redirects will need to be deleted - which once again may cause broken links.
Designing URLs
So, what extension should your pages have. Usually you want to avoid tying your website to a particular technology. A URL that ends in .php will make it hard if you want to migrate your website to ASP. Using .html is one of the safer extensions as no matter what technology your website uses, all pages would usually be generating HTML. But the best option is not to have any extensions. This also hides the technology that your website uses - which is good for security.
Also think about the domain name and sub domains. You don't have to use a www sub domain, but people are familiar with websites beginning with www so it's a good idea to follow the tradition - at least for your homepage.
Next, think about the URL path structure. What structure will logically divide your site and allow for future expansion.
Newer web technologies like Ruby On Rails do a lot of the hard work of URL design for you. But you still need to plan your URL structure before your start!
Dynamic Pages
Dynamically generated pages may take parameters, like an article or knowledge base ID. In this situation you could use Apache's mod_rewrite engine to convert page names into URL parameters. This hides the URL parameters from the user - they only see a clean URL. The mod_rewrite engine can convert the last path element into a URL parameter of the previous path element.
For example, the user will see
http://www.example.com/articles/mod_rewrite
But on the server the request gets transformed into
http://www.example.com/script.pl?article=mod_rewrite
Key Points
Here are a few points to remember when designing URLs:
- Use the www sub domain, at least for your homepage anyway. For example, use www.example.com rather than example.com.
- Domain names are lower case, so be consistent and make the whole of your URLs lower case.
- Make URLs readable. Use relevant words rather than abstract codes or numbers. A meaningful URL is more descriptive and easier to remember.
- Make your URLs shorter rather than longer (but not at the expense of readability). A short URL is easier to remember and easier to type.
- Avoid special characters other than hyphens or underscores in your URLs. Spaces and some other special characters get converted into hex values. For example, spaces become %20, which looks messy.
- Paths that contain /cgi-bin/ never look good. Scripts don't need to be in the cgi-bin directory, and besides you can always create another scriptalias folder to put your scripts in, or you can use mod_rewrite rules to hide it.