Class 12 – Exporting & Importing Database Tables

Posted: May 14th, 2009 | Author: | Filed under: mysql | Tags: , , | No Comments »

Now that the course is over, you are free to keep using the class database at your own risk for as long as you like.  However, I will be making changes on the server in preparation for the next course, and these will most likely affect your database tables.  So I recommend you make a backup of your database tables, as explained here.  If you eventually set up your own hosting account with your own domain, you will be able to import these tables into the MySQL server, as detailed below.

Exporting your tables

To export a copy of your tables from the database, follow these steps:

Step 1. Select the database that holds the tables you want to export.

Do this by selecting “classdb” in the top left of the page.  This will take you to the “Database View” that will show you a list of all the tables in that database.

select database

select database

Step 2. Select the “Export” tab at the top of the “Database View” page

insert tab

export tab

Step 3. Select the tables you want to export

Do this by holding down the Control key and clicking the name of each database you want to select in the list. (On Mac, hold down the Command key instead of the Control Key).

select tables to export

select tables to export

The default settings should be fine for the rest of the controls on this part of the page.  The important points are to make sure both the Structure and the Data of your tables are getting exported.  So the checkboxes next to the words “Structure” and “Data” should be checked.

make sure Structure and Data are both selected

make sure Structure and Data are both selected

Step 4. Click the “Save as file” checkbox

At the bottom of the “Export” page, there is a checkbox next to the words, “Save as file”.  Make sure this is checked.

In this same area, you can optionally select to have the database export file compressed into a zip or gzip file.  This will make the file smaller, which is perfectly okay, but not necessary.  It’s your choice whether to compress the file or not.

save settings

save as file settings

Step 5. Click “Go” to export

At the bottom right of the “Export” page is the submit button with the word, “Go”.  Click it to complete the export.

go!

go!

Step 6. Save file

After a few seconds, you may see a dialog box asking you whether you want to “Open” or “Save” the file.  Make sure you select “Save”, and as always, make sure you remember which folder the file is getting saved into on your hard drive.

save file

save file

Congratulations!  You have successfully exported your tables.

Importing your tables

Assuming you have exported your tables from the classdb database, you may one day want to import them your own database on your own server.  Doing this is even simpler than exporting was.

Before you begin, you must have a database set up.

It’s important to remember that we only exported the tables, not the actual database.  So before you can import your tables, you will have to first create a database in which to import them.  You can call this database any name you want.  It does not have to be named, “classdb”, like the original database.

Most web hosts will have a control panel that allows you to create databases on your server.  Usually they disable this feature from phpMyAdmin.  As an example, here’s a screenshot of what the “Create a database” control panel looks like with Dreamhost, the host for the server we used in class.

create database in Dreamhost control panel

create database in Dreamhost control panel

Assuming you have created a database on your server, or you are using an existing database, here are the steps necessary to import your tables.

Step 1. Select the database that you want to import the tables into

Do this in phpMyAdmin by selecting your database name in the top left of the page.  This will take you to the “Database View” that will show you a list of all the tables in that database, if any exist.

select database

select database

Step 2. Select the “Import” tab at the top of the “Database View” page

import tab

import tab

Step 3. Select the file with the tables you want to import

The “Import” page will ask you for the “File to import”.  Click the “Browse” button to browse your hard drive for the file you exported.  Once that file is selected, you can leave all the other settings at their defaults.

file to import

file to import

Step 4. Click “Go” to import

Click the “Go” button to complete the import.

go!

go!

Congratulations!  You have imported tables into your database.

As a confirmation, you should now see the new table names show up on the left side of the page, where all tables are listed that are contained within the selected database.

list of tables

updated list of tables


Class 11 – Introduction to Search Engine Optimization

Posted: May 6th, 2009 | Author: | Filed under: search | Tags: , , | No Comments »

The techniques website developers and marketers use to promote their web sites are many and varied.  Promotions on the web are not so different from promotions in any other medium – you need to use any and all channels available to you for getting the word out.  What used to be known as guerrilla marketing is now the norm online.

If a tree falls in the woods…

If your site doesn’t show up in the first page of Google results, does it really exist?  In some cases, getting your site listed near the top of a search for a particular word, or phrase, is imperative to the success of your web site and/or your business. Hence the interest marketers have in Search Engine Optimization (SEO).

The search engines have a monopoly.  Many users will not bother to look at sites that are not listed on the first page of search results for a particular term.  Many will not even bother with sites that are not in the top 3 results.

An excellent introduction

This site has an excellent introduction to the concept of Search Engine Optimization.  I will highlight what I consider to be the key aspects of the information in that tutorial.

SEO is “politics by other means”

How you place in the search results depends in a large part upon how the search engines work.  Each has a set of secret algorithms that ultimately determine how far up your site falls in the search results for any search term.  However, each search engine also regularly modifies these algorithms.  So just because you are high up in the search results today doesn’t mean that you will be there tomorrow.  Large, well-funded sites will try to detect each change in the search engines’ algorithms, and will modify their own sites accordingly.

“Politics by other means” was how General von Clausewitz described war.  You should generally consider SEO to be akin to war, and should think strategically.  Given the huge number of websites on just about any topic, all vying for the attention of a finite group of potential viewers, how will your site get noticed?  Everyone in the game is battling to show up at the top of a search result for the relevant keywords, so your chances of winning any particular battle are slim.

You need to consider SEO a sustained campaign of attrition.  Unless your site is very niche-oriented, and involves very obscure keywords, a one-time shock-and-awe marketing strategy may work for you at first, but you will slowly slide down in the search results as the search algorithms evolve, and as the other players in the game indefatiguably try to climb up to the top, pushing you down along the way.

It’s all about semantics

At a high level, the key to SEO is to make what your site is about clear to the search engines.  If your site is about cars, but you don’t use the word “car” in any headings or titles of pages, you will not be making a search engine’s job easy.

The search engines should be able to discover the main themes of your site automatically by crawling through the code of your site, seeing what other sites link to your site, seeing where your site links, and detecting the main words you use for things like the titles of pages, headings, and the text used in links.

So here are some very general but easy-to-implement tips:

  • inbound links: make partnerships, or friendships, with other sites and get them to link to your site.  You can even buy them.  The more thematically related the linking site is to your site, the better.  And ask them nicely to make the copy in the link text meaningful in some way to the content of your site.
  • outbound links: don’t be afraid to link to other related sites.  You want to show the search engines that you are part of the community of sites related to a specific topic.
  • picking keywords: if your site is about animals, you will need to come up with alternative keywords to use.  There are so many sites about animals that you will never make it to the top of the search results by optimizing for the word, “animals”.  Find variations or more specific keywords to use instead.
  • keyword density: if your site is about porpoise feeding habits, be sure to use the phrase “porpose feeding habits” in as many places in your content as possible.
  • meaningful page titles: If your site is about mold colonies, put the words “mold colonies”, or related words, in the <title> tag of every page
  • meaningful page headings:  Make sure to use the word “cultural perspectives on aging”, or related keywords and phrases in the <h1> – <h6> tags on your pages, if your site is about the cultural perspectives of the aging process.
  • meaningful link copy:  If your page about the health benefits of flax seed oil links to a page about bio-diesel car engines, put the words “flax seed oil will make your bio-diesel engine run quicker” somewhere in the link copy.  Of course, I’m being facetious, but you need to find creative ways to throw in the major keywords anywhere possible, even in the text you use for links.
  • semantic tags: use XHTML tags for what they were meant to be used for – don’t try to game the system (for now).  Use <h1> – <h6> tags for things that are truly headings of the content of your pages.  Use <p> tags for paragraphs, <th> tags for table headings, surround important words with the <strong> tag, use <label> tags for labels, etc.
  • don’t bury the content: use as few XHTML tags as possible to get the job done.  If you wrap <h1> tags within <divs> within <divs> within <divs> within <divs>, the search engine spider may give up trying to get to the real content of your page as it drills down through all the levels of your code.  Of course, efficient use of XHTML and CSS code comes with practice.
  • use meaningful URLs: if you feel comfortable with mod_rewrite and .htaccess files, convert your URLs to be semantically meaningful. For example, a page about artichoke recipes that has a URL like http://onepotcooking.com/recipes/artichokes is much more search engine friendly than http://onepotcooking.com/spring2009/class12/assigment6/recipes.php?cat=12
  • use <meta> tags in the <head> section of your document to explicitly include a description and keywords of your site.  Most search engines will actually ignore these when indexing your site, but it doesn’t hurt.

As you can see, there are some very practical things you can do to make your site more likely to be noticed by search engines.  How much you sacrifice in terms of design and creativity in order to appease the search engine gods is up to you and your specific needs.

More information

There are dozens of books available about this topic, and any of them will go into more detail about exactly what the differences are between the different search engines.  But each of them will most likely be focused at a high level on these fundamental concepts.

Furthermore, a simple search with the keywords, “search engine optimization” will bring up thousands of pages, blogs, message boards, and sites devoted to the topic.  Feel free to pick one from the top of the list.


Class 11 – Search in MySQL

Posted: May 4th, 2009 | Author: | Filed under: mysql, php, search | Tags: | No Comments »

As we discussed in class, there are two ways of implementing search functionality on your sites using built-in MySQL commands.

The first technique takes advantage of the WHERE LIKE clause in SQL.  The second technique takes advantage of the FULLTEXT search feature built into MySQL.

Both these methods, of course, presume that the content you want to be searching is stored in a MySQL database.  If the content you want to search is hard-coded in an XHTML page, rather than stored in a database, then these methods will not be useful to you, and you would do better to use Google Custom Search, Yahoo Search, or some other service that provides search functionality for regular XHTML documents.

Search using WHERE LIKE

This is the simplest form of MySQL search. You can see this example live in your browser here.

Let’s assume that we have a table, called “abloomberg_blogposts” that stores a bunch of blog posts.  The table has the fields, “id”, “title”, “message”, and “created”.

As you know, the SQL query to read all the rows from the table would be:

SELECT * FROM abloomberg_blogposts WHERE 1

The “WHERE 1″ part of the query tells MySQL to return all the rows in the table.

To read only those rows that had the title “cat”, we could run this query:

SELECT * FROM abloomberg_blogposts WHERE title LIKE '%cat%'

In this case, the “WHERE title LIKE ‘%cat%’” part of the query tells MySQL to return only those rows in the table where the “title” field contains the string, “cat”.  The “%” percent signs indicate wild card that matches any character or bunch of characters.

Because of the use of the wild cards before and after the search term, this query will match any of the following strings:

cat
catamaran
bobcat
concatenate

If we only wanted to match those rows where the “title” field began with the word, “cat”, we could use the following query with only a wild card after the search term:

SELECT * FROM abloomberg_blogposts WHERE title LIKE 'cat%'

Likewise, if we wanted to only match those rows where the “title” field ended with the word, “cat”, we could use this query:

SELECT * FROM abloomberg_blogposts WHERE title LIKE '%cat'

Finally, if we wanted to search more than one field, we can combine multiple queries together using the UNION keyword in SQL.  For example, if we wanted to search both the “title” and the “message” fields at once, we could use the following query:

SELECT * FROM abloomberg_blogposts WHERE title LIKE '%cat%' UNION SELECT * FROM abloomberg_blogposts WHERE message LIKE '%cat%'

We could chain together as many queries as we like using the UNION keyword, and MySQL will return the combined results of all of them.

Assuming you are running queries like this in PHP, rather than directly in SQL, you will probably want to store the search term in a variable rather than hard-code it in the SQL command.  Replacing the search term with a PHP variable called $searchTerm will make your queries look something like this:

$myQuery = "SELECT * FROM abloomberg_blogposts WHERE title LIKE '%{$searchTerm}%' UNION SELECT * FROM abloomberg_blogposts WHERE message LIKE '%{$searchTerm}%'";

Search using FULLTEXT search

The alternative method of searching in MySQL is to use the FULLTEXT search feature built into MySQL. You can see this example live in the browser here.

This method allows you to search more than one field in a single query.  It also allows you to be a bit more flexible with your search, and it is smart enough to ignore common words like “a”, “the”, “of”, etc.

Here is an example of using MATCH AGAINST to search for the word “cat” in both the “title” and “message” fields of our “abloomberg_blogposts” table:

SELECT * FROM abloomberg_blogposts WHERE MATCH (title, message) AGAINST ('+"cat"' IN BOOLEAN MODE)

As you can see, the fields you want to search go inside the MATCH(…) section, and the search term you want to find goes inside of the AGAINST(…) section of the query.

The plus sign, “+”, in the query indicates that you want to search for all rows that contain the word “cat”.  If you replaced this with a minus sign, “-”, the query would return all rows that did not contain that word.

Assuming you are running this query in PHP, rather than directly in SQL, you will most likely want to replace the search term with a variable.  And your PHP code would look something like this:

$myQuery = "SELECT * FROM abloomberg_blogposts WHERE MATCH (title, message) AGAINST ('+\"{$searchTerm}\"' IN BOOLEAN MODE)";

Important Note: In order to use this FULLTEXT search feature, MySQL requires you to create a FULLTEXT index on any text field (including varchar and text fields) that you want to use in your search.  The FULLTEXT index is a way of optimizing those fields for search.

If you look at the structure of the abloomberg_blogposts table in phpMyAdmin, you will see that there is a FULLTEXT index on the title and message fields, which allows this example search to work.

Indexes used by the abloomberg_blogposts table

Indexes used by the abloomberg_blogposts table

Creating a FULLTEXT index in phpMyAdmin

You must create such an index on the fields you intend to search in your own code in order to use the FULLTEXT search technique.   Creating such an index is simple in phpMyAdmin.  First go to the “Structure” tab of the table you will be searching.  In the Indexes box on that page, in the “Create an index on … columns” text field, fill in how many fields you wish to create the index for (i.e. how many fields you will be searching), and then click Go.  The next page will ask you which fields you want to index (i.e. which fields you will be searching), and it also asks you to name this index.  You can name the index anything, but it’s probably best to use a name that includes the field names the index uses.  In my example, the index is named, “title_message”.

More information on FULLTEXT search
http://www.onlamp.com/pub/a/onlamp/2003/06/26/fulltext.html

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

http://dev.mysql.com/doc/refman/5.1/en/fulltext-boolean.html


Class 11 – Brief Intro to Facebook Application Development

Posted: May 2nd, 2009 | Author: | Filed under: facebook | Tags: | No Comments »

As we saw in class today, Facebook Application development is not very much different from the sort of XHTML, CSS, PHP, and MySQL development we have been covering in class.

The main difference is that in addition to data that you store and retrieve from your own database, you have access to “social graph data” that comes from the Facebook Platform.

Here is the link to the official documentation for building Facebook Apps at http://developers.facebook.com

Application setup

To initially set up an application on Facebook, you’ll need to go to your personalized developer home page at http://facebook.com/developers.  There you need to click the “Set Up New Application” button, which will bring you to a page where you fill in a few details about your application.

Set up new application button

Set up new application button

The most important two bits of information that you need to fill in are your application’s “canvas URL”, and the “callback URL”.

Canvas settings

Canvas settings

Canvas URL

The term “canvas URL” refers to the URL that your application will have on the facebook site.  For example, http://apps.facebook.com/webdevspring/

Callback URL

The term “callback URL” refers to the actual location of your application if you were to access it directly in the browser.  For example, http://onepotcooking.com/examples/class11/facebook/.  However, users will not ever actually go to this page directly in their browser.  Instead, they will view your application as if it were on the Facebook site itself at the canvas URL.

When a user loads the canvas URL in their browser, Facebook serves as a sort of proxy.  Behind the scenes, Facebook loads the page from your callback URL, parses the code that it finds there and replaces any FBML it finds in that code with its XHTML equivalent.  Then it places the result of that parsing process into the main section of the Facebook page template.  So your page looks like it is hosted on the Facebook site, although you and I know that it is on our own server.

API Key & Secret

Once you have filled in all the required fields for the Application setup, Facebook will show a page that has your new application’s API Key and Application Secret.

API key and application secret

API key and application secret

These two bits of information are used for authentication, and are necessary for allowing your application’s code to make requests to Facebook’s Platform server.

You will need to use these two encoded strings to make a secure connection to Facebook’s servers on every page where your code interacts with the Facebook Platform.

FBML

Facebook’s proprietary markup language, FBML, is a subset of XML, like XHTML.  From the prefix, “fb:” that is prepended to every FBML tag, you should recognize that the tag names use their own XML namespace.

One of the reasons Facebook  created FBML is to make it easy for independent developers to place users’ photos and first and last names, and other bits of “social data” on a page, without Facebook having to give any random developer access to the database where that data is actually stored.  Giving independent developer’s database access would obviously be bad for security and performance of their site.

As an example of a typical use of FBML, to place a user’s profile photo on one of your application pages, you would use FBML like this somewhere embedded in your XHTML:

<fb:profile-pic uid=”112233″ size=”square” />

When Facebook parses the code from your callback URL, it will see that you are using FBML code, and it will replace this bit of code with the profile photo of user #112233, assuming there is such a user.

So when a user views the source code of any Facebook Application page, they will not see the FBML code – it will be parsed and removed by the Facebook server.

Facebook PHP Client

To get your code talking to the Facebook Platform, you will need to download the Facebook PHP Client Library, which is an object-oriented set of classes that provide some easy-to-use methods and properties for accessing Facebook user data, and some other common tasks that relate to your application interacting with their site.

You will need to include the Facebook PHP Client Library into your scripts by using the same require_once() function that we have been using to include our own files into our scripts.

Our example page

Let’s go line by line through a very simple application page that will display the profile photos of the logged-in user’s list of friends.
The first command just includes the Facebook PHP Client Library.

//include the Facebook API Client
require_once("facebook_client/facebook.php");

The next few lines set up our basic communication channel with the Facebook Platform, using the API Key and Application Secret that we got when we set up the application on Facebook.

You will recognize that we are creating an object from the Facebook class that is defined in the Facebook PHP Client we downloaded and include in this script.

//when we set up a new application on Facebook, they give us an API Key and API Secret for this app
//these will be different for each app
//we store them in variables and use these to set up communication with the Facebook API
$fb_api_key = "5a8c964c7f38f6aa53035f91133321d5";
$fb_api_secret = "00d997c03f7d342906b3aaa5da7956e5";

//INSTANTIATE FB API
//this creates a Facebook object, which we call $FB
//you can see that we are passing the API Key and Application Secret as required parameters to the Facebook class's constructor function
//this essentially authenticates a connection to the Facebook Platform and allows us to communicate with it in code
$FB = new Facebook($fb_api_key, $fb_api_secret);

So now we have an object called $FB, which is a Facebook object.  This object has all the properties and methods that Facebook has defined in their Facebook class definition.

One of those methods of the Facebook object is a method that requires the current user to be actually logged-in to the Facebook site.  We run that to make sure that user’s are logged in.

//REQUIRE USER TO BE LOGGED IN
$FB->require_login();

Then, we call a built-in method of the Facebook object that returns an array of all the user ids of the current user’s friends.  We store that array of friends’ ids in a variable called $arrFriendIds.

//GET A LIST OF THIS USER'S FRIENDS
$arrFriendIds = $FB->api_client->friends_get();

If you wanted to see the raw data that is contained within the Facebook object, you could, of course, use print_r() to output the contents of the $FB object.

//print_r($arrFriendIds);

In our example, we then use the data we have gathered from the $FB object and dislpay it using XHTMl and FBML:

<div class="container">
  <h1>Welcome, <fb:name uid="<?= $FB->user ?>" useyou="false" /></h1>
  <p>Here are your friends</p>
  <div id="friend_container">
    <?php foreach ($arrFriendIds as $friendId) : ?>
      <fb:profile-pic uid="<?= $friendId ?>" size="square" />  
    <?php endforeach ?>
  </div>
</div>

The Facebook property, $FB->user always holds the user id of the current user.  You can see that I have highlighted PHP code in green, and FBML code in red, to easily identify how each is being used in this simple example.
Notice that this XHTML & FBML code is just a code “snippet”, not a full XHTML document.  This is because Facebook will take this code and stick it inside their own XHTML document, so we should not redefine the <head>, <body>, or other basic tags that they will be creating for us on the page that shows our application.
We wrap the XHTML & FBML snippet inside of a div with id=”container”, just as we would on any other web page for the same reasons: this makes our part of the page easier to style and the layout easier to manage.

Conclusion

Obviously, this is just the beginning of developing for Facebook.  The interesting part of developing applications occurs when you combine the data that Facebook gives you through its Platform APIs with the user-generated content that you store in your own database.

For example, you could port your blog assignments to become Facebook applications by changing your code so that every time a user makes a post, you are storing the post along with their Facebook user id, found in the $FB->user property, not the user id that you automatically assigned to users when you made your stand-along blog site.

You would not have to perform any user authentication (i.e. registration or login), since Facebook would do all that for you when a user signs up for their site, so you would not need a “users” table at all in your database.  All user information is obtained through the Facebook Platform, and your database just stores everything else except that.

As another example, our earlier homemade social network example would be totally redundant if you ported it over to Facebook, since all “friend” information for Facebook apps is handled by the Facebook Platform, not by your own code.  So you wouldn’t have to keep track in your database of who is friends with whom.  You could leave those tasks to Facebook and concentrate on building out more interesting and compelling functionality on top of that.

You will be surprised at how many apps are just glorified message boards that store data in much the same way as some of your previous assignments.

For further reading, I recommend exploring the documentation linked to from the “Get Started” section of the Facebook Developers site.


Class 9 – Pagination in PHP

Posted: May 2nd, 2009 | Author: | Filed under: mysql, php | Tags: | No Comments »

Here is an example of how to do pagination of data in PHP.

Let’s say you have a database table full of animals.  You only want to show 10 animals per page, with “previous” and “next” buttons to allow users to go to the previous or next page of results.

There are a few different things we need to keep track of in our code to make this type of pagination work.  Here’s a quick run-down of the most important pieces of code in the controller:

  • $pageNum = the page number of the page the user is viewing
  • $numRowsPerPage = the number of results we want to show on each page
  • $startIndex = the index of the first row we want to display on this page
  • $endIndex = the index of the last row we want to display on this page
  • $numRowsTotal = the total number of results in the database
  • $numPagesTotal = the total number of pages of results

The current page number

The current page number is retrieved from the query string of the URL using the ubiquitous $_REQUEST variable.

//FIGURE OUT WHICH PAGE OF RESULTS TO SHOW
$pageNum = $_REQUEST['page']; //which page to show
//if no page requested, load the first page
if (empty($pageNum)) {
	$pageNum = 1;
}

This code checks for the page number in the query string.  If there is no page number there, it defaults to page 1.

The number of results to show on each page

The $numResultsPerPage variable is just hardcoded to some value:

//PAGINATION SETTINGS
$numRowsPerPage = 10; //how many results to show per page

The index of the first row to show

If the user is viewing page 1, and there are 10 rows per page, then the first row we want to show them is row 1, and the last row is row 10.  If they are on page 2, then the first row we want to show them is row 11, and the last is row 20… because they have already seen 1-10.

The general formula for this is:

//calculate the start index of the rows to show on this page
$startIndex = ($pageNum-1) * $numRowsPerPage;  // get the starting index number of the first item to show on this page

The index of the last row to show

In general, the index of the last row on the page is just the index of the first row, plus the number of rows on the page.

//calculate the end index
$endIndex = $startIndex + $numRowsPerPage;

However, on the very last page, it may be that there are fewer than 10 results.  This means that the index of the last row may not be just the index of the start row plus 10.  So we need to check to make sure that if we are on the last page, the value we calculated for the index of the last row is not greater than the total number of rows in the table.

//make sure on the last page that we don't have an end index that is greater than the total number of rows
if ($endIndex > $numRowsTotal) {
	$endIndex = $numRowsTotal;
}

The total number of rows

The total number of rows is just how many rows there are in the database table.

$numRowsTotal = sizeof(getAnimals()); //get the total number of animals in the database

This function getAnimals() returns an array with the full list of animals, and we use PHP’s built-in sizeof() function to get the number of elements in that array.

The total number of pages

The total number of pages is just the total number of rows, divided by the number of results per page.  That’s the math.

$numPagesTotal = ceil($numRowsTotal / $numRowsPerPage); //get the total number of pages

Display the number of results

Assuming the aforementioned variables have all been set up properly in the PHP in your controller script,  you will probably want to display the indexes of the results shown on any given page, as well as the total number of results somewhere in the XHTML template you are using for the view of your application.

The following bit of XHTML interspersed with “template-style” PHP will display the start and end number of the results on the page, as well as the total number of results in a nicely formatted, “normal” way that is commonly used by search engines:

<span>Displaying <?php echo $startIndex + 1 ?>-<?php echo $endIndex ?> of <?php echo $numRowsTotal ?> results</span>

This will output something like this in the browser,

Displaying 1-10 of 35 results

Display links to other pages

Let’s say you have 10 pages of results.  The first time a user comes to your site, they will see page #1.  So obviously, in your XHTML templates, you will also want to dislpay links to the other pages of results.  To do this, you will make use of some of the other variables we have set up in the PHP controller script.

Here is an example of how to use the PHP variables to output a very ordinary set of links to other pages:

<?php if ($pageNum > 1) : ?>
<a href="index.php?page=<?php echo $prevPage ?>">Prev</a>
<?php endif ?>
<?php if (($pageNum-1) * $numRowsPerPage + $numRowsPerPage < $numRowsTotal) : ?>
<a href="index.php?page=<?php echo $nextPage ?>s">Next</a>
<?php endif ?>
<?php for ($i=1; $i<=$numPagesTotal; $i++) : ?>
<a <?php if ($pageNum == $i) : ?>class="selected"<?php endif ?> href="index.php?page=<?php echo $i ?>"><?php echo $i ?></a>
<?php endfor ?>

Let’s take this code line-by-line.  The first three lines here display the “Prev” link to the previous page:

<?php if ($pageNum > 1) : ?>
<a href="index.php?page=<?php echo $prevPage ?>">Prev</a>
<?php endif ?>

If the user is on the first page, it doesn’t make sense to have a link to the previous page since there is no previous page, so we check to make sure the user is not on the first page before displaying this link.

The next three lines of code display the link to the “Next” page:

<?php if (($pageNum-1) * $numRowsPerPage + $numRowsPerPage < $numRowsTotal) : ?>
<a href="index.php?page=<?php echo $nextPage ?>s">Next</a>
<?php endif ?>

This code is basically just making sure that there is a next page before outputting the link.  It doesn’t make sense to show a link to the next page if we’re already on the last page.

And these last three lines display the page numbers of all the pages as links:

<?php for ($i=1; $i<=$numPagesTotal; $i++) : ?>
<a <?php if ($pageNum == $i) : ?>class="selected"<?php endif ?> href="index.php?page=<?php echo $i ?>"><?php echo $i ?></a>
<?php endfor ?>

The for loop here loops through from the first to the last page, and displays each page number as a dynamic link to that page of results.  It highlights the current page by adding a CSS class called “selected” to the link tag.


Class 12 – Common Command-Line Subversion Commands

Posted: May 2nd, 2009 | Author: | Filed under: version control | Tags: , , | No Comments »

Here are some of the most common Subversion commands that you may one day run on the server, which uses a Linux operating system.  Most of the time, your SVN client, like TortoiseSVN or SvnX, will handle these things for you.  But here are the commands for your reference anyway, in case you one day consider yourself something of a server administrator.

Note 1: These examples use “$SVNREP_DIR” in place of the full path from the server root to the folder on the server where your repository is stored.  And “myproject” is in place of your project’s name.

Note 2: Also, these examples show how to run the commands on the same server that stores the repository.  If for some reason, you are running them on a different computer than the one that stores the repository, you can replace the file:///somefolder/ urls with an absolutely http://someurl.com/somefolder/ path.

//command to create a repository
svnadmin create $SVNREP_DIR/myproject

//command to create the layout of the repository
svn mkdir -m "layout creation" file:///$SVNREP_DIR/myproject/trunk file:///$SVNREP_DIR/myproject/tags file:///$SVNREP_DIR/myproject/branches

//do an initial import of your project into the repository
cd ~/myproject
svn import -m "initial import" . file:///$SVNREP_DIR/myproject/trunk

//check to make sure all files were imported
svn ls file:///$SVNREP_DIR/myproject/trunk/

//move your original project folder to a backup folder
mv myproject myproject.origin

//checkout the project from the repository
svn co file:///$SVNREP_DIR/myproject/trunk myproject

//make sure the files checked out successfully
ls myproject

Class 12 – Version Control Using Subversion

Posted: May 2nd, 2009 | Author: | Filed under: version control | Tags: , , | No Comments »

When projects require several developers (i.e. designers, developers, html coders, etc), they may run into situations where multiple people try to edit the same file at the same time. This could be problematic, since one developer could easily overwrite another’s work.

To prevent one person from overwriting another’s work, most teams of coders use some form of version control. Version control software allows multiple people to simultaneously work on the same files while not overwriting each other’s work.

Subversion is probably the most popular version control software:
http://subversion.tigris.org/

You can find the Subversion online manual here.

The Repository

Subversion uses a type of client/server architecture.  The permanent location of all files is on the server, in what is called a Repository.

The Repository has the current version of all files, as well as all the older versions of the files stored in a nicely organized way. It keeps a permanent record of any changes to the project.

Clients can always get the latest copy of the files by copying them from the Repository.  So developers generally make changes to files on their local machines, and then upload them to the Repository so that other developers can get the lastest version of the files.

Internally, the Repository in Subversion has three subfolders: “trunk”, “branches”, and “tags”.

The “trunk” is the main folder that always has the latest copy of the main version of the code.

The “branches” folder is used if you decide to fork your code and simultaneously work on two or more different variations (or branches) of your code.

The “tags” folder allows you to give nicknames to prior versions of the code for easier access later on: for example, you may decide to call revision #12 the “gold_release” with a tag so that it is easier to remember than a number.

In general, for a simple project, you will work with code in the “trunk”, unless you are an advanced user.

The Client

Your local computer runs a Subversion client that automates the common tasks that you will need to do when using version control. The recommended client for Windows is TortoiseSVN. For Mac, you should check out SvnX.

Any client will easily allow you to run the basic Subversion commands:

  • checkout - this command is used to get all the files from the repository at once and download them to your client.  You usually run it once at the beginning of development to set up your working copies of the files
  • import - this command is used to import files as a batch into the Repository.  It is usually used once at the beginning of a project that has already begun without version control.
  • add - this command adds a file to the repository.
  • update - this command updates all the files on the client machine with the latest version from the Repository
  • delete - this obviously deletes files
  • commit - permanently store any changes in the Repository.  Note: you need to commit after every modification, merge, add, or delete in order for those changes to be permanently stored in the Repository.

Working Copy

When a developer downloads the latest copy of the files from the Repository, they are placed in what is called a Working Copy. This is exactly what it sounds like – a copy of the original files that a developer can use to work on her client machine.

Once a developer has finished modifying the files, she has to then upload them back to the Repository so other developers can see the changes.

Checkout

When a developer starts working on a project that is using version control with Subversion, the first thing they will need to do is get a copy of the latest version of all the files in the project.  This is called the initial checkout.

Any Subversion client, like TortoiseSVN or SvnX, will have an option to checkout the files in a repository.

This command is usually only run once to do the initial setup of the Working Copy on the client machine.

Update

To get the latest copy of files from the Repository, a developer runs a command called Update. This automatically downloads the latest files from the Repository.

Developers should run an update immediately before editing any file, and again immediately before Committing any changes to the Repository.

Commit

When a developer has finished making changes to her Working Copy of the files, she first runs the Update command to make sure that nobody else has modified the same files she is about to upload, and then runs a Commit command that automatically uploads the changed files to the Repository.

Conflicts

When two developers have modified the same files at the same time, and then they both Commit their changes, the second developer to commit his changes will experience a Conflict. Usually the work the developers have done is in different parts of the file, so Subversion can automatically merge the two versions together.

However, in some cases Subversion cannot figure out how to merge the two versions of the file, so it asks one of the developers to manually merge the two. Subversion will show you the difference between the two files, and you can usually easily see what has to be done to get the two versions to merge correctly.

Once you are finished merging, you must of course commit any changes in order for them to be permanently stored in the repository.

Workflow

To sum it up, here are the usual steps you will take when working on code that is held in version control:

  1. update to the latest version of code
  2. modify the code
  3. update again to check for any conflicts from other people’s changes
  4. merge any conflicts that have arisen from other people’s modifications
  5. commit your changes

Class 11 – Intro to Security on the Web

Posted: May 1st, 2009 | Author: | Filed under: security | Tags: | No Comments »

Security risks on the web fall into 3 general categories:

  1. Server-side risks
  2. Client-side risks
  3. Network eavesdropping

Server-side risks

Every web server is a security risk.  When you publish a website, you are letting anyone in the world connect to your server and access your files, run scripts, upload files, run queries on and store data in your database. The more complicated your setup, both in terms of the server setup as well as your code setup, the more likely you are to have bugs, which in turn makes it more likely you have holes in your security. This is true not only of the code you write, but also of all the products you use to help make your web site work.  Common risks include the theft of confidential information and the installation of malicious scripts onto your servers.

A common example of something hackers will do once they compromise your server is a distributed denial of service attack (DDOS). Hackers will gain access to many insecure servers and install scripts that do nothing but make requests to a particular web server. With thousands of these scripts running concurrently on many compromised servers, a setup known as a botnet, hackers can easily create so much traffic for a website that it brings the web server to its knees and is not able to respond to all the requests. This happens all the time to the most popular sites. Usually web servers have software that detects attempted DDOS attacks and has mechanisms for blocking requests from any server that seems to be compromised in this way.

Another common attack is the SQL injection attack. Hackers will try to gain access to your database this way, and can easily steal private information, for example credit card numbers, if you are not careful. This is the primary reason why you should ALWAYS sanitize user input before using it in queries to the database. Make sure what the user has submitted does not contain any weird code in it, and that it is of the type that you expected (e.g. if it’s a phone number you expect, make sure it’s a phone number the user entered).

Client-side risks

Attackers may also target the client in a variety of ways. Each web browser runs as an application on your local client machine. This means the browser software has access to your file system and everything on it. Since the information that the browser uses to display content from the web is usually coming from servers on the web, there’s a chance that a hacker will be able to use a server to send instructions to your browser that may install malicious software, or force the client to do things like upload personal information to the hacker’s server.

Multiple layers of anti-virus software is a must on both PC and Mac for preventing malware from running your computer. Given that the web is a high-risk environment, most web browsers and email clients are thoroughly tested and can be considered secure. However, all of the major web browsers and email clients do issue security updates from time-to-time to fix security problems they find in their software.

Certain types of web applications, such as Java, ActiveX, Silverlight, Flash, Adobe PDF are not natively supported by most web browsers. This means that they must run as separate applications from the web browser (even though they show up in the web browser window), and so these technologies have their own security risks that their developers must constantly mitigate. Like browsers, these technologies are so commonly used that security risks are usually discovered quickly, and updates are sent out that patch the bugs. But bugs do exist, and hackers are always trying to find new ones. Do a search for “flash vulnerabilities” on Google, and you will see examples of exploits that hackers have created using Flash.

Phishing scams are another major client-side risk that you should be aware of. Scammers could create a website, for example, that looks exactly like Amazon.com’s checkout page, but is actually created by a hackers in Nigeria. If for some reason you find yourself on this site thinking it is Amazon.com, you may enter your credit card information, which is then used by the hackers to buy gifts for themselves (or other more nefarious things). Phishing scams are also commonly used for identity theft – the phishing sites trick users into revealing personal information which is then used to apply for credit cards, issue passports, buy weapons, etc.

Most web browsers and email clients (e.g. Microsoft Outlook, Mozilla Thunderbird, Mac Mail, etc.), and client security programs (e.g. Norton Antivirus) have ways they try to identify phishing scams. But hackers are constantly figuring out new ways of bypassing or compromising every new tool that developers create, so most software should be updated regularly to keep it secure.

Network eavesdropping

Any time a client communicates with a server, the data is physically transmitted either via electric current in a wire or via radio waves in the air. There are ways hackers can intercept either of these means of communication.

Wireless communication is notoriously insecure. Anyone with a wifi card in their laptop can easily intercept unencrypted data being passed between the wireless router and other laptops. So some people encrypt the data that is passed between the two. The thinking goes that even if someone does intercept the signal, they won’t be able to understand it since it’s encrypted. However, WEP, the most commonly used encryption protocol available on wireless routers is known to be very weak encryption. WPA2 is supposedly a bit more secure, if it is available on your router. Another way to secure your wireless network is to set up your wireless router to only accept connections from computers with particular MAC addresses. Each computer has a unique MAC address that never changes.  Most new routers will have all of these options.

Wired communication, via ethernet cable, or other types of wires, can also be intercepted by someone who plugs into the same network as either the client or the server. Since all communication between client and server shares wires that also are used by other clients and other servers, it’s not crazy to imagine that someone could find a way to intercept and listen in on your conversation.

Like wireless communication, there are methods of encrypting communication over the wires so that even if someone does intercept communications, they won’t be able to easily decipher them.

Many web servers, especially for e-commerce sites, are called “secure servers”. Secure servers use the HTTPS protocol instead of the regular HTTP, so the URL will look like https://something.com, for example. Often, the checkout pages of online stores, or any page that asks the user to enter confidential information will be hosted on a secure server.

HTTPS encrypts the communication between the client and the server using the SSL encryption protocol. So the “secure server” is actually just encrypting the network communication between client and server, not securing the server itself against server attacks. The server and the client still have the same security risks as any other client or server. As with all encryption methods, SSL (and thereby HTTPS) can be hacked – a common exploit being the man-in-the-middle attack.

Further reading

http://www.w3.org/Security/Faq/
http://www.securityfocus.com/infocus/1864
http://www.windowsecurity.com/articles/Common_Attacks.html
http://www.icir.org/vern/cs294-28/scribe/WebClientAttacks.pdf
http://www.icir.org/vern/cs294-28/syllabus.html