phpHoo, Part I | 10 | WebReference

phpHoo, Part I | 10

phpHoo, Part I

Meta Characters and MySQL

Meta Characters are characters that have special meaning to both PHP and MySQL. Meta Characters include the following:

    Dollar Sign  : $
    Back Slash   : \
    Single Quote : '
    Double Quote : "
    Asterisk     : *
    Question Mark: ?
    Period       : .
    Caret        : ^
    Plus Sign    : +
    Brackets     : [ ]
    Parenthesis  : ( )

If any of these characters appear within the form data, they will need to be "escaped", or MySQL and PHP will barf up a lung on them. Knowing this, the PHP developers come to the rescue yet again.

Magic Quotes

Magic quotes is the PHP option to automatically escape meta characters in all operations that can be adversely affected by them, specifically SQL and Database operations. If Magic Quotes is enabled, PHP handles the escaping of Meta Characters automatically in the background every time you call a MySQL function. phpHoo assumes that Magic Quotes are ENABLED. If Magic Quotes is -not- enabled on your server, you'll have two options available to you.

    Option 1: change lines 2 and 3 of phpHoo.phtml to read as follows:
        02 if(!get_magic_quotes_runtime()) { set_magic_quotes_runtime(1); }
        03 if(!get_magic_quotes_gpc()) { set_magic_quotes_runtime(1); }
    Option 2: Use addslashes() on lines 253 through 258 of MySQL.php3
        253  $CatID = addslashes($postData["CatID"]);
        254  $Url = addslashes($postData["Url"]);
        255  $Description = addslashes($postData["Description"]);
        256  $LinkName = addslashes($postData["LinkName"]);
        257  $SubmitName = addslashes($postData["SubmitName"]);
        258  $SubmitEmail = addslashes($postData["SubmitEmail"]);

Option 1 may or may not work for you. It depends on how your PHP server was built. Option 2 is guaranteed to work regardless of how PHP was built. Warning - if magic_quotes is enabled, then using addslashes() will -cause- problems, not solve them.

Regardless of which option you choose, you'll notice that throughout the phpHoo.phtml source code, all MySQL data is sent through the stripslashes() function? This is because, although PHP will automatically -add- slashes for you, it will not automatically strip them from returned data. You have to do this on your own.

phpHoo is almost complete now. In it's form now, it is equivalent to Part II of the perlHoo tutorial and functionally identical to the first installment of Yahasp. It's time to push the envelope now.

Searching your database

Line 147 of phpHoo looks for the $KeyWords variable. If it is set, we're going to search the database. We make this look deceptively easy on line 150 with the following MySQL method call:

150 $hits = $db->search($KeyWords);

The search method in the MySQL class begins on line 200. It is the largest method in our class. Our keywords are going to arrive as a list of space separated strings.

Since browsers handle spaces and meta characters differently, we hedge our bets by passing the keywords list through the urldecode() function on line 206. The urldecode() function will convert "percent encoded" values to their ASCII equivalents. For example:

    Space Character: %20  =  " "
    'At' Symbol    : %40  =  "@"
    Ampersand      : %26  =  "&"
    Dollar Sign    : %24  =  "$"

To finish cleaning the data, we also pass it through the trim() function, which will remove any leading or trailing whitespace characters, like spaces and tabs. We also need to make sure there is only -1- space separating each keyword. The regular expression on line 207 takes care of this for us by converting multiple spaces into one single space.

Once we've finished cleaning the data, we can take a look at it. The regular expression and associated logic on lines 209 through 215 first check to see if there is more than one keyword. If there is only 1 keyword, it creates an array with only 1 element. If there is more than 1 keyword, it creates an array where each element in the array consists of a single keyword.

Now we can start our SQL query on line 217. DISTINCT is used in this SQL query for a very good reason. If there are 5 keywords, one link in the database may match all 5 keywords. MySQL doesn't care about this and will return the same link 5 times, once for each keyword. By using the DISTINCT directive, we're telling MySQL to only return a link once and then ignore it if the link matches more keywords.

On line 218 we start assembling our search logic by counting the number of keywords in our search array. If there is only 1 keyword, the SQL query gets completed. We're searching the Url, LinkName, and Description fields for words that are LIKE our keyword. LIKE is a comparison operator used specifically for string data types. The actual syntax we'll use is as follows:

(LinkName LIKE '%$keyword%')

The percent sign in MySQL is interpreted as a wild card which means "anything or nothing". The percent sign in MySQL is the same as the period in Perl regular expressions. By placing a percent sign at the beginning and end of our keyword, we're telling MySQL to look for exact matches, or sub-sets. So if our search was "LIKE '%house%'", the following would all return as hits:

House, houses, housed, house-of-ill-repute, Warehouse, household

Notice that the search is case-insensitive?

Case insensitivity is based on our MySQL data type. TEXT data types in MySQL are always compared case-insensitively. BLOB data types in MySQL hold the same type of string data as TEXT data types, however, MySQL comparisons on BLOB data types are always done in a case-sensitive fashion.

If there is more than one keyword in our search, we fall into the "else" condition beginning on line 224 . I decided to use a logical OR for all our operations. This means that if there are 5 keywords, any link that matches 1 of the 5 will return as a hit. If you wanted to change this to where a link has to match ALL keywords, just change the logical OR to an AND. Since we can have a variable number of keywords, we have to set up each logical condition within it's own set of parenthesis and keep track of where we are within these parenthesis, so a counter called '$ticker' is created to keep track of where we are in the array by comparing it to the $count we created on line 218. So long as our $ticker does NOT equal $count, we've got more keywords to go, so a logical OR is placed at the end of the SQL segment on line 233.

The moment $ticker and $count are equal, we know we're at the last keyword and won't need another OR at the end of the SQL string.

We fall out of the loop at line 239 and finish off our SQL statement with an 'ORDER BY LinkName' directive. This will order our hits by their LinkName in alphabetical order.

Line 204 and line 243 work together. If you'd like to "see" the completed MySQL query, set $DEBUG on line 204 to equal a carriage return "\n". Line 243 will catch this and will actually print out the completed MySQL statement to the browser for you to look at. Obviously you only want to do this while you're testing the program. As an example, here's a completed SQL query using 3 keywords: ( php webreference yahoo)

SELECT DISTINCT LinkID,CatID,Url,LinkName,Description FROM Links WHERE (Approved != 0) AND ( 
  ( (Description LIKE '%php%') OR (LinkName LIKE '%php%') OR (Url LIKE '%php%') ) OR 
  ( (Description LIKE '%webreference%') OR (LinkName LIKE '%webreference%') OR (Url LIKE '%webreference%') ) OR 
  ( (Description LIKE '%yahoo%') OR (LinkName LIKE '%yahoo%') OR (Url LIKE '%yahoo%') ) 
  ) ORDER BY LinkName

And here's the same query with only 1 keyword, (php)

SELECT DISTINCT LinkID,CatID,Url,LinkName,Description FROM Links WHERE (Approved != 0) AND (
  (Description LIKE '%php%') OR (LinkName LIKE '%php%') OR (Url LIKE '%php%') ) ORDER BY LinkName 

To show you the type of speed advantage MySQL can grant, enter those two queries directly into your MySQL interface:

    Five Keywords :  6 rows in set (0.01 sec)
    Single Keyword:  5 rows in set (0.00 sec)

One one-hundredth of a second to search your entire database is about average. Now, please stop drooling on the keyboard, we have a little more work to do yet. :-)

The entire query, once completed, is sent to our select() method and the results are immediately returned to the phpHoo program on line 150 as the $hits array. If there are no hits, lines 151 through 156 will start the Search Results page with a "No Matches" message. Notice that we haven't sent the start_browse() function yet here.

If there are hits, lines 157 through 180 will start the page (start_page()) and walk through the returned results printing the associated HTML to the browser. We fall out of the loop at line 181, print a page break, and call start_browse() to finish off the page with the "Top" level output.

The last 5 lines of phpHoo are here strictly as a "last resort". If all our logic fails and phpHoo doesn't have any data to work with, it will act as if the user has just loaded phpHoo and call the start_page and start_browse functions. In theory, phpHoo will never execute the commands between lines 184 and 189. It's still a good idea to have a "what do I do if everything else goes horribly wrong" condition in your programs. Line 190 completes phpHoo.

Produced by Jonathan Eisenzopf and
Created: July 20, 1999
Revised: July 20, 1999