Since I knew that the MySQL Query Cache used the literal queries as keys, it made sense that MySQL did not cache queries with certain SQL functions in them, such as this one:
1 | $sql = "select event_id from events where event_dt >= curdate()"; |
Because MySQL knows that this query run today is not the same query when it is run tomorrow. There are other SQL functions such as rand() and unix_timestamp() that will bypass the query cache. These are listed here.
So I avoid these functions when possible by calculating the value in PHP. For example, I’d rewrite the above query as:
1 2 | $date = date('Y-m-d'); $sql = "select event_id from events where event_dt >= '$date'"; |
At work, every project has an .htaccess file containing at the least some mod_rewrite rules. This way, all I need to do to run a project is check it out of version control. I don’t need to modify my local Apache configuration.
But turning this option on and allowing .htaccess files may be a performance hit. More specifically, enabling the AllowOverride option in Apache is a performance hit. The Apache docs sums up the problem best:
“Wherever in your URL-space you allow overrides (typically
.htaccessfiles) Apache will attempt to open.htaccessfor each filename component. For example,
1 2 3 4 DocumentRoot /www/htdocs <Directory /> AllowOverride all </Directory>and a request is made for the URI
/index.html. Then Apache will attempt to open/.htaccess,/www/.htaccess, and/www/htdocs/.htaccess.”
So I disabled all .htaccess files in production, and inserted each file’s individual mod_rewrite rules into the main Apache config file. After a quick Apache Bench run, one project looked around 3% faster. Note that there are a few other useful optimizations on that page.
In MySQL (and MySQL only AFAIK), INSERT has a clause called ON DUPLICATE KEY UDPATE. When ON DUPLICATE KEY UPDATE is used with INSERT, the insert will update the record if a value for a unique or primary key already exists, or else create a record if the value does not exist. So now when a form can either create a new something or edit an existing something, you can use one query to do it, and not have to query to see if the something exists already.
So instead of this:
$num = $db->getone("select count(*) from events where event_id = 15"); if ($num > 0) { $sql = "update events set name = 'New name' where event_id = 15"; } else { $sql = "insert into events (event_id, name) values (null, 'New name')"; }
you can do something like this:
$id = (empty($_POST['id'])) ? null : $_POST['id']; $sql = "insert into events (event_id, name) values ($id, 'New name') on duplicate key update name = 'New name'";
Neat, eh?
In case you’re wondering what the difference is between ON DUPLICATE KEY UPDATE and a REPLACE query, a REPLACE fires a DELETE followed by an INSERT query, as opposed to a real UPDATE.
If you find yourself wanting to echo or print something to the screen, go ahead and do it, but be sure to add a debug-level logging call for the info too. Chances are, you or someone else will want to see the same info sometime in the future.
When looking for something in an array of values, it is very tempting to use in_array(). After all, that’s what the name says. However, searching through an array, even with best-case search algorithms, will never be faster than a single index lookup, which is where isset() comes in. With isset(), you can use one operation to see if a value exists, provided those values exist as keys. I don’t know if it’s truly random access, but it’s pretty darn close.
So, instead of something like this:
1 2 3 4 5 6 7 8 9 | $exclude = array(1, 4, 6, 8); for ($i = 0, $size = count($data); $i < $size; $i++) { if (in_array($data[$i]['id'], $exclude) { // do something } } |
do something like:
1 2 3 4 5 6 7 8 9 10 11 12 | $exclude[1] = true; $exclude[4] = true; $exclude[6] = true; $exclude[8] = true; for ($i, $size = count($data); $i < $size; $i++) { if (isset($exclude[$data[$i]['id']])) { // do something } } |
So does this make a difference? Let’s write a little benchmark script.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | #!/usr/bin/php <?php $haystack = array(); for ($i = 0; $i < 1000; $i++) { $haystack[] = rand(0, 1000); } $needles = array(); for ($i = 0; $i < 1000; $i++) { $needles[] = rand(0, 1000); } for ($i = 0; $i < 1000; $i++) { foreach ($needles as $needle) { if (in_array($needle, $haystack)); } } |
We fill two arrays with 1000 random integers. One is the haystack – what we will search through. The other is the list of needles – we want to search for each one. For each needle, we look for it in the haystack. Then, we repeat this 1000 times.
Executing this, the script takes around 37 seconds:
% time ./bench.php real 0m37.400s user 0m37.282s sys 0m0.068s
Now, let’s change the last for() loop to this:
15 16 17 18 19 20 21 22 | for ($i = 0; $i < 1000; $i++) { $tmp = array_flip($haystack); foreach ($needles as $needle) { if (isset($tmp[$needle])); } } |
The new output:
% time ./bench.php real 0m0.778s user 0m0.764s sys 0m0.008s
Execution time drops from around 37 seconds to 0.7 seconds.