Optimizing PHP code

Source codes
Original image with source codes and execution times

Yesterday Frédéric Bisson posted on Twitter an image with comparison of execution time of the same algorithm (Scrabble helper) implemented in Haskell, Python and PHP. The indicated time was respectively 0.6s, 0.3s and 5s. I know that PHP can be slow, but 5s was sluggish even for PHP. I had to do something about it, so I spent some time optimizing the code and checking possible solutions.

I’ve contacted the author of the image, and apart from receiving input data file for the script, I got some information: first of all, he was using PHP 5.3, with XDebug enabled. Those two things alone showed that the initial time measurement is off scale. The author disabled XDebug, and the number went down to 1.7s. Still quite high. I’ve tested it on a stock installation of Ubuntu’s PHP 5.5 (all extensions disabled), and it was about 0.86s. Slowest of those three in comparison, but it was not that “off chart” initial number.

But I wanted to know if maybe there was something terribly slow in the code. I used XDebug in profiling mode to generate cachegrind output. Yes, there were many function calls, but no individual call was exceptionally slow.

2014-08-08_15-18-26

As you can see above, most called function was substr. The function itself is not slow, but it was called over 1.5M times. I knew that function calls are generally quite expensive in PHP, so I wanted to reduce it however I can. First fix was very simple. I’ve replaced

substr($word, 0, 1)

with array access:

$word[0]

That itself reduced substr function call count to less than 1M, and gave about 20% speed boost (current execution time: ~0.69s). I also removed temporary variables used only once, but that gave almost unnoticeable gain. (code of the current version)

2014-08-08_15-19-03

But that was still not enough. Knowing that function calls are expensive, you might thing about rewriting recursive algorithm to an iterative version. That might not be fair while using other algorithm in other languages, but hey, you have to know your language’s strengths and weaknesses. That change was done by the author of the initial code.

This change lowered execution time to ~0.55s, which is 19% gain from the previous iteration, and 36% total gain.

Also, it was possible to rewrite closure to a form of a generator. That part was provided by Filip Górny on PHPers group on Facebook (PHPers is a network of Polish PHP meetups).

This code allowed me to go below 0.5s – lowest run time that I achieved was about 0.49s. Not a big difference from previous version, but nevertheless noticeable. At this point I’ve concluded by toying with the code. I was thinking about looking at opcode level, but I decided that it was too much.

I finished modifying the code, but I was curious if the newest achievements in PHP core could be of any help. I downloaded and compiled the so-called php-ng, a branch of PHP with experimental core modifications focused on performance improvements. As it was to be expected, I didn’t disappoint. The last version of the code under php-ng was running in about 0.235s. Now that was the number that satisfied me.

But I also wanted to be fair and check also the other new player in PHP world – HHVM from Facebook. HHVM has a different paradigm than php-ng – it is a virtual machine with Just-In-Time compiler. But whatever it is, it’s also known to work faster than the vanilla PHP interpreter, and after tests it proved worthy of its opinion. The code was running even faster than php-ng, with some run times even below 0.2s

And that concluded my tests. Below you can find a table with run times for all executors, with 20 consecutive runs for each one.

php-5.5 php-ng hhvm
t1 0.633 0.248 0.262
t2 0.621 0.241 0.249
t3 0.530 0.233 0.203
t4 0.592 0.324 0.204
t5 0.557 0.240 0.202
t6 0.612 0.229 0.294
t7 0.539 0.219 0.202
t8 0.611 0.337 0.203
t9 0.558 0.220 0.204
t10 0.538 0.218 0.235
t11 0.531 0.220 0.228
t12 0.516 0.248 0.206
t13 0.540 0.217 0.202
t14 0.542 0.225 0.205
t15 0.564 0.219 0.237
t16 0.537 0.312 0.204
t17 0.553 0.218 0.207
t18 0.568 0.221 0.223
t19 0.548 0.221 0.258
t20 0.658 0.312 0.201
min 0.516 0.217 0.201
max 0.658 0.337 0.294
avg 0.567 0.246 0.221
median 0.555 0.227 0.206

Now it’s time for some conclusions.

First one is obvious: PHP can be as fast as other languages (or, better, interpreters/virtual machines of the other languages).

Second, know thy language. I know that it might not be very ingenious, but languages differ. Syntax is most obvious difference, but also some languages excel in different fields than the others. Some languages prefer one structure over another. Here it was seen that it’s better to avoid too many function calls, so iterative algorithms are favored over recursive ones. Also, you have to know the ecosystem to know that there are alternative interpreters or virtual machines.

Third and last conclusion is that this kind of benchmark-type comparisons are pointless (only worse are synthetic tests). It’s very easy to omit some small issue and language intricacies, and the results will differ greatly. PHP is not a language that will be chosen for a large scale data analysis, and it won’t be chosen for a reason – there are better languages for that task. Is this a problem? Definitely not. There are no languages that excel in each and every application, and there’s no need for them to. We have a lot of different languages to choose, so let’s choose wisely appropriate language for a given application.

[PHP] Zend Server, PEAR, and PHAR error

Some time ago I was trying to install PEAR with default installation of Zend Server. To do that, you have to run a following command:

C:\Program Files (x86)\Zend\ZendServer\bin\go-pear.bat

That resulted in a problem with PHAR archive:

phar "C:\Program Files (x86)\Zend\ZendServer\bin\PEAR\go-pear.phar" does not have a signature
PHP Warning:  require_once(phar://go-pear.phar/index.php): failed to open stream: phar error: invalid url or non-existent phar "phar://go-pear.phar/index.php" in C:\Program Files (x86)\Zend\ZendServer\bin\PEAR\go-pear.phar on line 1236

To solve that problem, you have to either modify php.ini file, adding following directive:

phar.require_hash=0

Or, you can use it one-time only, setting that configuration option in command line:

C:\Program Files (x86)\Zend\ZendServer\bin\PEAR> php -d phar.require_hash=0 go-pear.phar

The latter form is preferred, as disabling checking signatures is considered a security flaw.

ZendCon 2011 Slides

Here are slides from ZendCon 2011 sessions. I’ll update the list as soon as I find missing slides or they are posted. I’ll try to add UnCon sessions too.

Monday, October 17th

  • PHP Extensions, Why and What?
    Derick Rethans[twitter]derickr[/twitter]
  • Creating and Using Streams, Filters and Sockets
    Elizabeth Marie Smith[twitter]auroraeosrose[/twitter]
  • Doctrine 2
    Juozas Kaziukenas[twitter]juokaz[/twitter]
  • PHP Components From Idea To Maturity
    Stuart Herbert[twitter]stuherbert[/twitter]
  • Zend PHP 5.3 Certification Boot Camp
    Christian Wenz[twitter]chwenz[/twitter]
  • Learning CouchDB
    Bradley Holt[twitter]BradleyHolt[/twitter]
  • Design Patterns in Action
    Stefan Priebsch[twitter]spriebsch[/twitter]
  • Beware of the Dark Side, Luke!
    Arne Blankerts[twitter]arneblankerts[/twitter]

Tuesday, October 18th

Wednesday, October 19th

Thursday, October 20th

UnCon

Last updated: 31 Oct

Zend PHP Certification

For some time I’ve been thinking about passing the Zend PHP exam, and having possibility to attend to Zend PHP 5.3 Certification tutorial session during the recent PHP/Zend Conference 2010 was the thing that pushed me to finally do it. I enrolled for the conference too late to get a free voucher for the examination, so I bought it myself after coming back home, and on Dec 13th I finally got the ZCE title (yay for me!).

What does it test?

At the ZendCon it was said that the PHP 5.3 update of the exam was a huge leap forward in the meaning of quality and thoroughness of testing. Before this version, to pass the exam it was necessary to more or less memorize the manual — function names, arguments and return values. As I didn’t pass any previous version I can’t comment this statement, but I know that on my set of questions I didn’t have many questions that could be labeled as “manual questions” — I remember one “what is the output of the following code” question that was somehow related to the knowledge of function arguments, but it was enough to know the capabilities of the function rather than “is it ‘f($haystack, $needle)’ or ‘f($needle, $haystack)'”, and other about a name of a configuration option, but a very popular and important one.

I found the exam very interesting, as the questions were not only about “dry” PHP code, but also about broadly defined web-related technologies. Databases, web security, web services, HTTP protocol, etc. Zend’s training department strongly stated that the exam is not only about knowing the function names, but about assessing if the person has all what it takes to be a good programmer. And I might confirm that — without proper experience in the web development field it is very difficult to pass the exam. I mean, you can memorize all the definitions of what XSS, CSRF, SQL join et al. is, but the questions are not about definitions, but about a proper understanding of concepts, like “would X solve the problem of CSRF”.

Is it difficult?

It depends. Even if you work with PHP applications on a daily basis, it’s still necessary to check preparation guides. Why? For instance, I’ve been using PHP since PHP3, but I didn’t need to use web services at all. And from study guides and the tutorial I mentioned before I knew that even if web services is not the most important part of the examination, I need to know at least something about all the parts of the test.

Most of the questions are simply “either you know the answer or you don’t”, and that way you can call the test easy. There are also some analytical questions, where you have to force yourself to do some thinking — but if you know how PHP works, it’s only a matter of time to get the proper answer.

The test consists of 70 questions, for which you have 90 minutes – it’s a more than a minute for a question. It’s not much, but as I’ve said before, for most of the questions either you know the answer or you don’t. That way you can go through the easy questions while marking the other for review (the examination software allows that), not wasting time on things you can’t answer from the top of your head.

Watch out for trick questions. It’s especially important on “what does this code print” — one single character can totally change the meaning of the code. If everything seem obvious, usually it’s not. Look at return values, variable scope, function calls etc.

My strategy for this test was:

  • First pass: easy questions, test questions (generally those which took me just a few seconds)
  • Second pass: difficult, analytical questions
  • Third pass: questions I didn’t know the answer for, but with the possibility of proper extrapolation/guessing (questions without any possibility of me answering other way than guessing fell into the first category).

With this method it took me about one hour to fill all the answers.

How to prepare

Zend offers courses preparing for the exam, but in my opinion it’s an overkill. If you only need to refresh and organize your knowledge, $1000 for the course (well, minus $195 for the included exam voucher) is a bit too much. If you need more training (e.g. you don’t have enough experience with PHP), this course wouldn’t help, as it is not designed to teach you PHP.

Everything you need to know about PHP to pass the exam can be found in the manual, which is great, but it contains knowledge organized in a way that is not very helpful while studying for the exam. After buying voucher for the examination you should receive PDF version of Zend Certification Study Guide, which can help you to plan your learning. Generally, I’d advise to read the whole basics section of the manual — variables, data types, control structures, OOP, new features of PHP 5.3 (LSB, namespaces), and to skim through index of some elementary functions (string, array) to know what is possible with PHP. Check the list of topics covered by the exam (to be found on Zend page) and make sure that you know at least basics of each entry.

One more word about the Study Guide provided by Zend – it’s poor. It’s poor mainly because it’s not finished and at some places you can find placeholders instead of some real information. It’s nice to get anything (especially if there are sample questions — and it’s very important to read them as they give you some outlook how the test looks like), but after a year that has passed since the test has been updated I’d expect something better.

What to focus on?

According to the rules of Zend Certification I can’t leak out any questions, but I can give some suggestions. Aside from the obvious, I’d recommend to review especially:

  • references
  • streams, contexts etc.
  • XML processing
  • OOP’s features like inheritance, static methods, LSB
  • PDO

SQL-related questions are quite easy, so if you know how to do select, insert and how inner join works, you’re good.

Any benefits?

A value of the certificate itself is debatable. It’s hard to tell if a potential employer will take that paper into consideration or not. Still I think it wouldn’t do any harm. It’s a proof that you meet a certain level of PHP (and web development in general) skills. Exactly like an English language certificate is to be verified on the first interview, your PHP skills will be verified sooner or later, but if you don’t have a proper entries in the résumé, you might not even be invited to the interview.

Few days ago there was a discussion about that on Twitter, and one of the guys said that having ZCE title means that you are not creative and you are wasting your time which could be spent on open source projects otherwise. I think its a very radical opinion. Having ZCE doesn’t exclude being involved in OSS projects. Also OSS projects are long term involvement that can’t be compared to one evening spent on a short recap. My opinion is — don’t put all your eggs in one basket. You can’t rely on the certificate solely, just like you can’t rely on open source projects. Working on public projects can be beneficiary to your skills (but — looking at some of the high-profile projects — it doesn’t have to), and is a nice point on your CV, but just like with anything, employer might don’t give a crap about OSS (and saying “those companies that don’t take open-source into consideration are evil” is childish, recruitment procedures in big companies might be far away from the nearest person that knows anything about computers).

Preparation for the exam can be a value itself. It’s a motivation to look into subjects one didn’t need before (for me it was PDO – I was using abstraction layers, so I didn’t need that), and make a review of features and changes one possibly didn’t know about.

I decided to get the certification because I wanted to have some written proof of my skills, as even a list of prior projects does not say anything about the quality of those projects.

Should I re-test if I have PHP4/PHP5 cert?

Well, it’s up to you. If the certificate is for your better self esteem – go ahead. If it’s for improving your position on the market, it’s like with the value of the certificate in general — employer might value you better if you have the newest version of the document, but he doesn’t have to; he might don’t know the difference between PHP5 and PHP5.3 ;)

Conclusion

In my opinion, if you have some time and two hundred bucks to spare – go ahead :)

[Linux] PHP not working in userdir (public_html)

Today I wanted to give my users possibility to test their PHP scripts, but without all the fuss with creating virtual hosts for each one of them. My first and obvious choice was userdir – user creates public_html directory in his home dir, puts there files, and those files are accessible via http://servername/~username/ URL. To enable this behavior you only have to enable userdir module (a2enmod userdir), and remember to set correct permissions to the userdir (chmod +x $HOME) and public_html (chmod 755 $HOME/public_html). I did this, and everything was working fine, except PHP scripts – browser wanted to download them instead of displaying proper processed content. It appeared that apache in Debian has by default PHP disabled for userdirs. To enable scripting in this dirctory, open file /etc/apache2/mods-enabled/php5.conf, find that piece of code:

    <IfModule mod_userdir.c>
        <Directory /home/*/public_html>
            php_admin_value engine Off
        </Directory>
    </IfModule>

and disable it, either by deleting or by commenting it out (precede each line with # sign). You can also change php_admin_value engine setting to On, but if you do that, you will be unable to turn off PHP engine in .htaccess files.

Defending PHP (or not)

Today I’ve read article “Defending PHP” by Jim R. Wilson. He begins saying Ugh. I am so tired of defending PHP. And I’m saying “I am so tired of people defending PHP”. Why? First of all, if everything is OK, the language defends itself, and if lot of people complain about it, maybe really something is wrong with PHP?
Continue reading Defending PHP (or not)