Unmounting Samba shares after network disconnect

Recently I had some issues with my network switch, and because of that my home server which has some shares from NAS mounted started to work funky. I could not use the mounted shares, the load skyrocketed because of the IO wait, and I couldn’t unmount the share. After a moment of tinkering it appeared that I needed to add -l flag to mount command. So, to unmount the network share when it’s not accessible you can use this command:

umount -l /path/to/share

This flag means that umount will work in lazy mode, when the filesystem is detached immediately, but all the cleanup will be done when possible.

You can also use the following command to unmount all CIFS/Samba shares.

umount -a -t cifs -l

“Unknown job” errors when installing vmware tools in Ubuntu

Recently I’ve tried to install new vmware tools in Ubuntu 14.04.1 after upgrading VMware Workstation to version 10, and the build process crashed with errors like Unknown job: vmware-tools or Unknown job: vmware-tools-thinprint

After some debugging, I’ve found out that the problem was due to the fact that I was running the build process using sudo -s -E command, which gave me administrator rights, but kept some environment variables that messed initscripts-related commands (service, status and other from upstart package), on which build process relies. The solution is to build vmware-tools under clean environment for root user. To do that, gain superuser rights with sudo su - command, and then build vmware-tools as usual.

The tip above applies to general “unknown job” problem with starting/restarting services, not only to the vmware-tools build process.

Optimizing PHP code

Source codes
Original image with source codes and execution times

Yesterday Frédéric Bisson posted on Twitter an image with comparison of execution time of the same algorithm (Scrabble helper) implemented in Haskell, Python and PHP. The indicated time was respectively 0.6s, 0.3s and 5s. I know that PHP can be slow, but 5s was sluggish even for PHP. I had to do something about it, so I spent some time optimizing the code and checking possible solutions.

I’ve contacted the author of the image, and apart from receiving input data file for the script, I got some information: first of all, he was using PHP 5.3, with XDebug enabled. Those two things alone showed that the initial time measurement is off scale. The author disabled XDebug, and the number went down to 1.7s. Still quite high. I’ve tested it on a stock installation of Ubuntu’s PHP 5.5 (all extensions disabled), and it was about 0.86s. Slowest of those three in comparison, but it was not that “off chart” initial number.

But I wanted to know if maybe there was something terribly slow in the code. I used XDebug in profiling mode to generate cachegrind output. Yes, there were many function calls, but no individual call was exceptionally slow.

2014-08-08_15-18-26

As you can see above, most called function was substr. The function itself is not slow, but it was called over 1.5M times. I knew that function calls are generally quite expensive in PHP, so I wanted to reduce it however I can. First fix was very simple. I’ve replaced

substr($word, 0, 1)

with array access:

$word[0]

That itself reduced substr function call count to less than 1M, and gave about 20% speed boost (current execution time: ~0.69s). I also removed temporary variables used only once, but that gave almost unnoticeable gain. (code of the current version)

2014-08-08_15-19-03

But that was still not enough. Knowing that function calls are expensive, you might thing about rewriting recursive algorithm to an iterative version. That might not be fair while using other algorithm in other languages, but hey, you have to know your language’s strengths and weaknesses. That change was done by the author of the initial code.

This change lowered execution time to ~0.55s, which is 19% gain from the previous iteration, and 36% total gain.

Also, it was possible to rewrite closure to a form of a generator. That part was provided by Filip Górny on PHPers group on Facebook (PHPers is a network of Polish PHP meetups).

This code allowed me to go below 0.5s – lowest run time that I achieved was about 0.49s. Not a big difference from previous version, but nevertheless noticeable. At this point I’ve concluded by toying with the code. I was thinking about looking at opcode level, but I decided that it was too much.

I finished modifying the code, but I was curious if the newest achievements in PHP core could be of any help. I downloaded and compiled the so-called php-ng, a branch of PHP with experimental core modifications focused on performance improvements. As it was to be expected, I didn’t disappoint. The last version of the code under php-ng was running in about 0.235s. Now that was the number that satisfied me.

But I also wanted to be fair and check also the other new player in PHP world – HHVM from Facebook. HHVM has a different paradigm than php-ng – it is a virtual machine with Just-In-Time compiler. But whatever it is, it’s also known to work faster than the vanilla PHP interpreter, and after tests it proved worthy of its opinion. The code was running even faster than php-ng, with some run times even below 0.2s

And that concluded my tests. Below you can find a table with run times for all executors, with 20 consecutive runs for each one.

php-5.5 php-ng hhvm
t1 0.633 0.248 0.262
t2 0.621 0.241 0.249
t3 0.530 0.233 0.203
t4 0.592 0.324 0.204
t5 0.557 0.240 0.202
t6 0.612 0.229 0.294
t7 0.539 0.219 0.202
t8 0.611 0.337 0.203
t9 0.558 0.220 0.204
t10 0.538 0.218 0.235
t11 0.531 0.220 0.228
t12 0.516 0.248 0.206
t13 0.540 0.217 0.202
t14 0.542 0.225 0.205
t15 0.564 0.219 0.237
t16 0.537 0.312 0.204
t17 0.553 0.218 0.207
t18 0.568 0.221 0.223
t19 0.548 0.221 0.258
t20 0.658 0.312 0.201
min 0.516 0.217 0.201
max 0.658 0.337 0.294
avg 0.567 0.246 0.221
median 0.555 0.227 0.206

Now it’s time for some conclusions.

First one is obvious: PHP can be as fast as other languages (or, better, interpreters/virtual machines of the other languages).

Second, know thy language. I know that it might not be very ingenious, but languages differ. Syntax is most obvious difference, but also some languages excel in different fields than the others. Some languages prefer one structure over another. Here it was seen that it’s better to avoid too many function calls, so iterative algorithms are favored over recursive ones. Also, you have to know the ecosystem to know that there are alternative interpreters or virtual machines.

Third and last conclusion is that this kind of benchmark-type comparisons are pointless (only worse are synthetic tests). It’s very easy to omit some small issue and language intricacies, and the results will differ greatly. PHP is not a language that will be chosen for a large scale data analysis, and it won’t be chosen for a reason – there are better languages for that task. Is this a problem? Definitely not. There are no languages that excel in each and every application, and there’s no need for them to. We have a lot of different languages to choose, so let’s choose wisely appropriate language for a given application.

Apache sites not working after upgrade to Ubuntu 13.10

With new Ubuntu 13.10 Apache has new naming scheme for sites configuration files. Because of this your virtual servers might stop working, showing default site, which is “It works!” page, or whatever you have configured for your web server.

To fix the problem, you have to delete all old links in /etc/apache2/sites-enabled (maybe keeping 000-default.conf if you like), rename all site configuration files, residing in /etc/apache2/sites-available, to have .conf postfix, and then enable it again by manually creating a symbolic link to sites-enabled, or using a2ensite tool.

You can also use this simple shell script:

for i in `ls -1 /etc/apache2/sites-available | grep -v -e '.dpkg' -e '.conf$'`
do
   rm /etc/apache2/sites-enabled/$i
   mv /etc/apache2/sites-available/$i /etc/apache2/sites-available/$i.conf
   a2ensite $i
done

That should do the trick. Note that a2ensite accepts only the site name, without .conf postfix.

[ruby] “remove_entry_secure does not work” for /tmp

If you get that kind of error:

ArgumentError (parent directory is world writable, FileUtils#remove_entry_secure does not work; abort: "/tmp/gitosis20121120-26282-1q9qa73" (parent directory mode 40777)):

Check for permissions for /tmp directory. remove_entry_secure accepts world-writable directory in a path only if it’s /tmp, but only if that directory has 1777 permissions.

[PHP] Zend Server, PEAR, and PHAR error

Some time ago I was trying to install PEAR with default installation of Zend Server. To do that, you have to run a following command:

C:\Program Files (x86)\Zend\ZendServer\bin\go-pear.bat

That resulted in a problem with PHAR archive:

phar "C:\Program Files (x86)\Zend\ZendServer\bin\PEAR\go-pear.phar" does not have a signature
PHP Warning:  require_once(phar://go-pear.phar/index.php): failed to open stream: phar error: invalid url or non-existent phar "phar://go-pear.phar/index.php" in C:\Program Files (x86)\Zend\ZendServer\bin\PEAR\go-pear.phar on line 1236

To solve that problem, you have to either modify php.ini file, adding following directive:

phar.require_hash=0

Or, you can use it one-time only, setting that configuration option in command line:

C:\Program Files (x86)\Zend\ZendServer\bin\PEAR> php -d phar.require_hash=0 go-pear.phar

The latter form is preferred, as disabling checking signatures is considered a security flaw.

[Facebook] Obtaining your User ID

Your own Facebook User ID is required in many places, like setting admin ID on your page’s OpenGraph tags. Most of advice available on the web regarding getting your own Facebook ID suggest looking at links on your profile page, but that kind of advice is usually outdated and unusable if you have vanity URL (your username replaced user ID in URLs). You can look into the profile page source and bash through tons of code, and yes, you can find it looking for some strings inside the source code, like it’s pictured on a screenshot. But it’s a tedious job, and there are more elegant ways.

Specifically, you can use OpenGraph API Explorer, available here. Input field is already prepopulated with me value, that obviously tells the Explorer to show data for currently logged in user. But before this could work, you have to authorize API Explorer to access your data by clicking Get access token button, choosing minimal permissions (user_about_me is definitely enough – see this screenshot), and authorizing the application in a standard way, like any other with Facebook.

After granting permissions, Access Token field is populated automatically, and you can clickSubmit button to get information about your user. User ID will be the first field.

When you have your user ID, you can check if it’s a proper one on the same page – just enter your ID instead of me and hit Submit – you should see your basic data.

ZendCon 2011 Slides

Here are slides from ZendCon 2011 sessions. I’ll update the list as soon as I find missing slides or they are posted. I’ll try to add UnCon sessions too.

Monday, October 17th

  • PHP Extensions, Why and What?
    Derick Rethans[twitter]derickr[/twitter]
  • Creating and Using Streams, Filters and Sockets
    Elizabeth Marie Smith[twitter]auroraeosrose[/twitter]
  • Doctrine 2
    Juozas Kaziukenas[twitter]juokaz[/twitter]
  • PHP Components From Idea To Maturity
    Stuart Herbert[twitter]stuherbert[/twitter]
  • Zend PHP 5.3 Certification Boot Camp
    Christian Wenz[twitter]chwenz[/twitter]
  • Learning CouchDB
    Bradley Holt[twitter]BradleyHolt[/twitter]
  • Design Patterns in Action
    Stefan Priebsch[twitter]spriebsch[/twitter]
  • Beware of the Dark Side, Luke!
    Arne Blankerts[twitter]arneblankerts[/twitter]

Tuesday, October 18th

Wednesday, October 19th

Thursday, October 20th

UnCon

Last updated: 31 Oct

[VMWare] Device uses a controller that is not supported

Today I tried to deploy a Virtual Appliance from Bitnami on my ESXi server using ovftool:

ovftool.exe bitnami-xxx.vmx vi://192.168.0.100

But each time I had an error:

Device ‘VirtualDisk’ uses a controller that is not supported. This is a general limitation of the virtual machine’s virtual hardware version on the selected host.

What was funny, it was perfectly fine if I converted the virtual machine to OVA format and started it using… VirtualBox.

I googled a bit and found out that some people had that similar problems and what they did was to upgrade virtual hardware from version 4 to 7. It wasn’t a solution for me because I already had version 7 – so maybe the problem was that I had “too new” virtual hardware? I opened bitnami-xxx.vmx file with a text editor and there I had:

virtualHW.version = "4"

I changed it to “7” and voila – it worked!

[python] libxml2 xpath on child node

Let’s say you have XML like:

<clients>
   <client>
     <name>foo</name>
     <address>...</address>
     <email>...</email>
     <orders>
       <order>
          <id>id1</id>
          <items>...</items>
       </order>
       <order>
          <id>id2</id>
          <items>...</items>
       </order>
   </client>
   <client>
   ...
   </client>
</clients>

And now you’d like to get pairs order_id-client_name. And you’d like to make it in an elegant way, using xPath, not using DOM navigation, or worse, SAX parser. Getting all “client” nodes is easy:

import libxml2

doc = libxml2.parseFile('clients.xml')
ctxt = doc.xpathNewContext()
clients = ctxt.xpathEval('/clients/client')

# clean up nicely
doc.freeDoc()
ctxt.xpathFreeContext()

But now, how to run an xPath query on every node you found to get client name and orders? You have to tell the context object to change the scope of context, so the next query would be relative to the node you chose:

for client in clients:
    ctxt.setContextNode(client)
    client_name = ctxt.xpathEval('name')[0].getContent()
	
    orders = ctxt.xpathEval('orders/order')
    for order in orders:
        ctxt.setContextNode(order)
        orderId = ctxt.xpathEval('id')[0].getContent()
        print orderId+" "+client_name

And that’s it. I’m writing it because documentation for libxml2’s python bindings is scarce, and it took me a while to get to know about setContextNode method.

Complete script:

import libxml2

doc = libxml2.parseFile('clients.xml')
ctxt = doc.xpathNewContext()
clients = ctxt.xpathEval('//client')

for client in clients:
    ctxt.setContextNode(client)
    client_name = ctxt.xpathEval('name')[0].getContent()
	
    orders = ctxt.xpathEval('orders/order')
    for order in orders:
        ctxt.setContextNode(order)
        orderId = ctxt.xpathEval('id')[0].getContent()
        print orderId+" "+client_name

# clean up nicely
doc.freeDoc()
ctxt.xpathFreeContext()