Memory Exhaust Issue PHP and APM

A while back i wrote a small data aggregator that would do some background processing and populate a database with it’s calculated results. The main reason to implement this was that running the queries in real-time on the production server was causing memory exhaust issues. Once the new feature was done and deployed. We quickly noticed we were running into the same kind of issues again. And the script would halt almost each time it was executed.

Fatal error: Allowed memory size of xxxxxxxxx bytes exhausted (tried toallocate xxxxx bytes) in

After some hacking and optimization tricks. The script’s memory consumption seemed stable enough to deploy it to production. However. Each time we deployed it. We were hit by memory exhaust issues almost instantly. Time to do some debugging and code tracing.

And we’re able to narrow it down to a single mysql* (don’t get me started on the mysql* functions, so let’s ignore that for now) call. Namely mysql_fetch_field. Why on earth would this generate a memory exhaust issue on the server you might think. And that’s exactly what we were thinking.

The problem was. That the offset passed in as the field offset was incorrect. And resulted in a notice. These notices however were suppressed with the @ symbol. And therefor never noticeable during development. But that’s no reason for memory exhaust issues… right?

@mysql_fetch_field($result, $offset);

Right! Further inspection showed that in production PHP’s APM extension was loaded and active. And this was the real problem behind the memory exhaust issue. Thousands of queries were executed. And almost all of them were throwing an error notice for the fetch field function. Which was ignored by the @ symbol. But APM was still logging them. And apparently had a very hard time keeping up.

It looks like a bug in APM. But that’s another story. The issue was fixed by passing in the correct offset of course. Another nice example of logging bringing down a production server.

PHP 5.5.0 Released

PHP reached a new milestone 3 days ago. And as always it’s quite exciting news. Some of the new features are already outlined by Evert Pot’s post which you can find here. And all of this of course can be found on the wiki and in the change log.

I’ll just outline some of the new and exiting features, deprecated notices and removed functions.

Simplified password hashing API

Hashing passwords with md5() and sha1() just isn’t enough anymore. So a new, secure and easy way to hash passwords has been added. And besides hashing includes some other useful functions.

$password The password string to hash

$algorithm The hashing algorithm to use of which two are available at the moment

PASSWORD_DEFAULT which uses bcrypt
PASSWORD_BCRYPT which uses blowfish

$options makes it possible to add a salt or set the cost for the hashing algorithm

Hash password with default algorithmpassword_hash()
$password = 'test-password';
$hash = password_hash($password, PASSWORD_DEFAULT);
var_dump($hash);
string(60) "$2y$10$qGv1q5nT4F7HCtKSPPME2usrdJRcRpk9lEUMQsE8mqyDIy3fbJ4I."
Hash password with BLOWFISH algorithmpassword_hash()
$hash = password_hash($password, PASSWORD_BCRYPT);
var_dump($hash);
string(60) "$2y$10$XtpNO/tFjtkq4u3ghcpqXeSwbHZxDQDTXRHfWBnZsmowUVl/MQys2"
Hash password with BLOWFISH algorithm and optionspassword_hash()
$salt = mcrypt_create_iv(22, MCRYPT_DEV_URANDOM); 
$hash = password_hash($password, PASSWORD_BCRYPT, array("cost" => 14, "salt" => $salt));
var_dump($hash);
string(60) "$2y$14$6ZtnYJ0CyqCUx.vJu3MZEuUGgIN.ryxMa0Yh8BnCrbBDVnd3Me30i"

Verify if a hash and password match. Return true if they do and false if they don’t

Verify hashpassword_verify()
$hash = password_hash($password, PASSWORD_DEFAULT);

if (password_verify($password, $hash)) {
  echo 'Password is correct';
} else {
  echo 'Password is incorrect';
}
Password is correct

Get information about a valid hash created with password_hash(). The function returns an array with the use d algorithm and options

Retrieve $hash informationpassword_get_info()
$info = password_get_info($hash);
array(3) {
  ["algo"]=>
  int(1)
  ["algoName"]=>
  string(6) "bcrypt"
  ["options"]=>
  array(1) {
    ["cost"]=>
    int(10)
  }
}

Check if the supplied hash was generated by the provided algorithm and options. This might come in handy when the hash needs to be updated

Check if a $hash needs to be rehashedpassword_needs_rehash()
if (password_needs_rehash($hash, PASSWORD_BCRYPT, array('cost' => 8))) {
  // Update the password hash
}

Support for constant array/string dereferencing

If you work with objects you might have worked with object dereferencing. This Is used to chain method calls. Or the so called fluent interfaces

Dereference object call
$obj->method()->returnObjMethod();

Now the same is possible for arrays and strings

Dereference strings / array
echo array(1, 2, 3)[0]; //output 1
echo "foobar"[2]; //output o
echo "foobar"["foo"][0] // output f
 
echo [1,3,4][2]; //output 4

Class Name Resolution As Scalar Via “class” Keyword

An easy way to get the full class name

Class name resolution via ::class
namespace Vodka\Crypt;

Class HashBuilder {}

use Vodka\Crypt\HashBuilder;
         
var_dump(HashBuilder::class);
string(23) "Vodka\Crypt\HashBuilder"

Support for using empty() on the result of function calls and other expressions

Normally empty() and isset() could only be used on variables. In 5.5 it’s possible to test expression and return values from functions

5.3

Call empty() with closure
var_dump( empty(function() {}) );

PHP Parse error: syntax error, unexpected T_FUNCTION in

Call empty() with function return value
        
function foo() {}
var_dump(empty(foo()));

PHP Fatal error: Can’t use function return value in write context in

5.5

Call empty() with function return value
function foo($val) { 
  return $val; 
}

var_dump( empty(foo([])) );
var_dump( empty(foo(true)) );
bool(true)
bool(false)

Support for list in foreach

List has been added for foreach loops. Great for eliminating unused variables.

Support for list in foreach
$messages = array(
  array('id' => 1, 'body' => 'test-1', 'code' => 12),
  array('id' => 2, 'body' => 'test-2', 'code' => 12),
  array('id' => 3, 'body' => 'test-3', 'code' => 10)
);

// Before
foreach ($messages as $message) {
  list($id, $body) = $message;
}
         
// After
foreach ($messages as list($id, $body)) {}

Zend Opcache extension and enable building it by default

My short post about Zend Optimizer+ in February this year. Kind of slipped my mind. And i was somehow under the impression APC would be integrated. But this of course has to be Zend Optimizer+. Finally an opcode cacher available be default. And configurable from php.ini

[opcache]
; Determines if Zend OPCache is enabled
opcache.enable=0
opcache.enable_cli=0

; The OPcache shared memory storage size.
opcache.memory_consumption=64

; The amount of memory for interned strings in Mbytes.
opcache.interned_strings_buffer=4

; Max files in OPCode cache, use a number between 200 and 100000.
opcache.max_accelerated_files=2000

; The maximum percentage of “wasted” memory until a restart is scheduled.
opcache.max_wasted_percentage=5

; Append current working dir to script name
opcache.use_cwd=1

How often a file should be validated
opcache.revalidate_freq=2

; Enables or disables file search in include_path optimization
opcache.revalidate_path=0

; Drop all PHPDoc comments
opcache.save_comments=1

array_column function which returns a column in a multidimensional array

Fetching a column from a multi-dimensional array is now possible with a single function call.

Fetch a column from a multi-dimensional arrayarray_column()
$nestedArray = array(
  array('id' => 1, 'body' => 'test-1', 'code' => 12),
  array('id' => 2, 'body' => 'test-2', 'code' => 12),
  array('id' => 3, 'body' => 'test-3', 'code' => 10)
);
$columns = array_column($nestedArray, 'code');
print_r($columns);
Array
(
  [0] => 12
  [1] => 12
  [2] => 10
)

Or fetch status code indexed by id

Fetch a column from a multi-dimensional array index by another fieldarray_column()
$columns = array_column($nestedArray, 'code', 'id');
print_r($columns);
Array
(
  [145] => 12
  [20098] => 12
  [34] => 10
)

deprecated

The following mcrypt functions have been deprecated mcrypt_ecb(), mcrypt_cbc(), mcrypt_cfb(), mcrypt_ofb() and will now throw E_DEPRECATED.

The mysql extension has finally been deprecated, and deprecation warnings will be generated when connections are established to databases via mysql_connect(), mysql_pconnect()

use MySQLi or PDO_MySQL extensions instead.

removed

The following (not so useful) functions have been removed from the core php_logo_guid(), php_egg_logo_guid(), php_real_logo_guid(), zend_logo_guid(). And support for the ancient operating systems Windows XP and 2003 has been dropped!

Install 5.5 on Ubuntu (experimental)

If you want to experience the new version first hand and you work on Ubuntu. You can add the experimental PPA and give it a shot.

sudo add-apt-repository ppa:ondrej/php5-experimental
sudo apt-get update
sudo apt-get install php5

Casting Weirdness With PHP

When my coworker today asked if he could cast an array to a object. I couldn’t really answer the question. Don’t think i ever done that. So let’s try. right?

Cast array to object
$arr = array('foo' => 'baz');
$obj = (object) $arr;

var_dump($obj);
object(stdClass)#1 (1) {
  ["foo"]=>
  string(3) "baz"
}

Ha that’s cool. It actually works. But wait. What happens when we use a numeric index?

Cast array to object
$arr = array(0 => 'bar', 'foo' => 'baz');
$obj = (object) $arr;

var_dump($obj);
object(stdClass)#1 (2) {
  [0]=>
  string(3) "bar"
  ["foo"]=>
  string(3) "baz"
}

WTF? We just created $obj->0 which should not be allowed in PHP as far as i know. So let’s make sure i am not mistaking.

Assign value to numeric class property
$foo = new stdClass();
$foo->0 = 'bar';

PHP Parse error: syntax error, unexpected ‘0’ (T_LNUMBER), expecting identifier (T_STRING) or variable (T_VARIABLE) or ‘{’ or ‘$’ in foo.php on line 5

But casting the array didn’t complain about a thing. Can we access this property? Well! Not calling it directly. At least not that i know of. But looping over the object’s properties does seem to work.

Loop object properties
// Parse error
// var_dump($obj->0);

foreach ($obj as $key => $val) {
  var_dump($key);
  var_dump($obj->$key);
}
int(0)
PHP Notice:  Undefined property: stdClass::$0 in foo.php on line 14
NULL
string(3) "foo"
string(3) "baz"

It’s not that i was planning on ever using this. Or advising other people to use it. In the contrary. But i guess it’s not completely useless. But care is required when doing so.

Zend Optimizer+ Integrated Into 5.5

It looks like PHP is going to get another built in performance boost. The plan is to integrate and open source the Zend Optimizer+ component. The fastest Opcode cacher out there. Zend has provided the source. Which is already available on github. So we get the Opcode cacher to boost out applications even more. And the community can extend and built upon the available source code. Win! Win!

PHP 5+ has come a long way performance wise without the Optimizer component. But being able to apply opcode cache out of the box is a great addition. And so far the benchmarks look promising. The Optimizer+ component beats APC hands down in all performed tests.

Read more about Opcode caching and the integration process here.

PHPness Serious WTF

It’s been a while since i last blogged. And i don’t really have anything new to offer. But i would like to comment on some current event. There has been some commotion about a free t shirt that was distributed on the SunshinePHP conference this year.

The first time i saw the shirt i could not make out what was so offending that people would start blogging about it. Somebody had to point that out for me. So what’s the deal here? Apparently the shirt’s slogan was intentionally created with a slight sexual undertone. The idea comes from the enlarge your penis SPAM we all know!

Some people have perceived this as being sexist. And even go so far as to personally attack the people behind the shirt.

Personally i don’t think the shirt is sexist or offensive at all. And don’t really understand all the fuss. Sex has never been an issue for me in the PHP community. I learn just as much from female developers as i do from the male ones. We are all equal. People that feel the need to defend the opposite sex are probably the ones that see a difference. Even so! Why make such a big deal of something so small? I never seen anybody blog about the programming with attitude poster.

I think the PHP community has just lost a bit of it’s shine and coolness. I’ve always been under the impression the PHP community consisted of free thinking people that do what they love. Develop stuff. And that’s exactly what attracted me to it 12 years ago in the first place.

The community has grown quite a lot in that period of time. The thing with every communities is that it brings forth vocal people. Which is a great thing in general. But not so good when people forget their responsibility towards the rest of the community. And that’s exactly what happened the last few days. Personal issues have become public. To me it seems many of you have missed the boat completely. You don’t have to like the shirt. But there is absolutely no reason to bash the initiative in public.

Here are some of the articles the spawned over the last couple days:

PHPness Gate – raising interesting issues

On Public Outrage And Bad Actors

Sexism in the PHP Community

On Sexism/Racism/Any-other-ism and the PHP Community

Sexism and PHP

Hacker news - Sexism and PHP (calevans.com)

Web & PHP sexist tshirt debate – my 2 cents

On Equality, Sexism and an Even Hand

So let’s all just respect one another. And keep the PHP community a friendly place. This kind of nonsense has no positive impact on anybody.

Publish Google Reader Shared Items

A lot of people are using the shared items feature in Google Reader to publish what they are reading on a blog. It’s like having a live blogroll widget on your website. And gives your readers a good impression of what you are actually interested in.

I never bothered adding this to my blog. But i do like to share. So last night i was trying to figure out how to integrate this with my current blog. I was hoping for an easy implementation. But the information is scarce. So i had to do some digging. I did find a couple of Wordpress plugins. But most of them were not updated for at least a year. And i am not putting code like that life on my website.

After some Googling i came across a couple of feed URL’s that seem to share their shared items in feed form. Using the Google Reader URL and a user id. So the URL for my shared items would look like the one below. Getting your user id by the way is easy. Go to your Google reader page. Click all All items and the UID will show up in the address bar.

http://www.google.com/reader/shared/16525759780220726764

The problem with this however. Non of my shared posts show up on this page. And i have not figured out a way to populate it just yet. So i came up with an other path to get this data on my server.

In the Google Reader pages it’s possible to add sharing functionality. This is done by going to the Send To tab

From here it’s possible to select a service to share data with. Non of these services are any good for what i am trying to do. But the great thing about this page is. You can provide custom URL’s for sharing data. Just incorporate the specified parameters in the URL and your done.

Once configured and saved a custom URL is visible in the Send To panel

That’s it for this part. Actually sharing items is quite easy now. When back in the Google Reader under each post there is Send to link. If you click this link the newly created custom URL will show up. And you can share the post.

The only thing left to do is write some code to process the incoming data. To get you started. Some of the code i used while testing this is posted below

GoogleReaderShare class
Class GoogleReaderShare
{
  const GOOGLE_REMOTE_IP = '95.97.54.3';
            
  const GOOGLE_REFERER = 'http://www.google.nl/reader/view/';
            
  protected static $_whitelist = array('source', 'title', 'url', 'short');
            
  public static function CheckSource($remoteAddr, $referer) 
  {
    if (($remoteAddr == self::GOOGLE_REMOTE_IP) 
        && ($referer == self::GOOGLE_REFERER)) {
      return true;
    }
    return false;
  }
            
  protected static function _CheckIncomingData($data)
  {
    if (!is_array($data)) {
      return false;
    }
                            
    foreach ($data as $key => $value) {
      if (!in_array($key, self::$_whitelist)) {
        return false;
      }
    }
  }
            
  public static function Process($data) 
  {
    if (!self::_CheckIncomingData($data)) {
      throw new Exception("Unrecognized or no incoming data");
    }
                    
    // process the data
  }
 
  if (GoogleReaderShare::CheckSource($_SERVER['REMOTE_ADDR'], $_SERVER['HTTP_REFERER'])) {
    GoogleReaderShare::Process($_GET);
  }

Long Running PHP Script and MySQL Server Gone Away

At the moment i am writing some workers that interact with a beanstalk server. When data gets pushed into the beanstalk server. My workers will be triggered to process this data. So the workers are sitting idle for most of the time. And just wait for some data to process. No rocket science there. But during testing i kept running into MySQL issues. The script seemed to lose connection to the MySQL server when it was idle for longer then a minute. And would respond with a

Warning: mysql_query(): MySQL server has gone away in /path/to/some/mysql4/class.php on line xxx

So whats going on here? Well actually it’s quite simple. When my script initializes the database connection it doesn’t use it. It just sits there waiting for incoming data. Once received it will process it and try to store it in the database. But when the waiting period exceeds the time for PHP to keep the MySQL connection open it responds with the warning mentioned above. Now in previous versions of PHP this would not be a big issue. As PHP would just initiate a reconnect when the connection is lost. But from PHP 5.0.3 and up this functionality has been disabled by default. For MySQLI this is no problem at all.

MySQLi:

mysqli.reconnect = Off to On

Unfortunately i am working with PHP’s core mysql_* functions (don’t get me started) and there doesn’t seem to be an easy way to resolve this. According to the MySQL documentation

mysql_options(&mysql, MYSQL_OPT_RECONNECT, &reconnect);

Should do the trick. But passing MYSQL_OPT_RECONNECT in anyway to mysql_connect didn’t give me the result i was looking for. So what now? Porting the database code to make use of the newer and better PDO or MySQLi is no option. As it would consume way to much time. Fortunately the mysql_* core functions come with mysql_ping. I never had to use this before. But in this case it comes in quite handy.

From the PHP manual:

bool mysql_ping ([ resource $link_identifier = NULL ] )

Checks whether or not the connection to the server is working. If it has gone down, an automatic reconnection is attempted. This function can be used by scripts that remain idle for a long while, to check whether or not the server has closed the connection and reconnect if necessary.

Note: Automatic reconnection is disabled by default in versions of MySQL >= 5.0.3.

Adding the mysql_ping function call was rather straight forward. And didn’t require all that much work to be done. I extended the database class to include a ping method. That would simply throw an exception when it failed to reconnect.

MySQL ping function
public function ping()
{
  if (!mysql_ping($this->db_connect_id)) {
    throw new Mollie_Database_Exception("Connection was lost");
  }        
  return true;
}

And after that i started poking around the userland implementation. The worker is running inside a while(true) loop. A first test with ->ping() being called inside this loop proved to resolve the issue at hand. But running the ping function that often is overkill. And who knows it might actually result in DoS of the database server. So i decided to ping the server every one minute or so.

Keep the connection alive
$start = time();
while (true) 
{
  $this->_keepConnectionAlive($start);
}

And the keepConnectionAlive method looks something like this

keepConnectionAlive method
protected function _keepConnectionAlive(&$start)
{
  $passed = (time() - $start);
  if ($passed > 60)
  {
    $start = time();
    $this->_db->ping();
  }
}

I’m not a big fan of this solution. And would rather be implementing MySQLi functions. But this functions well. And will do for now.

Git: Rebase on Pull by Default

I was a bit surprised today when git presented me a merge message window after i did a pull. This should not happen as the normal behavior here should be to rebase the changes. But apparently that didn’t happen this time. When i asked one of the other dev ‘s what could be the issue. We quickly figured out i was just missing some config entry in .git/config. This probably happened some time ago when i did a fresh checkout.

So to make sure rebasing is done by default. You can run a simple git command or modify the .git/config file manually.

In git >= 1.7.9

git config –global pull.rebase true

In git < 1.7.9

$ git config branch.autosetuprebase always

or do

$ vi .git/config

Make sure [branch “master”] has rebase set to true. It should look like the snippet below.

[branch “master”]
remote = origin
merge = refs/heads/master
rebase = true

Starting to really like git. A couple of more quirks and things will be running fine.

Workaround Ubuntu 12.04 Mysterious System Freezes

I have two machines running Ubuntu 12.04. One of them is very stable. And hardly ever gets rebooted. The other machine is displaying some odd behavior every now and then. And with odd behavior i mean. It just completely freezing up. The only thing functioning at that moment is the mouse.

I use this machine daily. And i can’t afford to lose work due to system crashes. So i could spend numerous hours trying to figure out what’s going on. But that’s probably something better left for the Ubuntu devs them selfs. Besides that there are plenty of bug reports floating around that describe this behavior. And one of those posts seemed to resolve the issue for me.

Apparently the bug that causes these crashes is fixed in 12.10. But the changes will only be backported when 12.10 is released. So that leaves me in quite a pickle. But according to the thread. Upgrading the kernel should do the trick. So that’s exactly what i did. The kernel packages i used can be found here http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-precise/

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-precise/linux-headers-3.4.0-030400_3.4.0-030400.201205210521_all.deb
$ sudo dpkg -i linux-headers-3.4.0-030400_3.4.0-030400.201205210521_all.deb

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-precise/linux-headers-3.4.0-030400-generic_3.4.0-030400.201205210521_amd64.deb
$ sudo dpkg -i linux-headers-3.4.0-030400-generic_3.4.0-030400.201205210521_amd64.deb

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-precise/linux-image-3.4.0-030400-generic_3.4.0-030400.201205210521_amd64.deb
$ sudo dpkg -i linux-image-3.4.0-030400-generic_3.4.0-030400.201205210521_amd64.deb

This comes with a downside of course. All modules compiled for the current kernel need to be recompiled. And i haven’t figured out how to rebuild all of them at once. So i just ran the command below for VirtualBox and the NVidia drivers.

$ sudo dpkg-reconfigure package-name

It’s probably a better idea to keep the stable kernel for now. But if system crashes are really bugging you. Then this might resolve the issue. Just be careful.

Fixing Character Replacement in Wordpress

This has been bugging me for a while. But not enough to actually look into it. And Google searches for Wordpress display wrong characters results in a whole forest of threads about UTF-8 encoding. This has nothing to do with that. So what’s the issue here?

Wordpress is replacing characters in my posts. The most annoying being the replacement of – is with - . Which makes no sense at all. Specially when creating code samples. Or command line parameters. My thought is why on earth would you do that? But it probably has something to do with typography…

Anyways. Searching for it today made me stumble onto this post from 2007 by Paul Betts. And believe it or not. Wordpress is still doing that. The code has changed slightly tough. So i created a quick patch as a temporary solution. And need to figure out if this can be circumvented by a plugin or something. If that doesn’t exists already :)

The patch will replace the following lines

Original replacement arrays
$static_characters = array_merge( array( '---', ' -- ', '--', ' - ', 'xn-', '...', '``', '\'\'', ' (tm)' ), $cockney );
$static_replacements = array_merge( array( $em_dash, ' ' . $em_dash . ' ', $en_dash, ' ' . $en_dash . ' ', 'xn--', '', $opening_quote, 
  $closing_quote, '' ), $cockneyreplace 
);

With

Updated replacement arrays
$static_characters = $cockney;
$static_replacements = $cockneyreplace;
Copyright © 2015 - Thijs Lensselink - Powered by Octopress