PHP5 Filesize Limit on 32-bit System

So we have a PHP based importer script that does some heavy duty media processing at the office. And i had to import some new media today. But for some reason a couple of files weren’t picked up without a message. So i cleaned up the upload folder. The only files left were the files not being processed. And when i started the importer. The result was.

Importer found (0) files to import!

Hmmm. That’s not right. So i had a look at the code behind the importer. Which basically is a loop using a DirectoryIterator object. And some var_dump calls revealed the issue. For some reason ->isFile() was returning (false) for regular files. WTF! Let’s test that on the command line.

$ php -r “var_dump(is_file(‘/some/file.ext’));”; bool(false)

Ok so we have an issue here. How big are these files really. A inspection revealed they are all over 2GB. Maybe some 32 bit issue? As the platform the code is running on is a 32 bit server. So i asked my colleagues, Googled a bit and read through php.net. To find out that there is an issue with PHP and files larger then 2GB.

https://bugs.php.net/bug.php?id=27792 https://bugs.php.net/bug.php?id=48886 http://nl.php.net/manual/en/function.filesize.php

Those however all seem related to filesize. The filesize function manual page even has a note about it. Maybe it’s related?

Note: Because PHP’s integer type is signed and many platforms use 32bit integers, filesize() ** may return unexpected results for files which are larger than **2GB. For files between 2GB and 4GB in size this can usually be overcome by using sprintf(“%u”, filesize($file)).

But i can’t apply that patch on a production server. So i came up with a simple solution for now. I extended the DirectoryIterator class and have overwritten the isFile method. Which works for now (don’t think this will work on windows).

Class MyDirectoryIterator extends DirectoryIterator {
  public function isFile() {
    return (integer) exec("[ -f {$this->getPathname()} ] && echo 1 || echo 0");
  }
}

Convinced it was a 32 bit issue. I came home later that day. And wanted to try it out on my own desktop. That is a 32 bit system and runs Ubuntu 11.04. To my surprise the result was different then i expected.

  • $ php -r “var_dump(is_file(‘/some/file.ext’));”;
  • bool(true)

I used the same files as before. And tested some more big files. But the result was the same. Weird. Let’s try some other 32 bit machines.

Ubuntu 11.04: bool(true)

CentOS release 5.6 (Final): bool(false) Debian 6.0.2 (squeeze): bool(false)

Only my desktop at home seems to have a good result. Ubuntu must have some patch somewhere to fix this issue? To confirm i compiled PHP 5.3.8 from source. And did the same test again on Ubuntu 11.04. And this time it was (false).

  • $ php -r “var_dump(is_file(‘/some/file.ext’));”;
  • bool(false)

I am not really in the mood to search the Ubuntu changelog. And for now the work around will do. But i really would like to know what patch is applied to resolve the issue.

[ update ]

While applying the patch for the is_file issue. I was confronted with the fact that way more function calls cause issues. So while waiting for PHP to get patched i had to create some workarounds for the time being.

Getting the filesize
(integer) exec("stat -c%s {$file->getFilename()}");
Calculate a MD5 checksum
$md5 = exec("md5sum {$file->getFilename()}");
$expl = explode('\t', $md5);
return (string) $expl[0];
Calculate the CRC32 checksum
$hash = exec("cksum {$this->path}");
$expl = explode(' ', $hash);
return $expl[0];
Get the modified time
$stat = explode('.', exec("stat -c%y {$this->path}"));
$timestamp = strtotime($stat[0]);
return $timestamp;

Hopefully that will do for now. On a side note the issue is solvable by setting certain CFLAGS when compiling PHP. I have no idea what the impact of that will be on the PHP binary. But it does seem to solve the issue. Not sure how one would apply that when PHP is installed from the distro’s repository though.

CFLAGS=”-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64” ./configure

comments powered by Disqus