Parsing XML CDATA With SimpleXML

I was quite amazed today. When loading a SOAP response into a SimpleXMLElement i noticed some fields were left blank. I should have checked the SOAP response first. But instead told our Delphi guy that the response was not filled correctly. This was not the case :)

When we both saw the SOAP response was in perfect shape. We started to poke around on the PHP side. The strange thing was all tag names were taken from the response correctly. It was just the data that was missing. Then we noticed the data missing was inside CDATA tags.

From a first glance at the PHP manual it it wasn’t clear what was going on. So i did some googleing. And found a good post by David Coallier. This post solved the problem. The example showed how to add an extra LIBXML options to the simplexml_load_String method. Although David provided the solution. I still wanted to make post. Maybe it will help somebody.

// parsing with CDATA tags using the *_load_string method
$xml = simplexml_load_string($string, 'SimpleXMLElement', LIBXML_NOCDATA);

// parsing with CDATA tags using the OO way
$xml = new SimpleXMLElement($string, LIBXML_NOCDATA);

The LIBXML options that can be passed to the *_load methods and constructor can be found in the php documentation.

It’s pretty damn weird though. I want to parse the CDATA tags inside my XML. And can only do so by providing the NOCDATA option.

comments powered by Disqus