PHP JSON and XML Processing
Introduction to JSON and XML Processing
JSON (JavaScript Object Notation) and XML (eXtensible Markup Language) are the two most common data interchange formats in web development. PHP provides comprehensive built-in support for both formats, enabling seamless data exchange between applications, APIs, and services.
Understanding how to properly process JSON and XML is essential for modern web development, as these formats are used for API communication, configuration files, data storage, and integration with external services.
Why JSON and XML Matter
API Communication: Most modern web APIs use JSON for data exchange due to its lightweight nature and easy parsing.
Configuration Management: Both formats are commonly used for application configuration files and settings storage.
Data Integration: JSON and XML enable seamless data exchange between different systems and programming languages.
Web Services: SOAP services use XML, while REST APIs typically use JSON for request and response payloads.
Data Storage: NoSQL databases often store data in JSON format, while XML is used for document storage and markup.
JSON vs XML
JSON Advantages: Lighter weight, easier to read, native JavaScript support, faster parsing, less verbose syntax.
XML Advantages: Better validation support through schemas, namespace support, more metadata capabilities, comment support.
Use Cases: JSON for APIs and modern web applications, XML for document markup, legacy systems, and complex data structures requiring validation.
JSON Processing
Basic JSON Operations
<?php
/**
* PHP JSON Processing Fundamentals
*
* JSON encoding and decoding with error handling
* and various data type conversions.
*/
// Basic JSON encoding
$data = [
'name' => 'John Doe',
'email' => '[email protected]',
'age' => 30,
'active' => true,
'roles' => ['user', 'admin'],
'profile' => [
'bio' => 'Software developer',
'location' => 'New York'
]
];
$json = json_encode($data);
echo "Encoded JSON:\n$json\n\n";
// JSON encoding with options
$prettyJson = json_encode($data, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES);
echo "Pretty JSON:\n$prettyJson\n\n";
// Basic JSON decoding
$decodedArray = json_decode($json, true); // As associative array
$decodedObject = json_decode($json); // As object
echo "Decoded as array:\n";
print_r($decodedArray);
echo "Decoded as object:\n";
var_dump($decodedObject);
// Error handling
$invalidJson = '{"name": "John", "age": 30,}'; // Invalid trailing comma
$result = json_decode($invalidJson);
if (json_last_error() !== JSON_ERROR_NONE) {
echo "JSON Error: " . json_last_error_msg() . "\n";
}
?>
Advanced JSON Handling
<?php
/**
* Advanced JSON processing with validation and transformation
*/
class JsonProcessor
{
/**
* Safely encode data to JSON with comprehensive error handling
*/
public static function encode($data, int $flags = 0, int $depth = 512): string
{
$json = json_encode($data, $flags, $depth);
if (json_last_error() !== JSON_ERROR_NONE) {
throw new JsonException('JSON encoding error: ' . json_last_error_msg());
}
return $json;
}
/**
* Safely decode JSON with validation
*/
public static function decode(string $json, bool $associative = true, int $depth = 512, int $flags = 0)
{
$data = json_decode($json, $associative, $depth, $flags);
if (json_last_error() !== JSON_ERROR_NONE) {
throw new JsonException('JSON decoding error: ' . json_last_error_msg());
}
return $data;
}
/**
* Validate JSON structure against schema
*/
public static function validate(string $json, array $schema): array
{
$errors = [];
try {
$data = self::decode($json, true);
} catch (JsonException $e) {
return ['Invalid JSON format: ' . $e->getMessage()];
}
$errors = array_merge($errors, self::validateSchema($data, $schema, ''));
return $errors;
}
/**
* Recursive schema validation
*/
private static function validateSchema($data, array $schema, string $path): array
{
$errors = [];
foreach ($schema as $field => $rules) {
$fieldPath = $path ? "$path.$field" : $field;
if ($rules['required'] ?? false) {
if (!isset($data[$field])) {
$errors[] = "Required field missing: $fieldPath";
continue;
}
}
if (!isset($data[$field])) {
continue;
}
$value = $data[$field];
// Type validation
if (isset($rules['type'])) {
if (!self::validateType($value, $rules['type'])) {
$errors[] = "Invalid type for $fieldPath: expected {$rules['type']}";
}
}
// Length validation for strings and arrays
if (isset($rules['min_length']) && strlen($value) < $rules['min_length']) {
$errors[] = "Field $fieldPath too short (minimum {$rules['min_length']})";
}
if (isset($rules['max_length']) && strlen($value) > $rules['max_length']) {
$errors[] = "Field $fieldPath too long (maximum {$rules['max_length']})";
}
// Nested object validation
if (isset($rules['properties']) && is_array($value)) {
$errors = array_merge($errors, self::validateSchema($value, $rules['properties'], $fieldPath));
}
}
return $errors;
}
/**
* Validate data type
*/
private static function validateType($value, string $expectedType): bool
{
return match($expectedType) {
'string' => is_string($value),
'integer' => is_int($value),
'number' => is_numeric($value),
'boolean' => is_bool($value),
'array' => is_array($value),
'object' => is_object($value) || (is_array($value) && array_keys($value) !== range(0, count($value) - 1)),
default => true
};
}
/**
* Transform JSON data with mapping
*/
public static function transform(string $json, array $mapping): string
{
$data = self::decode($json, true);
$transformed = self::applyMapping($data, $mapping);
return self::encode($transformed, JSON_PRETTY_PRINT);
}
/**
* Apply transformation mapping
*/
private static function applyMapping($data, array $mapping)
{
if (!is_array($data)) {
return $data;
}
$result = [];
foreach ($mapping as $newKey => $rule) {
if (is_string($rule)) {
// Simple field mapping
if (isset($data[$rule])) {
$result[$newKey] = $data[$rule];
}
} elseif (is_array($rule)) {
if (isset($rule['source']) && isset($data[$rule['source']])) {
$value = $data[$rule['source']];
// Apply transformation
if (isset($rule['transform'])) {
$value = $rule['transform']($value);
}
$result[$newKey] = $value;
}
}
}
return $result;
}
}
// Usage examples
try {
$userData = [
'first_name' => 'John',
'last_name' => 'Doe',
'email_address' => '[email protected]',
'birth_date' => '1990-01-15'
];
// Schema for validation
$schema = [
'first_name' => ['required' => true, 'type' => 'string', 'min_length' => 1],
'last_name' => ['required' => true, 'type' => 'string'],
'email_address' => ['required' => true, 'type' => 'string'],
'birth_date' => ['type' => 'string']
];
$json = JsonProcessor::encode($userData, JSON_PRETTY_PRINT);
echo "Encoded data:\n$json\n";
// Validate
$errors = JsonProcessor::validate($json, $schema);
if (empty($errors)) {
echo "Validation passed!\n";
} else {
echo "Validation errors:\n";
foreach ($errors as $error) {
echo "- $error\n";
}
}
// Transform
$mapping = [
'name' => [
'source' => 'first_name',
'transform' => fn($value) => $userData['first_name'] . ' ' . $userData['last_name']
],
'email' => 'email_address',
'age' => [
'source' => 'birth_date',
'transform' => fn($date) => date('Y') - date('Y', strtotime($date))
]
];
$transformed = JsonProcessor::transform($json, $mapping);
echo "Transformed data:\n$transformed\n";
} catch (JsonException $e) {
echo "JSON processing error: " . $e->getMessage() . "\n";
}
?>
XML Processing
Basic XML Operations
<?php
/**
* PHP XML Processing with SimpleXML and DOMDocument
*/
// Creating XML with SimpleXML
$xml = new SimpleXMLElement('<users></users>');
$user1 = $xml->addChild('user');
$user1->addAttribute('id', '1');
$user1->addChild('name', 'John Doe');
$user1->addChild('email', '[email protected]');
$user2 = $xml->addChild('user');
$user2->addAttribute('id', '2');
$user2->addChild('name', 'Jane Smith');
$user2->addChild('email', '[email protected]');
echo "Generated XML:\n";
echo $xml->asXML();
// Reading XML
$xmlString = '<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<book id="1">
<title>PHP Guide</title>
<author>John Author</author>
<price currency="USD">29.99</price>
</book>
<book id="2">
<title>XML Processing</title>
<author>Jane Writer</author>
<price currency="USD">34.99</price>
</book>
</catalog>';
$catalog = new SimpleXMLElement($xmlString);
echo "\nReading XML:\n";
foreach ($catalog->book as $book) {
echo "Book ID: " . $book['id'] . "\n";
echo "Title: " . $book->title . "\n";
echo "Author: " . $book->author . "\n";
echo "Price: " . $book->price . " " . $book->price['currency'] . "\n\n";
}
// Converting XML to array
function xmlToArray(SimpleXMLElement $xml): array
{
$array = [];
foreach ($xml->children() as $element) {
$name = $element->getName();
if ($element->count() > 0) {
$array[$name][] = xmlToArray($element);
} else {
$array[$name] = (string) $element;
}
// Handle attributes
foreach ($element->attributes() as $attrName => $attrValue) {
$array[$name . '_' . $attrName] = (string) $attrValue;
}
}
return $array;
}
$arrayData = xmlToArray($catalog);
echo "XML as array:\n";
print_r($arrayData);
?>
Advanced XML Processing
<?php
/**
* Advanced XML handling with DOMDocument and validation
*/
class XmlProcessor
{
private DOMDocument $dom;
public function __construct()
{
$this->dom = new DOMDocument('1.0', 'UTF-8');
$this->dom->formatOutput = true;
$this->dom->preserveWhiteSpace = false;
}
/**
* Load XML from string with error handling
*/
public function loadXML(string $xml): bool
{
libxml_use_internal_errors(true);
$result = $this->dom->loadXML($xml);
if (!$result) {
$errors = libxml_get_errors();
$errorMessages = [];
foreach ($errors as $error) {
$errorMessages[] = trim($error->message);
}
throw new InvalidArgumentException('XML parsing error: ' . implode(', ', $errorMessages));
}
return true;
}
/**
* Validate XML against XSD schema
*/
public function validateAgainstSchema(string $xsdFile): array
{
libxml_use_internal_errors(true);
$isValid = $this->dom->schemaValidate($xsdFile);
$errors = [];
if (!$isValid) {
$xmlErrors = libxml_get_errors();
foreach ($xmlErrors as $error) {
$errors[] = [
'line' => $error->line,
'column' => $error->column,
'message' => trim($error->message),
'level' => $error->level
];
}
}
libxml_clear_errors();
return $errors;
}
/**
* Transform XML using XSLT
*/
public function transform(string $xslFile): string
{
$xsl = new DOMDocument();
$xsl->load($xslFile);
$processor = new XSLTProcessor();
$processor->importStylesheet($xsl);
return $processor->transformToXML($this->dom);
}
/**
* Add namespace support
*/
public function addNamespace(string $prefix, string $uri): void
{
$this->dom->documentElement->setAttributeNS(
'http://www.w3.org/2000/xmlns/',
"xmlns:$prefix",
$uri
);
}
/**
* Query XML with XPath
*/
public function query(string $xpath): array
{
$xpathObj = new DOMXPath($this->dom);
$nodeList = $xpathObj->query($xpath);
$results = [];
foreach ($nodeList as $node) {
$results[] = $node->nodeValue;
}
return $results;
}
/**
* Convert to array with attributes
*/
public function toArray(): array
{
return $this->domNodeToArray($this->dom->documentElement);
}
/**
* Recursive DOM to array conversion
*/
private function domNodeToArray(DOMNode $node): array
{
$output = [];
switch ($node->nodeType) {
case XML_CDATA_SECTION_NODE:
case XML_TEXT_NODE:
$output = trim($node->textContent);
break;
case XML_ELEMENT_NODE:
// Handle attributes
if ($node->hasAttributes()) {
foreach ($node->attributes as $attr) {
$output['@' . $attr->nodeName] = $attr->nodeValue;
}
}
// Handle child nodes
if ($node->hasChildNodes()) {
for ($i = 0; $i < $node->childNodes->length; $i++) {
$child = $node->childNodes->item($i);
$childArray = $this->domNodeToArray($child);
if (isset($childArray)) {
if (isset($output[$child->nodeName])) {
if (!is_array($output[$child->nodeName]) ||
!isset($output[$child->nodeName][0])) {
$output[$child->nodeName] = [$output[$child->nodeName]];
}
$output[$child->nodeName][] = $childArray;
} else {
$output[$child->nodeName] = $childArray;
}
}
}
}
// Handle empty elements
if (is_array($output) && count($output) === 0) {
$output = '';
}
break;
}
return $output;
}
/**
* Get formatted XML string
*/
public function getXML(): string
{
return $this->dom->saveXML();
}
}
// Usage examples
$xmlData = '<?xml version="1.0" encoding="UTF-8"?>
<products xmlns:product="http://example.com/product">
<product:item id="1" category="electronics">
<name>Laptop</name>
<price currency="USD">999.99</price>
<description><![CDATA[High-performance laptop for professionals]]></description>
</product:item>
<product:item id="2" category="books">
<name>PHP Manual</name>
<price currency="USD">49.99</price>
<description>Comprehensive PHP reference</description>
</product:item>
</products>';
try {
$processor = new XmlProcessor();
$processor->loadXML($xmlData);
echo "Loaded XML successfully\n";
// Query with XPath
$names = $processor->query('//name');
echo "Product names: " . implode(', ', $names) . "\n";
$prices = $processor->query('//price[@currency="USD"]');
echo "USD prices: " . implode(', ', $prices) . "\n";
// Convert to array
$array = $processor->toArray();
echo "\nXML as array:\n";
print_r($array);
} catch (Exception $e) {
echo "XML processing error: " . $e->getMessage() . "\n";
}
?>
Data Transformation and APIs
JSON-XML Conversion
<?php
/**
* Converting between JSON and XML formats
*/
class DataConverter
{
/**
* Convert JSON to XML
*/
public static function jsonToXml(string $json, string $rootElement = 'root'): string
{
$data = json_decode($json, true);
if (json_last_error() !== JSON_ERROR_NONE) {
throw new InvalidArgumentException('Invalid JSON: ' . json_last_error_msg());
}
$xml = new SimpleXMLElement("<?xml version=\"1.0\"?><$rootElement></$rootElement>");
self::arrayToXml($data, $xml);
return $xml->asXML();
}
/**
* Convert XML to JSON
*/
public static function xmlToJson(string $xml): string
{
$xmlObj = simplexml_load_string($xml);
if ($xmlObj === false) {
throw new InvalidArgumentException('Invalid XML');
}
return json_encode($xmlObj, JSON_PRETTY_PRINT);
}
/**
* Recursively convert array to XML
*/
private static function arrayToXml(array $data, SimpleXMLElement $xml): void
{
foreach ($data as $key => $value) {
if (is_array($value)) {
if (is_numeric($key)) {
$key = 'item';
}
$subnode = $xml->addChild($key);
self::arrayToXml($value, $subnode);
} else {
$xml->addChild($key, htmlspecialchars($value));
}
}
}
/**
* Convert CSV to JSON
*/
public static function csvToJson(string $csvData, bool $hasHeader = true): string
{
$lines = explode("\n", trim($csvData));
$result = [];
if ($hasHeader) {
$headers = str_getcsv(array_shift($lines));
}
foreach ($lines as $line) {
if (empty(trim($line))) continue;
$data = str_getcsv($line);
if ($hasHeader) {
$result[] = array_combine($headers, $data);
} else {
$result[] = $data;
}
}
return json_encode($result, JSON_PRETTY_PRINT);
}
}
// API Response Handler
class ApiResponseHandler
{
/**
* Format API response based on requested format
*/
public static function formatResponse(array $data, string $format = 'json'): string
{
switch (strtolower($format)) {
case 'json':
header('Content-Type: application/json');
return json_encode($data, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES);
case 'xml':
header('Content-Type: application/xml');
return DataConverter::jsonToXml(json_encode($data), 'response');
case 'csv':
header('Content-Type: text/csv');
return self::arrayToCsv($data);
default:
throw new InvalidArgumentException("Unsupported format: $format");
}
}
/**
* Convert array to CSV
*/
private static function arrayToCsv(array $data): string
{
if (empty($data)) {
return '';
}
$output = fopen('php://temp', 'r+');
// Write header
fputcsv($output, array_keys($data[0]));
// Write data
foreach ($data as $row) {
fputcsv($output, $row);
}
rewind($output);
$csv = stream_get_contents($output);
fclose($output);
return $csv;
}
/**
* Parse incoming API data
*/
public static function parseApiRequest(): array
{
$contentType = $_SERVER['CONTENT_TYPE'] ?? '';
$input = file_get_contents('php://input');
if (strpos($contentType, 'application/json') !== false) {
$data = json_decode($input, true);
if (json_last_error() !== JSON_ERROR_NONE) {
throw new InvalidArgumentException('Invalid JSON input');
}
return $data;
}
if (strpos($contentType, 'application/xml') !== false) {
$xml = simplexml_load_string($input);
if ($xml === false) {
throw new InvalidArgumentException('Invalid XML input');
}
return json_decode(json_encode($xml), true);
}
// Default to form data
return $_POST;
}
}
// Usage examples
$sampleData = [
['id' => 1, 'name' => 'John', 'email' => '[email protected]'],
['id' => 2, 'name' => 'Jane', 'email' => '[email protected]']
];
$jsonData = json_encode($sampleData);
echo "JSON Data:\n$jsonData\n\n";
$xmlData = DataConverter::jsonToXml($jsonData, 'users');
echo "Converted to XML:\n$xmlData\n\n";
$backToJson = DataConverter::xmlToJson($xmlData);
echo "Back to JSON:\n$backToJson\n\n";
// API response example
try {
$response = ApiResponseHandler::formatResponse($sampleData, 'json');
echo "API JSON Response:\n$response\n";
} catch (Exception $e) {
echo "Error: " . $e->getMessage() . "\n";
}
?>
Security Considerations
Secure JSON/XML Processing
<?php
/**
* Security-focused JSON/XML processing
*/
class SecureDataProcessor
{
private const MAX_JSON_SIZE = 1024 * 1024; // 1MB
private const MAX_XML_DEPTH = 100;
/**
* Safely process JSON with size and depth limits
*/
public static function processJson(string $json, int $maxSize = self::MAX_JSON_SIZE): array
{
// Check size
if (strlen($json) > $maxSize) {
throw new InvalidArgumentException("JSON too large (max: $maxSize bytes)");
}
// Decode with depth limit
$data = json_decode($json, true, 512, JSON_THROW_ON_ERROR);
// Validate structure
self::validateJsonStructure($data);
return $data;
}
/**
* Safely process XML with entity and size limits
*/
public static function processXml(string $xml, int $maxSize = self::MAX_JSON_SIZE): DOMDocument
{
// Check size
if (strlen($xml) > $maxSize) {
throw new InvalidArgumentException("XML too large (max: $maxSize bytes)");
}
// Disable external entity loading (XXE protection)
$prevEntityLoader = libxml_disable_entity_loader(true);
try {
$dom = new DOMDocument();
$dom->resolveExternals = false;
$dom->substituteEntities = false;
libxml_use_internal_errors(true);
if (!$dom->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD | LIBXML_DTDATTR)) {
$errors = libxml_get_errors();
$errorMessage = '';
foreach ($errors as $error) {
$errorMessage .= trim($error->message) . ' ';
}
throw new InvalidArgumentException('XML parsing error: ' . $errorMessage);
}
// Check depth
self::validateXmlDepth($dom->documentElement, 0);
return $dom;
} finally {
libxml_disable_entity_loader($prevEntityLoader);
libxml_clear_errors();
}
}
/**
* Validate JSON structure for potential attacks
*/
private static function validateJsonStructure($data, int $depth = 0): void
{
if ($depth > 100) {
throw new InvalidArgumentException('JSON structure too deep');
}
if (is_array($data)) {
if (count($data) > 10000) {
throw new InvalidArgumentException('Too many array elements');
}
foreach ($data as $value) {
self::validateJsonStructure($value, $depth + 1);
}
}
}
/**
* Validate XML depth
*/
private static function validateXmlDepth(DOMNode $node, int $depth): void
{
if ($depth > self::MAX_XML_DEPTH) {
throw new InvalidArgumentException('XML structure too deep');
}
foreach ($node->childNodes as $child) {
if ($child->nodeType === XML_ELEMENT_NODE) {
self::validateXmlDepth($child, $depth + 1);
}
}
}
/**
* Sanitize data for safe output
*/
public static function sanitizeForOutput($data): array
{
if (is_array($data)) {
return array_map([self::class, 'sanitizeForOutput'], $data);
}
if (is_string($data)) {
return htmlspecialchars($data, ENT_QUOTES | ENT_HTML5, 'UTF-8');
}
return $data;
}
/**
* Validate against whitelist of allowed fields
*/
public static function validateFields(array $data, array $allowedFields): array
{
$filtered = [];
foreach ($allowedFields as $field) {
if (isset($data[$field])) {
$filtered[$field] = $data[$field];
}
}
return $filtered;
}
}
// Usage examples with security
try {
$jsonInput = '{"name": "John", "email": "[email protected]", "role": "user"}';
$safeData = SecureDataProcessor::processJson($jsonInput);
// Validate allowed fields
$allowedFields = ['name', 'email'];
$filteredData = SecureDataProcessor::validateFields($safeData, $allowedFields);
// Sanitize for output
$sanitizedData = SecureDataProcessor::sanitizeForOutput($filteredData);
echo "Processed data safely:\n";
print_r($sanitizedData);
} catch (Exception $e) {
echo "Security error: " . $e->getMessage() . "\n";
}
// XML security example
$xmlInput = '<?xml version="1.0"?>
<user>
<name>John Doe</name>
<email>[email protected]</email>
</user>';
try {
$dom = SecureDataProcessor::processXml($xmlInput);
echo "XML processed safely\n";
} catch (Exception $e) {
echo "XML security error: " . $e->getMessage() . "\n";
}
?>
Related Topics
For more PHP data processing topics:
- PHP Input Validation - Validating JSON/XML input
- PHP File Operations - Reading/writing data files
- PHP Regular Expressions - Pattern matching in data
- PHP Security Best Practices - Secure data handling
- PHP Web APIs - Building JSON/XML APIs
Summary
Effective JSON and XML processing is essential for modern web development:
JSON Processing: Master encoding, decoding, validation, and transformation with proper error handling and security measures.
XML Processing: Use SimpleXML for basic operations and DOMDocument for advanced processing, validation, and transformation.
Security: Always validate input size, structure, and content; disable external entities in XML; sanitize output data.
Data Transformation: Convert between formats efficiently while preserving data integrity and structure.
API Integration: Handle various data formats in APIs with proper content-type negotiation and error responses.
Best Practices: Use appropriate tools for each task, implement comprehensive error handling, and always consider security implications.
Mastering JSON and XML processing enables robust data exchange, API development, and integration with external services while maintaining security and performance.