1. php
  2. /basics
  3. /file-operations

PHP File Operations

Introduction to File Operations

File operations are fundamental to many PHP applications, from configuration management and data storage to file uploads and content management systems. Understanding how to safely and efficiently work with files is essential for building robust web applications.

PHP provides extensive file handling capabilities that allow you to read, write, manipulate, and manage files and directories. However, file operations come with important security, performance, and reliability considerations that must be carefully addressed.

Why File Operations Matter

Data Persistence: Files provide a way to store data permanently, whether it's configuration settings, user uploads, cache data, or application logs. Unlike variables that exist only during script execution, files allow your application to maintain state between requests.

Content Management: Many applications need to handle user-uploaded files like images, documents, or media files, requiring secure upload and storage mechanisms. This includes validating file types, managing storage locations, and serving files back to users.

Integration: File operations enable integration with external systems through import/export functionality, CSV processing, and data exchange formats. Many business applications rely on file-based data exchange for compatibility with other systems.

Performance: Proper file handling can significantly impact application performance, especially when dealing with large files or high-volume file operations. Understanding techniques like streaming and chunked reading is crucial for scalable applications.

Security: File operations present numerous security risks including path traversal attacks, code injection, and unauthorized access that must be mitigated. Every file operation should be treated as a potential security risk point.

Types of File Operations

Reading Files: Retrieving content from existing files for processing, configuration loading, or data analysis. This includes simple text files, structured data formats like JSON or XML, and binary files.

Writing Files: Creating new files or updating existing ones with application data, logs, or generated content. Writing operations must consider file locking, atomic operations, and permission management.

File Uploads: Handling user-submitted files through web forms with proper validation and security checks. This is one of the most security-sensitive operations in web applications.

Directory Operations: Creating, listing, and managing directories and their contents. Directory traversal and recursive operations require special attention to prevent infinite loops and resource exhaustion.

File Manipulation: Moving, copying, renaming, and deleting files as part of application workflows. These operations often require transaction-like behavior to ensure consistency.

Stream Operations: Working with large files or network resources using PHP's stream system for memory-efficient processing. Streams allow you to process files that are larger than available memory.

Security Considerations

Before diving into code examples, it's crucial to understand the security implications of file operations:

Path Traversal: Malicious users might attempt to access files outside intended directories using relative paths like ../../../etc/passwd. Always validate and sanitize file paths, using functions like basename() and realpath() to ensure files stay within allowed directories.

File Type Validation: Uploaded files must be validated to prevent malicious code execution through disguised file types. Never rely solely on file extensions – check MIME types and use file content analysis when possible.

Permission Management: Proper file and directory permissions prevent unauthorized access and modification. Follow the principle of least privilege, granting only the minimum necessary permissions.

Input Sanitization: All file paths and names must be sanitized to prevent injection attacks and ensure safe file operations. Remove or encode special characters that could be interpreted as path separators or command injections.

Resource Limits: File operations can consume significant system resources and must be properly limited and monitored. Set reasonable limits on file sizes, implement timeouts, and monitor disk space usage.

Reading Files

Reading files is one of the most common operations in PHP applications. Whether you're loading configuration files, processing user data, or analyzing logs, understanding the various methods for reading files and their appropriate use cases is essential.

Basic File Reading Operations

PHP offers several methods for reading files, each with its own advantages and use cases. Let's explore these methods and understand when to use each one:

<?php
/**
 * PHP File Reading Operations
 * 
 * Reading files is one of the most common file operations, used for
 * configuration loading, data processing, and content management.
 */

/**
 * Simple file reading methods
 * 
 * Different methods are appropriate for different use cases based on
 * file size, memory constraints, and processing requirements.
 */
function demonstrateBasicFileReading(): void
{
    $filename = 'data/config.txt';
    
    // Method 1: Read entire file into string
    // This is the simplest approach but loads the entire file into memory
    // Best for: Small files (< 10MB), configuration files, templates
    if (file_exists($filename)) {
        $content = file_get_contents($filename);
        
        if ($content !== false) {
            echo "File content length: " . strlen($content) . " bytes\n";
            echo "First 100 characters: " . substr($content, 0, 100) . "\n";
        } else {
            echo "Failed to read file\n";
        }
    } else {
        echo "File does not exist\n";
    }
}

The `file_get_contents()` function is the simplest way to read a file in PHP. It reads the entire file into a string with a single function call. This method is perfect for small files but can cause memory issues with large files since it loads everything into memory at once.

Key considerations when using `file_get_contents()`:
- **Memory usage**: The entire file is loaded into memory, so a 100MB file will use at least 100MB of RAM
- **Error handling**: Returns `false` on failure, so always check the return value
- **Performance**: Fast for small files but can be slow and resource-intensive for large files
- **Network resources**: Can also read from URLs if `allow_url_fopen` is enabled

```php
    // Method 2: Read file into array (line by line)
    // Each line becomes an array element - useful for line-based processing
    // Best for: Log files, CSV files, configuration files with line-based structure
    if (file_exists($filename)) {
        $lines = file($filename, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
        
        echo "File has " . count($lines) . " non-empty lines\n";
        
        foreach ($lines as $lineNumber => $line) {
            echo "Line " . ($lineNumber + 1) . ": " . trim($line) . "\n";
        }
    }

The file() function reads a file into an array, with each line as a separate element. This is convenient for files where you need to process data line by line. The function accepts several useful flags:

  • FILE_IGNORE_NEW_LINES: Removes newline characters from the end of each line
  • FILE_SKIP_EMPTY_LINES: Skips empty lines in the file
  • FILE_USE_INCLUDE_PATH: Searches for the file in the include path

This method still loads the entire file into memory but organizes it into a more manageable structure for line-based processing.

    // Method 3: Read file with file pointer (for large files)
    // Uses a file handle to read line by line without loading entire file
    // Best for: Large files, streaming data, memory-constrained environments
    $handle = fopen($filename, 'r');
    
    if ($handle) {
        $lineCount = 0;
        
        while (($line = fgets($handle)) !== false) {
            $lineCount++;
            echo "Processing line $lineCount: " . trim($line) . "\n";
            
            // You can process each line individually here
            // This approach uses minimal memory regardless of file size
        }
        
        fclose($handle);
        echo "Processed $lineCount lines total\n";
    } else {
        echo "Could not open file for reading\n";
    }
}

Using fopen() with fgets() is the most memory-efficient approach for reading large files. This method:

  • Streams the file: Only reads one line at a time into memory
  • Handles large files: Can process files larger than available memory
  • Provides control: Allows you to stop reading at any point
  • Requires cleanup: Always close the file handle with fclose()

The file handle approach is essential when working with files that could be gigabytes in size, as it maintains constant memory usage regardless of file size.

Safe File Reading with Error Handling

In production environments, file reading operations must be wrapped with proper error handling and security checks. Here's a comprehensive example of safe file reading:

/**
 * Safe file reading with error handling
 * 
 * Production code should always include comprehensive error handling
 * and validation to prevent security issues and provide user feedback.
 */
function safeFileReading(string $filename): ?string
{
    // Validate filename
    if (empty($filename)) {
        throw new InvalidArgumentException('Filename cannot be empty');
    }
    
    // Prevent path traversal attacks
    // basename() removes any directory components from the filename
    $filename = basename($filename);
    $safePath = 'data/' . $filename;

The security measures implemented here are critical:

  1. Input validation: Never trust user input - validate that a filename was provided
  2. Path traversal prevention: Using basename() strips any directory components, preventing attacks like ../../../etc/passwd
  3. Controlled directory: Files are restricted to a specific directory (data/) rather than allowing arbitrary file access
    // Check if file exists and is readable
    if (!file_exists($safePath)) {
        throw new RuntimeException("File not found: $filename");
    }
    
    if (!is_readable($safePath)) {
        throw new RuntimeException("File not readable: $filename");
    }
    
    // Check file size to prevent memory issues
    $fileSize = filesize($safePath);
    $maxSize = 10 * 1024 * 1024; // 10MB limit
    
    if ($fileSize > $maxSize) {
        throw new RuntimeException("File too large: $fileSize bytes (limit: $maxSize bytes)");
    }

These additional checks ensure:

  • File existence: Provides clear error messages when files don't exist
  • Read permissions: Verifies the PHP process has permission to read the file
  • Size limits: Prevents memory exhaustion from unexpectedly large files

The size check is particularly important when using file_get_contents() since it loads the entire file into memory.

    // Read file with error handling
    $content = file_get_contents($safePath);
    
    if ($content === false) {
        throw new RuntimeException("Failed to read file: $filename");
    }
    
    return $content;
}

Reading Specific File Formats

Different file formats require different parsing approaches. PHP provides built-in functions for common formats:

/**
 * Reading specific file formats
 */
function readSpecificFormats(): void
{
    // Read CSV file
    $csvFile = 'data/users.csv';
    
    if (file_exists($csvFile)) {
        $handle = fopen($csvFile, 'r');
        
        if ($handle) {
            // Read header row
            $headers = fgetcsv($handle);
            echo "CSV Headers: " . implode(', ', $headers) . "\n";
            
            // Read data rows
            $rowCount = 0;
            while (($data = fgetcsv($handle)) !== false) {
                $rowCount++;
                $row = array_combine($headers, $data);
                
                echo "Row $rowCount: ";
                echo "Name: {$row['name']}, Email: {$row['email']}\n";
            }
            
            fclose($handle);
        }
    }

The fgetcsv() function is specifically designed for parsing CSV files. It handles:

  • Quoted fields: Properly parses fields containing commas or quotes
  • Different delimiters: Can handle tabs, semicolons, or other delimiters
  • Escape characters: Correctly interprets escaped special characters

Best practices for CSV reading:

  • Always read the header row first to understand the data structure
  • Use array_combine() to create associative arrays for easier data access
  • Handle cases where rows might have different numbers of columns
  • Consider using specialized libraries like league/csv for complex CSV operations
    // Read JSON file
    $jsonFile = 'data/config.json';
    
    if (file_exists($jsonFile)) {
        $jsonContent = file_get_contents($jsonFile);
        $config = json_decode($jsonContent, true);
        
        if (json_last_error() === JSON_ERROR_NONE) {
            echo "Configuration loaded successfully\n";
            print_r($config);
        } else {
            echo "Invalid JSON in config file: " . json_last_error_msg() . "\n";
        }
    }

JSON file reading considerations:

  • Memory usage: json_decode() needs the entire JSON string in memory
  • Error checking: Always verify JSON parsing succeeded with json_last_error()
  • Data types: The second parameter of json_decode() determines if you get arrays (true) or objects (false)
  • Large JSON files: Consider using streaming JSON parsers for very large files
    // Read XML file
    $xmlFile = 'data/settings.xml';
    
    if (file_exists($xmlFile)) {
        libxml_use_internal_errors(true);
        $xml = simplexml_load_file($xmlFile);
        
        if ($xml !== false) {
            echo "XML loaded successfully\n";
            foreach ($xml->setting as $setting) {
                echo "Setting: {$setting['name']} = {$setting['value']}\n";
            }
        } else {
            echo "Failed to parse XML file\n";
            foreach (libxml_get_errors() as $error) {
                echo "XML Error: {$error->message}";
            }
        }
    }
}

XML parsing in PHP:

  • SimpleXML: Best for reading simple XML structures
  • DOM: More powerful but complex, better for modifying XML
  • XMLReader: Streaming parser for large XML files
  • Error handling: Use libxml_use_internal_errors() to capture parsing errors

// Example usage try { demonstrateBasicFileReading();

$content = safeFileReading('example.txt'); echo "Safely read content: " . strlen($content) . " bytes\n";

readSpecificFormats();

} catch (Exception $e) { echo "Error: " . $e->getMessage() . "\n"; } ?>


### Advanced File Reading with Streams

When working with large files or when memory efficiency is critical, PHP's stream-based approach provides powerful tools for file handling. Streams allow you to process files of any size without loading them entirely into memory, making them essential for production applications that handle user-generated content, log analysis, or data processing.

Understanding when to use streams versus simple file reading methods is crucial:
- **Use streams when**: Files might be large (>10MB), memory is limited, you need to process data incrementally, or you're working with network resources
- **Use simple methods when**: Files are small, you need all data at once, or simplicity is more important than memory efficiency

```php
<?php
/**
 * Advanced file reading using PHP streams
 * 
 * Streams provide memory-efficient ways to handle large files
 * and enable processing of files that don't fit in memory.
 */

class FileReader
{
    private $handle;
    private string $filename;
    private array $metadata;
    
    public function __construct(string $filename)
    {
        $this->filename = $filename;
        $this->validateFile();
        $this->openFile();
    }
    
    public function __destruct()
    {
        $this->close();
    }

This FileReader class demonstrates object-oriented file handling with several key benefits:

Automatic Resource Management: The destructor ensures file handles are always closed, even if an exception occurs. This prevents resource leaks that can accumulate in long-running applications.

Encapsulation: All file operations are wrapped in a clean, reusable interface. This makes it easy to add features like logging, caching, or monitoring without changing the calling code.

Consistent Error Handling: By using exceptions throughout, the class provides predictable error handling that integrates well with modern PHP applications.

State Management: The class maintains the file handle and metadata, eliminating the need to pass these around or use global variables.

    /**
     * Validate file before opening
     */
    private function validateFile(): void
    {
        if (!file_exists($this->filename)) {
            throw new RuntimeException("File not found: {$this->filename}");
        }
        
        if (!is_readable($this->filename)) {
            throw new RuntimeException("File not readable: {$this->filename}");
        }
        
        // Get file metadata
        $this->metadata = [
            'size' => filesize($this->filename),
            'modified' => filemtime($this->filename),
            'type' => mime_content_type($this->filename),
            'permissions' => substr(sprintf('%o', fileperms($this->filename)), -4)
        ];
    }

The validation method performs several crucial pre-flight checks:

Existence Check: Prevents attempting to read non-existent files, which would result in cryptic errors later. By checking early, we can provide clear error messages.

Permission Check: Verifies the PHP process has read access. This is especially important in shared hosting environments or when dealing with files created by other processes.

Metadata Collection: Gathering file information upfront serves multiple purposes:

  • Size: Helps decide whether to use streaming or load entire file
  • Modified time: Useful for caching decisions and audit trails
  • MIME type: Important for security validation and content handling
  • Permissions: Helps diagnose access issues and security audits

The permission format (sprintf('%o', fileperms())) converts the numeric permission to octal format (like 0644), making it human-readable.

    /**
     * Open file for reading
     */
    private function openFile(): void
    {
        $this->handle = fopen($this->filename, 'rb');
        
        if (!$this->handle) {
            throw new RuntimeException("Cannot open file: {$this->filename}");
        }
    }

The 'rb' mode is used for maximum compatibility:

  • 'r': Opens for reading only
  • 'b': Binary mode - prevents PHP from translating line endings

Binary mode is crucial because:

  • It preserves exact file contents on all platforms
  • It's required for non-text files (images, PDFs, etc.)
  • It prevents corruption when processing files with mixed line endings
    /**
     * Read file in chunks for memory efficiency
     */
    public function readInChunks(int $chunkSize = 8192): \Generator
    {
        if (!$this->handle) {
            throw new RuntimeException('File not open');
        }
        
        rewind($this->handle);
        
        while (!feof($this->handle)) {
            $chunk = fread($this->handle, $chunkSize);
            
            if ($chunk === false) {
                throw new RuntimeException('Error reading file chunk');
            }
            
            if (strlen($chunk) > 0) {
                yield $chunk;
            }
        }
    }

This method showcases PHP generators, a powerful feature for memory-efficient processing:

Generator Benefits:

  • Lazy Evaluation: Chunks are read only when requested by the consuming code
  • Memory Efficiency: Only one chunk is in memory at a time, regardless of file size
  • Flow Control: The caller controls the pace of reading, allowing for backpressure handling
  • Early Termination: Processing can stop at any time without reading the entire file

Chunk Size Considerations:

  • 8192 bytes (8KB): Default matches typical filesystem block size for optimal I/O
  • Smaller chunks: Use for real-time processing or limited memory (e.g., 1024 bytes)
  • Larger chunks: Better for sequential processing of large files (e.g., 1MB)
  • Network streams: Smaller chunks prevent timeout issues

The rewind() call ensures we start from the beginning, making the method reusable.

    /**
     * Read file line by line
     */
    public function readLines(): \Generator
    {
        if (!$this->handle) {
            throw new RuntimeException('File not open');
        }
        
        rewind($this->handle);
        $lineNumber = 0;
        
        while (($line = fgets($this->handle)) !== false) {
            $lineNumber++;
            yield $lineNumber => rtrim($line, "\r\n");
        }
    }

Line-by-line reading is ideal for text file processing:

Use Cases:

  • Log file analysis
  • CSV data processing
  • Configuration file parsing
  • Large text file searching

The rtrim() call removes line endings:

  • Handles both Unix (\n) and Windows (\r\n) line endings
  • Provides consistent data regardless of file origin
  • Prevents double-spacing when outputting lines

Generator with Keys: Yielding $lineNumber => $line allows callers to know the line number, useful for error reporting or progress tracking.

    /**
     * Read specific number of bytes from position
     */
    public function readBytes(int $length, int $offset = null): string
    {
        if (!$this->handle) {
            throw new RuntimeException('File not open');
        }
        
        if ($offset !== null) {
            if (fseek($this->handle, $offset) !== 0) {
                throw new RuntimeException("Cannot seek to position $offset");
            }
        }
        
        $data = fread($this->handle, $length);
        
        if ($data === false) {
            throw new RuntimeException('Error reading file data');
        }
        
        return $data;
    }

Random access reading enables powerful file processing capabilities:

Common Use Cases:

  • Reading file headers (e.g., first few bytes of an image file)
  • Implementing pagination for large files
  • Binary file format parsing
  • Resumable file processing

Seek Operations:

  • SEEK_SET: Position relative to beginning (default)
  • SEEK_CUR: Position relative to current position
  • SEEK_END: Position relative to end of file

This method is essential for working with binary file formats where specific data is at known offsets.

    /**
     * Search for text in file
     */
    public function searchText(string $searchTerm, bool $caseSensitive = true): array
    {
        $results = [];
        $lineNumber = 0;
        
        foreach ($this->readLines() as $line) {
            $lineNumber++;
            $searchIn = $caseSensitive ? $line : strtolower($line);
            $searchFor = $caseSensitive ? $searchTerm : strtolower($searchTerm);
            
            if (strpos($searchIn, $searchFor) !== false) {
                $results[] = [
                    'line' => $lineNumber,
                    'content' => $line,
                    'position' => strpos($searchIn, $searchFor)
                ];
            }
        }
        
        return $results;
    }

This search implementation demonstrates several important concepts:

Memory Efficiency: By using the generator-based readLines() method, even gigabyte-sized files can be searched without memory issues.

Flexible Searching:

  • Case-sensitive or case-insensitive options
  • Returns comprehensive results including line numbers and positions
  • Could be extended to support regular expressions or fuzzy matching

Performance Considerations:

  • For frequent searches, consider building an index
  • For very large files, implement parallel searching
  • Case-insensitive searches are slower due to string conversion
    /**
     * Calculate file hash
     */
    public function calculateHash(string $algorithm = 'sha256'): string
    {
        $context = hash_init($algorithm);
        
        foreach ($this->readInChunks(8192) as $chunk) {
            hash_update($context, $chunk);
        }
        
        return hash_final($context);
    }

Incremental hashing is essential for large files:

Why Incremental Hashing:

  • Can hash files larger than available memory
  • Provides progress tracking capability
  • Allows for hash calculation to be interrupted and resumed

Algorithm Choices:

  • MD5: Fast but cryptographically broken - use only for checksums
  • SHA256: Good balance of speed and security
  • SHA512: More secure but slower
  • xxHash: Very fast, good for non-cryptographic uses

This method is perfect for file integrity verification, duplicate detection, and cache key generation.

// Usage examples
try {
    $reader = new FileReader('large_data.txt');
    
    // Read file in chunks (memory efficient for large files)
    echo "Reading file in chunks:\n";
    $totalBytes = 0;
    
    foreach ($reader->readInChunks(1024) as $chunk) {
        $totalBytes += strlen($chunk);
        // Process chunk here
    }
    
    echo "Processed $totalBytes bytes total\n";
    
    // Read line by line
    echo "\nReading line by line:\n";
    $lineCount = 0;
    
    foreach ($reader->readLines() as $lineNum => $line) {
        $lineCount++;
        if ($lineCount <= 5) { // Show first 5 lines
            echo "Line $lineNum: " . substr($line, 0, 50) . "\n";
        }
    }
    
    // Search for text
    echo "\nSearching for 'error':\n";
    $searchResults = $reader->searchText('error', false);
    
    foreach ($searchResults as $result) {
        echo "Found on line {$result['line']}: {$result['content']}\n";
    }
    
    // Calculate file hash
    $hash = $reader->calculateHash('md5');
    echo "\nFile MD5 hash: $hash\n";
    
    // Show metadata
    echo "\nFile metadata:\n";
    print_r($reader->getMetadata());
    
} catch (Exception $e) {
    echo "Error: " . $e->getMessage() . "\n";
}
?>

The usage examples demonstrate practical applications:

Progress Tracking: The chunk reading example shows how to track progress, essential for user interfaces or logging.

Partial Processing: Reading only the first 5 lines demonstrates how generators allow for early termination, saving resources.

Practical Search: The search example shows how to find and report errors in log files, a common administrative task.

File Verification: Hash calculation provides a way to verify file integrity or detect duplicates.

Writing Files

Writing files is a fundamental operation in PHP applications, used for data persistence, logging, cache management, and content generation. Understanding the various methods and their trade-offs is crucial for building reliable applications.

File Writing and Creation

PHP offers multiple approaches to writing files, each suited to different scenarios. The choice of method depends on factors like file size, write frequency, concurrency requirements, and data integrity needs.

<?php
/**
 * PHP File Writing Operations
 * 
 * Writing files safely and efficiently is crucial for data persistence,
 * logging, cache management, and content generation.
 */

/**
 * Basic file writing operations
 */
function demonstrateFileWriting(): void
{
    $filename = 'output/test.txt';
    $data = "This is test data\nWritten at " . date('Y-m-d H:i:s') . "\n";
    
    // Method 1: Write entire content at once
    $bytesWritten = file_put_contents($filename, $data);
    
    if ($bytesWritten !== false) {
        echo "Wrote $bytesWritten bytes to $filename\n";
    } else {
        echo "Failed to write to $filename\n";
    }

The file_put_contents() function is the simplest way to write data to a file, offering several advantages:

Simplicity: Single function call replaces the traditional fopen/fwrite/fclose sequence, reducing code complexity and potential for errors.

Atomic Operation: The entire write happens as a single operation, reducing the risk of partial writes that can occur with multiple fwrite calls.

Automatic File Creation: Creates the file if it doesn't exist, eliminating the need for explicit file creation logic.

Return Value: Returns the number of bytes written, allowing for verification that all data was written successfully.

However, there are important limitations to consider:

  • Memory Usage: The entire content must fit in memory
  • No Streaming: Cannot write data incrementally
  • Overwrites by Default: Existing content is lost unless using append mode
    // Method 2: Append to existing file
    $additionalData = "This line is appended to the file.\n";
    $bytesAppended = file_put_contents($filename, $additionalData, FILE_APPEND | LOCK_EX);
    
    if ($bytesAppended !== false) {
        echo "Appended $bytesAppended bytes to $filename\n";
    }

Appending data to files is common for logging and data accumulation. The flags used here serve specific purposes:

FILE_APPEND: This flag changes the behavior from overwriting to appending. Essential for:

  • Log files where you want to preserve history
  • Data collection where new entries are added over time
  • Audit trails that must maintain chronological order

LOCK_EX (Exclusive Lock): This flag is critical for data integrity:

  • Prevents multiple processes from writing simultaneously
  • Ensures complete writes before other processes can access the file
  • Prevents data corruption in multi-process or multi-threaded environments
  • Essential for web applications where multiple requests might write to the same file

The combination of these flags makes this method ideal for logging systems where multiple processes might write to the same log file.

    // Method 3: Using file handle for more control
    $handle = fopen($filename, 'a');
    
    if ($handle) {
        fwrite($handle, "Another appended line\n");
        fwrite($handle, "Final line with timestamp: " . microtime(true) . "\n");
        fclose($handle);
        echo "Successfully wrote using file handle\n";
    }
}

Using file handles provides fine-grained control over file operations:

Benefits of File Handles:

  • Multiple Operations: Write data in chunks without reopening the file
  • Mixed Operations: Can read and write in the same session with appropriate modes
  • Resource Efficiency: Keep file open for multiple operations
  • Advanced Features: Access to stream filters, contexts, and metadata

File Opening Modes Explained:

  • 'w': Write mode - creates new file or truncates existing (data loss risk!)
  • 'a': Append mode - adds to end of file, safe for logs
  • 'x': Exclusive creation - fails if file exists (prevents accidental overwrites)
  • 'c': Open for writing without truncation, pointer at beginning
  • 'r+': Read/write mode, pointer at beginning
  • 'w+': Read/write mode, truncates file
  • 'a+': Read/write mode, appends data

Always remember to close file handles to:

  • Free system resources
  • Ensure all buffered data is written
  • Release file locks
  • Allow other processes to access the file

Safe File Writing with Atomic Operations

In production environments, data integrity is paramount. Atomic file operations ensure that files are either completely written or not written at all, preventing corruption:

/**
 * Safe file writing with atomic operations
 * 
 * Atomic file writing prevents data corruption if the process
 * is interrupted during writing.
 */
function safeFileWrite(string $filename, string $data): bool
{
    // Validate inputs
    if (empty($filename)) {
        throw new InvalidArgumentException('Filename cannot be empty');
    }
    
    // Ensure directory exists
    $directory = dirname($filename);
    if (!is_dir($directory)) {
        if (!mkdir($directory, 0755, true)) {
            throw new RuntimeException("Cannot create directory: $directory");
        }
    }

This atomic writing pattern addresses several critical scenarios:

Process Interruption: If PHP crashes, the server restarts, or the script is killed during writing, the original file remains untouched. This is crucial for configuration files where a corrupted file could prevent the application from starting.

Concurrent Access: In web environments, multiple requests might try to update the same file simultaneously. Without atomic operations, you might get interleaved writes or partial updates.

Power Failures: Even hardware failures won't leave you with a partially written file.

The directory creation logic includes important considerations:

  • Recursive Creation: The true parameter creates parent directories as needed
  • Permissions: 0755 gives owner full access, others read/execute (standard for web directories)
  • Error Handling: Always check if directory creation succeeded
    // Check if directory is writable
    if (!is_writable($directory)) {
        throw new RuntimeException("Directory not writable: $directory");
    }
    
    // Use temporary file for atomic writing
    $tempFile = $filename . '.tmp.' . uniqid();
    
    try {
        // Write to temporary file first
        $bytesWritten = file_put_contents($tempFile, $data, LOCK_EX);
        
        if ($bytesWritten === false) {
            throw new RuntimeException('Failed to write to temporary file');
        }
        
        // Atomically move temporary file to final location
        if (!rename($tempFile, $filename)) {
            throw new RuntimeException('Failed to move temporary file to final location');
        }
        
        return true;

The atomic write pattern implementation details:

Temporary File Strategy:

  • Uses uniqid() to ensure unique temporary filenames
  • Prevents conflicts when multiple processes write simultaneously
  • Temporary file is in the same directory to ensure atomic rename

Why rename() is Atomic:

  • On most filesystems, rename within the same partition is atomic
  • Either completes fully or doesn't happen at all
  • Much safer than copy-and-delete operations

LOCK_EX on Temporary File:

  • Still important even for temporary files
  • Prevents issues if the same process is called multiple times
  • Ensures complete write before rename
    } catch (Exception $e) {
        // Clean up temporary file if it exists
        if (file_exists($tempFile)) {
            unlink($tempFile);
        }
        throw $e;
    }
}

The cleanup logic is essential for preventing temporary file accumulation:

  • Always remove temporary files on failure
  • Re-throw the exception to allow caller to handle it
  • Prevents disk space issues from accumulated temporary files

This pattern should be used for:

  • Configuration file updates
  • User data files
  • Any file where corruption would cause serious issues
  • Files that are read frequently by other processes

Writing Structured Data Formats

Modern applications often need to persist data in standard formats for interoperability:

/**
 * Writing structured data formats
 */
function writeStructuredData(): void
{
    // Write JSON data
    $data = [
        'users' => [
            ['id' => 1, 'name' => 'John Doe', 'email' => '[email protected]'],
            ['id' => 2, 'name' => 'Jane Smith', 'email' => '[email protected]']
        ],
        'created_at' => date('c'),
        'version' => '1.0'
    ];
    
    $jsonContent = json_encode($data, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES);
    safeFileWrite('output/users.json', $jsonContent);
    echo "JSON data written successfully\n";

JSON is the most common data exchange format in modern web applications. Important considerations:

JSON Encoding Flags:

  • JSON_PRETTY_PRINT: Adds whitespace for human readability (increases file size)
  • JSON_UNESCAPED_SLASHES: Keeps URLs readable by not escaping forward slashes
  • JSON_UNESCAPED_UNICODE: Preserves Unicode characters instead of escaping them
  • JSON_THROW_ON_ERROR: (PHP 7.3+) Throws exception on encoding errors

Best Practices for JSON Files:

  • Always check json_last_error() after encoding
  • Use atomic writes for configuration files
  • Consider compression for large JSON files
  • Validate data structure before encoding

Performance Considerations:

  • Pretty printing increases file size by 20-30%
  • Large arrays/objects can cause memory issues
  • Consider streaming JSON libraries for very large datasets
    // Write CSV data
    $csvFile = 'output/users.csv';
    $handle = fopen($csvFile, 'w');
    
    if ($handle) {
        // Write header
        fputcsv($handle, ['ID', 'Name', 'Email', 'Created']);
        
        // Write data rows
        foreach ($data['users'] as $user) {
            fputcsv($handle, [
                $user['id'],
                $user['name'],
                $user['email'],
                date('Y-m-d H:i:s')
            ]);
        }
        
        fclose($handle);
        echo "CSV data written successfully\n";
    }

CSV remains popular for data exchange with spreadsheet applications and legacy systems:

CSV Writing Best Practices:

  • Always Include Headers: Makes files self-documenting
  • Consistent Formatting: Use same date/number formats throughout
  • Handle Special Characters: fputcsv() automatically escapes quotes and commas
  • Character Encoding: Be explicit about UTF-8 encoding for international characters

Common CSV Issues and Solutions:

  • Excel Compatibility: Consider using tab-delimited for better Excel support
  • Large Numbers: Excel may convert large numbers to scientific notation
  • Line Endings: Different systems expect different line endings (CRLF vs LF)
  • BOM for UTF-8: Some applications require byte order mark for UTF-8 files
    // Write XML data
    $xml = new SimpleXMLElement('<users></users>');
    
    foreach ($data['users'] as $userData) {
        $user = $xml->addChild('user');
        $user->addAttribute('id', $userData['id']);
        $user->addChild('name', htmlspecialchars($userData['name']));
        $user->addChild('email', htmlspecialchars($userData['email']));
    }
    
    $xmlContent = $xml->asXML();
    safeFileWrite('output/users.xml', $xmlContent);
    echo "XML data written successfully\n";
}

XML writing requires special attention to encoding and structure:

Security Considerations:

  • Always use htmlspecialchars() for user data to prevent XML injection
  • Validate data before adding to XML structure
  • Consider using CDATA sections for complex text content

XML Best Practices:

  • Use meaningful element and attribute names
  • Include XML declaration with encoding
  • Validate against schema when possible
  • Consider namespaces for complex documents

Performance Notes:

  • SimpleXML loads entire document in memory
  • For large XML files, use XMLWriter for streaming output
  • XML files are typically larger than equivalent JSON

Advanced File Writing with the FileWriter Class

For more complex file writing scenarios, an object-oriented approach provides better structure and reusability:

class FileWriter
{
    private $handle;
    private string $filename;
    private string $mode;
    
    public function __construct(string $filename, string $mode = 'w')
    {
        $this->filename = $filename;
        $this->mode = $mode;
        $this->openFile();
    }
    
    public function __destruct()
    {
        $this->close();
    }

This FileWriter class encapsulates file writing operations with several advantages:

Automatic Resource Management: The destructor ensures files are closed even if exceptions occur, preventing resource leaks.

Flexible Writing Modes: Supports all PHP file modes while defaulting to safe overwrite mode.

State Management: Maintains file handle and metadata throughout the object's lifetime.

Error Consistency: All errors are thrown as exceptions for consistent handling.

    /**
     * Open file for writing
     */
    private function openFile(): void
    {
        // Ensure directory exists
        $directory = dirname($this->filename);
        if (!is_dir($directory)) {
            mkdir($directory, 0755, true);
        }
        
        $this->handle = fopen($this->filename, $this->mode);
        
        if (!$this->handle) {
            throw new RuntimeException("Cannot open file for writing: {$this->filename}");
        }
    }
    
    /**
     * Write data with optional formatting
     */
    public function write(string $data, bool $addNewline = false): int
    {
        if (!$this->handle) {
            throw new RuntimeException('File not open for writing');
        }
        
        if ($addNewline) {
            $data .= "\n";
        }
        
        $bytesWritten = fwrite($this->handle, $data);
        
        if ($bytesWritten === false) {
            throw new RuntimeException('Failed to write data to file');
        }
        
        return $bytesWritten;
    }
    
    /**
     * Write formatted line
     */
    public function writeLine(string $format, ...$args): int
    {
        $line = sprintf($format, ...$args) . "\n";
        return $this->write($line);
    }
    
    /**
     * Write array as CSV row
     */
    public function writeCsvRow(array $data): int
    {
        if (!$this->handle) {
            throw new RuntimeException('File not open for writing');
        }
        
        $bytesWritten = fputcsv($this->handle, $data);
        
        if ($bytesWritten === false) {
            throw new RuntimeException('Failed to write CSV row');
        }
        
        return $bytesWritten;
    }
    
    /**
     * Flush data to disk
     */
    public function flush(): bool
    {
        if (!$this->handle) {
            return false;
        }
        
        return fflush($this->handle);
    }
    
    /**
     * Get current file position
     */
    public function getPosition(): int
    {
        if (!$this->handle) {
            throw new RuntimeException('File not open');
        }
        
        return ftell($this->handle);
    }
    
    /**
     * Close file
     */
    public function close(): void
    {
        if ($this->handle) {
            fclose($this->handle);
            $this->handle = null;
        }
    }
}

// Usage examples
try {
    demonstrateFileWriting();
    
    // Safe writing example
    $content = "Important data that must be written atomically\n";
    safeFileWrite('output/important.txt', $content);
    echo "Important data written safely\n";
    
    // Structured data writing
    writeStructuredData();
    
    // Using FileWriter class
    $writer = new FileWriter('output/log.txt', 'a');
    $writer->writeLine('[%s] %s: %s', date('Y-m-d H:i:s'), 'INFO', 'Application started');
    $writer->writeLine('[%s] %s: %s', date('Y-m-d H:i:s'), 'DEBUG', 'Debug information');
    $writer->flush();
    echo "Log entries written successfully\n";
    
} catch (Exception $e) {
    echo "Error: " . $e->getMessage() . "\n";
}
?>

File Uploads

Secure File Upload Handling

<?php
/**
 * Secure File Upload Handling
 * 
 * File uploads present significant security risks and must be handled
 * with comprehensive validation, sanitization, and security measures.
 */

class SecureFileUpload
{
    private array $config;
    private array $errors = [];
    
    public function __construct(array $config = [])
    {
        $this->config = array_merge([
            'upload_dir' => 'uploads/',
            'max_file_size' => 5 * 1024 * 1024, // 5MB
            'allowed_types' => ['jpg', 'jpeg', 'png', 'gif', 'pdf', 'txt'],
            'allowed_mime_types' => [
                'image/jpeg',
                'image/png', 
                'image/gif',
                'application/pdf',
                'text/plain'
            ],
            'require_image_validation' => true,
            'generate_unique_names' => true,
            'quarantine_suspicious' => true
        ], $config);
        
        // Ensure upload directory exists and is secure
        $this->setupUploadDirectory();
    }
    
    /**
     * Setup secure upload directory
     */
    private function setupUploadDirectory(): void
    {
        $uploadDir = rtrim($this->config['upload_dir'], '/') . '/';
        
        // Create directory if it doesn't exist
        if (!is_dir($uploadDir)) {
            if (!mkdir($uploadDir, 0755, true)) {
                throw new RuntimeException("Cannot create upload directory: $uploadDir");
            }
        }
        
        // Create .htaccess to prevent execution of uploaded files
        $htaccessFile = $uploadDir . '.htaccess';
        if (!file_exists($htaccessFile)) {
            $htaccessContent = "# Prevent execution of uploaded files\n";
            $htaccessContent .= "Options -ExecCGI\n";
            $htaccessContent .= "AddHandler cgi-script .php .pl .py .jsp .asp .sh .cgi\n";
            $htaccessContent .= "Options -Indexes\n";
            
            file_put_contents($htaccessFile, $htaccessContent);
        }
        
        // Create index.php to prevent directory listing
        $indexFile = $uploadDir . 'index.php';
        if (!file_exists($indexFile)) {
            file_put_contents($indexFile, '<?php http_response_code(403); ?>');
        }
    }
    
    /**
     * Handle file upload with comprehensive validation
     */
    public function handleUpload(string $fieldName): ?array
    {
        $this->errors = [];
        
        // Check if file was uploaded
        if (!isset($_FILES[$fieldName])) {
            $this->errors[] = 'No file uploaded';
            return null;
        }
        
        $file = $_FILES[$fieldName];
        
        // Validate upload
        if (!$this->validateUpload($file)) {
            return null;
        }
        
        // Additional security checks
        if (!$this->performSecurityChecks($file)) {
            return null;
        }
        
        // Move file to secure location
        return $this->moveUploadedFile($file);
    }
    
    /**
     * Validate uploaded file
     */
    private function validateUpload(array $file): bool
    {
        // Check for upload errors
        if ($file['error'] !== UPLOAD_ERR_OK) {
            switch ($file['error']) {
                case UPLOAD_ERR_INI_SIZE:
                    $this->errors[] = 'File exceeds upload_max_filesize directive';
                    break;
                case UPLOAD_ERR_FORM_SIZE:
                    $this->errors[] = 'File exceeds MAX_FILE_SIZE directive';
                    break;
                case UPLOAD_ERR_PARTIAL:
                    $this->errors[] = 'File was only partially uploaded';
                    break;
                case UPLOAD_ERR_NO_FILE:
                    $this->errors[] = 'No file was uploaded';
                    break;
                case UPLOAD_ERR_NO_TMP_DIR:
                    $this->errors[] = 'Missing temporary folder';
                    break;
                case UPLOAD_ERR_CANT_WRITE:
                    $this->errors[] = 'Failed to write file to disk';
                    break;
                case UPLOAD_ERR_EXTENSION:
                    $this->errors[] = 'Upload stopped by extension';
                    break;
                default:
                    $this->errors[] = 'Unknown upload error';
            }
            return false;
        }
        
        // Check file size
        if ($file['size'] > $this->config['max_file_size']) {
            $maxSizeMB = round($this->config['max_file_size'] / (1024 * 1024), 2);
            $this->errors[] = "File size exceeds limit ({$maxSizeMB}MB)";
            return false;
        }
        
        // Validate file extension
        $extension = strtolower(pathinfo($file['name'], PATHINFO_EXTENSION));
        if (!in_array($extension, $this->config['allowed_types'])) {
            $this->errors[] = 'File type not allowed: ' . $extension;
            return false;
        }
        
        // Validate MIME type
        $mimeType = mime_content_type($file['tmp_name']);
        if (!in_array($mimeType, $this->config['allowed_mime_types'])) {
            $this->errors[] = 'MIME type not allowed: ' . $mimeType;
            return false;
        }
        
        return true;
    }
    
    /**
     * Perform additional security checks
     */
    private function performSecurityChecks(array $file): bool
    {
        // Check if file is actually uploaded
        if (!is_uploaded_file($file['tmp_name'])) {
            $this->errors[] = 'File is not a valid upload';
            return false;
        }
        
        // Validate image files
        if ($this->config['require_image_validation'] && $this->isImageFile($file)) {
            if (!$this->validateImageFile($file['tmp_name'])) {
                return false;
            }
        }
        
        // Scan for malicious content
        if (!$this->scanForMaliciousContent($file['tmp_name'])) {
            return false;
        }
        
        return true;
    }
    
    /**
     * Check if file is an image
     */
    private function isImageFile(array $file): bool
    {
        $imageMimeTypes = ['image/jpeg', 'image/png', 'image/gif'];
        return in_array(mime_content_type($file['tmp_name']), $imageMimeTypes);
    }
    
    /**
     * Validate image file integrity
     */
    private function validateImageFile(string $tmpName): bool
    {
        // Try to get image information
        $imageInfo = getimagesize($tmpName);
        
        if ($imageInfo === false) {
            $this->errors[] = 'Invalid image file';
            return false;
        }
        
        // Check image dimensions
        [$width, $height] = $imageInfo;
        
        if ($width > 5000 || $height > 5000) {
            $this->errors[] = 'Image dimensions too large';
            return false;
        }
        
        if ($width < 1 || $height < 1) {
            $this->errors[] = 'Invalid image dimensions';
            return false;
        }
        
        // Additional image validation
        $allowedImageTypes = [IMAGETYPE_JPEG, IMAGETYPE_PNG, IMAGETYPE_GIF];
        if (!in_array($imageInfo[2], $allowedImageTypes)) {
            $this->errors[] = 'Image type not allowed';
            return false;
        }
        
        return true;
    }
    
    /**
     * Scan file for malicious content
     */
    private function scanForMaliciousContent(string $tmpName): bool
    {
        // Read first few KB to check for suspicious patterns
        $handle = fopen($tmpName, 'rb');
        if (!$handle) {
            $this->errors[] = 'Cannot read uploaded file';
            return false;
        }
        
        $chunk = fread($handle, 8192); // Read first 8KB
        fclose($handle);
        
        // Check for PHP code patterns
        $suspiciousPatterns = [
            '/<\?php/i',
            '/<\?=/i',
            '/<script/i',
            '/eval\s*\(/i',
            '/base64_decode/i',
            '/shell_exec/i',
            '/system\s*\(/i',
            '/exec\s*\(/i'
        ];
        
        foreach ($suspiciousPatterns as $pattern) {
            if (preg_match($pattern, $chunk)) {
                if ($this->config['quarantine_suspicious']) {
                    $this->quarantineFile($tmpName);
                }
                $this->errors[] = 'Suspicious content detected';
                return false;
            }
        }
        
        return true;
    }
    
    /**
     * Move suspicious files to quarantine
     */
    private function quarantineFile(string $tmpName): void
    {
        $quarantineDir = 'quarantine/';
        if (!is_dir($quarantineDir)) {
            mkdir($quarantineDir, 0700, true);
        }
        
        $quarantineFile = $quarantineDir . 'suspicious_' . uniqid() . '.bin';
        move_uploaded_file($tmpName, $quarantineFile);
        
        // Log the incident
        error_log("Suspicious file quarantined: $quarantineFile from IP: " . ($_SERVER['REMOTE_ADDR'] ?? 'unknown'));
    }
    
    /**
     * Move uploaded file to final location
     */
    private function moveUploadedFile(array $file): array
    {
        $uploadDir = rtrim($this->config['upload_dir'], '/') . '/';
        
        // Generate secure filename
        if ($this->config['generate_unique_names']) {
            $extension = strtolower(pathinfo($file['name'], PATHINFO_EXTENSION));
            $filename = uniqid('upload_') . '.' . $extension;
        } else {
            // Sanitize original filename
            $filename = $this->sanitizeFilename($file['name']);
        }
        
        $destination = $uploadDir . $filename;
        
        // Ensure file doesn't already exist
        $counter = 1;
        $originalDestination = $destination;
        while (file_exists($destination)) {
            $pathInfo = pathinfo($originalDestination);
            $destination = $pathInfo['dirname'] . '/' . $pathInfo['filename'] . '_' . $counter . '.' . $pathInfo['extension'];
            $counter++;
        }
        
        // Move file
        if (!move_uploaded_file($file['tmp_name'], $destination)) {
            $this->errors[] = 'Failed to move uploaded file';
            return null;
        }
        
        // Set secure permissions
        chmod($destination, 0644);
        
        return [
            'original_name' => $file['name'],
            'filename' => basename($destination),
            'path' => $destination,
            'size' => $file['size'],
            'mime_type' => mime_content_type($destination),
            'uploaded_at' => date('Y-m-d H:i:s')
        ];
    }
    
    /**
     * Sanitize filename
     */
    private function sanitizeFilename(string $filename): string
    {
        // Remove path information
        $filename = basename($filename);
        
        // Remove dangerous characters
        $filename = preg_replace('/[^a-zA-Z0-9._-]/', '_', $filename);
        
        // Limit length
        if (strlen($filename) > 100) {
            $pathInfo = pathinfo($filename);
            $name = substr($pathInfo['filename'], 0, 90);
            $filename = $name . '.' . $pathInfo['extension'];
        }
        
        return $filename;
    }
    
    /**
     * Get upload errors
     */
    public function getErrors(): array
    {
        return $this->errors;
    }
    
    /**
     * Handle multiple file uploads
     */
    public function handleMultipleUploads(string $fieldName): array
    {
        $results = [];
        
        if (!isset($_FILES[$fieldName])) {
            return $results;
        }
        
        $files = $_FILES[$fieldName];
        
        // Normalize array structure for multiple files
        if (is_array($files['name'])) {
            $fileCount = count($files['name']);
            
            for ($i = 0; $i < $fileCount; $i++) {
                $file = [
                    'name' => $files['name'][$i],
                    'type' => $files['type'][$i],
                    'tmp_name' => $files['tmp_name'][$i],
                    'error' => $files['error'][$i],
                    'size' => $files['size'][$i]
                ];
                
                // Temporarily override $_FILES for single file processing
                $_FILES['temp_single'] = $file;
                $result = $this->handleUpload('temp_single');
                unset($_FILES['temp_single']);
                
                if ($result) {
                    $results[] = $result;
                }
            }
        }
        
        return $results;
    }
}

// Usage example
if ($_SERVER['REQUEST_METHOD'] === 'POST' && isset($_FILES['upload'])) {
    try {
        $uploader = new SecureFileUpload([
            'upload_dir' => 'uploads/',
            'max_file_size' => 2 * 1024 * 1024, // 2MB
            'allowed_types' => ['jpg', 'jpeg', 'png', 'pdf'],
            'generate_unique_names' => true
        ]);
        
        $result = $uploader->handleUpload('upload');
        
        if ($result) {
            echo "File uploaded successfully:\n";
            print_r($result);
        } else {
            echo "Upload failed:\n";
            foreach ($uploader->getErrors() as $error) {
                echo "- $error\n";
            }
        }
        
    } catch (Exception $e) {
        echo "Upload error: " . $e->getMessage() . "\n";
    }
}
?>

For more PHP file and security topics:

Summary

Effective file operations are essential for robust PHP applications:

Reading Files: Use appropriate methods based on file size and memory constraints, with proper error handling and validation.

Writing Files: Implement atomic writing operations to prevent data corruption and ensure data integrity.

File Uploads: Apply comprehensive security measures including validation, sanitization, and malicious content detection.

Security First: Always validate file paths, sanitize filenames, check permissions, and implement proper access controls.

Performance Considerations: Use streams for large files, implement proper buffering, and consider memory usage in file operations.

Best Practices: Validate all inputs, handle errors gracefully, use secure file permissions, and log security incidents.

Mastering file operations enables you to build applications that safely handle user data, manage content effectively, and maintain security while providing excellent functionality and user experience.