PHP Web Site Optimization

From Training Material
Jump to navigation Jump to search

PHP Web Site Optimization
Greg Weir

PHP Web Site Optimization Training Materials

What is Optimization? ⌘

  • Reducing resource utilization
  • Improving response time
    • Sometimes there is a tradeoff between the two, increasing utilization to reduce response time
  • In this course, general techniques for any PHP application are covered
    • For some specialized applications, (e.g. WordPress) additional subject matter training is recommended

The Optimization Process ⌘

  1. Create a well instrumented test
    1. Time metrics are absolutely required
    2. Some resource utilization metrics are needed, others may not be
    3. Re-using existing monitoring metrics is a good practice, but in some cases they may not be enough
  2. Run the test (establish a baseline)
  3. Make a change
    1. Identify possible sources of delay
    2. Make *one* change (e.g. optimize a SQL query, tweak a script, add cache, add hardware)
  4. Repeat test and record results

Recommended Tools ⌘

  • Apache JMeter -
    • Load testing tool, used to create and run tests
  • xdebug -
    • Debugging extension for PHP, with a built-in profiler for performance data
  • Cachegrind GUI
    • Works with performance data generated by xdebug
    • Versions available for Linux, Windows, and MacOS

Opcode Caching ⌘

  • This is built in to all versions of PHP since 5.5, by default
    • Same sort of optimization Zend previously offered in their Zend Optimizer product
  • Parameters for the built-in caching are manipulated using the OPcache Module

Interprocess Communication ⌘

  • IPC is any means of storing variables or sending signals between processes
    • Legacy forms of IPC in PHP applications:
      • Shared Memory (shmop)
      • Flat file
      • Temp tables in databases
    • Best practice forms of IPC:
      • memcached
      • Redis
      • ZeroMQ, RabbitMQ, beanstalkd
      • Gearman

Parallel Processing ⌘

  • The problem: PHP is synchronous in nature, it is not designed for multithreading
  • Ways of doing parallel processing:
    • curl_multi or sockets - OK for lower numbers of parallel operations
    • popen() or exec() - requires custom code but is relatively easy
    • Forking (pcntl_fork) - this is an advanced technique
    • Task queueing - essential for large scale, easier to dianose problems

Database Optimization ⌘

  • Know how to turn on query logging and query caching
  • Database basics: keys, joins, table scans, functions
  • Optimizing a known query
    • Look at each table involved (MySQL: DESCRIBE tbl_name, Postgres: \d+ tbl_name)
    • EXPLAIN - Available in MySQL and PostgreSQL
    • Can the query use existing keys better?
    • If there is a subquery, can we reduce or reuse it?
    • Are we retrieving columns we don't need?
    • Can we create a key to help improve the query?