Redis for Administrators and Developers (PHP)

title: Redis for Administrators and PHP Developers
author: Alexander Patrakov

Redis Overview ⌘

Fast in-memory NoSQL database
- "REmote DIctionary Server"
- More than 1 000 000 GETs/sec on a typical developer's laptop, even more on servers
- Unique feature: performance guarantees for each command in Big-O notation
Different from memcached
- Offers disk-based persistence
- Has more than just strings to offer
Different from MongoDB
- Redis data types are closely related to fundamental data structures, not JSON
- No rich queries, just simple commands

Redis Releases ⌘

Download source from http://download.redis.io/releases/
- Redis 4.0.2 released on 21-Sep-2017
- No https, no hash sums! Are they serious about security?
- There are also tarballs available from https://github.com/antirez/redis/releases, with a different hash sum!

Major.Minor.Micro versioning
- Excellent backwards compatibility even between major versions
- Security fixes are provided (?) in the form of micro releases

Support policy: two latest minor versions
- But actually there were no 3.0.x releases after 3.2.0
- Security implications: always use the latest version?
- Or rely on a Linux distribution to backport the security patches
  - Verify that they do backport security patches!
- The reality may be not that sad: version 3.2.10 was released after 4.0.1, with a number of fixes

Redis Installation ⌘

Use packages? Sometimes they are good enough!
- Debian/Ubuntu: apt install redis-server # from the main repository?
- CentOS: yum install redis # from EPEL
- Arch: pacman -S redis # from the community repository
- SuSE: zypper install redis # from the server:database repository
Redis developers would like you to compile Redis from sources
- So that you have the latest version

Installation on Windows ⌘

There was a fork by Microsoft Open Technologies
- Redis-32 (very outdated) and Redis-64 NuGet packages available
- MSI releases also available
- Latest release in 2016, based on redis-3.0 codebase

What's new ⌘

Redis 4.0.x: 14-Jul-2017 - Present

- Redis Cluster is now compatible with NAT
- New replication engine
- Memory optimizations, including online defragmentation
- Background deletes
- Redis modules system

Redis 3.2.x: 06-May-2016 - Present

- Redis Sentinel is now compatible with NAT (3.2.2)
- New command for efficient bit-field manipulation
- New set of commands for geo indexing
- Memory optimizations
  - May or may not fully apply to versions in Linux distributions - bundled vs system jemalloc issue

Redis 3.0.x: 01-Apr-2015 - 28-Jan-2016
Redis 2.8.x: 22-Nov-2013 - 18-Dec-2015

Typical Package contents ⌘

Binaries:
- /usr/bin/redis-server
- /usr/bin/redis-cli
- /usr/bin/redis-sentinel
  - Symlink to redis-server
- /usr/bin/redis-trib
- /usr/bin/redis-benchmark
- /usr/bin/redis-check-aof
- /usr/bin/redis-check-rdb
Distributions add files such as:
- /usr/lib/systemd/system/redis.service
- /etc/logrotate.d/redis
- /etc/redis.conf

What does what ⌘

redis-server: the server
- Can start in standalone mode, in sentinel mode, or in cluster mode
redis-cli: a command-line client
- Used by sysadmins for debugging, monitoring and reconfiguration
- Also used in training courses to make them programming-language-neutral
redis-benchmark: the official way of benchmarking Redis
redis-check-aof, redis-check-rdb: check and possibly fix the structure of Redis database
redis-trib: a Ruby script that reconfigures a cluster
- Redis 3.0+
- Missing in Arch Linux because of unpackaged Ruby Redis client library
  - The missing library can be installed via gem, and the script will then work

Configuration file ⌘

Usually located at /etc/redis.conf or /etc/redis/redis.conf
Extensive documentation in the form of comments
The name of the configuration file is passed to redis-server as a parameter
Redis can be asked to rewrite its own configuration after dynamic changes
- Here is how: redis-cli CONFIG REWRITE
  - Fails in Arch: (error) ERR Rewriting config file: Read-only file system
- This feature is used by Redis Sentinel and Redis Cluster
- For standalone Redis servers, a read-only configuration file is better

Security notes ⌘

Redis is insecure
- Allows full access (including CONFIG SET and CONFIG REWRITE) to everyone who can connect and knows the password
Security features:
- The "bind" directive (specifies on which IP and port to accept connections)
  - Defaults to 127.0.0.1 in Debian, missing elsewhere
- The "requirepass" directive
  - Sets global password (no username)
Protected mode
- If activated, and there is no "bind" and no "requirepass", don't accept commands on non-local connections

Security notes, continued ⌘

There is no SSL
- Run Redis on a trusted network
- Use stunnel/spiped/IPSec if absolutely needed
Don't run Redis as root
- Or your root's authorized_keys will be overwritten from a hacked webapp
  - Demo readily available
~~Don't let HTTP clients send queries to Redis~~
- There was a browser-based demo that steals data
- Does not work anymore - Redis developers have added protections against cross-protocol scripting
Set up a firewall
Don't ignore protections (apparmor, systemd ProtectSystem) set up by distributions!

How Redis works ⌘

At startup, loads the existing RDB or AOF file
- Populates internal data structures in memory
Listens on sockets
Handles all clients in one thread using epoll
- Attempts to read commands from clients
- Attempts to write any buffered output to clients
  - Beware of clients and slaves behind slow networks - memory bloat issue
- Executes one command at a time, puts any output into the buffer
  - Beware of slow commands - latency issue for other clients
If configured, preiodically forks and saves its state into RDB file

Versions in distributions ⌘

Data current as of 12-Nov-2017
Debian Stable: 2:3.2.6-1
- No significant patches
- Cross-protocol scripting protection is missing!
Debian Testing & Backports: 4:4.0.2-5 and 4:4.0.2-6~bpo9+1
- Latest version, the Debian version churn is about dropping non-deterministic tests
Ubuntu 16.04 LTS: 2:3.0.6-1
- Never updated since Ubuntu release
- Is in Universe, so no bugfixes and no security support
Ubuntu 17.10: 4:4.0.1-7

Versions in distributions ⌘

Arch always has the latest Redis version
Fedora 26: Redis 3.2.8
CentOS: install from EPEL
- CentOS 6: Redis 2.4.10
- CentOS 7: Redis 3.2.3
OpenSuSE 42.3: Redis 3.2.9-84.11
- No significant patches over vanilla 3.2.9

Installation from source ⌘

No dependencies except gcc and related packages
make -j4
- Takes 18 seconds on a Lenovo Yoga 2 Pro laptop
- Possible tweaks: OPTIMIZATION=..., MALLOC=...
- Default: OPTIMIZATION=-O2, MALLOC=jemalloc
Or: make 32bit

make test
- Needs tcl installed
- 2 minutes here, everything passes
- Debian maintainers complain that some tests are sensitive to timing or otherwise unreliable

sudo make install
- Installs in /usr/local/bin
- Override: PREFIX=...

Situation with memory allocators ⌘

Default on Linux: MALLOC=jemalloc
- Jemalloc-4.0.3 is bundled with redis-4.0.2
- There was an upgrade attempt to 4.4.0, which failed due to deadlocks
- Other options: libc, jemalloc, tcmalloc, tcmalloc_minimal
Distributions often undo the bundling of jemalloc
- So, in Debian, redis uses the system jemalloc 3.6.0

Quick test without installation ⌘

./src/redis-server ./redis.conf
./redis-cli
- See example session below
- List of all commands (this training only covers a subset)
- There is also a help command that provides the same help!
kill `pidof redis-server`
- Look, there is a dump.rdb file!

127.0.0.1:6379> set foo bar
OK
127.0.0.1:6379> get foo
"bar"
127.0.0.1:6379> get boom
(nil)
127.0.0.1:6379> quit

Connecting to other servers/databases ⌘

redis-cli -h host -p port
- TCP connection
redis-cli -s /run/redis/redis.sock
- UNIX socket, will be faster, but can be used for local connections only
redis-cli -d database
- By default, there are 16 databases, numbered from 0 (default) to 15
redis-cli -a password
- The password is per-server
- There is no username
There is no SSL - use stunnel or spiped

How to start and stop redis ⌘

Start: redis-server /etc/redis.conf
- Will start listening on 127.0.0.1:6379 by default
Stop: two ways to do it
- redis-cli shutdown # specify the port if you changed it
- Or just kill the main process with SIGTERM
- Both ways can fail in different cases
Linux distributions typically provide an init script or a systemd unit that does it for you

Integration with systemd ⌘

Should work out of the box if you installed from package
Otherwise, copy-paste the unit from Arch Linux or from Debian
- Needs systemd >= 225 (released in August 2015)
- In Debian Unstable, they switched to killing with SIGTERM
- Also added support for running multiple instances of redis

Recommended sysctls ⌘

vm.overcommit_memory=1
- The default is 0, can prevent a big (> RAMSIZE/2) Redis from forking
- Redis needs to fork for saving its RDB file, for rewriting AOF, and for talking to slaves
net.core.somaxconn=512
- TCP backlog size
Disable transparent hugepages
- They cause latency spikes when Redis forks
- Jemalloc also doesn't like them
- Don't do this using /etc/rc.local - it would be too late!
- The preferred method is the transparent_hugepage=never kernel argument
- Another good way: two lines in /etc/tmpfiles.d/no-thp.conf:
  - w /sys/kernel/mm/transparent_hugepage/enabled - - - - never
  - w /sys/kernel/mm/transparent_hugepage/defrag - - - - never

What else is needed ⌘

Your application that wants to connect to Redis!
Libraries exist for many programming languages
- C: hiredis
- Python: redis
  - Can use hiredis for speedup
- PHP: phpredis or Predis
- C#: StackExchange.Redis
- Node.js: io_redis or node_redis

Accessing Redis from PHP ⌘

phpredis
- A PHP extension
- Written in C
- More or less, exposes raw commands + serializer
- Also exposes a session backend - but without locking
predis
- Pure PHP
  - So 1.5x slower
- Exposes some useful abstractions, e.g.:
  - replication
  - client-side sharding
  - easier access to scripts
Both libraries have their own limitations
Both libraries are available in Debian and Ubuntu via apt-get
- But it may be preferable to install predis using composer - easier to autoload, guaranteed to be up to date

Simple example: counter ⌘

Let's demonstrate the INCR command

$ redis-cli
127.0.0.1:6379> incr counter:foo
(integer) 1
127.0.0.1:6379> incr counter:foo
(integer) 2
127.0.0.1:6379> incr counter:foo
(integer) 3

Naming Redis keys ⌘

Convention: use multi-part names with ":" (or sometimes "|") as a namespace separator
- "object-type:id" is a good idea
  - Various admin tools (like phpRedisAdmin) enforce it
- Think about colons in untrusted user data that becomes part of the key name
All bytes are OK
- But "*", "#" and "->" have a special meaning for SORT, so may be a problem
- "{" and "}" have a special meaning for Redis Cluster
Redis doesn't care about UTF-8
- Again, except in the SORT command, but that's about values
The empty string is also a valid key
Too-long names (>1000 bytes) are slow

Simple example: counter (with phpredis) ⌘

<?php

$redis = new \Redis();
$redis->connect("127.0.0.1");

$counter = $redis->incr("counter:foo");
echo "The counter is {$counter}\n";

Warning: error handling is omitted for illustrative purposes.

Simple example: counter (with predis) ⌘

<?php

// This works with PEAR, but better use a project-wide autoloader
// Prepend a base path if Predis is not available in your "include_path".
require 'Predis/Autoloader.php';
// on Ubuntu 17.10: require 'php-ncr-predis/Autoloader.php';

Predis\Autoloader::register();

$redis = new Predis\Client(['host' => '127.0.0.1']);
$counter = $redis->incr("counter:foo");
echo "The counter is: {$counter}\n";

Warning: error handling is omitted for illustrative purposes.

The only exercise for today ⌘

Counting votes
- You can't do it all now. After I explain each data structure, think how it can help with the progress.
Users can vote (+1 or -1 or neutral) on articles
- Each user and each article are identified by a numeric ID
- Articles also have publication dates
Track the score that each article has
Prevent users from voting twice on the same article
- Or if you want to complicate things: allow them to change their mind!
Find top 10 articles for a time period between 7 days ago and now
Make sure that votes are lost (and voting is impossible) when an article is deleted
It should be possible to delete all of the particular user's votes, too

Setting up a project ⌘

We'll use Predis and PHPUnit
- We'll install them from distribution packages
- You can use Composer instead if you prefer
We'll write some unit tests and then code that makes them pass
- E.g.:
  - The number of votes for an article should initially be 0
  - The number of votes if one user voted positively should be 1
  - The number of votes if one user voted positively for the same article twice should still be 1

Vote.php (no Composer) ⌘

<?php

namespace Vote;

class Vote
{
    public function __construct($redis)
    {
        $this->redis = $redis;
    }

    public function incCounter()
    {
        $result = $this->redis->incr("counter:foo");
        return $result;
    }
}

VoteTest.php (no Composer) ⌘

<?php

require 'PHPUnit/Autoload.php';
require 'php-nrk-predis/Autoloader.php';
require 'Vote.php';

use \PHPUnit\Framework\TestCase;


\Predis\Autoloader::register();

class VoteTest extends TestCase
{
    private $vote;

    protected function setUp()
    {
        $redis = new \Predis\Client(['host' => '127.0.0.1']);
        $redis->flushall();  // tests should be independent, so we delete everything from the db
        $this->vote = new \Vote\Vote($redis);
    }

    public function testIncr()
    {
        $this->vote->incCounter();
        $this->vote->incCounter();
        $result = $this->vote->incCounter();
        $this->assertEquals(3, $result);
    }
}

And then run: phpunit VoteTest.php

Redis data types ⌘

Strings
Lists
Sets
Hashes
Sorted sets
Bitmaps (in fact, special operations on strings)
HyperLogLogs (2.8.9+)
Geoindexes (3.2.0+)

Redis data types, continued ⌘

The real question: which type is right for my use case?
- Answer: most likely, a combination of several types
  - Yes, redundant data - but that's also how SQL databases keep indexes
- Consider all possible requests that make sense now from a business standpoint
- Store data separately for each use case
- Common mistake: stored data, but don't know how to find it without the slow KEYS command

Strings ⌘

Are binary-safe
- Commonly used to represent a number, a string, or just a blob of data
Use these commands:
- GET key
  - Returns (nil) if the key does not exist
  - (nil) is different from an empty string
- SET key value
- APPEND key value
  - Non-existing keys are treated as holding an empty string

Redis strings in PHP bindings ⌘

$redis->get($key), $redis->set($key, $value) work as expected for strings
Both phpredis and Predis just use lowercased command names everywhere
Redis protocol indicates the data type for each reply, and Predis obeys that
- Which means that $redis->get($key) returns a string or null
phpredis can also (optionally) serialize objects and arrays automatically
- $redis->setOption(Redis::OPT_SERIALIZER, Redis::SERIALIZER_PHP);

Strings as numbers ⌘

Use these commands:
- GET key, SET key value
- INCR key, INCRBY key by, INCRBYFLOAT key by
- DECR key, DECRBY key by, DECRBYFLOAT key by
  - When incrementing or decrementing a non-existing key, it is assumed to hold 0
Both phpredis and Predis support these commands in a straightforward way
- Gotcha: $redis->get($key) still returns a string or a null, not an integer, you have to cast!

127.0.0.1:6379> get counter:foo
"3"
127.0.0.1:6379> incr counter:foo
(integer) 4

Some more string operations ⌘

Get string length: STRLEN key
- Returns 0 if the key doesn't exist
Substrings can be stored and retrieved: SETRANGE key offset value, GETRANGE key start end
- In SETRANGE, the assumed length is the length of the value
  - The reply is the length of the resulting string
- Good for arrays of fixed-size objects with O(1) access time

Deleting, testing and renaming keys ⌘

DEL key1 key2 key2 ...: deletes the named keys
EXISTS key1 key2 ...: returns the number of listed keys that exist
RENAME old new: renames the key
- Effectively deletes the new key if both keys exist
DEL and RENAME commands work with all data types, not only with strings

Hashes ⌘

Contain a whole mapping of field-value pairs
- Something like Redis inside Redis, under its own name
  - Design decision: objects as hashes vs json in strings
- There is no Redis inside Redis inside Redis
Important commands: HSET, HGET, HSTRLEN, HEXISTS, HDEL, HINCR, HINCRBY, HINCRBYFLOAT, HDECR, HDECRBY, HDECRBYFLOAT
- You already know them - they work just like their cousins without H, but take the key as the first argument
- HSET key field value
There is no HRENAME, HSETRANGE
HKEYS returns all fields, HVALS gets all values, HGETALL gets both, HLEN counts fields

Lists ⌘

Ordered collections of strings
- Similar to linked lists - not C++ deque!
Fast Operations: O(1)
- LPUSH/RPUSH key value: prepend/append a value
- LPOP/RPOP key: remove and get the value from either end
  - Return (nil) when applied to an empty list
- LLEN key: length of a list

Lists, continued ⌘

Slow operations: O(n) where n is the number of elements to traverse
- LINSERT mylist AFTER Hello There: finds "Hello", inserts "There" after it
- LINDEX key index: reads value by index
  - index works like in Python: 0 = leftmost, -1 = rightmost (and these cases are fast)
  - Use sorted sets instead if fast access by index is needed
- LRANGE key start stop: reads multiple values (stop is inclusive, unlike in Python)
- LSET key index value: sets a particular value
- LREM key count value: remove elements equal to value
  - count = 0 means all, count > 0 means some first elements, count < 0 traverses from the right
  - In the worst case, traverses all elements

LTRIM ⌘

Tricky operation: O(n), where n is the number of values to remove
- LTRIM key start stop: remove everything outside the specified range
- So O(1) if it removes at most only one element (use case: capped collection)
  - RPUSH log:general "INFO: nothing has happened"
  - LTRIM log:general 0 999

Sets ⌘

Like hashes, but without values
- No intrinsic ordering
Important commands:
- SADD key member1 mebber2 ... : adds members to the set
  - O(n) where n is the number of members to be added
- SREM key member1 member2 ... : removes from the set
- SCARD key : how many members are there?
- SISMEMBER key candidate : does the candidate exist in the set?
- SMEMBERS key : return all of them
- SPOP key: remove and return some random member

Sets, continued ⌘

Sets support bulk operations
- But they are slow: O(N + M + ...)
- SINTER set1 set2 ...: returns all members of the intersection
- SUNION set1 set2 ...: union
- SDIFF a b c d ...: elements of a that don't occur in b, c, d or ...
Server-side store available
- SINTERSTORE dest set1 set2 ...
- SUNIONSTORE dest set1 set2 ...
- SDIFFSTORE dest a b c d ...

Sorted sets ⌘

Like sets, but include a rank (a floating-point number) with each member
- Members are intrinsically ordered by rank (ascending)
- Members with the same rank are ordered lexicographically as arrays of bytes
Insertion of an element: ZADD key score1 value1 score2 value2 ...
- O(log N) where N is the set size
Can look at ranges:
- ZRANGE key start stop // by index
- ZRANGEBYLEX key min max // by dictionary order
  - works correctly only if all members in the set have the same score
- ZRANGEBYSCORE // by score
The following commands also work:
- ZSCORE key member, ZRANK key member, ZCARD key, ZREM key member

Bit arrays ⌘

Not really a new data type - just a view on strings
Commands: GETBIT, SETBIT, BITOP [AND|OR|NOT|XOR], BITCOUNT, BITOPS
Since Redis 3.2.0, it is possible to use a string as a collection of fields representing many small integers
- Use the BITFIELD command
- Wrapping, saturating or overflow-checked, signed or unsigned arithmetics

HyperLogLogs ⌘

Task: count distinct values in incoming stream of strings
Constraint: memory
- Cannot store them all
Solution: probabilistic algorithm
Uses ~12K bytes per the estimator
- Actually a string - no separate data type
Commands: PFADD key value, PFCOUNT key
Also: PFMERGE dst src src ...

Geoindexes ⌘

Available in Redis 3.2.0+
- No separate data structure - just a special case of sorted set
Available in predis and phpredis
Store locations (with names and coordinates): GEOADD key lon lat name
- Locations too close (<5 degrees latitude) to the poles cannot be added
- No GEODEL, use ZREM instead
- Get them back: GEOPOS key name
Calculate distances: GEODIST key name1 name2
Find locations close to a given point: GEORADIUS key lon lat radius m|km|mi|ft

Modules ⌘

Available in Redis 4.0.0+
- "Shared objects" that can be loaded by Redis
  - Loading from config file: loadmodule /path/to/mymodule.so
  - MODULE LOAD command also exists
- Typically written in C
- Can provide arbitrary commands and data structures
- Redis modules hub

Modules (PHP) ⌘

Modules can define arbitrary commands with arbitrary arguments
- So, ability to execute raw commands is needed
- phpredis: $result = $redis->rawcommand("SOME", "COMMAND");
  - Very simple
- predis: $result = $redis->executeRaw(["SOME", "COMMAND"]);
  - It is also possible to create classes for new commands

class PrependCommand extends Predis\Command\Command
{
    // The PREPEND command comes from https://github.com/RedisLabsModules/redex
    public function getId()
    {
        return 'PREPEND';
    }
    
}

$redis->getProfile()->defineCommand('prepend', 'PrependCommand');
$redis->prepend("key", "something");

Pub/Sub ⌘

To subscribe to channels: SUBSCRIBE ch1 ch2 ...
- No further commands allowed other than SUBSCRIBE, PSUBSCRIBE, UNSUBSCRIBE, PUNSUBSCRIBE, PING and QUIT
- PSUBSCRIBE and PUNSUBSCRIBE work on channel name patterns
To publish to a channel: PUBLISH ch message
- Returns the number of clients that receive it
PUBSUB is a debugging command to list channels or count subscribers
It is possible for Redis to auto-publish certain events in keyspace
- Look for "notify-keyspace-events" in the config file
- Off by default

Pub/Sub unreliability ⌘

Clients sometimes disconnect and reconnect
Messages sent during that time are lost for them
Solution (if lost messages are a problem):
- Keep real messages in lists
- Publish only notifications that the list has changed
- On reconnection, assume that the list has changed

Pub/Sub from predis ⌘

Need to disable timeout on the socket
A subscriber loop abstraction is available
- Provides an iterable that gives out the messages

$pubsub = $redis->pubSubLoop();
$pubsub->subscribe('channel1', 'channel2');
foreach ($pubsub as $message) {
    switch ($message->kind) {
        case 'subscribe':
            // look at $message->channel
            break;
        case 'unsubscribe':
            // look at $message->channel
            break;
        case 'message':
            // look at $message->channel and $message->payload
            if ($finished) {
                $pubsub->unsubscribe();
            }
            break;
    }
}

- See a complete working example in the sources

Pub/Sub from phpredis ⌘

Also provides an abstraction for pubsub, but very differently

$redis->subscribe(
    ['channel1', 'channel2'],
    function ($redis, $channel, $message) {
        echo "Message arrived\n";
        // do whatever is needed
        if ($finished) {
            $redis->unsubscribe();
        }
    }
);

Blocking list operations ⌘

Useful for stacks and queues
BLPOP, BRPOP key1 key2 ... timeout
Try to pop a key from one of the specified lists
- Wait until something appears or until the timeout expires
- Lists are checked in the order given
- Timeout is in seconds, integer, 0 = block forever

Transactions ⌘

A way to group operations so that nothing else happens in between
- It makes sense to combine transactions with pipelining
WATCH key1 key2 ...
GET ... or other operations
MULTI
SET ...
- QUEUED
SET ...
- QUEUED
EXEC
- <actual replies for commands in the transaction>
- Or a (nil) if at least one of the watched keys has changed

Transactions (phpredis) ⌘

More-or-less raw commands

$redis->watch(...);
$result = $redis->multi()
    ->hset(...)
    ->zadd(...)
    ->whatever_else(...)
    ->exec();

- Returns either an array of results or a FALSE value
- It is the responsibility of the user to retry

Transactions (predis) ⌘

Straightforward

$redis->watch(...);
$result = $redis->transaction()
    ->hset(...)
    ->zadd(...)
    ->whatever_else(...)
    ->execute();

- Returns an array of results (and never returns FALSE)
- Automatically retries if a WATCHed key gets changed

Special cases of the SET command ⌘

In some cases, you don't need transactions
SETNX key value
- Sets the key to a given value only if it doesn't exist
- Alternative form: SET key value NX
There is also a SET key value XX: sets only if the key already exists
GETSET key value: reads the value and replaces it with the new one

Lua scripts ⌘

An alternative to transactions
In transactions, you are unable to make decisions (or use results from GET) between MULTI and EXEC
SCRIPT LOAD script
- persists until server restart, or SCRIPT FLUSH
EVAL script numkeys key1 key2 ... arg1 arg2 ...
EVALSHA sha1 numkeys key1 key2 ... arg1 arg2 ...
- fails if the script does not exist
Lua tutorial
Lua string library tutorial

Lua scripts, continued ⌘

Sandboxed
Run atomically
- Until natural termination or SCRIPT KILL or SHUTDOWN NOSAVE
- SCRIPT KILL works only if there were no writes by the script
Can call commands: redis.call('GET', 'foo');
Can access keys and arguments as KEYS[] and ARGS[]
- What's a key and what's an argument is important in the context of Redis Cluster
- Please don't access other keys
  - It works for a standalone Redis, but will be a problem when you need to migrate to Cluster

Lua scripts in Predis ⌘

It is still possible to run low-level EVAL, SCRIPT LOAD and EVALSHA commands by hand
A higher-level abstraction is recommended
Example from the documentation:

class ListPushRandomValue extends Predis\Command\ScriptCommand
{
    public function getKeysCount()
    {
        return 1;
    }

    public function getScript()
    {
        return <<<LUA
math.randomseed(ARGV[1])
local rnd = tostring(math.random())
redis.call('lpush', KEYS[1], rnd)
return rnd
LUA;
    }
}

// Inject the script command in the current profile:
$client = new Predis\Client();
$client->getProfile()->defineCommand('lpushrand', 'ListPushRandomValue');

$response = $client->lpushrand('random_values', $seed = mt_rand());

Performance issues ⌘

"Redis is sometimes slow to respond"
- Ask devs about pipelining
- Check for slow commands
- Call LATENCY DOCTOR
"Redis is fat"
- Tune it, or change data structures
  - E.g. hashes are more memory-efficient than collections of "unrelated" strings
- Or maybe there are some keys that you forget to delete?
"Network is too slow"
- Often seen on WAN links
- Use VPN with compression
"Cannot assign requested address"
- Reuse connections

Benchmarking redis ⌘

There is an official benchmark tool: redis-benchmark
Try simulating the real workload:
- type of commands
- number of clients
- client IP address or the usage of UNIX socket
- pipeline depth
Can also repeat a custom command N times with variations

Monitoring Redis ⌘

INFO
- Look at memory statistics
- mem_fragmentation_ratio should be between 1 and 1.5 (except on empty server)
Latency problems?
- 200 µs is just expected on a gigabit LAN
- redis-cli --latency
- redis-cli --latency-dist
- Maybe it's not Redis? redis-cli --intrinsic-latency 100
  - Expect up to 75 µs spikes on commodity hardware
  - Does your KVM run with -realtime mlock=on?
    - 3-4 ms spikes happen on AliYun

What do other clients do? ⌘

redis-cli monitor

OK
1477053420.062978 [0 127.0.0.1:57626] "COMMAND"
1477053420.066968 [0 127.0.0.1:57626] "ping"
1477053442.487071 [0 127.0.0.1:57632] "COMMAND"
1477053442.490143 [0 127.0.0.1:57632] "sadd" "foo" "bar"

Monitoring slow commands ⌘

SLOWLOG GET 10
- Prints last 10 slow commands
- ID, timestamp, time in µs, the command itself
In redis.conf: slowlog-log-slower-than 100
- The unit is µs
Change threshold at runtime: CONFIG SET slowlog-log-slower-than 50

Avoiding network latency ⌘

In the previous "Counter" example, we waited for the command to finish
- The execution thread was blocked
Sometimes we don't need the result
- CLIENT REPLY SKIP
  - Unfortunately not supported by PHP Redis clients
  - Supported by Jedipus Java client, but it is not active
Sometimes it is possible to do something else while waiting for the result
A useful trick is to send another command without waiting for the first one to finish
- It's a good idea to write the two commands to the network in one syscall
- This technique is called "Pipelining"
- This is not related to transactions

Pipelining (PHP) ⌘

predis

$responses = $client->pipeline()->set('foo', 'bar')->get('foo')->execute();

phpredis

$responses = $redis->multi(Redis::PIPELINE)->set('foo', 'bar')->get('foo')->exec();

Special cases ⌘

Modern versions of Redis support getting/setting multiple keys at once
MSET, HMSET
Support is available in both phpredis and predis

Is pipelining effective? ⌘

How does a sysadmin know whether developers are using pipelining?
- Read their source code
- Run Wireshark on the client side
- Look for extra commands being sent between a command and its reply
- Look for multiple Redis commands in one TCP packet

Commands to avoid ⌘

KEYS pattern
- Scans all keys, returns those that match the pattern
SMEMBERS bigset
Problem: scanning of a big dataset
- Redis is single-threaded, so cannot do other work during this

What to do instead ⌘

SCAN, SSCAN, HSCAN, ZSCAN
All of them accept and return cursors
- The initial value of the cursor should be 0
- They all return a new value of the cursor and some (or maybe no) results
- If the new cursor is 0, the scan is complete
- predis wraps that in the HashKey and similar classes
Guarantees:
- Will return all elements that lived through the whole scan, at least once
- Will not mention any elements that were never there during the scan
May return some elements twice
May return empty slices
- phpredis has an option to retry automatically if this happens

How Redis stores data ⌘

It chooses representations based on data sizes
Has efficient storage for:
- Strings that represent integers
- Small HyperLogLogs
- Short lists and sets
- Sets consisting of integers
- Compressible data inside lists and sets
Debug: OBJECT ENCODING key

Memory optimization ⌘

Migrate to Redis 3.2 or later
Use efficient data types when there is such alternative
- E.g. hashes vs "just collections of strings"
- E.g. bitfields
Use short field names in hashes
- Or even migrate to lists + convention regarding what is kept at what index
Tune thresholds that Redis uses to switch representations

Maybe it's a client problem? ⌘

Sometimes clients store keys and forget about them
OBJECT IDLETIME key

Memory buffer safeguards ⌘

Problem: client pipelined too much
- Replies accumulate in a buffer
Problem: fast client, slow slave
- Replicated commands accumulate in a buffer
Problem: slow Pub/Sub subscriber
- Messages accumulate in a buffer
Solution: client-output-buffer-limit
- Separate limits for normal clients, slaves and Pub/Sub

Other safeguards ⌘

Hard limit on memory usage: maxmemory
- Comes with a policy what to do (evict keys, which ones, or give errors)
Limit on client idle time: timeout

Persistence ⌘

Two ways: RDB and AOF
- You can enable both
RDB persistence is enabled by the "save" directive
- Redis forks and saves its contents to a file
  - Uses atomic replacement so you can copy /var/lib/redis/dump.rdb, and you have a good backup!
- save 300 10 means save RDB to disk if there are 10 changes in 300 seconds
AOF stands for Append-Only File
- Essentially a log of all commands
- Sometimes rewritten to save space
- Enabled with "appendonly yes"
  - You also have to specify appendfsync always|everysec|no

Forcing Redis to persist ⌘

SAVE
BGSAVE
BGREWRITEAOF

Restoring a backup ⌘

If you enabled only RDB:
- Stop redis
- Copy dump.rdb to /var/lib/redis
- Start redis
If you also enabled AOF:
- Stop Redis
- Delete AOF and temporarily disable it in the config
- Copy dump.rdb to /var/lib/redis
- Start Redis
- CONFIG SET appendonly yes # this will block for a while

Inspecting RDB files ⌘

redis RDB tools (a third-party project)
- Export to JSON
- Compare dumps
- Convert to Redis protocol
- Estimate memory usage

More than one Redis: a big picture ⌘

Just a bunch of Redis nodes
- E.g. one Redis for cache, one for sessions, etc.
- Distributed locking (Redlock)
Partitioning/sharding
- Always done based on the key
- Old solution: twemproxy
- Now supported in Redis Cluster
Replication
- No special tools needed
- Master-master replication is not available
Failover
- Redis Sentinel
- Now also Redis Cluster

Distributed locking (or rather, leases) ⌘

Objectives:
- Make sure that two application server nodes don't concurrently access some shared resource
- Do that using some Redis servers
  - With one Redis, that's easy: SETNX
- Tolerate failure of a minor fraction of Redis servers
- Tolerate network partitions
- Tolerate crashing clients (i.e. auto-release the lock after a predefined timeout)
Answer: Redlock
- Note: there were major criticisms
- Libraries are available for PHP

Redis Replication ⌘

Master/slave
Asynchronous
Disk-based and diskless modes
Effectively, the slave asks master to dump RDB, restores it, listens to new commands
- While RDB is being dumped and downloaded, commands are buffered
- They consume memory
- Safeguard: CONFIG SET client-output-buffer-limit slave 256mb 64mb 60
  - Disconnect the slave if we buffered 256 MB, or coundn't go below 64 MB for 60 sec

Redis Replication setup ⌘

In config file: slaveof 1.2.3.4 6379
Dynamically: SLAVEOF 1.2.3.4 6379
Promote slave to master: SLAVEOF NO ONE
Other relevant configuration parameters:
- slave-serve-stale-data yes
- masterauth <password>
- repl-backlog-size 1mb # on master
  - Quick replication after quick restart

Redis Sentinel ⌘

High availability solution
Monitors Redis servers
Reconfigures master-slave chain if master fails
Also functions as a way to locate masters and slaves
- SENTINEL get-master-addr-by-name mymaster
- SENTINEL master mymaster
- SENTINEL slaves mymaster

Redis Sentinel in practice ⌘

You will need an odd number of sentinels
- Place them on your application servers
Also some Redis servers
Configure the master-slave replication manually first
Place the master like this in redis-sentinel.conf:
- sentinel monitor mymaster 1.2.3.4 6379 2
- 2 = the number of sentinels that must agree that the master is down in order to fail it over
- It will discover slaves by asking the master
- It will announce itself and autodiscover other sentinels through PUB/SUB
Manual failover: redis-cli -p 26379 SENTINEL failover mymaster

Redis Sentinel bugs and limitations ⌘

Bug in postinst script in Debian backports: it hangs
- Systemd expects Redis Sentinel to daemonize and write the PID file
- Redis Sentinel is not configured to do so
  - daemonize yes
  - pidfile /var/run/redis/redis-sentinel.pid
  - logfile /var/log/redis/redis-sentinel.log
Authentication between sentinels is unsupported
- So disable protected mode

Redis Cluster ⌘

Enabled in redis.conf by setting "cluster-enabled yes"
Needs at least 3 masters
Will shard keys among them
- There are 16384 slots that can be assigned to masters in arbitrary proportions
Sharding is based on key hash
- Except when there is a {...} in the key name
- The first {...} is non-empty: hash it
- The first {} is empty: hash the entire key
Will also manage slaves

Redis Cluster networking requirements ⌘

As of Redis 3.2.x, not compatible with NAT
- The first stable version of Redis to support that will be 4.0, released as RC1 15th of October 2016
- Fixing this required updating the protocol version
Needs an extra port (base_port + 10000, not configurable) for cluster bus

Redis Cluster: client side ⌘

Clients must handle the ASK and MOVED errors (they happen if the server is not responsible for the key slot)
- It contains the correct IP and port to connect to
There is also a CLUSTER SLOTS command
- Returns the full map of the cluster
- The client
Clients must not issue cross-slot commands

Redis Cluster: operations ⌘

There is a CLUSTER command
- Tedious to use - e.g. can move only one slot at a time
- Redis comes with a redis-trib script
  - Needs a "redis" Ruby gem
  - "gem install redis", if your distribution hasn't packaged it