Fixed
Details
Priority
MinorAssignee
AlexAlexReporter
AlexAlexDeveloper
AlexAlexChange Log Message
Fixes server load exponential raise on cache resetPatch Instructions
Patches must be submitted through Phabricator.
To submit patch via Command Line use Patches Workflow (via Arcanist) tutorial.
To submit patch via Web Interface use Patches Workflow (via Web Interface) tutorial.
External issue ID
1123External issue URL
Story Points
3Fix versions
Affects versions
Details
Details
Priority
Assignee
Alex
AlexReporter
Alex
AlexDeveloper
Alex
AlexChange Log Message
Fixes server load exponential raise on cache reset
Patch Instructions
Patches must be submitted through Phabricator.
To submit patch via Command Line use Patches Workflow (via Arcanist) tutorial.
To submit patch via Web Interface use Patches Workflow (via Web Interface) tutorial.
External issue ID
1123
External issue URL
Story Points
3
Fix versions
Affects versions
Created September 19, 2011 at 8:41 PM
Updated December 29, 2024 at 11:09 PM
Resolved July 25, 2012 at 10:33 AM
[b]Some background info:/b
To rebuild cache firstly [b]user_a/b deletes it. Then first user to ask for this missing cache (not necessarily [b]user_a/b, that initially deleted a cache) will build it.
All seems nice and pretty until [b]user_b/b visits a website, while [b]user_a/b is still building a cache. Based on logic, described above [b]user_b/b will too start cache building process in PARALLEL with [b]user_a/b. This will indefinitely continue to happen with each new user visiting website in time, until [b]user_a/b completes cache building process. But [b]user_a/b won't be able to build cache as quickly as he planned, since other users, who are building same cache in PARALLEL will slow him down.
This way due parallel calculations server load will raise exponentially. For a dedicated servers this might not be that big problem, but for shared hosting this could lead to whole server shutdown.
Here are preconditions, that can cause exponential server load I've explain above:
In-Commerce module installed
Unit Config Cache build time - 5 seconds
5+ RPS (request per second to the site)
[b]Concept of fixing:/b
There are 2 states in which each of caches could be:
we have cache, but it's outdated and needs to be rebuild
we don't have any cache and we need to build from scratch
[b]Here is what I propose:/b
in case, when we have outdated cache, then let [b]user_a/b rebuild cache, while other users would use outdated cache version
in case, when we don't have cache, then let [b]user_a/b rebuild cache, while other users will be waiting (predefined amount of seconds) for him to finish and then use cache, when it's ready
To implement proposed idea we [b]always/b need a way to get outdated cache version to return to other users, while [b]user_a/b is building.
This is now always possible due current cache key automatic expiration scheme. For example, cache key "[b]sample_key[%LangSerial%]/b" (that automatically expire on LangSerial cache key change) would be stored in cache under name "[b]sample_key[%LangSerial:1%]/b" (added "[b]:1/b"). This way, when LangSerial cache key will be changed, then key name (in cache) will be different and that cache with previous name sort-of expires (since nobody will know how to access it). This works well, but we don't have a way to get old cache key name to return all users, except one, that is building new cache.
To solve this issue I'm proposing to store additional cache key with each cache key stored and don't replace any serial cache keys (ones between "[%" and "%]") within cache key name. That additional cache key will hold variable part of cache key. This way original cache key will always be the same, but expiration fact could be detected by comparing at cached and current additional cache key value.
For example:
before:
key: "[b]sample_key[%LangSerial%]/b" (actually stored key is: "[b]sample_key[%LangSerial:1%]/b"), value: "[b]some cached data/b"
after:
key: "[b]sample_key[%LangSerial%]/b" (actually stored key is: "[b]sample_key[%LangSerial%]/b"), value: "[b]some cached data/b"
key: "[b]sample_key[%LangSerial%]_serials/b" ("[b]_serials/b" added to original cache key name), value: "[b]sample_key[%LangSerial:1%]/b"
To implement described scheme we need:
make "[b]getCache/b" method to wait for cache (if it's totally missing) or return outdated cache (when cache is build by other user)
make "[b]setCache/b" method reset any cache building indicators (set by rebuildCache method, see below)
create "[b]rebuildCache/b" method, that will allow to indicate, that:
cache will be rebuild right away (e.g. set "<cache_key>_rebuilding" cache key, so other users will know, that somebody is rebuilding cache)
cache must be rebuild on next user visit (e.g. set "<cache_key>_rebuild" cache key, so next user will know that cache must be rebuild)