GeoWebCache can push out staggering amounts of tiles to a large number of simultaneous clients when the operating system and servlet container are properly configured. But in a number of cases it makes sense to run multiple caches, either for redundancy or to improve scalability.
The issue with tiling clients is that the requests are strongly correlated, meaning that any tile request is a strong predictor that adjacent tiles will also be requested. If a conventional load balancer is used, these requests will go to difference GeoWebCache instances. Due to metatiling, all of them will have all tiles, which is a waste of storage, or they will all trigger separate requests to the backend which may then get overloaded.
GeoWebCache could be extended to support clustering. When an instance receives a request, it will first check whether it already has this tile. If it does not, it will send a broadcast to its peers asking for the same tile. If a response is received, a request is made to the peer. If not, the cache will forward the request to the backend. While waiting for the backend, the cache will respond positively and block any requets for tiles on the same metatile, thus avoiding redundant requests to the backend.
To reduce the overhead of maintaining a cluster, the RESTful configuration interface can be expanded so that only one node in the cluster needs to be configured, and all other nodes can download the configuration on startup. The main job is to implement the broadcast and locking system, as well as an interface for distributing tiles among peers. Note that the design is such that the caches can also be geographically spread out.
Funding
This roadmap item is currently unfunded.
Add your support for this item by contacting us for a quote and discussion of the particular features you need.
Get a quote now!
Get a quote or read more about core development to add your support to a road-map item.
Other Roadmap Items
Several improvements to GeoWebCache could be done to better support clustering for reliability and scalability. While traditional clustering techniques work there are several unique aspects of geospatial cache clustering which can be optimized for directly in the GeoWebCache codebase.
Seeding is the process of pre-populating the cache with tiles. Currently a seed request can only take a bounding box. One improvement we could make is to let the seeder take a polygon, which can efficiently delimit the area of interest, saving significant time and bandwidth. Another improvement to be made is depth-first seeding, for cases when the center of the seed is of the most interest.
Funded: GeoWebCache does a great job caching static datasets and these improvements help make it a more active cache.
