In an interview with Susan Smith of GISCafe in 2009, Arne Kepp provides a detailed overview of the history and functionality of GeoWebCache. The entirety of the interview is provided below.
Through Google Summer of Code, Chris Whitney was able to spend a summer creating what became known as jTileCache. It had very basic functionality, but also original ideas like using the Java Caching System to store image objects and compress them on the fly. Over the next nine months, that code was then reworked by OpenGeo into what is today known as GeoWebCache. Through external funding we were able to add native interfaces for Virtual Earth and Google Maps, making it easy to use those clients against data served by WMS servers.
Last summer the project benefited from another generous Summer of Code grant, allowing Marius Suta to contribute code that enabled XML configuration using XStream and a RESTful configuration interface. Google's Open Source Office has also funded OpenGeo to add streaming Google Earth support to GeoServer, and GeoWebCache benefited from this, gaining the ability to tile and cache KML placemarks and vectors.
Looking forward, we are continuing to break down some of limitations commonly associated with tile caches.
GeoServer is sort of like an older sibling. GeoWebCache has benefited tremendously from the collective experience of the GeoServer developer community and the feedback from GeoServer users. They have had a strong influence on the design and helped identify bugs. Originally, GeoWebCache was intended to be a library for GeoServer, but since the WMS standard provides a very stable interface it turned out to be just as easy to develop a separate servlet.
GeoWebCache was first included as a plugin in GeoServer 1.7.1. In the 1.7.x series it basically has the same functionality as a standalone version of GeoWebCache, but the plugin is automatically configured and has a reduced footprint since it shares many libraries with GeoServer.
In the 2.x series of GeoServer, which is currently at at the alpha-stage, we have started going beyond what can be achieved using the standard HTTP requests. If a layer is added or reconfigured, the tile cache will be informed immediately through a callback interface and reevaluate any existing tiles. However, GeoWebCache also has a RESTful configuration interface that allows other servers to achieve the same effect.
The main improvement in version 1.1.0 is the support for "modifiable parameters". Other tile caches have associated one layer name with one set of tiles, meaning they only consider the bounding box and the layer name of the request, the rest is defined by the configuration of the cache. GeoWebCache has for a long time supported a separate set for each combination of spatial reference system and output format.
The new release takes this one step further, allowing you to configure filters and specify what other parameters constitute a set of tiles. For example, you can now serve the same layer with multiple styles, you can apply CQL filters, or you can use the time and elevation parameters introduced in WMS 1.3.0. One of the filter types uses regular expressions, which are extremely flexible, and the other is written for matching floating point numbers.
One existing feature that many users appreciate is the automatic
configuration of GeoWebCache from a WMS
GetCapabilities document. The
drawback with using this method has been that you could not specify
additional projections or output formats. In 1.1.0 this problem has been
reduced, if the configuration file and the
GetCapabilities document have
overlapping layer names the two configurations will simply be merged. In
the long run we still hope to provide an
AJAX interface to make this easy.
Very basic WFS caching is included in 1.1.0. The motivation behind this is that GeoServer's WFS supports zipped shape-files as an output format. These can be several hundred megabytes and very expensive to compute, so any public server should cache them. Again, you can limit what queries are allowed by using a regular expression, but this is definitely a feature that will be improved over time.
But most importantly, the 1.1.0 release lays a lot of the groundwork for future development. Key to this is the pluggable H2 database which stores meta information about tiles, so that it will now be possible to remove tiles that have not been accessed in a certain time period, or find tiles that have been outdated by a recent change.
GeoWebCache works great with any WMS compliant server, including Mapserver and deegree. But there are also a number of people who use it in front of ESRI products, Ionic and even custom WMS servers. On the client side it currently works with any software that can use the OSGeo WMS-C recommendations, including OpenLayers and uDig. There are also custom clients that use the Google Maps API.
The user base is anything from professional developers to home users. Based on the activity on the mailing list, my impression is that developers are actually the minority. Most questions appear to come from end users who own an existing WMS solution and wish to improve its performance or reduce costs.
While some understanding of WMS makes life easier, there are also those that do not want to deal with OGC services and use GeoWebCache so that they can access their data in Google Earth or use the APIs that Google Maps and Virtual Earth provide.
GeoWebCache acts like a proxy between clients and one or more WMS servers. When the client makes a request, GeoWebCache first checks to see whether it already has the corresponding tile. If not, the request is forwarded to the appropriate WMS server. When the response comes back, GeoWebCache first saves a copy (caches) and then forwards it to the client. The entire process adds only a few milliseconds to the time it takes to do the WMS request. Subsequent requests for the same tile are then answered in milliseconds using the copy, with the added benefit that this requires no resources on the WMS backend.
This improves the user experience and also opens up a number of new possibilities. For example, the response time becomes less important, so the WMS server can use more complex rules and render tiles that look better. You can also seed the cache in advance, using the built-in web interface, so that some or all tiles are cached before the instance is used in production.
GeoWebCache has been designed for speed and scalability. Even a laptop can serve tiles at several hundred megabits per second. I have come across blogs and emails where people assume that their instances would be limited to the throughput or seek times of their hard-drives, since this is where the tiles are persisted. But this is generally not the case, most modern operating systems have something called disk block caches. This effectively moves the most requested tiles into memory, so they can be accessed at much higher speeds. OpenGeo hopes that this increase in capacity will also allow data providers to make their data available to a wider audience.
In addition to Summer of Code, Google has contributed to both GeoServer and GeoWebCache by funding the development of three special output formats. Two of them are more closely related to Google Earth, namely raster super-overlays and regionated vectors. The first one works like the regular Google Earth background, improving the resolution of images as you zoom in. GeoWebCache can be used with any WMS server to achieve this effect.
The vector format uses OGC KML, and the key is that we gradually show more features as you zoom in. Developing code that automatically selects what items to show at what zoom level was a major undertaking. Both of these types of hierarchies can be cached using GeoWebCache. Google Earth requests a large number of tiles while you are zooming in or spinning the globe, so caching is crucial if you want to serve more than a few simultaneous clients.
The third format is what we call “geosearch” and most of the work is actually done in GeoServer. It is basically an XML sitemap, similar to those used for normal websites, and KML files representing each feature or row in the underlying database. The KML is automatically generated from any backend that provides vector data. Googlebot reads the sitemaps and then fetches all the KML placemarks. It analyzes the description of each feature and its location. After a period of about two weeks your data then becomes visible as a user-contributed placemark on maps.google.com. GeoWebCache's role here is to provide fast access in case the person searching wants to view the entire dataset or download the data as a shape-file.
The goal is to make tile caching as unobtrusive and easy to use as possible. I hope that the WMTS standard> that OGC is working on will make it easier to develop clients and share tiles across applications. On the server side I want to make the cache more dynamic, to automatically expire tiles that are no longer accurate. This is particularly important to OpenGeo, one of our primary goals is to create software that lets end users contribute and edit geospatial information through their web browsers. The term we use for this is “wikiable maps”, it is based on WFS with transactions, but maintains multiple versions of the same data.
On the enterprise side of things we are actively looking for clients to fund features that are particularly important for large users. These include clustering, with lateral cache synchronization, for increased scalability and reliability. We would also like to develop tools that make it easier to maintain a cache and gather detailed statistics about usage. To get there I have made a list of menu items that we hope clients will fund. That said, GeoWebCache is an exciting platform with a lot of possibilities, things we have listed only represent a small subset of what can be done.
Other White Papers
Since 2001, the Open Geospatial Consortium (OGC) has been engaged in developing a set of standards for web-enabling sensors and sensor observations. Version 1.0 of the Sensor Web Enablement (SWE) standards were approved and released in 2007. Versions 2.0 of these standards have either been approved, or will be approved by Fall 2011.
This paper outlines how the OpenGeo Suite Enterprise Edition augments the innovation of open source software communities with the testing, certification, and maintenance necessary to create and maintain reliable, long-term enterprise production web services.
The OpenGeo Suite is built from several open source projects (OpenLayers, GeoWebCache, GeoServer, PostGIS) that each provide distinct functionality. This paper explains what each component does and how they interact with other components.
GeoWebCache is gaining popularity as enterprises look to accelerate their online maps. In this interview, Arne Kepp, the project founder and OpenGeo team member, provides historical background and technical details.
The SDI model of distributed service providers can fall apart when services or connectivity are unreliable. National infrastructure providers can increase SDE reliability by providing a maintained caching infrastructure on top of distrobuted services.
GeoServer in a production environment can be evaluated according to three criteria: reliability, availability, and performance. This paper discusses methods for implementing production grade GeoServer deployments.
This is the the third paper in a series of three desribing OpenGeo’s vision for a distributed versioning system. This paper describes our proposed work path toward a fully realized infrastructure of distributed versioning tools for geospatial.
This paper is the second in a series of three which into the technology necessary to apply distributed versioning systems for source code control to geospatial information.
This is the first paper in a series of three that propose a new approach to working with spatial data, recommending a shift from treating spatial data simply as data to considering it as programmers do source code.
This white paper compares the relative strengths and weaknesses of closed source geospatial web services software, open source (unsupported) alternatives, and supported open source — namely the OpenGeo Suite.