Internet maps appear magical: portals into infinitely large, infinitely deep pools of data. But they aren't magical, they are built of a few standard pieces of technology, and the pieces can be re-arranged and sourced from different places. Anyone can build an internet map.
The OpenGeo architecture is a way of categorizing the technology pieces that make an internet map work, and building on existing information infrastructures. The key to the architecture is breaking internet mapping into functional layers.
- Storage: Raw data needs to be managed in a consistent read/write data store, a relational database. OpenGeo uses the PostGIS spatial database. Other options include SQL Server, Oracle Spatial and DB2.
- Application server: The raw data needs to be accessed using web services, and rendered into cartographic products. OpenGeo uses the GeoServer map/feature server. Other options include ArcGIS Server, MapGuide, and MapServer.
- Application cache: Performance requires the caching of intermediate results, such as map files. OpenGeo uses the GeoWebCache tile cache. Other options include TileCache, ArcGIS Server and MapGuide.
- User interface framework: Targeted vertical applications serve one operational need and serve it well. OpenGeo uses GeoExt/ExtJS as a platform independent user interface toolkit. Other options include FLEX and Silverlight.
- User interface map component: Mapping applications need a map component that understands spatial features and map layers. OpenGeo uses OpenLayers. Other options include Google Maps API and Bing Maps API.
The OpenGeo Suite bundles all the layers together into a convenient one-click installer, and using all the layers together in the OpenGeo Suite is one way to build an internet map. HOWEVER, the real power of the architecture is in the freedom to use the components individually to build new applications in combination with existing infrastructure.
- Organizations with existing GIS systems can build web applications by using their current storage (ArcSDE, or Shape files) layer, and putting the OpenGeo application and web interface layers on top.
- Organizations with no GIS systems can integrate mapping into existing enterprise applications by embedding the OpenGeo web components in existing pages and serving maps from the OpenGeo application layers.
- Organizations using Google map tools can deploy all the OpenGeo storage and application layers to integrate custom spatial information into the consumer-facing Google framework.
An effective architecture provides a complete solution, but in a modular manner, and that's how we have designed the OpenGeo architecture and OpenGeo Suite.
Putting maps on the web used to be very very difficult. It required specialized software, and more important, specialized knowledge about the kinds of data and processes used to create cartographic products.
The difficulties arose in the gap between the general public understanding of "what is a map" and the geographic specialist understanding of "what is a map".
Specialists understand a map to be made from a number of "layers", topography, transportation, hydrography, land cover, human construction and so on. The manipulation of these layers, and the building of algorithms to analyze them is a field unto itself: "geographic information systems" or "GIS".
Because the specialists were the first market for web mapping tools, their tools tended to embed the specialist understanding of what comprised a good mapping solution: it should expose multiple layers, the combinations of layers should be quite flexible for the end user, and the end user should provide the data to make the map. Specialists have access to lots of data, and they like to be able to turn their layers on and off.
Members of the general public usually have very simple mapping problems. They have one piece of data (a single "layer", in the specialist terminology) and they want to see it on a map. Using the specialist web mapping software to achieve their goals is tough, because in order to see their data "on the map", they first have to build "the map" – the collection of all the things that aren't their data, but that provide the locational context within which their data resides, generally called a "base map".
Building a "base map" involves finding all the relevant source data (topography, roads, water, place names, etc) for the working area, and establishing rendering rules (colors, line widths, labeling) for every scale of display. Even for specialists, tracking down data and establishing attractive multi-scale rendering rules can take several days.
Once freed from the awkward initial step of building their own base map, non-specialists rapidly colonized the online mapping space. As they did, two things happened:
- the clients of the specialists saw the new consumer tools, and wondered why the tools their specialists were providing were so clunky; and,
- the non-specialist's demands for functionality soon outstripped what Google and Microsoft were willing to provide.
As time goes on, the demands for functionality on the part of non-specialists are moving upwards and converging with the demands of simplicity from organizations formerly served exclusively by specialist GIS staff.
In this fertile middle ground, between complex solutions for experts and restricted solutions for the consumer, we can define a web architecture that plugs into both the specialist and non-specialist use cases. The OpenGeo web mapping architecture can:
- serve data from specialist databases and files up into the consumer mapping portals;
- store and manipulate data for non-specialists using algorithms formerly only available via expensive GIS software; and,
- build desktop-like applications (including embedded maps and data capture features) that can be accessed via any web browser.
In all the ways that count, our web architecture is a "GIS", but it is one that subordinates the "G" to the "IS". There is nothing extra-special about a "geographic" information system that distinguishes it from an "information system".
The architectural diagram for a generic web application usually looks like this:
There is a database or other data storage system at the bottom, some application logic in the middle, and a user interface layer at the top. The database and application layers interact via SQL over a network protocol (specific to the database vendor but usually abstracted away with JDBC or ODBC). The application and the interface layers interact via encoded documents (usually XML or JSON) transferred over HTTP.
Scratch any modern web application, and underneath you'll find an architecture like this one.
So, what does the OpenGeo web mapping architecture look like? It makes use of a set of five open source components, each fulfilling a particular functional role:
- Storage: PostGIS / PostgreSQL spatial database
- Application server: GeoServer map/feature server
- Application cache: GeoWebCache tile cache
- User interface framework: GeoExt / ExtJS
- User interface map component: OpenLayers
At the bottom of the OpenGeo architecture there is a database (PostGIS) or file-based storage system, there are application servers in the middle (GeoServer and GeoWebCache), and there's a user interface layer on the top (OpenLayers and GeoExt).
The database and application servers interact via SQL (with Open Geospatial Consortium standard spatial extensions). The applications servers and user interface layers interact via standard web encodings (XML, JSON, images) over an HTTP transport.
The web mapping architecture is distinguished from a standard application architecture, not in the arrangement or classification of the parts, but in what the parts do.
- The PostGIS database can answer spatial queries as well as standard attribute queries.
- The GeoServer map/feature server can provide standardized web access to underlying GIS data sources.
- The GeoWebCache tile server can intelligently store and serve map tiles using standard web protocols for requests and responses.
- The GeoExt/ExtJS interface framework includes standard UI components and also specific bindings for spatial features.
- The OpenLayers map component can consume maps from multiple sources and provides tools for data editing and capture.
A key feature of the OpenGeo architecture is that, through the use of standards, any component of the architecture can be replaced with other products. For organizations with existing software infrastructure, this feature is a necessity.
- For storage, PostGIS can be swapped with Oracle Spatial, SQL Server Spatial, DB2 Spatial or ArcSDE.
- For web map access, GeoServer can be swapped with MapServer, ArcGIS Server, MapGuide or any other WMS-capable map renderer.
- For web feature access, GeoServer can be swapped with Ionic Red Spider, CubeWerx or any other fully featured Web Feature Server (WFS).
- For caching, GeoWebCache can be swapped with TileCache.
- For user map components, OpenLayers can be swapped with Google Maps, Bing Maps, and other components.
4.1. OpenGeo PostGIS
OpenGeo supports the PostGIS/PostgreSQL spatial database as the foundation of a pure open source architecture. If you are building an application, and want to use open source from top to bottom, PostGIS/PostgreSQL is the database we recommend. It is certified as compliant with the OGC "Simple Features for SQL" specification.
PostGIS is a spatial extension to PostgreSQL, and inherits all the enterprise features of the underlying database:
- 100% SQL92 standards support.
- Transactional integrity and disaster recovery. Pull the plug on a PostgreSQL server and it comes back up automatically, data file corruption does not occur.
- Triggers, views, foreign key constraints, user-defined functions, and procedural programming languages in the back-end. Build, maintain and guarantee complex data models.
- Role-based security and authentication infrastructure.
- Hot-backups and replication.
- Partitioning and accelerated queries on horizontally partitioned databases.
- Standard external access via protocols (JDBC, ODBC) and language bindings (C, C++, Python, Perl, C#, Java, PHP, ASP, etc).
In addition, PostGIS adds types, functions and indexes to support the storage, management, and analysis of geospatial objects: points, linestrings, polygons, multipoints, multilinestrings, multipolygons and geometry collections.
As a spatial database, PostGIS can store very large contiguous areas of spatial data, and provide read/write random access to that data. This is an improvement over old file-based management structures, that were restricted by file-size limitations and the need to lock the whole files during write operations.
The spatial SQL functions available in PostGIS make analyses possible that were previously the domain of workstation GIS systems:
- Join two layers based on spatial containment rules (e.g. "what is the census profile of our customers", "
SELECT avg(census.income) FROM census JOIN customers ON ( ST_Contains(census.geom, customers.geom )" )
- Summarize layers based on spatial aggregates (e.g. "generate the polygon of all census tracks with annual income more than $50K", "
SELECT ST_Union(census.geom) FROM census WHERE census.income > 50000")
- Create new layers through spatial operations (e.g. "buffer all the roads by 100 meters and union the result into an output polygon", "
SELECT ST_Union(ST_Buffer(roads.geom, 100)) FROM roads" )
4.2. Other Databases
Other spatial databases offer many of the same features as PostGIS, and the OpenGeo application layer (GeoServer) can integrate with them directly.
- Oracle Spatial
- DB2 Spatial
- ESRI ArcSDE
- Microsoft SQL Server Spatial
- MySQL Spatial
For organizations with existing database systems, deploying OpenGeo's GeoServer on top of their current database is a quick way to get on the geospatial web without re-tooling their infrastructure.
4.3. Other Data Sources
While databases offer the strongest combination of data integrity, integrated analysis and support for write operations, many organizations use GIS file formats to hold their data.
OpenGeo's GeoServer can read from and write to (with some limitations on concurrency) GIS files:
- ESRI Shape Files
- Image File Formats
In addition, GeoServer can read data directly from standard internet services:
- OGC Web Map Service (WMS) for access to imagery and rendered maps.
- OGC Web Feature Service (WFS) for access to vector features.
4.4. Architecture Examples
Using the OpenGeo suite with different data source architectures opens up a large number of deployment possibilities. The basic architectural pattern of using a database as a "single point of truth" allows a great deal of flexibility, both in supporting heterogeneous environments, and in evolving systems over time.
4.4.1. A Data Entry Application on ArcSDE
A very common use case for web services is gathering "red lining" information. Given a base map, mark in a polygon or a line, and annotate it. Red-lining is a way of capturing informal knowledge for future use, the term refers to drawing on paper maps, with a red pen.
Starting from an existing data management system, with ArcSDE as the data container, and ArcMap being used to create project data, and paper cartography, how can the OpenGeo suite provide a red-lining service for external clients?
The core data are all in ArcSDE, so all that is needed is a way to expose project views of that data to clients. Simply add two tables to ArcSDE: a polygon table that contains the extents of project areas, client names and project titles (we will use this to drive the user interface to project locations); and a redline table, to hold the data you are collecting.
On top of ArcSDE, deploy OpenGeo's GeoServer. Configure styling rules for the core data, to create an attractive cartographic product. Configure a WFS service for the project areas and the redline data.
On top of GeoServer, deploy our GeoExt framework with OpenLayers. The application can be as simple as a map panel, with edit tools enabled, a table widget displaying the project area names, and a table widget displaying the redline annotations.
Because this design deploys on top of the existing ArcSDE database, the ArcMap operators can directly access the redline data to add to their paper maps, and can directly edit the project areas layer, to add new project areas to the web interface, without any special programming or administrative controls for the web application.
4.4.2. A WMS/KML Service on PostGIS
Providing data for other users to integrate can be hard, if the data are constantly changing. Dumping a file out to an FTP site works fine, but what if the data are changing every 15 minutes as new information streams in automatically? What about people who use Google Earth for visualization?
Here's an architecture for a simple asset tracking system, that allows multiple tools for visualization, and maintains historical information.
At the bottom of this architecture is OpenGeo's PostGIS spatial database. The data inputs are piped into PostGIS by converting the data stream to SQL "update" statements, that update an asset location table in the database. To maintain a history of locations, a simple "on update" trigger copies the location as an "insert" to an asset history table. (More advanced systems will also partition the history table, to manage the size issues created by continuous data streams.)
On top of PostGIS, we add GeoServer. Configure the asset location table as a data source, and add any styling we want for the location points (symbols, circles, pixmaps, etc).
That is it for configuration! Everything else is done by end-users.
End-users can view the asset locations in:
- ArcMap, by adding the GeoServer WMS service to their application as a "WMS layer";
- Google Earth, by adding the GeoServer WMS service to their application using a KML output type;
- Google Maps, by entering the URL of the GeoServer KML service into the Google Maps search field; or,
- Any other client software (there are 100s) that support consuming WMS services as an input data source.
The application server layer is responsible for mediating between the data layer and the user interface layer (UI). It acts as a protocol gateway, turning standard web requests from the UI into the specific calls necessary to talk to databases or read from GIS files. It also is a repository for custom logic, like processing services, and other application-specific routines.
OpenGeo supports the GeoServer spatial application server as the best building block for spatial web services, and the GeoWebCache tile server as an accelerator for custom map services.
5.1. OpenGeo GeoServer
GeoServer can read from multiple data sources, generate multiple output formats, and communicate using multiple standard protocols. As such, it fits easily into existing infrastructures, providing a communication path between old and new software components.
GeoServer presents spatial data (tables in a database, files on a hard drive) as feature collections, and allows HTTP clients to perform operations on those collections.
- Render them to an image, as an attractive cartography product.
- Apply a logical filter to them and retrieve a subset, or a summary.
- Retrieve them in multiple formats (KML, GML, GeoJSON).
Without GeoServer, when building a spatial web application, the developer would be required to write all the code between the web server and the database/files. With GeoServer, the developer can use a few standard access patterns to retrieve maps and information.
The access standards GeoServer implements include:
- OGC Web Map Server (WMS) for retrieving cartographic images;
- OGC Web Feature Server (WFS) for querying and retrieving vector feature collections;
- OGC Styled Layer Descriptors (SLD) for encoding cartographic styling rules;
- OGC Filter specification for encoding subset queries on feature collections;
- OGC KML for encoding feature collections for visualization in Google Earth;
- OGC Geographic Markup Language (GML) for encoding feature collections for general purpose re-use.
All these standards are internationally recognized and approved.
5.2. OpenGeo GeoWebCache
Like GeoServer, GeoWebCache is a protocol gateway. GeoWebCache sits between tiled mapping components (like OpenLayers, Google Maps and Bing Maps) and rendering engine in GeoServer.
Tiled map components generate a large number of parallel requests for map tiles, and the tiles always have the same bounds, so they are prime candidates for caching. GeoWebCache receives tile requests, checks its internal cache to see if it already has a copy of the response, returns it if it does, or delegates to the rendering engine (GeoServer) if it does not.
When GeoWebCache delegates rendering requests to the rendering engine, it uses standard WMS requests. As a result, it can be used with any engine that supports WMS (ArcServer, MapServer, MapGuide, etc). GeoWebCache supports tile requests from all the common tiled map components, so it can be used equally well under applications using OpenLayers, Google Maps, Bing Maps or Google Earth.
5.3. Other Application Servers
Because the OpenGeo suite is configured to use standard protocols for data access and delivery, it is possible to substitute alternate components, or create architectures that use multiple components.
Map rendering from PostGIS data sources to a WMS interface can be done by GeoServer, but it can also be done by MapServer, or by ArcServer (9.2 or greater). Using the OpenGeo suite, architectures can mix and match different application servers depending on their relevant strengths.
5.4. Architecture Examples
5.4.1. A Transit Information Portal
This architecture is used by the Portland regional transit authority, TriMet, to back their trip planner and route mapping pages.
At the bottom of the architecture, the relevant spatial is loaded into PostGIS and updated on regular schedule. This is primarily to provide a consistent interface (SQL) for the data from all other components of the architecture, as well as a single points of truth.
On top of PostGIS, GeoServer provides rendering to cartographic map output, which is exposed via the standard WMS interface. The WMS interface is public, which means that users other than TriMet can make direct use of the maps. Log tracking shows that there are a number of clients using TriMet WMS services directly.
On top of GeoServer, GeoWebCache turns tile requests from OpenLayers into WMS requests to GeoServer and caches the results for performance.
On top of GeoWeb Cache, OpenLayers is the mapping component, embedded within a GeoExt framework providing user interface components like expandable trays, fold-out panels and more standard user interface elements.
Because each element of the interface is standardized, it is possible for external organizations to include components of the TriMet system in their own applications. It is also easier for TriMet to roll out new applications, using the same architectural base of services.
5.4.2. Caching ArcGIS Server
Organizations do not need to decommission their back-end rendering engines in order to get access to a modern tiled mapping interface.
Suppose an organization has a legacy web mapping site, consisting of ArcSDE and ArcGIS Server on Microsoft SQL Server, with an ASP front-end interface. The interface can be modernized, and performance improved, by using the top layers of the OpenGeo suite.
On top of ArcGIS Server, GeoWebCache provides high-speed tile-based access to the ArcGIS map rendering engine. On top of GeoWebCache, the GeoExt framework allows a completely client-side GUI application to be built, using the OpenLayers map component to pull the tiles from GeoWebCache.
The resulting application can continue to leverage the server software the organization is used to maintaining, while providing a faster, more modern user experience.
The advantage of web applications is ease of deployment and update. Updating a desktop application can take weeks of installs on many computers. Updating a web application requires deploying the code to one server. Now that the finish quality of web applications equals that of desktop applications, there is no reason not to use the web.
6.1. OpenGeo OpenLayers
OpenLayers is a generic mapping component, designed to consume spatial data and maps from numerous sources and display that data in a web browser. Unlike Google Maps or Bing Maps, OpenLayers is not tied to a particular map source. It can display maps from Google, Microsoft or Yahoo!, and also display custom maps generated by rendering engines like GeoServer, MapGuide, MapServer or ArcServer.
In addition to map display, OpenLayers offers tools for working with spatial data directly in the browser. Code components for reading features from GeoJSON, GML, and text formats. And tools for manipulating those features on the screen: digitizing, altering, and moving features.
6.1. OpenGeo GeoExt
GeoExt adds extensions to ExtJS that bind basic ExtJS components to the spatial features of OpenLayers. For example, a GeoExt "selection manager" which looks like a table in the user interface is bound to the set of selected features in OpenLayers.
The GeoExt set of components allows OpenGeo to rapidly develop custom spatial applications, because we don't have to constantly re-write the bindings between the map component and the other user interface components – we just use (or enhance) the existing GeoExt work.
6.2. Other User Interface Layers
The map components from Google, Microsoft, Yahoo and others can all be used in place of OpenLayers in the OpenGeo suite. Because GeoServer and GeoWebCache speak standard web protocols and formats, it is easy to build a site that uses Google Maps for visualization, but GeoServer for map rendering and query processing.
6.3. Architecture Examples
6.3.1. A Multi-Layer Data Viewer
It is common for an organization to want to validate older vector data against newer ground truth. The public availability of image data at Google Maps and Bing Maps has made the prospect even more tantalizing – if only the organization's data could be overlaid with the corporate imagery to check it!
Our suggested architecture makes use of GeoServer to render the organization's GIS data, from files, or existing databases, or PostGIS, whichever makes the most sense.
OpenLayers is then configured with three (or more) layers: base map layers from Microsoft and Google, and an overlay layer provided by GeoServer.
Additionally, because OpenLayers reads from OGC standard WMS sources, any WMS-capable rendering engine (MapServer, MapGuide, ArcGIS Server) could be used in place of GeoServer.
6.3.2. An Embedded Map in a Business Application
One unfortunate characteristic of legacy web mapping sites, is that they tend to "lead with the map". The majority of the browser frame is devoted to the map area, and a small bar down the side is carved out for "extra information".
GIS specialists built whole software frameworks around this idea, and then imposed them on other business applications, to poor practical effect – "to see a map, press this button and the 'map application' will pop up with your feature highlighted".
OpenLayers takes the approach of "object embedding". An OpenLayers map is added to any web page with a named
The architecture maintains the business application architecture. It could be anything, any database, any scripting language. As an example here, we show an application built with Oracle and J2EE.
On the page where we want the map, we embed OpenLayers, by adding a named
<div> tag to the existing code. OpenLayers pulls a base map from Google, and renders an overlay map of business features via GeoServer, which in turn pulls data directly from Oracle.
© 2012 OpenGeo.
Redistributable under the Creative Commons Attribution-Share Alike license.
Download this white paper:
The OpenGeo Architecture
Table of Contents
Other White Papers
Since 2001, the Open Geospatial Consortium (OGC) has been engaged in developing a set of standards for web-enabling sensors and sensor observations. Version 1.0 of the Sensor Web Enablement (SWE) standards were approved and released in 2007. Versions 2.0 of these standards have either been approved, or will be approved by Fall 2011.
This paper outlines how the OpenGeo Suite Enterprise Edition augments the innovation of open source software communities with the testing, certification, and maintenance necessary to create and maintain reliable, long-term enterprise production web services.
The OpenGeo Suite is built from several open source projects (OpenLayers, GeoWebCache, GeoServer, PostGIS) that each provide distinct functionality. This paper explains what each component does and how they interact with other components.
GeoWebCache is gaining popularity as enterprises look to accelerate their online maps. In this interview, Arne Kepp, the project founder and OpenGeo team member, provides historical background and technical details.
The SDI model of distributed service providers can fall apart when services or connectivity are unreliable. National infrastructure providers can increase SDE reliability by providing a maintained caching infrastructure on top of distrobuted services.
GeoServer in a production environment can be evaluated according to three criteria: reliability, availability, and performance. This paper discusses methods for implementing production grade GeoServer deployments.
This is the the third paper in a series of three desribing OpenGeo’s vision for a distributed versioning system. This paper describes our proposed work path toward a fully realized infrastructure of distributed versioning tools for geospatial.
This paper is the second in a series of three which into the technology necessary to apply distributed versioning systems for source code control to geospatial information.
This is the first paper in a series of three that propose a new approach to working with spatial data, recommending a shift from treating spatial data simply as data to considering it as programmers do source code.
This white paper compares the relative strengths and weaknesses of closed source geospatial web services software, open source (unsupported) alternatives, and supported open source — namely the OpenGeo Suite.