An overview of recent remote sensing and GIS based research in ecological informatics

This article provides an overview of some of the recent research in ecological informatics involving remote sensing and GIS. Attention focuses on a selected range of issues including topics such as the nature of remote sensing data sets, issues of accuracy and uncertainty, data visualization and sharing activities as well as developments in aspects of ecological modelling research. It is shown that considerable advances have been made over recent years and foundations for future research established. © 2010 Elsevier B.V. All rights reserved.


Background
Geotechnology has been couched as one of the three "megatechnologies" of the 21st century, along with nanotechnology and biotechnology (Gewin, 2004).On the cusp of the second decade of the 21st century it is evident that recent developments in the geotechnologies of Geographic Information Science (GIS) and remote sensing have had a substantial impact on ecological research, providing spatial data and associated information to enable the further understanding of ecological systems (Rundell et al., 2009).In a digital society there is sometimes an expectation that information is just a click away and this may apply to spatial information in the domain of ecological informatics.Indeed, it has been in the last five years that developments in and relating to the fields of GIS and remote sensing, such as those seen in internet technologies (e.g., Web 2.0; high-performance communication networks; Brunt et al., 2007), sensing technology (e.g., quantum well infrared photodetectors; Krabach, 2000), data standards and interoperability (e.g., OGC® KML 2.2) and spatially explicit modelling (e.g., Osborne et al., 2007a), have resulted in the revelation of previously unobservable phenomena and posing of what might be termed second generation of ecological questions.During this time we have also seen more support for the open exchange of data and software and infrastructure for location based services.This paper serves as a review of some of the recent developments in GIS and remote sensing that have been or promises to be of benefit to ecological informatics.The subjects addressed are necessarily selective and limited in scope, drawing on topics of interest to us and, we hope, to you.

Remote sensing: overcoming limitations in spatial resolution
Since the turn of the twenty-first century over a hundred satellite platforms carrying Earth observation sensors have been launched, in addition to the many airborne and terrestrial sensors deployed (Boyd, 2009).Data acquired by these sensors and derived products (e.g., MERIS Terrestrial Chlorophyll Index product available from the UK's NERC Earth Observation Data Centre (NEODC); Curran et al., 2007), with websites focused solely on publishing images, articles and engendering participation (e.g., NASA's Earth Observatory) are now common place.Nevertheless, despite this seeming plethora of Earth observation data, the derivation of ecologically relevant variables still demands innovative image processing algorithms and approaches to ensure data are fit for purpose.For instance, studies are sometimes limited by the spatial resolution of the remotely sensed imagery available.Often, for example, the spatial resolution, expressed typically by the image's pixel size, is coarser than desired with small targets unresolved.Commonly, the desire is to have imagery with a spatial resolution finer than the size of the features of interest (Woodcock and Strahler, 1987).Given that the features of interest may be relatively small (e.g., an individual tree in a biodiversity study or a patch of woodland in a global mapping study) the spatial resolution can, therefore, limit the detail of a study, with many targets of interest beneath the minimum map unit.Spatial resolution is also linked to one of the largest sources of error and uncertainty in studies of land cover and land cover change.This latter issue is perhaps most commonly noted in relation to the mixed pixel problem.The proportion of mixed pixels in an image is a function of the sensor's spatial resolution and a major problem in accurate mapping as a mixed pixel cannot be appropriately represented in standard (hard) image classification analyses used to map land cover from remotely sensed data (Fisher, 1997;Foody, 2002;Weng and Lu, 2009).The fundamental concern is that the pixel no longer belongs to a single class, violating a basic underlying assumption of many analyses in remote sensing; any hard class allocation which labels a pixel as being fully a member of a single class must be at least partially erroneous for a mixed pixel.In general terms, the proportion of mixed pixels will tend to increase with pixel size but mixing problems may still occur at fine spatial resolutions (e.g. for a crop imaged at a sub-metre resolution there could be intraclass mixing of reflectance from sunlit and shaded leaves as well as background soil).Mixed pixels may occur for a variety of reasons (Fisher, 1997) but are often evident at the geographical boundaries between classes or when the landscape mosaic is highly fragmented.In such situations standard hard image analysis techniques will not allow accurate estimation of basic features such as boundaries, landscape patterns or the characterisation of objects that are of sub-pixel size.Unfortunately, this may often be the case, especially if using medium and coarse spatial resolution imagery as the proportion of mixed pixels may be very high (Foody et al., 1996).
The mixed pixel problem may be reduced in a variety of ways.One obvious approach is to make use of imagery acquired at as fine a spatial resolution as possible.Recently many fine spatial resolution systems have been launched (Toutin, 2009).These systems allow ecosystems to be characterised over a range of scales, including the very local (Wulder et al., 2004) and allow questions that were previously impractical to study from space or on the ground to be addressed.IKONOS, for example, acquires imagery with a spatial resolution of 1 m (panchromatic) and 4 m (multispectral) that enables the study of local scale features from space (Read et al., 2003).IKONOS data have, for example, been used to study forested environments at the scale of individual tree crowns (Hurtt et al., 2003;Clark et al., 2004).Thus, there has been a trend for research to develop away from the mapping of broad land cover classes towards a focus on specific classes, often detailed classes such as tree species.Indeed, the latter application has become a focus of considerable attention and made use of data acquired by a range of remote sensing systems (Franklin, 2000;Nagendra, 2001;Brandtberg, 2002;Haara and Haarala, 2002;Brandtberg et al., 2003;Holmgren and Persson, 2003;Sanchez-Azofelfa et al., 2003;Turner et al., 2003;Carleer and Wolff, 2004;Wang et al., 2004;Goodwin et al., 2005;Boschetti et al., 2007;van Aardt and Wynne, 2007).The derived information can aid both the assessment of biodiversity and its conservation (Landenberger et al., 2003;Wilson et al., 2004), especially as the spatial distribution of a species influences its ability to reproduce, compete and disperse as well as suffer damage or death.In terms of accuracy, it may now be possible to map some tree species from satellite sensor imagery to a level of accuracy that is comparable to that derived from the use of aerial sensor data (Carleer and Wolff, 2004;Wang et al., 2004).Fine spatial resolution imagery is, however, still often acquired by airborne sensors and there have been major developments in platforms (e.g.unmanned aerial vehicles) and imaging technologies that present new opportunities for image acquisition.Indeed the ties to major airborne campaigns have been loosened and researchers can increasingly focus on specific areas and targets, perhaps using basic imagery acquired by basic camera systems from an airborne platform.This may be of value to a range of applications including studies with very focused interest, such as those concerned with studies of introduced and invasive species (Ramsey et al., 2002;Underwood et al., 2003;Ellis and Wang, 2006;D'Iorio et al., 2007;Kakembo et al., 2007;Tsai et al., 2007;Auda et al., 2008;Singh and Glenn, 2009).Such studies may also benefit from a more tailored approach to mapping, in which effort and resources are focused on the thematic information of interest rather than the entire land cover mosaic of the region (Sanchez-Hernandez et al., 2007a).
The use of fine spatial resolution imagery brings its own set of problems.It would be likely, for example, that the image data costs would be very high, there would be a need for considerable preprocessing of the imagery to ensure appropriate radiometric and geometric properties and problems with class mixing will still occur.Additionally since there are inter-relationships between key sensor properties (e.g.spatial and spectral resolution) it must be noted that a fine spatial resolution is often achieved by reducing another sensor property.The use of fine spatial resolution imagery may, therefore, not always be practical, especially for large area studies.An alternative to the use of fine spatial resolution imagery as a means of addressing the mixed pixel problem is the adoption of some form of spectral unmixing analysis.For example, a soft or fuzzy classification technique which allows multiple and partial class membership could be used to indicate sub-pixel class composition (Foody, 1996).These approaches have been used widely to provide representations of land cover that are typically more accurate and appropriate than that derived from the use of a conventional hard classification.They may, for example, help provide more meaningful representations of environmental continua (Rocchini and Ricotta, 2007;Heiskanen, 2008;Weng and Lu, 2009).Sub-pixel scale analyses have proved popular in ecological research, notably with the provision of products such as vegetation continuous fields (Hansen et al., 2002;Heiskanen, 2008).There are, however, also concerns with this type of analysis that can greatly limit its value (Foody and Doan, 2007;Ling et al., in press;Ngigi et al., 2009).
Although soft classifications provide valuable sub-pixel scale information and indicate sub-pixel class composition they do not indicate the spatial arrangement of the sub-pixel scale class components.The latter may be of considerable interest to, for example, landscape ecologists.To address this problem some have sought to investigate ways of effectively increasing the spatial resolution of the imagery.This has often been achieved through use of a 'sharpening' image following some form of image fusion analysis (Ling et al., 2008a, b;Cetin and Musaoglu, 2009;Jing and Cheng, 2009) or the adoption of a super-resolution mapping technique (Ling et al., in press).The use of sharpening imagery has been popular with imagery acquired by sensors that operate at more than one spatial resolution (e.g.Landsat ETM+ has a fine spatial resolution panchromatic band that can be used to sharpen the multispectral imagery acquired at a coarser spatial resolution).However, there are problems with such correlative techniques and they are most suitable for use with a limited number of sensors.Super-resolution techniques have also attracted considerable interest.A variety of super-resolution analyses may be undertaken.In particular super-resolution restitution seeks to form a finer spatial resolution image which may then form the focus of interest (Farsiu et al., 2006;Ling et al., in press) or super-resolution mapping which seeks to map at a sub-pixel scale (Foody et al., 2005).Superresolution techniques have been shown able to increase the accuracy and realism of key features such as class boundaries (Foody et al., 2005;Ling et al., 2008a,b) and provide useful information for ecological research.As with all techniques there are, however, challenges to address (Atkinson, 2009).For example, one key issue with superresolution mapping based upon the outputs of a soft classification is that the error and uncertainty in the classification can have a major negative impact on the derived products (Foody and Doan, 2007).

Remote sensing: targeted mapping
Remote sensing has been widely used as a source of environmental information for ecological research.For example, in relation to biodiversity, studies have often sought to derive information on variables such as species richness and tried to facilitate biodiversity monitoring activities (Gillespie et al., 2008;Coops et al., 2009).The latter may be derived using a direct relationship between a measure of biodiversity and the remotely sensed response, often in the form of a vegetation index (e.g., Feeley et al., 2005;Oindo and Skidmore, 2002;Seto et al., 2004;Gillespie 2005;Lassau and Hochuli, 2007;Bino et al., 2008;Rocchini et al., 2009a,b).Recent research has based biodiversity assessment on a variety of measures such as spectral diversity (Rocchini, 2007;Rocchini et al., 2007) and biochemical diversity (Carlson et al., 2007).However, many studies have sought to infer biodiversity information indirectly from remote sensing, often through information on the land cover mosaic represented in a land cover map derived from the imagery (e.g.Gould, 2000;Griffiths et al., 2000;Oindo and Skidmore, 2002;Kerr et al., 2001;Rocchini et al., 2006).Indeed many studies have used land cover data, often as a surrogate for data on habitat type, and frequently exploited the temporal dimension of remote sensing to monitor land cover dynamics (Kral and Pavlis, 2006;Duveiller et al., 2008;Munoz-Villers and Lopez-Blanco, 2008;Kerr et al., 2001;Luoto et al., 2002Luoto et al., , 2004;;Cohen and Goward, 2004;Bergen et al., 2007;Fuller et al., 2007;Lassau and Hochuli, 2007).
Traditionally remote sensing has been used to derive standard land cover maps.The latter typically show a variety of land cover classes, with each unit mapped assumed to belong to one of the set of mutually exclusive classes contained in the map legend.Such maps have been used widely in ecological work.For example, studies of biodiversity related issues have often focused on land cover change as this is one of the greatest threats to biodiversity.Hence land cover has also been a central variable in studies of biodiversity conservation (Duro et al., 2007;Gillespie et al., 2008;Jones et al., 2009;Haines-Young, 2009).Remote sensing and GIS have the potential to make important contributions to biodiversity conservation.They may, for example, aid the prioritization of candidate locations for new reserves (Mang et al., 2007;Schulman et al., 2007;Vogiatzakis et al., 2006;Cayuela et al., 2006;Tchouto et al., 2006;Friedlander et al., 2007;Wood and Dragicevic, 2007;Beech et al., 2008), especially as sometimes only relatively coarse biological information may be required (Shi et al., 2005;Harris et al., 2005;Knudby et al., 2007).Moreover, remote sensing and GIS may be used to help monitor the effectiveness of reserves, allowing, for example, evaluation of changes inside and outside of reserve boundaries (Southworth et al., 2006;Wright et al., 2007;Joseph et al., 2009).
There are, however, many problems in the remote sensing of land cover.Classes can be difficult to define (Comber et al., 2005) which can be a source of error in studies of change (See and Fritz, 2006).Accuracy is also typically a function of the thematic resolution of the map.Because of these various sources of errors map users are often keen to have information on map accuracy.The need for accuracy assessment is now well-established (Cihlar, 2000;Strahler et al., 2006).One problem, however, is that accuracy assessments may suggest that land cover products derived by remote sensing are of insufficient accuracy (Townshend, 1992;Wilkinson, 1996;Gallego, 2004;Lu et al., 2008).This problem may be especially particularly apparent in change detection based on post-classification comparison where the amount of error in the individual maps could act to obscure or exaggerate change (Verbyla and Boles, 2000;Pontius and Lippitt, 2006).Change is also often mis-estimated as a result of locational errors causing data sets to be imperfectly co-registered.This type of problem may be particularly especially important in studies focused on class edges or in patchy landscape mosaics, which may have considerable impacts on species and biodiversity (Harris and Reed, 2002;Lindell et al., 2007;Weber et al., 2008).
The standard, general purpose, land cover maps is not always required for ecological research.Recently there has been a growing interest in more tailored or targeted mapping.For example, a researcher might be interested in mapping an invasive species or a rare habitat.The other classes that occur in the study area may be of little or no interest, simply forming a background in which the class(es) of interest are located.In such scenarios it may be wasteful of scare resources to produce a general purpose map and attention instead focused on just the class(es) of interest.This may be especially apparent if the image classifier used seeks to maximise overall classification accuracy, an undesirable feature if the class(es) of interest are rare and embedded in a mosaic of many classes.More targeted studies may help optimise resource use (Weiers et al., 2004) and the savings that can be achieved by the adoption of a specific mapping approach can be considerable (Foody et al., 2006;Mathur and Foody, 2008).Rather than including all classes in the analysis, necessitating amongst other things a large amount of ground data on each class to ensure underlying assumptions of the classification analysis are satisfied (Foody, 2004a,b), effort may be focused closely on the class(es) of interest (Boyd et al., 2006;Sanchez-Hernandez et al., 2007a,b).
Efficiency savings have been possible through the recent proliferation of airborne lidar (laser scanning), where it has shown that it is possible to acquire information on three-dimensions of a forested canopy (x,y and z) over much larger areas than possible through ground survey (Maltamo et al., 2005;Goetz et al., 2007).Airborne laser scanning (ALS) is a rapidly growing technology and many technical improvements have featured in recent years.These include sampling frequency, positioning accuracy and the number of recorded echoes (Hyyppä et al., 2009).These improvements coupled with the many advantages of ALS over traditional imaging techniques (i.e., that they are active systems and surveys can be performed at any time of the day or night in any season and that the interpretation of data captured is not hampered by shadows caused by clouds or neighboring objects) have led to their adoption operationally and semi-operationally for forest inventory purposes (Naesset et al., 2004).There are disadvantages to using ALS compared to traditional photogrammetric techniques and these include the lower spatial resolution and the lack of spectral information.However, new ALS systems are capable of recording the intensity and sometimes the full waveform of the target (Wagner et al., 2006;Reitberger et al., 2008;Chauve et al., 2009), and furthermore synergy with high spatial resolution images improve the semi-automatic classification of features from laser data (Kaartinen and Hyyppä, 2008).
The standard distribution-or area-based methods, where forest properties such as tree height, basal area and volume were inferred from the laser-derived surface models and canopy height point clouds, are making way for the extraction of individual tree crowns.In either case, the ecologist is provided with information such as forest growth (e.g., Yu et al., 2006); valuable structural information where the vertical distribution of the canopy, for example, can be the most important variable for the accurate prediction of bird species richness (Goetz et al., 2007); habitat assessment variables (Hill et al., 2004;Clawges et al., 2008;Martinuzzi et al., 2009) and biomass assessment for input into initiatives such as the UN-REDD programme (Tollefson, 2009).In addition to work using ALS both spaceborne lidar systems (e.g., ICESat/GLAS) and terrestrial lidar systems (TLS) have demonstrated potential for provision of ecological information.Lefsky et al. (2005) illustrate how ICESat/GLAS could map global canopy height, with Rosette et al. (2009) noting that GLAS footprints can provide estimates of mixed vegetation canopy height are comparable to those obtained from relatively high density airborne lidar data.Recent research has explored the potential of commercially available TLS for measuring vegetation canopy structure and although most of these have been concerned with the measurement of forest stand variables, including tree height, stem taper, diameter at breast height and planting density (e.g., Hopkinson et al. 2004;Henning and Radtke 2006a).In addition a small number of innovative studies have attempted to use TLS for characterising vegetation canopy structure (Henning and Radtke, 2006b;Danson et al. 2007).

Remote sensing: capturing the temporal dimension
There has been a recent upsurge in the provision of a temporal dimension, notably as a result of advances in data integration methods and attention to continuity issues (Steven et al., 2003;Wulder et al., 2008).Many studies have exploited the temporal dimension of remote sensing, a valuable attribute for studies of ecological systems, notably as a source of information on variables such as the timing and monitoring of vegetative phenological events (Zhang et al., 2009).Satellite remote sensing has become increasingly important in studies of phenological change at landscape to global scales (Studer et al., 2007;Motohka et al., 2009;Julien and Sobrino, 2009), particularly now the satellite record is over thirty years old, providing a means to help study impacts of environmental change.Vegetative phenological events are, for example, useful indicators of the impact of climate change in the terrestrial biosphere (Brügger et al., 2003).Changes in the timing of phenological events may signal year-to-year climatic variations or longer term change (Reed et al., 1994).Phenological events have implications for competition between plant species and interactions with heterotrophic organisms as well as ecosystem services to humans (Badeck et al., 2004).Much work to date has focused on constructing multi-temporal records of satellite data, extracting key phenological information from these via a number of different methods and relating these to climatic variations.The use of vegetation indices, such as the NDVI calculated from a time-series of NOAA AVHRR data, that are related to green leaf area and total green biomass has been the most prominent approach used (e.g., Piao et al., 2006;Heumann et al., 2007), however, the new generation of sensors launched over the past decade greatly enhances the potential to identify vegetation dynamics of this sort.These sensors include VEGETATION, POLDER, SEAWIFS, ATSR, MODIS and MISR (Steven et al., 2003).In particular, MODIS offers enhanced geometric, atmospheric and radiometric properties (Zhang et al., 2003) and synoptic coverage at spatial resolutions of 250 m, 500 m and 1 km globally.The EVI generated from MODIS data has several advantages over the NDVI for vegetation studies, having been used with some success in a number of phenological studies (e.g., Zhang et al., 2004Zhang et al., , 2006)).Another satellite sensor, which was launched post-Millennium is the Envisat MERIS and this sensor has many virtues for remote sensing ecosystem status and change.MERIS is, for example, a highly radiometrically accurate imaging spectrometer (Curran and Steele, 2004) and unlike many spaceborne sensors, it has a good spectral sampling at visible and near infrared (NIR) wavelengths coupled with narrow bands that should theoretically facilitate the accuracy of vegetation monitoring.The MERIS sensor also benefits from a moderate spatial resolution (300 m) and three-day repeat cycle (Verstraete et al., 1999).Two vegetation indices have been included in the official processing chain of the MERIS: the MERIS terrestrial chlorophyll index (MTCI; (Dash and Curran, 2004)) and the MERIS global vegetation index (MGVI; (Gobron et al., 1999)).Since 2003, global and regional composite (Level 3) products of MERIS Level 2 geophysical data have been generated by the UK Multi-Mission Product Archive Facility (UK-MM-PAF).Recently the MTCI has been shown to be a valuable data set for monitoring vegetation phenology, principally due to its sensitivity to canopy chlorophyll content, a vegetation property that is a useful proxy for the canopy physical and chemical alterations associated with phenological change (Jeganathan et al., 2010;Boyd et al., in press).

GIS: the internet revolution
According to Fotheringham and Wilson (2007), recent important developments in GIS include the rapid growth in the number and variety of geographical data sets, finding new ways to store, process, and transmit these data sets, new forms of visualization and statistical/mathematical modelling.It is now evident that rapid developments present in internet/intranet technology have seen GIS move from a static, closed, often single application environment to one that reaps the benefits of the networked environment, in particular its global and real-time accessibility.There has been considerable development in WebGIS and consequently the field has moved from its origins of serving maps to one that is interactive, standardized and under-pinned by OGC standards and sophisticated with regard to its visualization and geospatial analysis functionality (Peng and Tsou, 2003).Web-based GIS is now a mainstream application and has many attractive features and consequently, it should facilitate the exchange of current, analytical and multi-source ecological informatics data and GIS functions that are useful to the ecologist.
There a number of case-studies that illustrate the power of WebGIS: Graham et al. (2007) required a custom internet GIS solution to enable users to display maps of invasive species served by the Global Organism Detection and Monitoring (GODM) system which provides real-time data from a range of users on the distribution and abundance of non-native species, including habitat attributes for predictive spatial modelling of current and future locations.The use of WebGIS here provided a level of flexibility in database access, query and display not previously encountered.Another example is the Webbased bird avoidance models (BAMs).This system uses interactive GIS-enabled environments to provide fine-resolution and frequent predictions of bird densities for the Netherlands and the continental United States and Alaska (Shamoun- Baranes et al., 2008).The Global Biodiversity Information Facility's (GBIF) Mapping and Analysis Portal Application (MAPA) is another example of how WebGIS has facilitated the effective analysis and visualization of a legacy biodiversity data set which was not being employed optimally (Flemons et al., 2007).By building this application a number of challenges were met, including assuring fast speed of access to the vast amounts of data available through these distributed biodiversity databases; developing open standards based access to suitable environmental data layers for analyzing biodiversity distribution; building suitably flexible and intuitive map interfaces for refining the scope and criteria of an analysis; and building appropriate web-services based analysis tools that are of primary importance to the ecological community.Building on these successes such as these, some of the problems of internet GIS are now being addressed (e.g., interoperability) and we are now seeing examples of "Distributed GIS (DGIS)" which has the benefit of linking and accessing many systems as a single virtual system, using the standards and software of the Internet (Tait 2005;Chang and Park, 2006).Zhang and Tsou (2009) refer to a geospatial cyberinfrastructure which integrates distributed geographic information processing (DGIP) technology, high-performance computing resources, interoperable Web services, and sharable geographic knowledge to facilitate the advancement of geographic information science (GIScience) research, geospatial technology, and geographic education.Further development in GRID and CLOUD computing (e.g., Chen et al., 2009) should see dividends for ecological informatics.
In addition to developments in the wired technologies the past few years have seen remarkable maturation in wireless technologies (e.g., Wi-Fi and Bluetooth).This coupled with a maturation in Global Positioning System (GPS) technology, mobile operating systems and device platforms such as smartphones, Pocket PCs/PDAs, laptops, and Tablet PCs, rugged handheld mobile devices has meant that mobile GIS is now a reality with advantages such as reduction in task redundancy and increase in data currency.There has also been a decrease in the cost, size and weight, and reliability of environmental sensing software and hardware (Rundell et al., 2009).To the extent that there are now many examples of wireless sensors and associated sensor webs/networks whereby sensors (both remote and in situ) to observe ecological and associated variables and distribute these in near-real time to web-accessible databases.Rundell et al. (2009) provide a review of current sensor networks and stress that near-real-time observation of systems, based on data from local sources as well as nested or adjacent networks, and from remote sensing data streams is allowed by new sensor network designs.Current examples include the National Ecological Observatory Network (NEON), an integrated network of 20 regional observatories designed to gather long term data on ecological responses of the biosphere to changes in land use and climate, and on feedbacks with the geosphere, hydrosphere and atmosphere (Keller et al., 2008); the Global Lake Ecological Observatory Network (GLEON) (www.gleon.org)and Sensor Asia which integrates fieldservers and Web GIS to realise easy and low cost installation and operation of ubiquitous field sensor networks (Honda et al., 2009).

Free and Open Source GIS Software and data
Another spin-off from internet technology has been the recent surge of interest in Free and Open Source Software (FOSS) within the domain of GIS.This also arises in response to the rapidly growing realisation that sharing knowledge and helping others is a cornerstone of a social and ethical society (Stallman, 1999) as well as being an aid to technological and scientific advancement (Steiniger and Hay, 2009).As well as new projects and products coming on stream (for example The GIS Weasel, a freely available, open-source software package built on top of ArcInfo Workstations for creating maps and parameters of geographic features used in environmental simulation models (Viger, 2008)), it is apparent that existing products are entering a phase of rapid refinement and enhancement (Ramsey, 2007).Steiniger and Bocher (2009) document four indicators that demonstrate the growing adoption of FOSS in GIS, including download rates of free desktop GIS software such as SAGA GIS (www.saga-gis.org)and increasing financial support by governmental organization for FOS GIS projects.The ethos that underpins the FOSS vision conforms to that of Ecoinformatics.orgwhich is an open, voluntary collaboration of developers and researchers that aims to produce software, systems, publications, and services beneficial to the ecological and environmental sciences.The organization states that technologists may use the resources provided by ecoinformatics.orgto leverage tools being developed in an open-source, collaborative, standards-aware environment.Given the now readily available FOS GIS and the positive recognition of open-source principles by the ecological domain we should see the use of FOS GIS within ecology as a whole.Li et al. (2007) describe a spatial forest information system based on Web service using an open-source software approach, stating that the use of open-source software in this way can greatly reduce the cost while providing high performance and sharing spatial forest information.Indeed, Steiniger and Hay (2009) point to the great potential of FOS desktop GIS in landscape ecology research and encourage a joint push towards a common, customisable and free research platform for the by the landscape ecology community.Other recent publications of note include GOBLET: An open-source geographic overlaying database and query module for spatial targeting in agricultural systems (Quiros et al., 2009); GODM, a global organism detection and monitoring system for non-native species (Graham et al., 2007) and AquaMaps, for generating model-based, large-scale predictions of currently known natural occurrence of marine species based from estimates of the environmental tolerance of a given species with respect to depth, salinity, temperature, primary productivity, and its association with sea ice or coastal areas (Kaschner et al., 2008).It should be noted, however, that some have raised cautionary concerns, particularly with regard to public provision of spatial data enabled by Web 2.0 (Elwood, 2009).

Geovisualization
Spatial data often lend themselves to visualization because the data are geocoded and can therefore be represented easily on maps and map-like objects (Fotheringham and Wilson, 2007).Geovisualization integrates approaches from scientific visualization, (exploratory) cartography, image analysis, information visualization, exploratory data analysis (EDA) and GIS to provide theory, methods and tools for the visual exploration, analysis, synthesis and presentation of geospatial data (McEachren and Kraak, 2001).The past five years has seen greater interaction between those working in the geoinformation domain with those involved in domains such as scientific visualization, information visualization and knowledge visualization.This further cements the power of visualization to handle information for knowledge discovery purposes (Jiang and Li, 2005).Today's geovisualization technologies provide many advanced options, with users able to explore spatial data using multiple views and techniques to compile maps with full control over graphic variables and display (Jones et al., 2009).One example is the animation visualization tool developed by Rieker and Labadie (2006) for the analysis of impacts of river operations on habitats of endangered species.Future visualizations could include the functionality of Augmented Reality (AR) systems in real space and time, building on the already achievable photo-realistic visualization method which uses the combination of off-line AR techniques with a GIS (Ghadirian and Bishop, 2008).
It has been noted, however, that many innovative geovisualization research tools are not easy to use and require significant user training.They are also difficult to integrate with other systems and the scientific activities they are meant to inform are often unclear; the number of active geovisualization users remains lower than expected (Gahegan, 2005).However, the launch in 2005 of Google Earth (A 3D Interface to the Planet: earth.google.com)and other virtual globe/ geobrowsing software, for example, NASA World Wind (worldwind.arc.nasa.gov)and ESRI's ArcGIS Explorer (www.esri.com)has altered that trend, with researchers using these 3D interfaces as a geographic visualization medium (Gonzalesa et al., 2009).Scientists and environmental professionals from many fields are beginning to utilize the functionality and advantages of virtual globes (Butler, 2006) and since ecology benefits from a multidisciplinary perspective using data covering wide and/or temporal spaces (Carpenter, 2008) this can be beneficial.Michael Goodchild, reported in Butler (2006), purports that once scientists experience this easy visualization, they will be drawn into deeper forms of analysis using the powerful techniques that GIS professionals have developed over decades.Butler (2006) notes a word of caution in stating the dangers associated with the production of visually appealing, even statistically sound, results that do not reveal anything useful about either pattern or process.Nevertheless, useful progress has been made, such as the SFMN GeoSearch, a data sharing and visualization application built for the Sustainable Forest Management Network (SFMN).This is an integrated system built on Google's Keyhole Markup Language (KML) where the tools needed for database exploration, data visualization, and communication between data sets authors and potential second-hand users are tightly interconnected, easy to use and accessible from the same online application (Gonzalesa et al., 2009).In addition to this, progress moving visualization into the 3rd dimension, there has also been progress in the temporal component of data visualization (the 4th dimension), with initiatives such as GISTSOM (www.gistsom.com)and CommonGIS (Andrienko et al., 2003).Mobile environmental visualization is becoming a reality (Danado et al., 2005), as is the idea of the "Geovisualization Mashup", in which visual analysis is useful for the preliminary investigation of large structured, multifaceted spatiotemporal data sets (Wood et al., 2007).

Species distribution modelling
The number of ecological publications using GIS has grown very rapidly (Anderson, 2008) and this is noticeable with applications such as species distribution modelling.Researchers have become increasingly D.S. Boyd, G.M. Foody / Ecological Informatics xxx (2010) xxx-xxx Please cite this article as: Boyd, D.S., Foody, G.M., An overview of recent remote sensing and GIS based research in ecological informatics, Ecological Informatics (2010), doi:10.1016/j.ecoinf.2010.07.007 aware of the potentials of GIS and of data availability and tools for modelling applications.Indeed, GIS has been a component of a diverse array of studies with one popular theme being the linking of landscape patterns to a range of ecological variables.The latter has included assessments of the effect of road networks on accessible habitats and effects of humans on habitat quality (Kameyama et al., 2007;Eigenbrod et al., 2008).However, a major topic of recent research has been on the potential impacts of environmental changes on species distributions.GIS has, for example, been used to predict species distributions and risks to biodiversity (Spens et al., 2007), to aid the visualization, exploration and modelling of data on species distributions (Lopez-Lopez et al. 2006;Vogiatzakis et al., 2006;Zhang et al., 2007) and study the effect of major variables such as disturbance events (Pennington, 2007).Indeed GIS now offers unprecedented flexibility to analysts, especially in relation to how data are used and what analytical criteria are employed in studies (Graham et al., 2007).There are, of course, still many basic concerns such as impacts of missing data or variations in data quality to be addressed (Bailey et al., 2006;Wolman, 2007).
The modelling of species distributions has been an important issue in ecology for a long time, not least in helping to characterise ecological niches.In recent years species distribution modelling has become even more popular, especially given its role in predicting impacts of variables such as climate change on species and biodiversity.Species distribution models have, however, been used in a variety of applications including facilitating the selection of sites for species re-introduction (Pearce and Lindenmayer, 1998), design of field surveys (Engler et al., 2004), design of reserves (Li et al., 1999) and impacts of climate change (Nativi et al., 2009).None-the-less, the latter application has been a focus of considerable recent attention in the literature.Studies have sought to determine the potential impacts of climate change on the spatial distribution of species and biodiversity and use this information to facilitate conservation activities (Araujo et al., 2005;Akcakaya et al., 2006).Many approaches may be used to model the impacts of climate change on species distributions (e.g.Hamann and Wang, 2006;Austin, 2007;Botkin et al., 2007) but considerable use has been made of bioclimate envelope models (e.g.Berry et al., 2002;Garzon et al., 2007).The latter are based on the correlation between observed species distributions and climatic variables which may be readily undertaken in a basic GIS (Pearson and Dawson, 2003;Luoto et al., 2005).The relationships established between the variables may then be used to project the future distribution of the species under a set of climate change scenarios (Pearson and Dawson, 2003;Luoto et al., 2007).Such studies can provide a valuable initial assessment of likely climate change impacts, especially if used at coarse spatial scales where macro-climate variation has most impact on species distributions (Pearson and Dawson, 2003;Luoto et al., 2005;Heikkinen et al., 2006).There are, of course, limitations to such modelling (Pearson and Dawson, 2003;Beaumont et al., 2007;Brooker et al., 2007;Osborne et al., 2007a).For example, there is often a need to accommodate for negative impacts of spatial autocorrelation (Hampe, 2004;Dorman, 2007).Some recent research using spatially explicit methods such as local statistics may represent one way forward (Osborne et al., 2007a,b,c;Echeverria et al., 2008;Foody, 2008a) and these and other techniques which can greatly aid ecological modelling activities in support of conservation efforts are becoming increasingly available to the ecological community (Santos et al., 2006).
Species distribution modelling has also benefited from the increased provision of data arising from the opening-up of archive resources and data sharing activities as well as the availability of a suite of modelling tools (Guisan et al., 2006;Austin, 2007;Graham et al., 2008).Additionally, much modelling is based upon presence or presence-absence data which are relatively easy to acquire and less sensitive than other data sets, such as those relating to abundance or cover, to variations in surveyor expertise (Ringvall et al., 2005).None-the-less many challenges and issues remain to be addressed.For example, further work to help accommodate for the effect interactions between variables and a greater incorporation of theoretical knowledge may be required (Guisan et al., 2006;Austin, 2007).Additionally there are many factors that may influence a modelling study.These include issues connected with accuracy of the data sets and methods used.

Accuracy and comparison
Modelling activities are inevitably limited by the quantity and quality of the data sets used (Lobo, 2008;Wisz et al., 2008).Unfortunately, ecological data are often characterised by a large degree of inherent uncertainty and error.However, as there is typically little meta-data on data quality, the issue, although wellknown, may often be ignored or assumed unimportant.This may be a major concern with species distribution modelling activities as even basic presence-absence data sets may contain substantial error and uncertainty.Many problems have been reported in the literature.For example, there may be only partial information available (Conlisk et al., 2009) or substantial error arising from locational uncertainty (Freeman and Moisen, 2008;Johnson and Gillingham, 2008;Graham et al., 2008;Osborne and Leitao, 2009).Indeed a common problem is associated with data on the absence of species (Lobo, 2008;Jimenez-Valverde and Lobo, 2007;Graham et al., 2008).Fundamentally, it is normally impossible to be confident that a recorded absence is actually nothing more than an undetected presence (MacKenzie, 2005;Cronin and Vickers, 2008;Franklin et al., 2009).False absence cases may arise for a variety of reasons and may occur especially for cryptic species that are difficult to detect.These cases can be a source of substantial error and bias to modelling studies (Hartel et al., 2009).Problems such as this may be greatest for rare species, having negative impacts on conservation activities that seek to protect them or detect changes in their occurrence (Ringvall et al., 2005).Thus while the acquisition of presence-absence data is less sensitive to measurement and judgement errors than some other data (Ringvall et al., 2005) there are still many concerns that can result in error and uncertainty in the data used in modelling studies.
In addition to concerns linked to the quality of the data there are important issues associated with aspects of the quantity of data used.For example, in addition to issues connected with the data set size (Strayer, 1999;Stockwell and Peterson, 2002;Wisz et al., 2008) the partitioning of cases between presence and absence cases is important.For example, the use of data sets that are greatly imbalanced in terms of the proportion of presences and of absences can be problematic (Real et al., 2006).The latter concern is also linked to problems connected with the selection of a threshold to convert the continuous probabilistic outputs from a model into a binary classification for mapping purposes (Jimenez-Valverde and Lobo, 2007;Freeman and Moisen, 2008).For example, if the data set is greatly imbalanced in favour of absences the model outputs may sometimes be biased to 0, depending greatly on the algorithms and thresholds used (Jimenez-Valverde and Lobo, 2007).However, it should be noted that it may still sometimes be possible to derive useful information from an imperfect modelling analyses where the dangers to interpretation should be recognised (MacKenzie, 2005;Graham et al., 2008;Osborne and Leitao, 2009).
The accuracy of the outputs derived from analyses, whether from a classification of remotely sensed data or some species distribution model, has a major impact on their value for later work.Accuracy assessment or validation is thus a fundamental issue in ecological informatics.The latter typically involves the comparison of the derived product with reality and calculation of summary quality measures.Unfortunately, reality or the 'truth' about the feature under study is rarely known unless a simulated data set is used (Austin et al., 2006;Carlotto, 2009;Foody, 2009a;Franklin et al., 2009).For example, errors in ground data sets used in remote sensing of land cover may be large (Foody, 2009a).Moreover, there are often considerable differences between maps of, apparently at least, the same phenomenon derived by remote sensing (Herold et al., 2006;See and Fritz, 2006;Potere et al., 2009) which give map user's uncertainty over which, if any, to adopt (Herold et al., 2008;Shao and Wu, 2008).Unfortunately this latter situation is sometimes worsened by the poor attention sometimes paid to accuracy assessment, with many maps either not evaluated rigorously or only to a limited extent (Herold et al., 2006;Brannstrom et al., 2008).
As noted above, there are many sources of error in ecological data and it is unlikely that error-free data on even basic issues such as species distribution can be collected in practice (Turner, 2006).Accuracy assessment is, therefore, often not undertaken relative to a true gold-standard reference but relative to an imperfect reference.The use of an imperfect reference can, however, be a source of considerable error.For example, even small errors in the ground data used in remote sensing studies may introduce very large bias into the assessment of classification accuracy and estimation of basic variables such as class extent (Foody, 2009a(Foody, , 2010)).The magnitude and direction of the biases introduced through the use of an imperfect reference vary as a function of the quality of the reference data and its relationship to the data evaluated (Valenstein, 1990).It is, therefore, important to base an accuracy assessment on high quality reference data (Farber and Kadmon, 2003).
A variety of measures of accuracy have been discussed in the ecological literature (Fielding and Bell, 1997;Liu et al., 2009).Many of the popular approaches are based on a binary confusion matrix which summarises the allocations made in the two classifications.Cases that lie on the main diagonal of the matrix are those for which the two classifications agree on labeling.The off-diagonal elements of the confusion matrix, however, highlight the two types of error that may occur: omission (false absence) and commission (false presence).The magnitude of these two types of error clearly impacts on the accuracy of the classification, although the relative importance of the errors of omission and commission may vary between studies (Fielding and Bell, 1997;Jimenez-Valverde and Lobo, 2007).Similarly, the types of error may impact differently on the indices of accuracy that may be derived from a confusion matrix.
Widely used measures of accuracy in ecology include sensitivity, specificity, true skills statistic (TSS), overall accuracy and the kappa coefficient of agreement (Fielding and Bell, 1997;McPherson et al., 2004;Allouche et al., 2006;Freeman and Moisen, 2008).Another popular measure in ecology which is not directly derived from the confusion matrix, but which is based upon sensitivity and specificity, is the area under the receiver operating characteristics curve (AUC).These various measures reflect different aspects of accuracy and may vary in their value depending on the objectives of a study.One feature often stressed in the ecological literature is that a useful measure of accuracy should be independent of prevalence (Manel et al., 2001).Thus, the use of some of the popular measures which are prevalent dependent, such as the overall accuracy and positive predicted value, is often discouraged in ecological applications discouraged (Fielding and Bell, 1997;Manel et al., 2001;Farber and Kadmon, 2003;Freeman and Moisen, 2008).It should be noted that the popular kappa coefficient of agreement is also prevalent dependent (Manel et al., 2001;McPherson et al., 2004;Freeman and Moisen, 2008) and prevalence correction may be unsuitable (Hoehler, 2000).As a result of this and other concerns with the use and interpretation of the kappa coefficient its use should be questioned (Foody, 2008b).Similarly the impacts arising from the use of an imperfect reference need to be recognised (Foody, 2009a(Foody, , 2010)).The choice of accuracy measure should also be based closely on project needs.For example, it may be desirable in some studies to weight omission and commission errors differently and hence the standard formulation of the TSS may be inappropriate (Allouche et al., 2006).Similarly while the AUC has the attractive feature of being based upon the entire spectrum of sensitivity and specificity it may sometimes be necessary to consider the shape as well as the area of the curve, weight sensitivity and specificity differentially and base calculations upon only their meaningful range for the analysis task in-hand (Kazmierczak, 1999;Williams and Peterson, 2009).It should also be noted that the calculated AUC may vary with the extent of the study area.Some, have, therefore, suggested that more than one measure of accuracy be provided to indicate the quality of the product evaluated (Lobo et al., 2008).
A further key issue in accuracy assessment is the sample size used.The sample size used is positively related to the precision of the estimate derived.Additionally, sample size is also an important issue in comparative studies such as those seeking to determine changes in species distribution over time or evaluate land cover change.A key issue to note is that the use of a sample size that is too small or too large can be problematic (Foody, 2009b).Fundamental to this situation are the two types of error that can arise in popular statistical hypothesis test-based analyses.A Type I error in which the null hypothesis, which is normally of no difference, is incorrectly rejected and a difference declared to exists when in reality it does not.This could have major problems in conservation studies, perhaps leading to a conclusion that a species was declining in a region where the population was actually stable (Strayer, 1999).A Type II error, however, occurs when the null hypothesis is incorrectly upheld and hence the existence of a meaningful difference goes undetected.This type of error can also be a major problem in ecological studies, perhaps leading to the conclusion a population was stable when actually there are important changes occurring (Strayer, 1999).The interpretation of non-significant test results can be a major problem, especially if a small sample size was used (Hoenig and Heisey, 2001;Trout et al., 2007).The probability of committing both Type I and II errors should be considered in the design and interpretation of a statistical hypothesis test.The probability of making each type of error is, however, a function of sample size; a study with a small sample may fail to detect a meaningful difference that does exist while one using a large sample might ascribe statistical significance to a trivial differences.Thus, sample size should be set to meet the study objectives and this should also recognise the impacts arising through the use of an imperfect reference (Messam et al., 2008).Finally, it should be noted that comparative studies need not only focus on the basic difference between values.A range of other scenarios exist, such as testing for equivalence and non-inferiority, and the null hypothesis is not constrained to be one of no difference (Foody, 2009c).

Prospects
As highlighted by this review both incremental and nonincremental developments in both remote sensing and GIS have ensured their continuing input and significance in the field of ecological informatics.Much progress has been made in the observation and analysis of real-world complexity and promises to further realisation of the inherent capabilities afforded by remote sensing and GIS for improving ecosystem theory and decision support.As a look to the future, one might anticipate further consolidation of support via the launch of sensors with improved resolutions, novel sensor approaches such as "interactive remote sensing" (Gail, 2007) and "intelligent multitasking" (Schmidt, 2009), advanced spatial analysis tools (e.g.S+) and visualizations that are highly interactive with multisensory inputs and outputs (Clarke, 2010).Here-on-in it is expected that developments in sensor technology and data availability may play a major role in influencing future work.An important concern for some is the maintenance of data continuity (Steven et al. 2003;Leimgruber et al., 2005).For example, there is a strong desire to extend the archive of Landsat sensor products into the future as such data have been widely used for over 3 decades (Cohen and Goward, 2004; Boyd and Danson, 2005;Wulder et al., 2008).Continuity issues, therefore, need consideration in the development of new sensors (Janetos and Justice, 2000).This should not, however, limit developments which could follow the acquisition of data at enhanced spatial, spectral and radiometric resolutions possibly over a range of angular viewing geometries from recently launched and proposed remote sensing systems.Similarly, advances may be expected to arise from an increased use of remotely sensed data in combination with other environmental data, field observations and models as well as through the combined use of multiple analytical methods (e.g.Campbell et al., 2005;Richards, 2005;Scopelitis et al., 2007).Developments such as these could facilitate a move of the idea of a megascience infrastructure capable of discovering and integrating enormous volumes of multidisciplinary data, such as envisioned by communities such as those behind Global Earth Observation System of Systems (Nativi et al., 2009), toward a reality.