We were a bit surprised to learn that no one source lists all the lakes of the world and their key attributes – much less in an easily accessible repository. The number is finite and not that large (~350 million; Downing et al. 2006), and probably does not change much. We decided to create a repository, LakeBase, that represents lakes of the world and their associated data of interest to lake scientists. We have a long way to go, but we’re off to a good start. LakeBase is changing rapidly as we load data sets and provide additional functionality.

LakeBase has two main interfaces – a map (Fig. 1) and a free-text search (Fig. 2).

GLEON CDI LakeBase map lake lakes world

Figure 1. LakeBase has ~250,000 lakes from around the world. Inset are a subset of lakes from the Northern Highland Lake District of northern Wisconsin. LakeBase has nearly 200,000 data records. Through this map, users will be able to show geographic distributions of lakes with specific characteristics, for example high phosphorus concentration.

GLEON CDI LakeBase search interface

Figure 2. LakeBase has a ‘Google-like’ interface for searching. We were tired for filling out forms to peruse data, so we developed free-text search of the metadata and return results with data previews and a shopping cart to accumulate data sets for download to handy formats.

Our intentions for LakeBase are as follows:

  • GLEON does not ‘own’ data from sites, rather we provide data services. True for Vega (sensor data) and LakeBase, so you see full attribution of the data source in LakeBase when you search for data. Data owners have the right to add/remove their data.
  • LakeBase is meant to hold typical structured data coming from field programs — very much what NTL LTER collects and what most GLEON sites collect. Through the map interface, users will be able to display geographic distribution of lakes with certain characteristics, such as high phosphorus for example. Naturally, the underlying data will be downloadable.
  • LakeBase is meant to be the unique identifier for lakes (but also has multiple lake identifiers that help trace backward to data sources) so that our Computer Science colleagues can search the Internet for, e.g., deposition, LU/LC, economic, demographic, and other social science information, and link those data back to actual lakes. To see this feature in action, go to a US Lake (sorry, but we had to start somewhere!) on the map, e.g., Mendota, click on the pin for that map, and at the bottom you’ll see a link to ‘CS Lake’.
  • Some field programs, including the Hanson Lab (UW CFL), will use this as an information management system.
  • We have developed an easy mechanism to upload data from spreadsheets so that any site can upload data and get credit for it. The user identifies the lake, selects the variables to upload, and a spreadsheet template with all the proper metadata/controlled vocabulary is generated. The user then pastes data into the spreadsheet and uploads it. This will be made public when we straighten out accounts/user information in the coming months.
  • As data consumers, we were tired of having to fill out forms to get data from other repositories, so for LakeBase we developed a Google-like search interface with a ‘shopping cart’ checkout of data. The search mechanism is free-text search of the metadata, so exact matches and similar matches are returned. This will see refinement in the coming months.
  • By 2011, we also will provide output formats for search results that can be easily entered into R (e.g., data frames), Matlab (e.g., iddata objects), and numerical simulations (e.g., NetCDF) formats.