FLUXNET2015 Dataset

The FLUXNET2015 Dataset includes data collected at sites from multiple regional flux networks. The preparation of this FLUXNET Dataset has been possible thanks only to the efforts of many scientists and technicians around the world and the coordination among teams from regional networks. The previous versions of FLUXNET Dataset releases are the FLUXNET Marconi Dataset (2000) and the FLUXNET LaThuile Dataset (2007). The FLUXNET2015 Dataset includes several improvements to the data quality control protocols and the data processing pipeline. Examples include close interaction with tower teams to improve data quality, new methods for uncertainty quantification, use of reanalysis data to fill long gaps of micrometeorological variable records, among others (see the data processing pipeline page for details). Refer to the Data Policy page for data usage and acknowledgement requirements.

Download FLUXNET2015 Dataset 

(Dataset updated on February 6, 2020 — see changes)

 

Note: Regional databases contain newer data products 

New documentation

Reference paper. Detailed descriptions of the FLUXNET2015 dataset and the ONEFlux processing pipeline are available in the dataset reference paper, co-authored by data teams and members of all site teams contributing to FLUXNET2015 Tier One sites:

Pastorello, G., Trotta, C., Canfora, E. et al. The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data. Sci Data 7, 225 (2020). https://doi.org/10.1038/s41597-020-0534-3
[.bib, .ris]

Quick start guide. Please see the quick start guide to quickly get started selecting variables from the FLUXNET2015 dataset.

Data products

The complete output from the new data processing pipeline includes over 200 variables – among measured and derived data, quality flags, uncertainty quantification variables, and results from intermediate data processing steps. Many data users are expected to require access only to a subset of these variables. To serve these users with an easier to use data product, we created two data products with different selections of variables for data distribution. Auxiliary data products provide extra information on specific parameters of the data processing pipeline. The groups of products are:

  • FULLSET: Includes all variables generated by the processing. This includes all generated uncertainty quantification variables, all variants of the data products, all quality information flags, and many variables generated by intermediate processing steps to allow in depth understanding of individual processing steps and their effect in the final data products. Please refer to the FULLSET Data Product documentation for details.
  • SUBSET: Includes a subset of the data product. The selection of the variables for this data product was done based on the expected usage for most users. The selected data products are accompanied by a minimal set of quality flags and uncertainty quantification variables. Please refer to the SUBSET Data Product documentation for details.
  • AUXMETEO: Auxiliary data product containing results from the downscaling of micrometeorological variables using the ERA-Interim reanalysis data product. Variables in this files relate to the linear regression and error/correlation estimates for each data variable used in the downscaling. Please refer to the FULLSET Data Product documentation for details and the data processing pipeline for information on generating and using these variables.
  • AUXNEE: Auxiliary data product with variables resulting from the processing of NEE (mainly related to USTAR filtering) and generation of RECO and GPP. Variables in this product include success/failure of execution of USTAR filtering methods, USTAR thresholds applied to different versions of variables, and percentile/threshold pairs with best model efficiency results. Please refer to the FULLSET Data Product documentation for details and the data processing pipeline for information on generating and using these variables.
  • ERAI: Auxiliary data product containing full record (1989-2014) of downscaled micrometeorological variables (as related to the site’s measured variables) using the ERA-Interim reanalysis data product (details in the document about the data processing pipeline).
  • PATCH1: [Solved in February 6, 2020 update, no longer needed] Complete versions of  LE_F_MDS_QC and H_F_MDS_QC variables; for details, please see known issues

 

Data policies

The FLUXNET2015 Dataset is distributed under the two Tiers of the FLUXNET2015 Data Policy. Data distributed in both Tiers can be accessed from the data download page for the FLUXNET2015 Dataset. Tiers are assigned to individual site-years of data (a collection of data records collected in a calendar year at one site). Tower teams choose Tiers, following regional networks policies. Tier 1 has a more open data policy and includes a subset of the site-years of data in the FLUXNET2015 Dataset. Tier 2 is has a more restrictive data policy, but includes all the site-years of data in the FLUXNET2015 Dataset (including site-years of data under Tier 1 policy). To see a list of site-years of data available for each site, please refer to the list of sites and data availability.

 

Temporal aggregation resolutions

The FLUXNET2015 Dataset provides data in five standard temporal aggregations. Tower teams generate either half-hourly (HH) or hourly (HR) data sets, depending on the conditions in the site. Half-hourly and hourly data are treated as equivalent in this documentation — i.e., any reference to half-hourly data also applies to hourly data, unless otherwise stated. Half-hourly data are the basis of all the processing done for this dataset and are the finest grained temporal resolution provided. Coarser aggregations are generated uniformly from half-hourly data within the data processing pipeline. The other standard temporal aggregations are: daily (DD), weekly (WW), monthly (MM), and yearly (YY).

 

Data files

The FLUXNET2015 Dataset is distributed in files separated by sites, by temporal aggregation resolutions (e.g., half-hourly or yearly), by data products (e.g., SUBSET or FULLSET), and by Tiers. Version information is also assigned to the file to document changes required for a site. The file naming convention below details these options for each file. Multiple files with different temporal aggregation resolution (same site, same data product, and same Tier) are available for download as a single ZIP file archive. Site information metadata are also available at the time of download and within the file download manifest (generated at each download operation).

 

Template:
[PUBLISHER]_[SITEID]_[PROCESSING-PIPELINE]_[GROUPING]_[RESOLUTION]_[FIRST-LAST-YEARS]_[SITEVERSION-CODEVERSION].[EXT]

Examples:
FLX_US-Ha1_FLUXNET2015_SUBSET_HH_1992-2012_1-1.csv
FLX_FI-Hyy_FLUXNET2015_FULLSET_1996-2014_1-1.zip

Field definitions:

  • PUBLISHER: 3-character code for publisher. Possible options:
    • FLX: FLUXNET data product
    • other codes can be used by publishers (e.g. regional networks) when distributing data from their databases
  • SITEID: FLUXNET site ID in the format CC-SSS (CC is two-letter country code, SSS is three-character site-level identifier)
  • PROCESSING-PIPELINE:  Version of the processing pipeline (e.g., FLUXNET2015 is the current version)
  • GROUPING: Grouping of variables from release included in file. Possible values:
    • SUBSET: Core set of variables with minimal quality and uncertainty information
    • FULLSET: All variables, including all quality and uncertainty information, and key variables from intermediate processing steps
    • AUXMETEO: Auxiliary variables related to micrometeorological downscaling
    • AUXNEE: Auxiliary variables related to NEE, RECO, and GPP processing
    • ERAI: Full record (1989-2014) of ERA-Interim downscaled micrometeorological variables for site
    • PATCH1: [Solved in February 6, 2020 update, no longer needed] Complete versions of  LE_F_MDS_QC and H_F_MDS_QC variables; for details, please see known issues
  • RESOLUTION: Temporal resolution of data product (NOTE: only applicable to DATA files, not ARCHIVE, AUX, or ZIP files). Possible values:
    • HH: Half-Hourly time steps
    • HR: Hourly time steps (NOTE: documentation for HH also applies to HR)
    • DD: Daily time steps
    • WW: Weekly time steps
    • MM: Monthly time steps
    • YY: Yearly time steps
  • FIRST-LAST-YEARS: First and last years of eddy covariance fluxes data in the format YYYY-YYYY
  • SITEVERSION-CODEVERSION: Version string with two integer components separated by a dash (#-#). First integer indicates version of data set for the site within the scope of the release; second integer indicates version of the code of the data processing pipeline within the scope of the release used to process the data set for the site
  • EXT: File extension. Possible values:
    • csv: Comma-separated values in a text file (ASCII)
    • zip: Archive file with all temporal resolutions for same site and data product

 

Timestamps

Timestamps used in this release follow the format YYYYMMDDHHMM, truncated at the adequate resolution (e.g., YYYYMMDD for a date or YYYYMM for a month). Two formats of time associated with a record are used here: (a) single timestamp, and, (b) pair of timestamps.

In cases in which the temporal resolution of the period represented matches the temporal resolution of the timestamp being used, there is no ambiguity. For instance: to represent a daily aggregate, a temporal resolution up to the day is sufficient for a timestamp to unambiguously identify the period represented, e.g., 20150728. However, in situations in which the temporal resolution of the period represented differs from the timestamp, it is necessary to clarify what is being represented by a given timestamp. For instance, using a timestamp with resolution up to the minute — e.g., 201507281730 — to identify a single half-hour period can be interpreted in different ways — e.g., 201507281730 could refer to the periods of 5:00pm to 5:30pm (end of averaging period convention), or 5:30pm to 6:00pm (beginning of the averaging period convention). Different tower teams, and even different networks, use different conventions. This requires tracking of the convention adopted, often leading to data sets being shifted in time because of confusion on the conventions used. To address this issue, two variables explicitly referring to start and end of a given period are adopted (TIMESTAMP_START and TIMESTAMP_END), eliminating ambiguity.

Data files in half-hourly, hourly, and weekly resolutions use start and end timestamps. Data files using daily, monthly, and yearly resolutions use a single timestamp. Below are examples of resolutions that will use a single TIMESTAMP variable for timekeeping, and resolutions requiring the use of both TIMESTAMP_START and TIMESTAMP_END (blank spaces added for legibility).

sample half-hourly data file (both timestamps):

    TIMESTAMP_START, TIMESTAMP_END,  CO2,   ...
    201507281700,    201507281730,   391.1, ...
    201507281730,    201507281800,   391.8, ...
    ...

sample hourly data file (both timestamps):

    TIMESTAMP_START, TIMESTAMP_END,  CO2,   ...
    201507281700,    201507281800,   391.1, ...
    201507281800,    201507281900,   391.8, ...
    ...

sample daily data file (single timestamp):

    TIMESTAMP, CO2,   ...
    20150728,  391.1, ...
    20150729,  392.8, ...
    ...

sample weekly data file (both timestamps):

    TIMESTAMP_START, TIMESTAMP_END, CO2,   ...
    20150701,        20150707,      391.1, ...
    20150708,        20150714,      391.8, ...
    20150715,        20150721,      390.9, ...
    20150722,        20150728,      392.0, ...
    ...

sample monthly data file (single timestamp):

    TIMESTAMP, CO2,   ...
    201507,    391.1, ...
    201508,    392.8, ...
    ...

sample yearly data file (single timestamp):

    TIMESTAMP, CO2,   ...
    2014,      388.1, ...
    2015,      392.8, ...
    ...

Time zone convention

Time is reported in local standard time (i.e., without “Daylight Saving Time”). The timezone information (with respect to UTC time) is reported in the site metadata.

 

Column ordering

For text file data representations (i.e., CSV formatted), the variable/column order is relevant. The order of columns will NOT be guaranteed to be the same for different files (e.g., different sites), even though they will be similar in many cases. This means that any data processing routines should rely on the variable label (which is always consistent) and not the order of occurrence of that variable in the file. Timestamps are the only exception and will always be the first variable(s)/column(s) of the data file.

 

Missing data

Missing data values are indicated with -9999 (without decimal points) as a replacement value, independent of the cause for the missing value.

 

Known Issues

A list of known issues and limitations relevant to the FLUXNET2015 Dataset will be maintained. Types of issues covered will include data-related issues, processing-related issues, or notes tower teams might want to make available about their data.

Releases of the FLUXNET2015 Dataset

The FLUXNET2015 Dataset had two incremental releases in July 2016 and November 2016, following the original December 2015 release — see change log for information on changes. This approach aimed at allowing sites that hadn’t been processed yet or had pending data quality issues to be included as they became ready. Another release was made available in February 2020, mainly with updates to data policy and metadata available for FLUXNET2015, no new sites or new data were included — more details in the change log.

FAQ

A collection of frequently asked questions will be compiled and maintained as new questions are raised.

 

Contact Us

Please send questions, comments, and feedback to [email protected]. We’d love to hear from you!

 

Please use the downloads page to get access to the FLUXNET2015 Dataset.


More information