Photo by DeepMind on Unsplash

NOAA Global Forecast System (GFS) Dataset for Machine Learning

Install DagsHub:

pip install dagshub
Click on copy button to copy content

To stream this data directly on DagsHub

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://test.dagshub.com/DagsHub-Datasets/noaa-gfs-bdp-pds-dataset")

fs.listdir("s3://noaa-gfs-bdp-pds")
Click on copy button to copy content

Description

NOTE – Upgrade NCEP Global Forecast System to v16.3.0 – Effective November 29, 2022 See notification HERE

The Global Forecast System (GFS) is a weather forecast model produced by the National Centers for Environmental Prediction (NCEP). Dozens of atmospheric and land-soil variables are available through this dataset, from temperatures, winds, and precipitation to soil moisture and atmospheric ozone concentration. The entire globe is covered by the GFS at a base horizontal resolution of 18 miles (28 kilometers) between grid points, which is used by the operational forecasters who predict weather out to 16 days in the future. Horizontal resolution drops to 44 miles (70 kilometers) between grid point for forecasts between one week and two weeks.

The NOAA Global Forecast Systems (GFS) Warm Start Initial Conditions are produced by the National Centers for Environmental Prediction Center (NCEP) to run operational deterministic medium-range numerical weather predictions. The GFS is built with the GFDL Finite-Volume Cubed-Sphere Dynamical Core (FV3) and the Grid-Point Statistical Interpolation (GSI) data assimilation system. Please visit the links below in the Documentation section to find more details about the model and the data assimilation systems. The current operational GFS is run at 64 layers in the vertical extending from the surface to the upper stratosphere and on six cubic-sphere tiles at the C768 or 13-km horizontal resolution. A new version of the GFS that has 127 layers extending to the mesopause will be implemented for operation on February 3, 2021. These initial conditions are made available four times per day for running forecasts at the 00Z, 06Z, 12Z and 18Z cycles, respectively. For each cycle, the dataset contains the first guess of the atmosphere states found in the directory ./gdas.yyyymmdd/hh-6/RESTART, which are 6-hour GDAS forecast from the last cycle, and atmospheric analysis increments and surface analysis for the current cycle found in the directory ./gfs.yyyymmdd/hh, which are produced by the data assimilation systems.

Additional information

Update frequency

4 times a day, every 6 hours starting at midnight UTC

License

Open Data. There are no restrictions on the use of this data.

Related datasets

Atmospheric Models from Météo-France

CAFE60 reanalysis

Coupled Model Intercomparison Project Phase 5 (CMIP5) University of Wisconsin-Madison Probabilistic Downscaling Dataset

Earth Radio Occultation

Launch your ML development to new heights with DagsHub

Back to top