API Reference

This page provides an auto-generated summary of PyISD’s public API.

The pyisd package

exception pyisd.DataDownloadError[source]

Raised when station data cannot be fetched or parsed.

class pyisd.IsdLite(crs=4326, verbose=0, metadata_retries=None, metadata_timeout=None, metadata_retry_delay=None)[source]

A client for accessing NOAA’s ISD-Lite (Integrated Surface Data Lite) weather dataset.

ISD-Lite provides global hourly observations of temperature, dew point, pressure, wind, sky coverage, and precipitation from weather stations worldwide. This client handles downloading and processing the data, with support for spatial and temporal filtering.

The data fields available are:
  • temp: Air temperature (Celsius)

  • dewtemp: Dew point temperature (Celsius)

  • pressure: Sea level pressure (hectopascals)

  • winddirection: Wind direction (degrees)

  • windspeed: Wind speed (meters/second)

  • skycoverage: Sky coverage/ceiling (code)

  • precipitation-1h: One-hour precipitation (mm)

  • precipitation-6h: Six-hour precipitation (mm)

Parameters:
  • crs (int or str, optional) – Coordinate reference system for spatial operations. Defaults to 4326 (WGS 84).

  • verbose (int, optional) – Verbosity level for progress reporting. 0 for silent, 1 for progress bars. Defaults to 0.

  • metadata_retries (int, optional) – Number of metadata download attempts before failing. Defaults to 3.

  • metadata_timeout (int or float, optional) – Metadata request timeout in seconds. Defaults to 2.

  • metadata_retry_delay (int or float, optional) – Base delay between metadata retries. Retries use exponential backoff. Defaults to 1.

Examples

# Initialize the client
isd = IsdLite()

# Get data for all US stations for January 2020
data = isd.get_data(
    start='2020-01-01',
    end='2020-01-31',
    countries='US'
)

# Get data within a specific region, organized by weather variable
texas_data = isd.get_data(
    start='2020-01-01',
    geometry=(-106.6, 25.8, -93.5, 36.5),  # Texas bounding box
    organize_by='field'
)

# Use with geopandas for spatial filtering
import geopandas as gpd
city = gpd.read_file('city_boundary.geojson')
city_data = isd.get_data(
    start='2020-01-01',
    geometry=city
)

# Access specific weather variables
temperatures = texas_data['temp']  # When organize_by='field'
# Or
station_data = data['724940']  # When organize_by='location'
station_temp = station_data['temp']
get_data(start, end=None, station_id=None, countries=None, geometry=None, organize_by='location', n_jobs=6)[source]

Fetches weather data from the ISD-Lite dataset for the specified time range and location.

Parameters:
  • start (datetime) – The start date for the data retrieval.

  • end (datetime, optional) – The end date for the data retrieval. If not provided, defaults to the start date.

  • station_id (str, optional) – A specific weather station ID in the format ‘USAF-WBAN’ to retrieve data for. If provided, overrides any spatial or country filters. If None, data for all stations will

  • countries (str or iterable of str, optional) – Country code(s) to filter stations by. Must be valid codes from the ISD-Lite metadata (found in raw_metadata[‘CTRY’]). Can be either a single country code as string or multiple codes as an iterable. If None, stations from all countries will be considered. When used together with geometry, both filters are applied.

  • geometry (GeoSeries or tuple, optional) – A spatial filter for the stations. Can be either: - A GeoSeries or geometry object to filter stations by spatial location - A tuple of (xmin, ymin, xmax, ymax) defining a bounding box If None, data for all stations will be retrieved. Defaults to None. When used together with countries, only stations matching both filters are considered.

  • organize_by (str, optional) – Determines how the resulting data is organized. Options are: - ‘location’: Organize data by weather station. - ‘field’: Organize data by weather variable. Defaults to ‘location’.

  • n_jobs (int, optional) – The number of threads to use for parallel data downloads. Defaults to 6.

Returns:

A dictionary containing the weather data. The structure of the dictionary depends on the organize_by parameter:

  • If ‘location’: Keys are station IDs, and values are DataFrames with weather data.

  • If ‘field’: Keys are weather variables, and values are DataFrames with stations as columns. If no data is available, each field maps to an empty DataFrame indexed by the requested time range.

Return type:

dict

Raises:
  • ValueError – If organize_by is not one of the allowed options.

  • DataDownloadError – If station data cannot be downloaded or parsed.

Examples

# Get data for a single country
data = isd.get_data(start='2020-01-01', end='2020-12-31', countries='US')

# Get data for multiple countries
data = isd.get_data(start='2020-01-01', countries=['US', 'CA', 'MX'])

# Get data within a bounding box
data = isd.get_data(start='2020-01-01', geometry=(-100, 30, -90, 40))
property raw_metadata

Weather station metadata, loaded lazily on first access.

refresh_metadata()[source]

Force a metadata refresh from NOAA, replacing the in-memory cache on success.

exception pyisd.MetadataDownloadError[source]

Raised when station metadata cannot be fetched from NOAA.

The pyisd.misc module

pyisd.misc.daterange(date_start, date_end=None, freq='h') DatetimeIndex[source]

Creates a date range with a given frequency between date_start and date_end.

Parameters:
  • date_start (int or str) – The start date as an integer in “yyyymmdd” format or as a string.

  • date_end (int or str or None, optional) – The end date as an integer in “yyyymmdd” format or as a string. If None, the end date will equal the start date. Default: None.

  • freq (str, optional) – The frequency of the dates in the range. For example, ‘H’ for hours, ‘D’ for days, ‘M’ for months, etc. Default: ‘H’.

Returns:

A DatetimeIndex object containing the dates in the specified range with the given frequency.

Return type:

pd.DatetimeIndex

Examples

daterange(20220306, 20220307, freq='D')
>>> DatetimeIndex(['2022-03-06', '2022-03-07'], dtype='datetime64[ns]', freq='D')
daterange(20220306)
>>> DatetimeIndex(['2022-03-06 00:00:00', '2022-03-06 01:00:00', ...],
                   dtype='datetime64[ns]', freq='h')
pyisd.misc.proj(x: float | int | Iterable[float], y: float | int | Iterable[float], proj_in: str | int | CRS, proj_out: str | int | CRS) Tuple[Iterable[float], Iterable[float]][source]

Projects coordinates from one coordinate system to another.

Parameters:
  • x (Union[float, int, Iterable[float]]) – x-coordinates to project.

  • y (Union[float, int, Iterable[float]]) – y-coordinates to project.

  • proj_in (Union[str, int, pyproj.CRS]) – Input coordinate system.

  • proj_out (Union[str, int, pyproj.CRS]) – Output coordinate system.

Returns:

Projected coordinates (x, y).

Return type:

Tuple[Iterable[float], Iterable[float]]

pyisd.misc.to_crs(proj: str | int | CRS | Proj | None) CRS | None[source]

Converts a coordinate system into a pyproj.CRS object.

Parameters:

proj (Union[str, int, pyproj.CRS, pyproj.Proj, None]) – The coordinate system to convert.

Returns:

The pyproj.CRS object corresponding to the specified coordinate system.

Return type:

Optional[pyproj.CRS]

Example

to_crs('EPSG:4326')
>>> <pyproj.CRS ...>

to_crs(27572)
>>> <pyproj.CRS ...>