Bulk Data Download

!pip install aiohttp requests s5cmd "xarray[io]"

Hide code cell output

Requirement already satisfied: aiohttp in /home/deploy/.local/lib/python3.8/site-packages (3.10.11)
Requirement already satisfied: requests in /usr/lib/python3/dist-packages (2.22.0)
Requirement already satisfied: s5cmd in /home/deploy/.local/lib/python3.8/site-packages (0.2.0)
Requirement already satisfied: xarray[io] in /home/deploy/.local/lib/python3.8/site-packages (2023.1.0)
Requirement already satisfied: yarl<2.0,>=1.12.0 in /home/deploy/.local/lib/python3.8/site-packages (from aiohttp) (1.15.2)
Requirement already satisfied: attrs>=17.3.0 in /usr/lib/python3/dist-packages (from aiohttp) (19.3.0)
Requirement already satisfied: async-timeout<6.0,>=4.0; python_version < "3.11" in /home/deploy/.local/lib/python3.8/site-packages (from aiohttp) (5.0.1)
Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /home/deploy/.local/lib/python3.8/site-packages (from aiohttp) (2.4.4)
Requirement already satisfied: frozenlist>=1.1.1 in /home/deploy/.local/lib/python3.8/site-packages (from aiohttp) (1.5.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/deploy/.local/lib/python3.8/site-packages (from aiohttp) (6.1.0)
Requirement already satisfied: aiosignal>=1.1.2 in /home/deploy/.local/lib/python3.8/site-packages (from aiohttp) (1.3.1)
Requirement already satisfied: pandas>=1.3 in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (2.0.3)
Requirement already satisfied: packaging>=21.3 in /usr/local/lib/python3.8/dist-packages (from xarray[io]) (24.1)
Requirement already satisfied: numpy>=1.20 in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (1.24.4)
Requirement already satisfied: scipy; extra == "io" in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (1.10.1)
Requirement already satisfied: pydap; python_version < "3.10" and extra == "io" in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (3.4.1)
Requirement already satisfied: fsspec; extra == "io" in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (2024.10.0)
Requirement already satisfied: pooch; extra == "io" in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (1.8.2)
Requirement already satisfied: cftime; extra == "io" in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (1.6.4.post1)
Requirement already satisfied: cfgrib; extra == "io" in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (0.9.15.0)
Requirement already satisfied: rasterio; extra == "io" in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (1.3.11)
Requirement already satisfied: netCDF4; extra == "io" in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (1.7.2)
Requirement already satisfied: h5netcdf; extra == "io" in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (1.1.0)
Requirement already satisfied: zarr; extra == "io" in /home/deploy/.local/lib/python3.8/site-packages (from xarray[io]) (2.16.1)
Requirement already satisfied: idna>=2.0 in /usr/lib/python3/dist-packages (from yarl<2.0,>=1.12.0->aiohttp) (2.8)
Requirement already satisfied: propcache>=0.2.0 in /home/deploy/.local/lib/python3.8/site-packages (from yarl<2.0,>=1.12.0->aiohttp) (0.2.0)
Requirement already satisfied: typing-extensions>=4.1.0; python_version < "3.11" in /home/deploy/.local/lib/python3.8/site-packages (from multidict<7.0,>=4.5->aiohttp) (4.13.2)
Requirement already satisfied: pytz>=2020.1 in /home/deploy/.local/lib/python3.8/site-packages (from pandas>=1.3->xarray[io]) (2025.2)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/deploy/.local/lib/python3.8/site-packages (from pandas>=1.3->xarray[io]) (2.9.0.post0)
Requirement already satisfied: tzdata>=2022.1 in /home/deploy/.local/lib/python3.8/site-packages (from pandas>=1.3->xarray[io]) (2025.2)
Requirement already satisfied: docopt in /usr/lib/python3/dist-packages (from pydap; python_version < "3.10" and extra == "io"->xarray[io]) (0.6.2)
Requirement already satisfied: beautifulsoup4 in /home/deploy/.local/lib/python3.8/site-packages (from pydap; python_version < "3.10" and extra == "io"->xarray[io]) (4.13.4)
Requirement already satisfied: Webob in /home/deploy/.local/lib/python3.8/site-packages (from pydap; python_version < "3.10" and extra == "io"->xarray[io]) (1.8.9)
Requirement already satisfied: Jinja2 in /usr/local/lib/python3.8/dist-packages (from pydap; python_version < "3.10" and extra == "io"->xarray[io]) (3.1.4)
Requirement already satisfied: six>=1.4.0 in /usr/lib/python3/dist-packages (from pydap; python_version < "3.10" and extra == "io"->xarray[io]) (1.14.0)
Requirement already satisfied: platformdirs>=2.5.0 in /home/deploy/.local/lib/python3.8/site-packages (from pooch; extra == "io"->xarray[io]) (4.3.6)
Requirement already satisfied: eccodes>=0.9.8 in /home/deploy/.local/lib/python3.8/site-packages (from cfgrib; extra == "io"->xarray[io]) (2.42.0)
Requirement already satisfied: click in /usr/lib/python3/dist-packages (from cfgrib; extra == "io"->xarray[io]) (7.0)
Requirement already satisfied: affine in /home/deploy/.local/lib/python3.8/site-packages (from rasterio; extra == "io"->xarray[io]) (2.4.0)
Requirement already satisfied: cligj>=0.5 in /home/deploy/.local/lib/python3.8/site-packages (from rasterio; extra == "io"->xarray[io]) (0.7.2)
Requirement already satisfied: snuggs>=1.4.1 in /home/deploy/.local/lib/python3.8/site-packages (from rasterio; extra == "io"->xarray[io]) (1.4.7)
Requirement already satisfied: click-plugins in /home/deploy/.local/lib/python3.8/site-packages (from rasterio; extra == "io"->xarray[io]) (1.1.1.2)
Requirement already satisfied: importlib-metadata; python_version < "3.10" in /usr/lib/python3/dist-packages (from rasterio; extra == "io"->xarray[io]) (1.5.0)
Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from rasterio; extra == "io"->xarray[io]) (45.2.0)
Requirement already satisfied: certifi in /usr/lib/python3/dist-packages (from rasterio; extra == "io"->xarray[io]) (2019.11.28)
Requirement already satisfied: h5py in /home/deploy/.local/lib/python3.8/site-packages (from h5netcdf; extra == "io"->xarray[io]) (3.11.0)
Requirement already satisfied: asciitree in /home/deploy/.local/lib/python3.8/site-packages (from zarr; extra == "io"->xarray[io]) (0.3.3)
Requirement already satisfied: numcodecs>=0.10.0 in /home/deploy/.local/lib/python3.8/site-packages (from zarr; extra == "io"->xarray[io]) (0.12.1)
Requirement already satisfied: fasteners in /home/deploy/.local/lib/python3.8/site-packages (from zarr; extra == "io"->xarray[io]) (0.19)
Requirement already satisfied: soupsieve>1.2 in /home/deploy/.local/lib/python3.8/site-packages (from beautifulsoup4->pydap; python_version < "3.10" and extra == "io"->xarray[io]) (2.7)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.8/dist-packages (from Jinja2->pydap; python_version < "3.10" and extra == "io"->xarray[io]) (2.1.5)
Requirement already satisfied: findlibs in /home/deploy/.local/lib/python3.8/site-packages (from eccodes>=0.9.8->cfgrib; extra == "io"->xarray[io]) (0.1.1)
Requirement already satisfied: cffi in /home/deploy/.local/lib/python3.8/site-packages (from eccodes>=0.9.8->cfgrib; extra == "io"->xarray[io]) (1.17.1)
Requirement already satisfied: pyparsing>=2.1.6 in /usr/lib/python3/dist-packages (from snuggs>=1.4.1->rasterio; extra == "io"->xarray[io]) (2.4.6)
Requirement already satisfied: pycparser in /home/deploy/.local/lib/python3.8/site-packages (from cffi->eccodes>=0.9.8->cfgrib; extra == "io"->xarray[io]) (2.22)
import xarray as xr

Bulk Data Download#

This notebook shows how to perform bulk downloads with a S3 command line tool. This is useful if you want to have local access to a big subset of the data or event download the whole archive!

We can download data in bulk using any command line that supports the S3 protocol. We recommend using the s5cmd tool, which can be simply installed by running:

pip install s5cmd

Now we can download data using the cp command.

In this example, we are going to transfer the thompson scattering data for shot 30420 locally.

We need to set the endpoint of where the bucket is hosted (for now: https://s3.echo.stfc.ac.uk) and we need to set --no-sign-request for annonymous access.

%%capture --no-display
%%bash
s5cmd --no-sign-request --endpoint-url https://s3.echo.stfc.ac.uk cp s3://mast/level2/shots/30420.zarr/thomson_scattering/* ./30420.zarr/thomson_scattering;

Finally, we can open the file locally:

xr.open_zarr('30420.zarr/thomson_scattering', consolidated=False)
<xarray.Dataset> Size: 257kB
Dimensions:       (time: 88, major_radius: 120)
Coordinates:
  * major_radius  (major_radius) float64 960B 0.3 0.31 0.32 ... 1.47 1.48 1.49
  * time          (time) float64 704B -0.0568 -0.0518 -0.0468 ... 0.3732 0.3782
Data variables:
    n_e_core      (time) float64 704B ...
    p_e           (major_radius, time) float64 84kB ...
    n_e           (major_radius, time) float64 84kB ...
    t_e_core      (time) float64 704B ...
    t_e           (major_radius, time) float64 84kB ...
Attributes:
    description:  Thomson scattering diagnostic
    label:        core temperature
    name:         thomson_scattering
    units:        eV
    imas:         thomson_scattering
    license:      {'name': 'Creative Commons 4.0 BY-SA', 'url': 'https://crea...