.. _interp:

Interpolating data
==================

.. ipython:: python
   :suppress:

    import numpy as np
    import pandas as pd
    import xarray as xr
    np.random.seed(123456)

xarray offers flexible interpolation routines, which have a similar interface
to our :ref:`indexing <indexing>`.

.. note::

  ``interp`` requires `scipy` installed.


Scalar and 1-dimensional interpolation
--------------------------------------

Interpolating a :py:class:`~xarray.DataArray` works mostly like labeled
indexing of a :py:class:`~xarray.DataArray`,

.. ipython:: python

    da = xr.DataArray(np.sin(0.3 * np.arange(12).reshape(4, 3)),
                      [('time', np.arange(4)),
                       ('space', [0.1, 0.2, 0.3])])
    # label lookup
    da.sel(time=3)

    # interpolation
    da.interp(time=2.5)


Similar to the indexing, :py:meth:`~xarray.DataArray.interp` also accepts an
array-like, which gives the interpolated result as an array.

.. ipython:: python

    # label lookup
    da.sel(time=[2, 3])

    # interpolation
    da.interp(time=[2.5, 3.5])

To interpolate data with a :py:func:`numpy.datetime64` coordinate you can pass a string.

.. ipython:: python

    da_dt64 = xr.DataArray([1, 3],
                           [('time', pd.date_range('1/1/2000', '1/3/2000', periods=2))])
    da_dt64.interp(time='2000-01-02')

The interpolated data can be merged into the original :py:class:`~xarray.DataArray`
by specifying the time periods required.

.. ipython:: python

    da_dt64.interp(time=pd.date_range('1/1/2000', '1/3/2000', periods=3))

Interpolation of data indexed by a :py:class:`~xarray.CFTimeIndex` is also
allowed.  See :ref:`CFTimeIndex` for examples.
    
.. note::

  Currently, our interpolation only works for regular grids.
  Therefore, similarly to :py:meth:`~xarray.DataArray.sel`,
  only 1D coordinates along a dimension can be used as the
  original coordinate to be interpolated.


Multi-dimensional Interpolation
-------------------------------

Like :py:meth:`~xarray.DataArray.sel`, :py:meth:`~xarray.DataArray.interp`
accepts multiple coordinates. In this case, multidimensional interpolation
is carried out.

.. ipython:: python

    # label lookup
    da.sel(time=2, space=0.1)

    # interpolation
    da.interp(time=2.5, space=0.15)

Array-like coordinates are also accepted:

.. ipython:: python

    # label lookup
    da.sel(time=[2, 3], space=[0.1, 0.2])

    # interpolation
    da.interp(time=[1.5, 2.5], space=[0.15, 0.25])


:py:meth:`~xarray.DataArray.interp_like` method is a useful shortcut. This
method interpolates an xarray object onto the coordinates of another xarray
object. For example, if we want to compute the difference between
two :py:class:`~xarray.DataArray` s (``da`` and ``other``) staying on slightly
different coordinates,

.. ipython:: python

  other = xr.DataArray(np.sin(0.4 * np.arange(9).reshape(3, 3)),
                       [('time', [0.9, 1.9, 2.9]),
                       ('space', [0.15, 0.25, 0.35])])

it might be a good idea to first interpolate ``da`` so that it will stay on the
same coordinates of ``other``, and then subtract it.
:py:meth:`~xarray.DataArray.interp_like` can be used for such a case,

.. ipython:: python

  # interpolate da along other's coordinates
  interpolated = da.interp_like(other)
  interpolated

It is now possible to safely compute the difference ``other - interpolated``.


Interpolation methods
---------------------

We use :py:func:`scipy.interpolate.interp1d` for 1-dimensional interpolation and
:py:func:`scipy.interpolate.interpn` for multi-dimensional interpolation.

The interpolation method can be specified by the optional ``method`` argument.

.. ipython:: python

    da = xr.DataArray(np.sin(np.linspace(0, 2 * np.pi, 10)), dims='x',
                      coords={'x': np.linspace(0, 1, 10)})

    da.plot.line('o', label='original')
    da.interp(x=np.linspace(0, 1, 100)).plot.line(label='linear (default)')
    da.interp(x=np.linspace(0, 1, 100), method='cubic').plot.line(label='cubic')
    @savefig interpolation_sample1.png width=4in
    plt.legend()

Additional keyword arguments can be passed to scipy's functions.

.. ipython:: python

    # fill 0 for the outside of the original coordinates.
    da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={'fill_value': 0.0})
    # extrapolation
    da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={'fill_value': 'extrapolate'})


Advanced Interpolation
----------------------

:py:meth:`~xarray.DataArray.interp` accepts :py:class:`~xarray.DataArray`
as similar to :py:meth:`~xarray.DataArray.sel`, which enables us more advanced interpolation.
Based on the dimension of the new coordinate passed to :py:meth:`~xarray.DataArray.interp`, the dimension of the result are determined.

For example, if you want to interpolate a two dimensional array along a particular dimension, as illustrated below,
you can pass two 1-dimensional :py:class:`~xarray.DataArray` s with
a common dimension as new coordinate.

.. image:: _static/advanced_selection_interpolation.svg
    :height: 200px
    :width: 400 px
    :alt: advanced indexing and interpolation
    :align: center

For example:

.. ipython:: python

    da = xr.DataArray(np.sin(0.3 * np.arange(20).reshape(5, 4)),
                      [('x', np.arange(5)),
                       ('y', [0.1, 0.2, 0.3, 0.4])])
    # advanced indexing
    x = xr.DataArray([0, 2, 4], dims='z')
    y = xr.DataArray([0.1, 0.2, 0.3], dims='z')
    da.sel(x=x, y=y)

    # advanced interpolation
    x = xr.DataArray([0.5, 1.5, 2.5], dims='z')
    y = xr.DataArray([0.15, 0.25, 0.35], dims='z')
    da.interp(x=x, y=y)

where values on the original coordinates
``(x, y) = ((0.5, 0.15), (1.5, 0.25), (2.5, 0.35))`` are obtained by the
2-dimensional interpolation and mapped along a new dimension ``z``.

If you want to add a coordinate to the new dimension ``z``, you can supply
:py:class:`~xarray.DataArray` s with a coordinate,

.. ipython:: python

    x = xr.DataArray([0.5, 1.5, 2.5], dims='z', coords={'z': ['a', 'b','c']})
    y = xr.DataArray([0.15, 0.25, 0.35], dims='z',
                     coords={'z': ['a', 'b','c']})
    da.interp(x=x, y=y)

For the details of the advanced indexing,
see :ref:`more advanced indexing <more_advanced_indexing>`.


Interpolating arrays with NaN
-----------------------------

Our :py:meth:`~xarray.DataArray.interp` works with arrays with NaN
the same way that
`scipy.interpolate.interp1d <https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html>`_ and
`scipy.interpolate.interpn <https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interpn.html>`_ do.
``linear`` and ``nearest`` methods return arrays including NaN,
while other methods such as ``cubic`` or ``quadratic`` return all NaN arrays.

.. ipython:: python

    da = xr.DataArray([0, 2, np.nan, 3, 3.25], dims='x',
                      coords={'x': range(5)})
    da.interp(x=[0.5, 1.5, 2.5])
    da.interp(x=[0.5, 1.5, 2.5], method='cubic')

To avoid this, you can drop NaN by :py:meth:`~xarray.DataArray.dropna`, and
then make the interpolation

.. ipython:: python

    dropped = da.dropna('x')
    dropped
    dropped.interp(x=[0.5, 1.5, 2.5], method='cubic')

If NaNs are distributed randomly in your multidimensional array,
dropping all the columns containing more than one NaNs by
:py:meth:`~xarray.DataArray.dropna` may lose a significant amount of information.
In such a case, you can fill NaN by :py:meth:`~xarray.DataArray.interpolate_na`,
which is similar to :py:meth:`pandas.Series.interpolate`.

.. ipython:: python

    filled = da.interpolate_na(dim='x')
    filled

This fills NaN by interpolating along the specified dimension.
After filling NaNs, you can interpolate:

.. ipython:: python

    filled.interp(x=[0.5, 1.5, 2.5], method='cubic')

For the details of :py:meth:`~xarray.DataArray.interpolate_na`,
see :ref:`Missing values <missing_values>`.


Example
-------

Let's see how :py:meth:`~xarray.DataArray.interp` works on real data.

.. ipython:: python

    # Raw data
    ds = xr.tutorial.open_dataset('air_temperature').isel(time=0)
    fig, axes = plt.subplots(ncols=2, figsize=(10, 4))
    ds.air.plot(ax=axes[0])
    axes[0].set_title('Raw data')

    # Interpolated data
    new_lon = np.linspace(ds.lon[0], ds.lon[-1], ds.dims['lon'] * 4)
    new_lat = np.linspace(ds.lat[0], ds.lat[-1], ds.dims['lat'] * 4)
    dsi = ds.interp(lat=new_lat, lon=new_lon)
    dsi.air.plot(ax=axes[1])
    @savefig interpolation_sample3.png width=8in
    axes[1].set_title('Interpolated data')

Our advanced interpolation can be used to remap the data to the new coordinate.
Consider the new coordinates x and z on the two dimensional plane.
The remapping can be done as follows

.. ipython:: python

    # new coordinate
    x = np.linspace(240, 300, 100)
    z = np.linspace(20, 70, 100)
    # relation between new and original coordinates
    lat = xr.DataArray(z, dims=['z'], coords={'z': z})
    lon = xr.DataArray((x[:, np.newaxis]-270)/np.cos(z*np.pi/180)+270,
                       dims=['x', 'z'], coords={'x': x, 'z': z})

    fig, axes = plt.subplots(ncols=2, figsize=(10, 4))
    ds.air.plot(ax=axes[0])
    # draw the new coordinate on the original coordinates.
    for idx in [0, 33, 66, 99]:
        axes[0].plot(lon.isel(x=idx), lat, '--k')
    for idx in [0, 33, 66, 99]:
        axes[0].plot(*xr.broadcast(lon.isel(z=idx), lat.isel(z=idx)), '--k')
    axes[0].set_title('Raw data')

    dsi = ds.interp(lon=lon, lat=lat)
    dsi.air.plot(ax=axes[1])
    @savefig interpolation_sample4.png width=8in
    axes[1].set_title('Remapped data')
