How to use the verde.datasets function in verde

To help you get started, we’ve selected a few verde examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github fatiando / verde / tutorials / overview.py View on Github external
# inherit from :class:`verde.base.BaseGridder`). Since most gridders in Verde are linear
# models, we based our gridder interface on the `scikit-learn
# `__ estimator interface: they all implement a
# :meth:`~verde.base.BaseGridder.fit` method that estimates the model parameters based
# on data and a :meth:`~verde.base.BaseGridder.predict` method that calculates new data
# based on the estimated parameters.
#
# Unlike scikit-learn, our data model is not a feature matrix and a target vector (e.g.,
# ``est.fit(X, y)``) but a tuple of coordinate arrays and a data vector (e.g.,
# ``grd.fit((easting, northing), data)``). This makes more sense for spatial data and is
# common to all classes and functions in Verde.
#
# As an example, let's generate some synthetic data using
# :class:`verde.datasets.CheckerBoard`:

data = vd.datasets.CheckerBoard().scatter(size=500, random_state=0)
print(data.head())


########################################################################################
# The data are random points taken from a checkerboard function and returned to us in a
# :class:`pandas.DataFrame`:

import matplotlib.pyplot as plt

plt.figure()
plt.scatter(data.easting, data.northing, c=data.scalars, cmap="RdBu_r")
plt.colorbar()
plt.show()

########################################################################################
# Now we can use the bi-harmonic spline method [Sandwell1987]_ to fit this data. First,
github fatiando / verde / tutorials / model_evaluation.py View on Github external
a quantified measure of the quality of a given fitted gridder, we can use it to tune the
gridder's parameters, like ``damping`` for a :class:`~verde.Spline` (see
:ref:`model_selection`).
"""
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import pyproj
import verde as vd

########################################################################################
# Verde provides adaptations of common scikit-learn tools to work better with spatial
# data. Let's use these tools to evaluate the performance of a :class:`~verde.Spline` on
# our sample bathymetry data.

data = vd.datasets.fetch_baja_bathymetry()

# Use Mercator projection because Spline is a Cartesian gridder
projection = pyproj.Proj(proj="merc", lat_ts=data.latitude.mean())

########################################################################################
# Before gridding, we need to decimate the data to avoid aliasing because of the
# oversampling along the ship tracks. We'll use a blocked median with 10 arc-minute
# blocks since that is our desired grid spacing.
spacing = 10 / 60
reducer = vd.BlockReduce(reduction=np.median, spacing=spacing)
coordinates, bathymetry = reducer.filter(
    (data.longitude, data.latitude), data.bathymetry_m
)
proj_coords = projection(*coordinates)

########################################################################################
github fatiando / verde / tutorials / projections.py View on Github external
grid_geo = vd.distance_mask(
    (data.longitude, data.latitude), maxdist=30e3, grid=grid_geo, projection=projection
)

########################################################################################
# Now we can use the Cartopy library to plot our geographic grid.

plt.figure(figsize=(7, 6))
ax = plt.axes(projection=ccrs.Mercator())
ax.set_title("Geographic grid of bathymetry")
pc = grid_geo.bathymetry.plot.pcolormesh(
    ax=ax, transform=ccrs.PlateCarree(), vmax=0, zorder=-1, add_colorbar=False
)
plt.colorbar(pc).set_label("meters")
vd.datasets.setup_baja_bathymetry_map(ax, land=None)
plt.tight_layout()
plt.show()
github fatiando / verde / dev / _downloads / 2972b1be06ca3a52f0dfaa7e16ba9152 / checkerboard.py View on Github external
"""
Checkerboard function
=====================

The :class:`verde.datasets.CheckerBoard` class generates synthetic data in a
checkerboard pattern. It has the same data generation methods that most
gridders have: predict, grid, scatter, and profile.
"""
import matplotlib.pyplot as plt
import verde as vd

# Instantiate the data generator class and fit it to set the data region.
synth = vd.datasets.CheckerBoard()

# Default values are provided for the wavelengths of the function determined
# from the region.
print("wavelengths (east, north):", synth.w_east_, synth.w_north_)

# The CheckerBoard class behaves like any gridder class
print(
    "Checkerboard value at (easting=2000, northing=-2500):",
    synth.predict((2000, -2500)),
)

# Generating a grid results in a xarray.Dataset
grid = synth.grid(shape=(150, 100))
print("\nData grid:\n", grid)

# while a random scatter generates a pandas.DataFrame
github fatiando / verde / examples / chain.py View on Github external
:class:`verde.BlockReduce` and :class:`verde.BlockMean` change the input data values and
the coordinates as well but don't impact the predictions directly.

For example, say we want to decimate some data, calculate a polynomial trend, and fit a
gridder on the residual values (not the trend), but then make a grid of the full data
values (trend included). This is useful because decimation is common and many gridders
can't handle trends in data.
"""
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import pyproj
import verde as vd

# Load the Rio de Janeiro total field magnetic anomaly data
data = vd.datasets.fetch_rio_magnetic()
region = vd.get_region((data.longitude, data.latitude))

# Create a projection for the data using pyproj so that we can use it as input for the
# gridder. We'll set the latitude of true scale to the mean latitude of the data.
projection = pyproj.Proj(proj="merc", lat_ts=data.latitude.mean())

# Create a chain that fits a 2nd degree trend, decimates the residuals using a blocked
# mean to avoid aliasing, and then fits a standard gridder to the residuals. The spacing
# for the blocked mean will be 0.5 arc-minutes (approximately converted to meters).
spacing = 0.5 / 60
chain = vd.Chain(
    [
        ("trend", vd.Trend(degree=2)),
        ("reduce", vd.BlockReduce(np.mean, spacing * 111e3)),
        ("spline", vd.Spline(damping=1e-8, mindist=50e3)),
    ]
github fatiando / verde / data / examples / rio_magnetic.py View on Github external
maxabs = vd.maxabs(data.total_field_anomaly_nt)
# Cartopy requires setting the projection of the original data through the transform
# argument. Use PlateCarree for geographic data.
plt.scatter(
    data.longitude,
    data.latitude,
    c=data.total_field_anomaly_nt,
    s=1,
    cmap="seismic",
    vmin=-maxabs,
    vmax=maxabs,
    transform=crs,
)
plt.colorbar(pad=0.01).set_label("nT")
# Set the proper ticks for a Cartopy map
vd.datasets.setup_rio_magnetic_map(ax)
plt.tight_layout()
plt.show()