geoarrow-pandas#

Contains pandas integration for the geoarrow Python bindings. Importing this package will register pyarrow extension types and register the geoarrow accessor on pandas.Series objects.

Examples#

>>> import geoarrow.pandas as _
class GeoArrowAccessor(pandas_obj)#

GeoArrow series accessor

The GeoArrow series accessor adds a convenient way to apply the type introspection and coordinate shuffling capabilities of geoarrow-pyarrow to columns in a pandas data frame. The accessor can be applied to text columns (interpreted as WKT), binary columns (interpreted as WKB), GeoArrowExtensionDtype columns, or a geopandas.GeoSeries.

>>> import geoarrow.pandas as _
>>> import pandas as pd
>>> series = pd.Series(["POINT (0 1)", "POINT (2 3)"])
>>> x, y = series.geoarrow.point_coords()
>>> x
0    0.0
1    2.0
dtype: float64
>>> y
0    1.0
1    3.0
dtype: float64
as_geoarrow(type=None, coord_type=None)#

See geoarrow.pyarrow.as_geoarrow()

as_wkb()#

See geoarrow.pyarrow.as_wkb()

as_wkt()#

See geoarrow.pyarrow.as_wkt()

bounds()#

See geoarrow.pyarrow.box()

format_wkb()#

See geoarrow.pyarrow.as_wkb()

format_wkt(precision=None, max_element_size_bytes=None)#

See geoarrow.pyarrow.format_wkt()

parse_all()#

See geoarrow.pyarrow.parse_all()

point_coords(dimensions=None)#

See geoarrow.pyarrow.point_coords()

to_geopandas()#

See geoarrow.pyarrow.to_geopandas()

total_bounds()#

See geoarrow.pyarrow.box_agg()

with_coord_type(coord_type)#

See geoarrow.pyarrow.with_coord_type()

with_crs(crs)#

See geoarrow.pyarrow.with_crs()

with_dimensions(dimensions)#

See geoarrow.pyarrow.with_dimensions()

with_edge_type(edge_type)#

See geoarrow.pyarrow.with_edge_type()

with_geometry_type(geometry_type)#

See geoarrow.pyarrow.with_geometry_type()

class GeoArrowExtensionDtype(parent)#

ExtensionDtype implementation wrapping a geoarrow type

The dtype object for geoarrow-encoded arrays that are converted to pandas. Use the pyarrow_dtype property to return the underlying geoarrow.pyarrow.GeometryExtensionType (e.g., to query the crs or dimensions).

classmethod construct_array_type()#

Return the array type associated with this dtype.

Returns#

type

classmethod construct_from_string(string)#

Construct this type from a string.

This is useful mainly for data types that accept parameters. For example, a period dtype accepts a frequency parameter that can be set as period[h] (where H means hourly frequency).

By default, in the abstract class, just the name of the type is expected. But subclasses can overwrite this method to accept parameters.

Parameters#

stringstr

The name of the type, for example category.

Returns#

ExtensionDtype

Instance of the dtype.

Raises#

TypeError

If a class cannot be constructed from this ‘string’.

Examples#

For extension dtypes with arguments the following may be an adequate implementation.

>>> import re
>>> @classmethod
... def construct_from_string(cls, string):
...     pattern = re.compile(r"^my_type\[(?P<arg_name>.+)\]$")
...     match = pattern.match(string)
...     if match:
...         return cls(**match.groupdict())
...     else:
...         raise TypeError(
...             f"Cannot construct a '{cls.__name__}' from '{string}'"
...         )
property na_value#

Default NA value to use for this type.

This is used in e.g. ExtensionArray.take. This should be the user-facing “boxed” version of the NA value, not the physical NA value for storage. e.g. for JSONArray, this is an empty dictionary.

property name#

A string identifying the data type.

Will be used for display in, e.g. Series.dtype

property type#

The scalar type for the array, e.g. int

It’s expected ExtensionArray[item] returns an instance of ExtensionDtype.type for scalar item, assuming that value is valid (not NA). NA values do not need to be instances of type.

class GeoArrowExtensionArray(obj, type=None)#

ExtensionArray implementation wrapping a geoarrow Array

This ExtensionArray implementation currently wraps a pyarrow.Array or pyarrow.ChunkedArray with an extension type. Most users will not instantiate this class directly.

copy()#

Return a copy of the array.

Returns#

ExtensionArray

Examples#

>>> arr = pd.array([1, 2, 3])
>>> arr2 = arr.copy()
>>> arr[0] = 2
>>> arr2
<IntegerArray>
[1, 2, 3]
Length: 3, dtype: Int64
property dtype#

An instance of ExtensionDtype.

Examples#

>>> pd.array([1, 2, 3]).dtype
Int64Dtype()
isna()#

A 1-D array indicating if each value is missing.

Returns#

numpy.ndarray or pandas.api.extensions.ExtensionArray

In most cases, this should return a NumPy ndarray. For exceptional cases like SparseArray, where returning an ndarray would be expensive, an ExtensionArray may be returned.

Notes#

If returning an ExtensionArray, then

  • na_values._is_boolean should be True

  • na_values should implement ExtensionArray._reduce()

  • na_values.any and na_values.all should be implemented

Examples#

>>> arr = pd.array([1, 2, np.nan, np.nan])
>>> arr.isna()
array([False, False,  True,  True])
property nbytes#

The number of bytes needed to store this object in memory.

Examples#

>>> pd.array([1, 2, 3]).nbytes
27
take(indices, allow_fill=False, fill_value=None)#

Take elements from an array.

Parameters#

indicessequence of int or one-dimensional np.ndarray of int

Indices to be taken.

allow_fillbool, default False

How to handle negative values in indices.

  • False: negative values in indices indicate positional indices from the right (the default). This is similar to numpy.take().

  • True: negative values in indices indicate missing values. These values are set to fill_value. Any other other negative values raise a ValueError.

fill_valueany, optional

Fill value to use for NA-indices when allow_fill is True. This may be None, in which case the default NA value for the type, self.dtype.na_value, is used.

For many ExtensionArrays, there will be two representations of fill_value: a user-facing “boxed” scalar, and a low-level physical NA value. fill_value should be the user-facing version, and the implementation should handle translating that to the physical version for processing the take if necessary.

Returns#

ExtensionArray

Raises#

IndexError

When the indices are out of bounds for the array.

ValueError

When indices contains negative values other than -1 and allow_fill is True.

See Also#

numpy.take : Take elements from an array along an axis. api.extensions.take : Take elements from an array.

Notes#

ExtensionArray.take is called by Series.__getitem__, .loc, iloc, when indices is a sequence of values. Additionally, it’s called by Series.reindex(), or any other method that causes realignment, with a fill_value.

Examples#

Here’s an example implementation, which relies on casting the extension array to object dtype. This uses the helper method pandas.api.extensions.take().

def take(self, indices, allow_fill=False, fill_value=None):
    from pandas.core.algorithms import take

    # If the ExtensionArray is backed by an ndarray, then
    # just pass that here instead of coercing to object.
    data = self.astype(object)

    if allow_fill and fill_value is None:
        fill_value = self.dtype.na_value

    # fill value should always be translated from the scalar
    # type for the array, to the physical storage type for
    # the data, before passing to take.

    result = take(data, indices, fill_value=fill_value,
                  allow_fill=allow_fill)
    return self._from_sequence(result, dtype=self.dtype)
to_numpy(dtype=None, copy=False, na_value=None)#

Convert to a NumPy ndarray.

This is similar to numpy.asarray(), but may provide additional control over how the conversion is done.

Parameters#

dtypestr or numpy.dtype, optional

The dtype to pass to numpy.asarray().

copybool, default False

Whether to ensure that the returned value is a not a view on another array. Note that copy=False does not ensure that to_numpy() is no-copy. Rather, copy=True ensure that a copy is made, even if not strictly necessary.

na_valueAny, optional

The value to use for missing values. The default value depends on dtype and the type of the array.

Returns#

numpy.ndarray

class GeoArrowExtensionScalar(obj, index=None)#

Scalar type for GeoArrowExtensionArray

This is a generic Scalar implementation for a “Geometry”. It is currently implemented as an immutable subclass of bytes whose value is the well-known binary representation of the geometry.

to_shapely()#

The shapely representation of this feature.

property wkb#

The well-known binary representation of this feature.

property wkt#

The well-known text representation of this feature.