geoarrow-pandas#
Contains pandas integration for the geoarrow Python bindings.
Importing this package will register pyarrow extension types and
register the geoarrow
accessor on pandas.Series
objects.
Examples#
>>> import geoarrow.pandas as _
- class GeoArrowAccessor(pandas_obj)#
GeoArrow series accessor
The GeoArrow series accessor adds a convenient way to apply the type introspection and coordinate shuffling capabilities of geoarrow-pyarrow to columns in a pandas data frame. The accessor can be applied to text columns (interpreted as WKT), binary columns (interpreted as WKB),
GeoArrowExtensionDtype
columns, or ageopandas.GeoSeries
.>>> import geoarrow.pandas as _ >>> import pandas as pd >>> series = pd.Series(["POINT (0 1)", "POINT (2 3)"]) >>> x, y = series.geoarrow.point_coords() >>> x 0 0.0 1 2.0 dtype: float64 >>> y 0 1.0 1 3.0 dtype: float64
- as_geoarrow(type=None, coord_type=None)#
- as_wkb()#
- as_wkt()#
- bounds()#
- format_wkb()#
- format_wkt(precision=None, max_element_size_bytes=None)#
- parse_all()#
- point_coords(dimensions=None)#
- to_geopandas()#
- total_bounds()#
- with_coord_type(coord_type)#
- with_crs(crs)#
- with_dimensions(dimensions)#
- with_edge_type(edge_type)#
- with_geometry_type(geometry_type)#
- class GeoArrowExtensionDtype(parent)#
ExtensionDtype implementation wrapping a geoarrow type
The dtype object for geoarrow-encoded arrays that are converted to pandas. Use the
pyarrow_dtype
property to return the underlyinggeoarrow.pyarrow.GeometryExtensionType
(e.g., to query thecrs
ordimensions
).- classmethod construct_from_string(string)#
Construct this type from a string.
This is useful mainly for data types that accept parameters. For example, a period dtype accepts a frequency parameter that can be set as
period[h]
(where H means hourly frequency).By default, in the abstract class, just the name of the type is expected. But subclasses can overwrite this method to accept parameters.
Parameters#
- stringstr
The name of the type, for example
category
.
Returns#
- ExtensionDtype
Instance of the dtype.
Raises#
- TypeError
If a class cannot be constructed from this ‘string’.
Examples#
For extension dtypes with arguments the following may be an adequate implementation.
>>> import re >>> @classmethod ... def construct_from_string(cls, string): ... pattern = re.compile(r"^my_type\[(?P<arg_name>.+)\]$") ... match = pattern.match(string) ... if match: ... return cls(**match.groupdict()) ... else: ... raise TypeError( ... f"Cannot construct a '{cls.__name__}' from '{string}'" ... )
- property na_value#
Default NA value to use for this type.
This is used in e.g. ExtensionArray.take. This should be the user-facing “boxed” version of the NA value, not the physical NA value for storage. e.g. for JSONArray, this is an empty dictionary.
- property name#
A string identifying the data type.
Will be used for display in, e.g.
Series.dtype
- property type#
The scalar type for the array, e.g.
int
It’s expected
ExtensionArray[item]
returns an instance ofExtensionDtype.type
for scalaritem
, assuming that value is valid (not NA). NA values do not need to be instances of type.
- class GeoArrowExtensionArray(obj, type=None)#
ExtensionArray implementation wrapping a geoarrow Array
This ExtensionArray implementation currently wraps a
pyarrow.Array
orpyarrow.ChunkedArray
with an extension type. Most users will not instantiate this class directly.- copy()#
Return a copy of the array.
Returns#
ExtensionArray
Examples#
>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.copy() >>> arr[0] = 2 >>> arr2 <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- isna()#
A 1-D array indicating if each value is missing.
Returns#
- numpy.ndarray or pandas.api.extensions.ExtensionArray
In most cases, this should return a NumPy ndarray. For exceptional cases like
SparseArray
, where returning an ndarray would be expensive, an ExtensionArray may be returned.
Notes#
If returning an ExtensionArray, then
na_values._is_boolean
should be Truena_values should implement
ExtensionArray._reduce()
na_values.any
andna_values.all
should be implemented
Examples#
>>> arr = pd.array([1, 2, np.nan, np.nan]) >>> arr.isna() array([False, False, True, True])
- property nbytes#
The number of bytes needed to store this object in memory.
Examples#
>>> pd.array([1, 2, 3]).nbytes 27
- take(indices, allow_fill=False, fill_value=None)#
Take elements from an array.
Parameters#
- indicessequence of int or one-dimensional np.ndarray of int
Indices to be taken.
- allow_fillbool, default False
How to handle negative values in indices.
False: negative values in indices indicate positional indices from the right (the default). This is similar to
numpy.take()
.True: negative values in indices indicate missing values. These values are set to fill_value. Any other other negative values raise a
ValueError
.
- fill_valueany, optional
Fill value to use for NA-indices when allow_fill is True. This may be
None
, in which case the default NA value for the type,self.dtype.na_value
, is used.For many ExtensionArrays, there will be two representations of fill_value: a user-facing “boxed” scalar, and a low-level physical NA value. fill_value should be the user-facing version, and the implementation should handle translating that to the physical version for processing the take if necessary.
Returns#
ExtensionArray
Raises#
- IndexError
When the indices are out of bounds for the array.
- ValueError
When indices contains negative values other than
-1
and allow_fill is True.
See Also#
numpy.take : Take elements from an array along an axis. api.extensions.take : Take elements from an array.
Notes#
ExtensionArray.take is called by
Series.__getitem__
,.loc
,iloc
, when indices is a sequence of values. Additionally, it’s called bySeries.reindex()
, or any other method that causes realignment, with a fill_value.Examples#
Here’s an example implementation, which relies on casting the extension array to object dtype. This uses the helper method
pandas.api.extensions.take()
.def take(self, indices, allow_fill=False, fill_value=None): from pandas.core.algorithms import take # If the ExtensionArray is backed by an ndarray, then # just pass that here instead of coercing to object. data = self.astype(object) if allow_fill and fill_value is None: fill_value = self.dtype.na_value # fill value should always be translated from the scalar # type for the array, to the physical storage type for # the data, before passing to take. result = take(data, indices, fill_value=fill_value, allow_fill=allow_fill) return self._from_sequence(result, dtype=self.dtype)
- to_numpy(dtype=None, copy=False, na_value=None)#
Convert to a NumPy ndarray.
This is similar to
numpy.asarray()
, but may provide additional control over how the conversion is done.Parameters#
- dtypestr or numpy.dtype, optional
The dtype to pass to
numpy.asarray()
.- copybool, default False
Whether to ensure that the returned value is a not a view on another array. Note that
copy=False
does not ensure thatto_numpy()
is no-copy. Rather,copy=True
ensure that a copy is made, even if not strictly necessary.- na_valueAny, optional
The value to use for missing values. The default value depends on dtype and the type of the array.
Returns#
numpy.ndarray
- class GeoArrowExtensionScalar(obj, index=None)#
Scalar type for GeoArrowExtensionArray
This is a generic Scalar implementation for a “Geometry”. It is currently implemented as an immutable subclass of bytes whose value is the well-known binary representation of the geometry.
- to_shapely()#
The shapely representation of this feature.
- property wkb#
The well-known binary representation of this feature.
- property wkt#
The well-known text representation of this feature.