momepy.describe_reached_agg#
- momepy.describe_reached_agg(y, graph_index, graph, q=None, statistics=None)[source]#
Describe the distribution of values reached on a neighbourhood graph.
Given a neighborhood graph or a grouping, computes the descriptive statistics of values reached. Optionally, the values can be limited to a certain quantile range before computing the statistics.
The statistics calculated are count, sum, mean, median, std, nunique, mode. The desired statistics to compute can be passed to
statistics
The neighbourhood is defined in
graph
. Ifgraph
isNone
, the function will assume topological distance0
(element itself) andresult_index
is required in order to arrange the results. Ifgraph
, the results are arranged according to the spatial weights ordering.Adapted from [Hermosilla et al., 2012] and [Feliciotti, 2018].
- Parameters:
- ySeries | numpy.array
A Series or numpy.array containing values to analyse.
- graph_indexSeries | numpy.array
The unique ID that specifies the aggregation of
y
objects tograph
groups.- graphlibpysal.graph.Graph (default None)
A spatial weights matrix of the element
y
is grouped into.- qtuple[float, float] | None, optional
Tuple of percentages for the percentiles to compute. Values must be between 0 and 100 inclusive. When set, values below and above the percentiles will be discarded before computation of the average. The percentiles are computed for each neighborhood. By default None.
- statisticslist[str]
A list of stats functions to pass to groupby.agg.
- Returns
- ——-
- DataFrame
Notes
The numba package is used extensively in this function to accelerate the computation of statistics. Without numba, these computations may become slow on large data.
Examples
>>> from libpysal import graph >>> path = momepy.datasets.get_path("bubenec") >>> buildings = geopandas.read_file(path, layer="buildings") >>> streets = geopandas.read_file(path, layer="streets") >>> buildings["street_index"] = momepy.get_nearest_street(buildings, streets) >>> buildings.head() uID geometry street_index 0 1 POLYGON ((1603599.221 6464369.816, 1603602.984... 0.0 1 2 POLYGON ((1603042.88 6464261.498, 1603038.961 ... 33.0 2 3 POLYGON ((1603044.65 6464178.035, 1603049.192 ... 10.0 3 4 POLYGON ((1603036.557 6464141.467, 1603036.969... 8.0 4 5 POLYGON ((1603082.387 6464142.022, 1603081.574... 8.0
>>> queen_contig = graph.Graph.build_contiguity(streets, rook=False) >>> queen_contig <Graph of 35 nodes and 148 nonzero edges indexed by [0, 1, 2, 3, 4, ...]>
>>> momepy.describe_reached_agg( ... buildings.area, ... buildings["street_index"], ... queen_contig, ... ).head() count mean median std min max sum nunique mode 0 43.0 643.595418 633.692589 412.563790 53.851509 2127.752228 27674.602973 43.0 53.851509 1 41.0 735.058515 662.921280 381.827737 51.246377 2127.752228 30137.399128 41.0 51.246377 2 50.0 636.304006 625.190488 450.182157 53.851509 2127.752228 31815.200298 50.0 53.851509 3 6.0 405.782514 370.352071 334.848563 57.138700 863.828420 2434.695086 6.0 57.138700 4 1.0 683.514930 683.514930 NaN 683.514930 683.514930 683.514930 1.0 683.514930
The result can be directly assigned a columns of the
streets
GeoDataFrame.To eliminate the effect of outliers, you can take into account only values within a specified percentile range (
q
). At the same time, you can specify only a subset of statistics to compute:>>> momepy.describe_reached_agg( ... buildings.area, ... buildings["street_index"], ... queen_contig, ... q=(10, 90), ... statistics=["mean", "std"], ... ).head() mean std 0 619.104840 250.369496 1 721.441808 216.516469 2 597.379925 297.213321 3 378.431992 274.631290 4 683.514930 NaN