pymove.models.pattern_mining package

Submodules

pymove.models.pattern_mining.clustering module

Clustering operations.

elbow_method, gap_statistic, dbscan_clustering

pymove.models.pattern_mining.clustering.dbscan_clustering(move_data: DataFrame, cluster_by: str, meters: int = 10, min_sample: float = 840.0, earth_radius: float = 6371, metric: str | Callable = 'euclidean', inplace: bool = False) → DataFrame | None[source]

Performs density based clustering on the move_dataframe according to cluster_by.

Parameters:
  • move_data (dataframe) – the input trajectory
  • cluster_by (str) – the colum to cluster
  • meters (int, optional) – distance to use in the clustering, by default 10
  • min_sample (float, optional) – the minimum number of samples to consider a cluster, by default 1680/2
  • earth_radius (int) – Y offset from your original position in meters, by default EARTH_RADIUS
  • metric (string, or callable, optional) – The metric to use when calculating distance between instances in a feature array by default ‘euclidean’
  • inplace (bool, optional) – Whether to return a new DataFrame, by default False
Returns:

Clustered dataframe or None

Return type:

DataFrame

pymove.models.pattern_mining.clustering.elbow_method(move_data: DataFrame, k_initial: int = 1, max_clusters: int = 15, k_iteration: int = 1, random_state: int | None = None) → dict[source]

Determines the optimal number of clusters.

In the range set by the user using the elbow method.

Parameters:
  • move_data (dataframe) – The input trajectory data.
  • k_initial (int, optional) – The initial value used in the interaction of the elbow method. Represents the maximum numbers of clusters, by default 1
  • max_clusters (int, optional) – The maximum value used in the interaction of the elbow method. Maximum number of clusters to test for, by default 15
  • k_iteration (int, optional) – Increment value of the sequence used by the elbow method, by default 1
  • random_state (int, RandomState instance) – Determines random number generation for centroid initialization. Use an int to make the randomness deterministic, by default None
Returns:

The inertia values ​​for the different numbers of clusters

Return type:

dict

Example

clustering.elbow_method(move_data=move_df, k_iteration=3)
{
1: 55084.15957839036, 4: 245.68365592382938, 7: 92.31472644640075, 10: 62.618599956870355, 13: 45.59653757292055,

}

pymove.models.pattern_mining.clustering.gap_statistic(move_data: DataFrame, nrefs: int = 3, k_initial: int = 1, max_clusters: int = 15, k_iteration: int = 1, random_state: int | None = None) → dict[source]

Calculates optimal clusters numbers using Gap Statistic.

From Tibshirani, Walther, Hastie.

Parameters:
  • move_data (ndarray of shape (n_samples, n_features)) – The input trajectory data.
  • nrefs (int, optional) – number of sample reference datasets to create, by default 3
  • k_initial (int, optional.) – The initial value used in the interaction of the elbow method, by default 1 Represents the maximum numbers of clusters.
  • max_clusters (int, optional) – Maximum number of clusters to test for, by default 15
  • k_iteration (int, optional) – Increment value of the sequence used by the elbow method, by default 1
  • random_state (int, RandomState instance) – Determines random number generation for centroid initialization. Use an int to make the randomness deterministic, by default None
Returns:

The error value for each cluster number

Return type:

dict

Notes

https://anaconda.org/milesgranger/gap-statistic/notebook

pymove.models.pattern_mining.freq_seq_patterns module

Not implemented.

pymove.models.pattern_mining.moving_together_patterns module

Not implemented.

pymove.models.pattern_mining.periodic_patterns module

Not implemented.

Module contents

Contains models to detect patterns on trajectories.

clustering, freq_seq_patterns, moving_together_patterns, periodic_patterns