pymove.models.pattern_mining package¶
Submodules¶
pymove.models.pattern_mining.clustering module¶
Clustering operations.
elbow_method, gap_statistic, dbscan_clustering
-
pymove.models.pattern_mining.clustering.
dbscan_clustering
(move_data: DataFrame, cluster_by: str, meters: int = 10, min_sample: float = 840.0, earth_radius: float = 6371, metric: str | Callable = 'euclidean', inplace: bool = False) → DataFrame | None[source]¶ Performs density based clustering on the move_dataframe according to cluster_by.
Parameters: - move_data (dataframe) – the input trajectory
- cluster_by (str) – the colum to cluster
- meters (int, optional) – distance to use in the clustering, by default 10
- min_sample (float, optional) – the minimum number of samples to consider a cluster, by default 1680/2
- earth_radius (int) – Y offset from your original position in meters, by default EARTH_RADIUS
- metric (string, or callable, optional) – The metric to use when calculating distance between instances in a feature array by default ‘euclidean’
- inplace (bool, optional) – Whether to return a new DataFrame, by default False
Returns: Clustered dataframe or None
Return type: DataFrame
-
pymove.models.pattern_mining.clustering.
elbow_method
(move_data: DataFrame, k_initial: int = 1, max_clusters: int = 15, k_iteration: int = 1, random_state: int | None = None) → dict[source]¶ Determines the optimal number of clusters.
In the range set by the user using the elbow method.
Parameters: - move_data (dataframe) – The input trajectory data.
- k_initial (int, optional) – The initial value used in the interaction of the elbow method. Represents the maximum numbers of clusters, by default 1
- max_clusters (int, optional) – The maximum value used in the interaction of the elbow method. Maximum number of clusters to test for, by default 15
- k_iteration (int, optional) – Increment value of the sequence used by the elbow method, by default 1
- random_state (int, RandomState instance) – Determines random number generation for centroid initialization. Use an int to make the randomness deterministic, by default None
Returns: The inertia values for the different numbers of clusters
Return type: dict
Example
- clustering.elbow_method(move_data=move_df, k_iteration=3)
- {
- 1: 55084.15957839036, 4: 245.68365592382938, 7: 92.31472644640075, 10: 62.618599956870355, 13: 45.59653757292055,
}
-
pymove.models.pattern_mining.clustering.
gap_statistic
(move_data: DataFrame, nrefs: int = 3, k_initial: int = 1, max_clusters: int = 15, k_iteration: int = 1, random_state: int | None = None) → dict[source]¶ Calculates optimal clusters numbers using Gap Statistic.
From Tibshirani, Walther, Hastie.
Parameters: - move_data (ndarray of shape (n_samples, n_features)) – The input trajectory data.
- nrefs (int, optional) – number of sample reference datasets to create, by default 3
- k_initial (int, optional.) – The initial value used in the interaction of the elbow method, by default 1 Represents the maximum numbers of clusters.
- max_clusters (int, optional) – Maximum number of clusters to test for, by default 15
- k_iteration (int, optional) – Increment value of the sequence used by the elbow method, by default 1
- random_state (int, RandomState instance) – Determines random number generation for centroid initialization. Use an int to make the randomness deterministic, by default None
Returns: The error value for each cluster number
Return type: dict
Notes
pymove.models.pattern_mining.freq_seq_patterns module¶
Not implemented.
pymove.models.pattern_mining.moving_together_patterns module¶
Not implemented.
pymove.models.pattern_mining.periodic_patterns module¶
Not implemented.
Module contents¶
Contains models to detect patterns on trajectories.
clustering, freq_seq_patterns, moving_together_patterns, periodic_patterns