pymove.utils package¶
Submodules¶
pymove.utils.constants module¶
PyMove constants.
pymove.utils.conversions module¶
Unit conversion operations.
lat_meters, meters_to_eps, list_to_str, list_to_csv_str, list_to_svm_line, lon_to_x_spherical, lat_to_y_spherical, x_to_lon_spherical, y_to_lat_spherical, geometry_points_to_lat_and_lon, lat_and_lon_decimal_degrees_to_decimal, ms_to_kmh, kmh_to_ms, meters_to_kilometers, kilometers_to_meters, seconds_to_minutes, minute_to_seconds, minute_to_hours, hours_to_minute, seconds_to_hours, hours_to_seconds
-
pymove.utils.conversions.
geometry_points_to_lat_and_lon
(move_data: pandas.core.frame.DataFrame, geometry_label: str = 'geometry', drop_geometry: bool = False, inplace: bool = False) → pandas.core.frame.DataFrame[source]¶ Creates lat and lon columns from Points in geometry column.
Removes geometries that are not of the Point type.
Parameters: - move_data (DataFrame) – Input trajectory data.
- geometry (str, optional) – Represents column name of the geometry column, by default GEOMETRY
- drop_geometry (bool, optional) – Option to drop the geometry column, by default False
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import geometry_points_to_lat_and_lon >>> geom_points_df id geometry 0 1 POINT (116.36184 39.77529) 1 2 POINT (116.36298 39.77564) 2 3 POINT (116.33767 39.83148) >>> geometry_points_to_lat_and_lon(geom_points_df) id geometry lon lat 0 1 POINT (116.36184 39.77529) 116.36184 39.77529 1 2 POINT (116.36298 39.77564) 116.36298 39.77564 2 3 POINT (116.33767 39.83148) 116.33767 39.83148
-
pymove.utils.conversions.
hours_to_minute
(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]¶ Convert values, in hours, in label_distance column to minute.
Parameters: - move_data (DataFame) – Input trajectory data.
- label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
- new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import hours_to_minute >>> geo_life_df id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 0.000278 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 0.001389 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 0.001389 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 0.001389 0.577934 >>> hours_to_minute(geo_life, inplace=False) id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 0.016667 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 0.083333 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 0.083333 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 0.083333 0.577934
-
pymove.utils.conversions.
hours_to_seconds
(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]¶ Convert values, in hours, in label_distance column to seconds.
Parameters: - move_data (DataFame) – Input trajectory data.
- label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
- new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import hours_to_seconds >>> geo_life_df id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 0.000278 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 0.001389 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 0.001389 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 0.001389 0.577934 >>> hours_to_seconds(geo_life, inplace=False) id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 1.0 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 5.0 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 5.0 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 5.0 0.577934
-
pymove.utils.conversions.
kilometers_to_meters
(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_distance: str = 'dist_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]¶ Convert values, in kilometers, in label_distance column to meters.
Parameters: - move_data (DataFame) – Input trajectory data.
- label_distance (str, optional) – Represents column name of speed, by default DIST_TO_PREV
- new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import kilometers_to_meters >>> geo_life_df id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 0.013690 1.0 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 0.007404 5.0 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 0.001821 5.0 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 0.002890 5.0 0.577934 >>> kilometers_to_meters(geo_life, inplace=False) id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 1.0 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 5.0 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 5.0 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 5.0 0.577934
-
pymove.utils.conversions.
kmh_to_ms
(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_speed: str = 'speed_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]¶ Convert values, in kmh, in label_speed column to ms.
Parameters: - move_data (DataFame) – Input trajectory data.
- label_speed (str, optional) – Represents column name of speed, by default SPEED_TO_PREV
- new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import kmh_to_ms >>> geo_life_df id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 1.0 49.284551 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 5.0 5.330727 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 5.0 1.311180 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 5.0 2.080563 >>> kmh_to_ms(geo_life, inplace=False) id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 1.0 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 5.0 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 5.0 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 5.0 0.577934
-
pymove.utils.conversions.
lat_and_lon_decimal_degrees_to_decimal
(move_data: pandas.core.frame.DataFrame, latitude: str = 'lat', longitude: str = 'lon') → pandas.core.frame.DataFrame[source]¶ Converts latitude and longitude format from decimal degrees to decimal format.
Parameters: - move_data (DataFrame) – Input trajectory data.
- latitude (str, optional) – Represents column name of the latitude column, by default LATITUDE
- longitude (str, optional) – Represents column name of the longitude column, by default LONGITUDE
Returns: A new dataframe with the converted feature
Return type: DataFrame
Example
>>> from pymove.utils.conversions import lat_and_lon_decimal_degrees_to_decimal >>> lat_and_lon_df id lat lon 0 0 28.0N 94.8W 1 1 41.3N 50.4W 2 1 40.8N 47.5W >>> lat_and_lon_decimal_degrees_to_decimal(lat_and_lon_df) id lat lon 0 0 28.0 -94.8 1 1 41.3 -50.4 2 1 40.8 -47.5
-
pymove.utils.conversions.
lat_meters
(lat: float) → float[source]¶ Transform latitude degree to meters.
Parameters: lat (float) – This represent latitude value. Returns: Represents the corresponding latitude value in meters. Return type: float Examples
Latitude in Fortaleza: -3.71839 >>> from pymove.utils.conversions import lat_meters >>> lat_meters(-3.71839) 110832.75545918777
-
pymove.utils.conversions.
lat_to_y_spherical
(lat: float | ndarray) → float | ndarray[source]¶ Convert latitude to Y EPSG:3857 WGS 84/Pseudo-Mercator.
Parameters: lat (float) – Represents latitude. Returns: Y offset from your original position in meters. Return type: float Examples
>>> from pymove.utils.conversions import lat_to_y_spherical >>> lat_fortaleza = -3.71839 >>> for_y = lat_to_y_spherical(lat_fortaleza) >>> print(y_for, type(y_for)) -414220.15015607665 <class 'numpy.float64'>
References
-
pymove.utils.conversions.
list_to_csv_str
(input_list: list) → str[source]¶ Concatenates the elements of the list, joining them by “,”.
Parameters: input_list (list) – List with elements to be joined. Returns: Returns a string, resulting from concatenation of list elements, separeted by “,”. Return type: str Example
>>> from pymove.utils.conversions import list_to_csv_str >>> list = [1,2,3,4,5] >>> print(list_to_csv_str(list), type(list_to_csv_str(list))) 1,2,3,4,5 <class 'str'>
-
pymove.utils.conversions.
list_to_str
(input_list: list, delimiter: str = ', ') → str[source]¶ Concatenates a list elements, joining them by the separator delimiter.
Parameters: - input_list (list) – List with elements to be joined.
- delimiter (str, optional) – The separator used between elements, by default ‘,’.
Returns: Returns a string, resulting from concatenation of list elements, separeted by the delimiter.
Return type: str
Example
>>> from pymove.utils.conversions import list_to_str >>> list = [1,2,3,4,5] >>> print(list_to_str(list, 'x'), type(list_to_str(list))) 1x2x3x4x5 <class 'str'>
-
pymove.utils.conversions.
list_to_svm_line
(original_list: list) → str[source]¶ Concatenates list elements in consecutive element pairs.
Parameters: original_list (list) – The elements to be joined Returns: Returns a string, resulting from concatenation of list elements in consecutive element pairs, separeted by ” “. Return type: str Example
>>> from pymove.utils.conversions import list_to_svm_line >>> list = [1,2,3,4,5] >>> print(list_to_svm_line(list), type(list_to_svm_line(list))) 1 1:2 2:3 3:4 4:5 <class 'str'>
-
pymove.utils.conversions.
lon_to_x_spherical
(lon: float | ndarray) → float | ndarray[source]¶ Convert longitude to X EPSG:3857 WGS 84/Pseudo-Mercator.
Parameters: lon (float) – Represents longitude. Returns: X offset from your original position in meters. Return type: float Examples
>>> from pymove.utils.conversions import lon_to_x_spherical >>> lon_fortaleza = -38.5434 >>> for_x = lon_to_x_spherical(lon_fortaleza) >>> print(x_for, type(x_for)) -4290631.66144146 <class 'numpy.float64'>
References
-
pymove.utils.conversions.
meters_to_eps
(radius_meters: float, earth_radius: float = 6371) → float[source]¶ Converts radius in meters to eps.
Parameters: - radius_meters (float) – radius in meters
- earth_radius (float, optional) – radius of the earth in the location, by default EARTH_RADIUS
Returns: radius in eps
Return type: float
Example
>>> from pymove.utils.conversions import meters_to_eps >>> earth_radius = 6371000 >>> meters_to_eps(earth_radius) 1000.0
-
pymove.utils.conversions.
meters_to_kilometers
(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_distance: str = 'dist_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]¶ Convert values, in meters, in label_distance column to kilometers.
Parameters: - move_data (DataFame) – Input trajectory data.
- label_distance (str, optional) – Represents column name of speed, by default DIST_TO_PREV
- new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import meters_to_kilometers >>> geo_life_df id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 1.0 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 5.0 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 5.0 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 5.0 0.577934 >>> meters_to_kilometers(geo_life, inplace=False) id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 0.013690 1.0 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 0.007404 5.0 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 0.001821 5.0 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 0.002890 5.0 0.577934
-
pymove.utils.conversions.
minute_to_hours
(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]¶ Convert values, in minutes, in label_distance column to hours.
Parameters: - move_data (DataFame) – Input trajectory data.
- label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
- new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import minute_to_hours, seconds_to_minutes >>> geo_life_df id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 1.0 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 5.0 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 5.0 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 5.0 0.577934 >>> seconds_to_minutes(geo_life, inplace=True) >>> minute_to_hours(geo_life, inplace=False) id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 0.000278 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 0.001389 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 0.001389 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 0.001389 0.577934
-
pymove.utils.conversions.
minute_to_seconds
(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]¶ Convert values, in minutes, in label_distance column to seconds.
Parameters: - move_data (DataFame) – Input trajectory data.
- label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
- new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import minute_to_seconds >>> geo_life_df id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 0.016667 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 0.083333 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 0.083333 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 0.083333 0.577934 >>> minute_to_seconds(geo_life, inplace=False) id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 1.0 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 5.0 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 5.0 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 5.0 0.577934
-
pymove.utils.conversions.
ms_to_kmh
(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_speed: str = 'speed_to_prev', new_label: str = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]¶ Convert values, in ms, in label_speed column to kmh.
Parameters: - move_data (DataFrame) – Input trajectory data.
- label_speed (str, optional) – Represents column name of speed, by default SPEED_TO_PREV
- new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import ms_to_kmh >>> geo_life_df lat lon datetime id 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 39.984198 116.319322 2008-10-23 05:53:06 1 2 39.984224 116.319402 2008-10-23 05:53:11 1 3 39.984211 116.319389 2008-10-23 05:53:16 1 4 39.984217 116.319422 2008-10-23 05:53:21 1 >>> geo_life.generate_dist_time_speed_features(inplace=True) >>> geo_life id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 1.0 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 5.0 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 5.0 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 5.0 0.577934 >>> ms_to_kmh(geo_life, inplace=False) id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 1.0 49.284551 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 5.0 5.330727 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 5.0 1.311180 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 5.0 2.080563
-
pymove.utils.conversions.
seconds_to_hours
(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]¶ Convert values, in seconds, in label_distance column to hours.
Parameters: - move_data (DataFame) – Input trajectory data.
- label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
- new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import minute_to_seconds, seconds_to_hours >>> geo_life_df id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 0.016667 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 0.083333 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 0.083333 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 0.083333 0.577934 >>> minute_to_seconds(geo_life, inplace=True) >>> seconds_to_hours(geo_life, inplace=False) id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 0.000278 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 0.001389 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 0.001389 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 0.001389 0.577934
-
pymove.utils.conversions.
seconds_to_minutes
(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]¶ Convert values, in seconds, in label_distance column to minutes.
Parameters: - move_data (DataFame) – Input trajectory data.
- label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
- new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
- inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns: A new dataframe with the converted feature or None
Return type: DataFrame
Example
>>> from pymove.utils.conversions import seconds_to_minutes >>> geo_life_df id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 1.0 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 5.0 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 5.0 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 5.0 0.577934 >>> seconds_to_minutes(geo_life, inplace=False) id lat lon datetime dist_to_prev time_to_prev speed_to_prev 0 1 39.984094 116.319236 2008-10-23 05:53:05 NaN NaN NaN 1 1 39.984198 116.319322 2008-10-23 05:53:06 13.690153 0.016667 13.690153 2 1 39.984224 116.319402 2008-10-23 05:53:11 7.403788 0.083333 1.480758 3 1 39.984211 116.319389 2008-10-23 05:53:16 1.821083 0.083333 0.364217 4 1 39.984217 116.319422 2008-10-23 05:53:21 2.889671 0.083333 0.577934
-
pymove.utils.conversions.
x_to_lon_spherical
(x: float | ndarray) → float | ndarray[source]¶ Convert X EPSG:3857 WGS 84 / Pseudo-Mercator to longitude.
Parameters: x (float) – X offset from your original position in meters. Returns: Represents longitude. Return type: float Examples
>>> from pymove.utils.conversions import x_to_lon_spherical >>> for_x = -4290631.66144146 >>> print(x_to_lon_spherical(for_x), type(x_to_lon_spherical(for_x))) -38.5434 <class 'numpy.float64'>
References
-
pymove.utils.conversions.
y_to_lat_spherical
(y: float | ndarray) → float | ndarray[source]¶ Convert Y EPSG:3857 WGS 84 / Pseudo-Mercator to latitude.
Parameters: y (float) – Y offset from your original position in meters. Returns: Represents latitude. Return type: float Examples
>>> from pymove.utils.conversions import y_to_lat_spherical >>> for_y = -414220.15015607665 >>> print(y_to_lat_spherical(y_for), type(y_to_lat_spherical(y_for))) -3.7183900000000096 <class 'numpy.float64'>
References
pymove.utils.data_augmentation module¶
Data augmentation operations.
append_row, generate_trajectories_df, generate_start_feature, generate_destiny_feature, split_crossover, augmentation_trajectories_df, insert_points_in_df, instance_crossover_augmentation
-
pymove.utils.data_augmentation.
append_row
(data: DataFrame, row: Series | None = None, columns: dict | None = None)[source]¶ Insert a new line in the dataframe with the information passed by parameter.
Parameters: - data (DataFrame) – The input trajectories data.
- row (Series, optional) – The row of a dataframe, by default None
- columns (dict, optional) – Dictionary containing the values to be added, by default None
-
pymove.utils.data_augmentation.
augmentation_trajectories_df
(data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', restriction: str = 'destination only', label_trajectory: str = 'trajectory', insert_at_df: bool = False, frac: float = 0.5) → DataFrame[source]¶ Generates new data from unobserved trajectories, given a specific restriction.
By default, the algorithm uses the same route destination constraint.
Parameters: - data (DataFrame) – The input trajectories data.
- restriction (str, optional) – Constraint used to generate new data, by default ‘destination only’
- label_trajectory (str, optional) – Label of the points sequences, by default TRAJECTORY
- insert_at_df (boolean, optional) – Whether to return a new DataFrame, by default False If True then value of copy is ignored.
- frac (number, optional) – Represents the percentage to be exchanged, by default 0.5
Returns: Dataframe with the new data generated
Return type: DataFrame
-
pymove.utils.data_augmentation.
generate_destiny_feature
(data: pandas.core.frame.DataFrame, label_trajectory: str = 'trajectory')[source]¶ Removes the first point from the trajectory and adds it in a new column ‘start’.
Parameters: - data (DataFrame) – The input trajectory data.
- label_trajectory (str, optional) – Label of the points sequences, by default ‘trajectory’
-
pymove.utils.data_augmentation.
generate_start_feature
(data: pandas.core.frame.DataFrame, label_trajectory: str = 'trajectory')[source]¶ Removes the last point from the trajectory and adds it in a new column ‘destiny’.
Parameters: - data (DataFrame) – The input trajectory data.
- label_trajectory (str, optional) – Label of the points sequences, by default TRAJECTORY
-
pymove.utils.data_augmentation.
generate_trajectories_df
(data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame') → DataFrame[source]¶ Generates a dataframe with the sequence of location points of a trajectory.
Parameters: data (DataFrame) – The input trajectory data. Returns: DataFrame of the trajectories Return type: DataFrame
-
pymove.utils.data_augmentation.
insert_points_in_df
(data: pandas.core.frame.DataFrame, aug_df: pandas.core.frame.DataFrame)[source]¶ Inserts the points of the generated trajectories to the original data sets.
Parameters: - data (DataFrame) – The input trajectories data
- aug_df (DataFrame) – The data of unobserved trajectories
-
pymove.utils.data_augmentation.
instance_crossover_augmentation
(data: pandas.core.frame.DataFrame, restriction: str = 'destination only', label_trajectory: str = 'trajectory', frac: float = 0.5)[source]¶ Generates new data from unobserved trajectories, with a specific restriction.
By default, the algorithm uses the same destination constraint as the route and inserts the points on the original dataframe.
Parameters: - data (DataFrame) – The input trajectories data
- restriction (str, optional) – Constraint used to generate new data, by default ‘destination only’
- label_trajectory (str, optional) – Label of the points sequences, by default ‘trajectory’
- frac (number, optional) – Represents the percentage to be exchanged, by default 0.5
-
pymove.utils.data_augmentation.
split_crossover
(sequence_a: list, sequence_b: list, frac: float = 0.5) → tuple[list, list][source]¶ Divides two arrays in the indicated ratio and exchange their halves.
Parameters: - sequence_a (list or ndarray) – Array any
- sequence_b (list or ndarray) – Array any
- frac (float, optional) – Represents the percentage to be exchanged, by default 0.5
Returns: Arrays with the halves exchanged.
Return type: Tuple[List, List]
pymove.utils.datetime module¶
Datetime operations.
date_to_str, str_to_datetime, datetime_to_str, datetime_to_min, min_to_datetime, to_day_of_week_int, working_day, now_str, deltatime_str, timestamp_to_millis, millis_to_timestamp, time_to_str, str_to_time, elapsed_time_dt, diff_time, create_time_slot_in_minute, generate_time_statistics, threshold_time_statistics
-
pymove.utils.datetime.
create_time_slot_in_minute
(data: DataFrame, slot_interval: int = 15, initial_slot: int = 0, label_datetime: str = 'datetime', label_time_slot: str = 'time_slot', inplace: bool = False) → DataFrame | None[source]¶ Partitions the time in slot windows.
Parameters: - data (DataFrame) – dataframe with datetime column
- slot_interval (int, optional) – size of the slot window in minutes, by default 5
- initial_slot (int, optional) – initial window time, by default 0
- label_datetime (str, optional) – name of the datetime column, by default DATETIME
- label_time_slot (str, optional) – name of the time slot column, by default TIME_SLOT
- inplace (boolean, optional) – wether the operation will be done in the original dataframe, by default False
Returns: data with converted time slots or None
Return type: DataFrame
Examples
>>> from pymove.utils.datetime import create_time_slot_in_minute >>> from pymove import datetime >>> data lat lon datetime id 0 39.984094 116.319236 2008-10-23 05:44:05 1 1 39.984198 116.319322 2008-10-23 05:56:06 1 2 39.984224 116.319402 2008-10-23 05:56:11 1 3 39.984224 116.319402 2008-10-23 06:10:15 1 >>> datetime.create_time_slot_in_minute(data, inplace=False) lat lon datetime id time_slot 0 39.984094 116.319236 2008-10-23 05:44:05 1 22 1 39.984198 116.319322 2008-10-23 05:56:06 1 23 2 39.984224 116.319402 2008-10-23 05:56:11 1 23 3 39.984224 116.319402 2008-10-23 06:10:15 1 24
-
pymove.utils.datetime.
date_to_str
(dt: datetime.datetime) → str[source]¶ Get date, in string format, from timestamp.
Parameters: dt (datetime) – Represents a date Returns: Represents the date in string format Return type: str Example
>>> from datetime import datatime >>> from pymove.utils.datetime import date_to_str >>> time_now = datetime.now() >>> print(time_now) '2021-04-29 11:01:29.909340' >>> print(type(time_now)) '<class 'datetime.datetime'>' >>> print(date_to_str(time_now), type(time_now)) '2021-04-29 <class 'str'>'
-
pymove.utils.datetime.
datetime_to_min
(dt: datetime.datetime) → int[source]¶ Converts a datetime to an int representation in minutes.
To do the reverse use: min_to_datetime.
Parameters: dt (datetime) – Represents a datetime in datetime format Returns: Represents minutes from Return type: int Example
>>> from pymove.utils.datetime import datetime_to_min >>> from datetime import datetime >>> time_now = datetime.now() >>> print(type(datetime_to_min(time_now))) '<class 'int'>' >>> datetime_to_min(time_now) '26996497'
-
pymove.utils.datetime.
datetime_to_str
(dt: datetime.datetime) → str[source]¶ Converts a date in datetime format to string format.
Parameters: dt (datetime) – Represents a datetime in datetime format. Returns: - str – Represents a datetime in string format “%Y-%m-%d %H:%M:%S”.
- Example
- ——-
- >>> from pymove.utils.datetime import datetime_to_str
- >>> from datetime import datetime
- >>> time_now = datetime.now()
- >>> print(time_now)
- ’2021-04-29 14 (15:29.708113’)
- >>> print(type(time_now))
- ’<class ‘datetime.datetime’>’
- >>> print(datetime_to_str(time_now), type(datetime_to_str(time_now)))
- ’2021-04-29 14 (15:29 <class ‘str’ >’)
-
pymove.utils.datetime.
deltatime_str
(deltatime_seconds: float) → str[source]¶ Convert time in a format appropriate of time.
Parameters: deltatime_seconds (float) – Represents the elapsed time in seconds Returns: time_str – Represents time in a format hh:mm:ss Return type: str Examples
>>> from pymove.utils.datetime import deltatime_str >>> deltatime_str(1082.7180936336517) '18m:02.718s'
Notes
Output example if more than 24 hours: 25:33:57 https://stackoverflow.com/questions/3620943/measuring-elapsed-time-with-the-time-module
-
pymove.utils.datetime.
diff_time
(start_time: datetime.datetime, end_time: datetime.datetime) → int[source]¶ Computes the elapsed time from the start time to the end time specified by the user.
Parameters: - start_time (datetime) – Specifies the start time of the time range to be computed
- end_time (datetime) – Specifies the start time of the time range to be computed
Returns: Represents the time elapsed from the start time to the current time (when the function was called).
Return type: int
Examples
>>> from datetime import datetime >>> from pymove.utils.datetime import str_to_datetime >>> time_now = datetime.now() >>> start_time_1 = datetime(2020, 6, 29, 0, 0) >>> start_time_2 = str_to_datetime('2020-06-29 12:45:59') >>> print(diff_time(start_time_1, time_now)) 26411808665 >>> print(diff_time(start_time_2, time_now)) 26365849665
-
pymove.utils.datetime.
elapsed_time_dt
(start_time: datetime.datetime) → int[source]¶ Computes the elapsed time from a specific start time.
Parameters: start_time (datetime) – Specifies the start time of the time range to be computed Returns: Represents the time elapsed from the start time to the current time (when the function was called). Return type: int Examples
>>> from datetime import datetime >>> from pymove.utils.datetime import str_to_datetime >>> start_time_1 = datetime(2020, 6, 29, 0, 0) >>> start_time_2 = str_to_datetime('2020-06-29 12:45:59') >>> print(elapsed_time_dt(start_time_1)) 26411808666 >>> print(elapsed_time_dt(start_time_2)) 26365849667
-
pymove.utils.datetime.
generate_time_statistics
(data: pandas.core.frame.DataFrame, local_label: str = 'local_label')[source]¶ Calculates time statistics of the pairwise local labels.
(average, standard deviation, minimum, maximum, sum and count) of the pairwise local labels of a symbolic trajectory.
Parameters: - data (DataFrame) – The input trajectories date.
- local_label (str, optional) – The name of the feature with local id, by default LOCAL_LABEL
Returns: Statistics infomations of the pairwise local labels
Return type: DataFrame
Example
>>> from pymove.utils.datetime import generate_time_statistics >>> df local_label prev_local time_to_prev id 0 house NaN NaN 1 1 market house 720.0 1 2 market market 5.0 1 3 market market 1.0 1 4 school market 844.0 1 >>> generate_time_statistics(df) local_label prev_local mean std min max sum count 0 house market 844.0 0.000000 844.0 844.0 844.0 1 1 market house 720.0 0.000000 720.0 720.0 720.0 1 2 market market 3.0 2.828427 1.0 5.0 6.0 2
-
pymove.utils.datetime.
millis_to_timestamp
(milliseconds: float) → pandas._libs.tslibs.timestamps.Timestamp[source]¶ Converts milliseconds to timestamp.
Parameters: milliseconds (int) – Represents millisecond. Returns: Represents the date corresponding. Return type: Timestamp Examples
>>> from pymove.utils.datetime import millis_to_timestamp >>> millis_to_timestamp(1449907200123) '2015-12-12 08:00:00.123000'
-
pymove.utils.datetime.
min_to_datetime
(minutes: int) → datetime.datetime[source]¶ Converts an int representation in minutes to a datetime.
To do the reverse use: datetime_to_min.
Parameters: minutes (int) – Represents minutes Returns: Represents minutes in datetime format Return type: datetime Example
>>> from pymove.utils.datetime import min_to_datetime >>> print(min_to_datetime(26996497), type(min_to_datetime(26996497))) '2021-04-30 13:37:00 <class 'datetime.datetime'>'
-
pymove.utils.datetime.
now_str
() → str[source]¶ Get datetime of now.
Returns: Represents a date Return type: str Examples
>>> from pymove.utils.datetime import now_str >>> now_str() '2019-09-02 13:54:16'
-
pymove.utils.datetime.
str_to_datetime
(dt_str: str) → datetime.datetime[source]¶ Converts a datetime in string format to datetime format.
Parameters: dt_str (str) – Represents a datetime in string format, “%Y-%m-%d” or “%Y-%m-%d %H:%M:%S” Returns: Represents a datetime in datetime format Return type: datetime Example
>>> from pymove.utils.datetime import str_to_datetime >>> time_1 = '2020-06-29' >>> time_2 = '2020-06-29 12:45:59' >>> print(type(time_1), type(time_2)) '<class 'str'> <class 'str'>' >>> print( str_to_datetime(time_1), type(str_to_datetime(time_1))) '2020-06-29 00:00:00 <class 'datetime.datetime'>' >>> print(str_to_datetime(time_2), type(str_to_datetime(time_2))) '2020-06-29 12:45:59 <class 'datetime.datetime'>'
-
pymove.utils.datetime.
str_to_time
(dt_str: str) → datetime.datetime[source]¶ Converts a time in string format “%H:%M:%S” to datetime format.
Parameters: dt_str (str) – Represents a time in string format Returns: Represents a time in datetime format Return type: datetime Examples
>>> from pymove.utils.datetime import str_to_time >>> str_to_time("08:00:00") datetime(1900, 1, 1, 8, 0)
-
pymove.utils.datetime.
threshold_time_statistics
(df_statistics: DataFrame, mean_coef: float = 1.0, std_coef: float = 1.0, inplace: bool = False) → DataFrame | None[source]¶ Calculates and creates the threshold column.
The values are based in the time statistics dataframe for each segment.
Parameters: - df_statistics (DataFrame) – Time Statistics of the pairwise local labels.
- mean_coef (float) – Multiplication coefficient of the mean time for the segment, by default 1.0
- std_coef (float) – Multiplication coefficient of sdt time for the segment, by default 1.0
- inplace (boolean, optional) – wether the operation will be done in the original dataframe, by default False
Returns: DataFrame of time statistics with the aditional feature: threshold, which indicates the time limit of the trajectory segment, or None
Return type: DataFrame
Example
>>> from pymove.utils.datetime import generate_time_statistics >>> df local_label prev_local time_to_prev id 0 house NaN NaN 1 1 market house 720.0 1 2 market market 5.0 1 3 market market 1.0 1 4 school market 844.0 1 >>> statistics = generate_time_statistics(df) >>> statistics local_label prev_local mean std min max sum count 0 house market 844.0 0.000000 844.0 844.0 844.0 1 1 market house 720.0 0.000000 720.0 720.0 720.0 1 2 market market 3.0 2.828427 1.0 5.0 6.0 2 >>> threshold_time_statistics(statistics) local_label prev_local mean std min max sum count threshold 0 house market 844.0 0.000000 844.0 844.0 844.0 1 844.0 1 market house 720.0 0.000000 720.0 720.0 720.0 1 720.0 2 market market 3.0 2.828427 1.0 5.0 6.0 2 5.8
-
pymove.utils.datetime.
time_to_str
(time: pandas._libs.tslibs.timestamps.Timestamp) → str[source]¶ Get time, in string format, from timestamp.
Parameters: time (Timestamp) – Represents a time Returns: Represents the time in string format Return type: str Examples
>>> from pymove.utils.datetime import time_to_str >>> time_to_str("2015-12-12 08:00:00.123000") '08:00:00'
-
pymove.utils.datetime.
timestamp_to_millis
(timestamp: str) → int[source]¶ Converts a local datetime to a POSIX timestamp in milliseconds (like in Java).
Parameters: timestamp (str) – Represents a date Returns: Represents millisecond results Return type: int Examples
>>> from pymove.utils.datetime import timestamp_to_millis >>> timestamp_to_millis('2015-12-12 08:00:00.123000') 1449907200123 (UTC)
-
pymove.utils.datetime.
to_day_of_week_int
(dt: datetime.datetime) → int[source]¶ Get day of week of a date. Monday == 0…Sunday == 6.
Parameters: dt (datetime) – Represents a datetime in datetime format. Returns: Represents day of week. Return type: int Example
>>> from pymove.utils.datetime import str_to_datetime >>> monday = str_to_datetime('2021-05-3 12:00:01') >>> friday = str_to_datetime('2021-05-7 12:00:01') >>> print(to_day_of_week_int(monday), type(to_day_of_week_int(monday))) '0 <class 'int'>' >>> print(to_day_of_week_int(friday), type(to_day_of_week_int(friday))) '4 <class 'int'>'
-
pymove.utils.datetime.
working_day
(dt: str | datetime, country: str = 'BR', state: str | None = None) → bool[source]¶ Indices if a day specified by the user is a working day.
Parameters: - dt (str or datetime) – Specifies the day the user wants to know if it is a business day.
- country (str) – Indicates country to check for vacation days, by default ‘BR’
- state (str) – Indicates state to check for vacation days, by default None
Returns: if true, means that the day informed by the user is a working day. if false, means that the day is not a working day.
Return type: boolean
Examples
>>> from pymove.utils.datetime import str_to_datetime >>> independence_day = str_to_datetime('2021-09-7 12:00:01') # Holiday in Brazil >>> next_day = str_to_datetime('2021-09-8 12:00:01') # Not a Holiday in Brazil >>> print(working_day(independence_day, 'BR')) False >>> print(type(working_day(independence_day, 'BR'))) <class 'bool'> >>> print(working_day(next_day, 'BR')) True >>> print(type(working_day(next_day, 'BR'))) '<class 'bool'>'
References
Countries and States names available in https://pypi.org/project/holidays/
pymove.utils.distances module¶
Distances operations.
haversine, euclidean_distance_in_meters, nearest_points, medp, medt
-
pymove.utils.distances.
euclidean_distance_in_meters
(lat1: float | ndarray, lon1: float | ndarray, lat2: float | ndarray, lon2: float | ndarray) → float | ndarray[source]¶ Calculate the euclidean distance in meters between two points.
Parameters: - lat1 (float or array) – latitute of point 1
- lon1 (float or array) – longitude of point 1
- lat2 (float or array) – latitute of point 2
- lon2 (float or array) – longitude of point 2
Returns: euclidean distance in meters between the two points.
Return type: float or ndarray
Example
>>> from pymove.utils.distances import euclidean_distance_in_meters >>> lat_fortaleza, lon_fortaleza = [-3.71839 ,-38.5434] >>> lat_quixada, lon_quixada = [-4.979224744401671, -39.056434302570665] >>> euclidean_distance_in_meters( >>> lat_fortaleza, lon_fortaleza, lat_quixada, lon_quixada >>> ) 151907.9670136588
-
pymove.utils.distances.
haversine
(lat1: float | ndarray, lon1: float | ndarray, lat2: float | ndarray, lon2: float | ndarray, to_radians: bool = True, earth_radius: float = 6371) → float | ndarray[source]¶ Calculates the great circle distance between two points on the earth.
Specified in decimal degrees or in radians. All (lat, lon) coordinates must have numeric dtypes and be of equal length. Result in meters. Use 3956 in earth radius for miles.
Parameters: - lat1 (float or array) – latitute of point 1
- lon1 (float or array) – longitude of point 1
- lat2 (float or array) – latitute of point 2
- lon2 (float or array) – longitude of point 2
- to_radians (boolean) – Wether to convert the values to radians, by default True
- earth_radius (int) – Radius of sphere, by default EARTH_RADIUS
Returns: Represents distance between points in meters
Return type: float or ndarray
Example
>>> from pymove.utils.distances import haversine >>> lat_fortaleza, lon_fortaleza = [-3.71839 ,-38.5434] >>> lat_quixada, lon_quixada = [-4.979224744401671, -39.056434302570665] >>> haversine(lat_fortaleza, lon_fortaleza, lat_quixada, lon_quixada) 151298.02548428564
References
- Vectorized haversine function:
- https://stackoverflow.com/questions/43577086/pandas-calculate-haversine-distance-within-each-group-of-rows
- About distance between two points:
- https://janakiev.com/blog/gps-points-distance-python/
-
pymove.utils.distances.
medp
(traj1: pandas.core.frame.DataFrame, traj2: pandas.core.frame.DataFrame, latitude: str = 'lat', longitude: str = 'lon') → float[source]¶ Returns the Mean Euclidian Distance Predictive between two trajectories.
Considers only the spatial dimension for the similarity measure.
Parameters: - traj1 (dataframe) – The input of one trajectory.
- traj2 (dataframe) – The input of another trajectory.
- latitude (str, optional) – Label of the trajectories dataframe referring to the latitude, by default LATITUDE
- longitude (str, optional) – Label of the trajectories dataframe referring to the longitude, by default LONGITUDE
Returns: total distance
Return type: float
Example
>>> from pymove.utils.distances import medp >>> traj_1 lat lon datetime id 0 39.98471 116.319865 2008-10-23 05:53:23 1 >>> traj_2 lat lon datetime id 0 39.984674 116.31981 2008-10-23 05:53:28 1 >>> medp(traj_1, traj_2) 6.573431370981577e-05
-
pymove.utils.distances.
medt
(traj1: pandas.core.frame.DataFrame, traj2: pandas.core.frame.DataFrame, latitude: str = 'lat', longitude: str = 'lon', datetime: str = 'datetime') → float[source]¶ Returns the Mean Euclidian Distance Trajectory between two trajectories.
Considers the spatial dimension and the temporal dimension when measuring similarity.
Parameters: - traj1 (dataframe) – The input of one trajectory.
- traj2 (dataframe) – The input of another trajectory.
- latitude (str, optional) – Label of the trajectories dataframe referring to the latitude, by default LATITUDE
- longitude (str, optional) – Label of the trajectories dataframe referring to the longitude, by default LONGITUDE
- datetime (str, optional) – Label of the trajectories dataframe referring to the timestamp, by default DATETIME
Returns: total distance
Return type: float
Example
>>> from pymove.utils.distances import medt >>> traj_1 lat lon datetime id 0 39.98471 116.319865 2008-10-23 05:53:23 1 >>> traj_2 lat lon datetime id 0 39.984674 116.31981 2008-10-23 05:53:28 1 >>> medt(traj_1, traj_2) 6.592419887747872e-05
-
pymove.utils.distances.
nearest_points
(traj1: pandas.core.frame.DataFrame, traj2: pandas.core.frame.DataFrame, latitude: str = 'lat', longitude: str = 'lon') → pandas.core.frame.DataFrame[source]¶ Returns the point closest to another trajectory based on the Euclidean distance.
Parameters: - traj1 (dataframe) – The input of one trajectory.
- traj2 (dataframe) – The input of another trajectory.
- latitude (str, optional) – Label of the trajectories dataframe referring to the latitude, by default LATITUDE
- longitude (str, optional) – Label of the trajectories dataframe referring to the longitude, by default LONGITUDE
Returns: dataframe with closest points
Return type: DataFrame
Example
>>> from pymove.utils.distances import nearest_points >>> df_a lat lon datetime id 0 39.984198 116.319322 2008-10-23 05:53:06 1 1 39.984224 116.319402 2008-10-23 05:53:11 1 >>> df_b lat lon datetime id 0 39.984211 116.319389 2008-10-23 05:53:16 1 1 39.984217 116.319422 2008-10-23 05:53:21 1 >>> nearest_points(df_a,df_b) lat lon datetime id 0 39.984211 116.319389 2008-10-23 05:53:16 1 1 39.984211 116.319389 2008-10-23 05:53:16 1
pymove.utils.geoutils module¶
Geo operations.
v_color, create_geohash_df, create_bin_geohash_df, decode_geohash_to_latlon,
-
pymove.utils.geoutils.
create_bin_geohash_df
(data: pandas.core.frame.DataFrame, precision: float = 15)[source]¶ Create trajectory geohash binaries and integrate with df.
Parameters: - data (dataframe) – The input trajectories data
- precision (float, optional) – Number of characters in resulting geohash, by default 15
Returns: Return type: A DataFrame with the additional column ‘bin_geohash’
Example
>>> from pymove.utils.geoutils import create_bin_geohash_df >>> geoLife_df lat lon 0 39.984094 116.319236 1 39.984198 116.319322 2 39.984224 116.319402 3 39.984211 116.319389 4 39.984217 116.319422 >>> print(type(create_bin_geohash_df(geoLife_df))) >>> geoLife_df <class 'NoneType'> lat lon bin_geohash 0 39.984094 116.319236 [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ... 1 39.984198 116.319322 [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ... 2 39.984224 116.319402 [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ... 3 39.984211 116.319389 [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ... 4 39.984217 116.319422 [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ...
-
pymove.utils.geoutils.
create_geohash_df
(data: pandas.core.frame.DataFrame, precision: float = 15)[source]¶ Create geohash from geographic coordinates and integrate with df.
Parameters: - data (dataframe) – The input trajectories data
- precision (float, optional) – Number of characters in resulting geohash, by default 15
Returns: Return type: A DataFrame with the additional column ‘geohash’
Example
>>> from pymove.utils.geoutils import create_geohash_df, _reset_and_create_arrays_none >>> geoLife_df lat lon 0 39.984094 116.319236 1 39.984198 116.319322 2 39.984224 116.319402 3 39.984211 116.319389 4 39.984217 116.319422 >>> print(type (create_geohash_df(geoLife_df))) >>> geoLife_df <class 'NoneType'> lat lon geohash 0 39.984094 116.319236 wx4eqyvh4xkg0xs 1 39.984198 116.319322 wx4eqyvhudszsev 2 39.984224 116.319402 wx4eqyvhyx8d9wc 3 39.984211 116.319389 wx4eqyvhyjnv5m7 4 39.984217 116.319422 wx4eqyvhyyr2yy8
-
pymove.utils.geoutils.
decode_geohash_to_latlon
(data: pandas.core.frame.DataFrame, label_geohash: str = 'geohash', reset_index: bool = True)[source]¶ Decode feature with hash of trajectories back to geographic coordinates.
Parameters: - data (dataframe) – The input trajectories data
- label_geohash (str, optional) – The name of the feature with hashed trajectories, by default GEOHASH
- reset_index (boolean, optional) – Condition to reset the df index, by default True
Returns: Return type: A DataFrame with the additional columns ‘lat_decode’ and ‘lon_decode’
Example
>>> from pymove.utils.geoutils import decode_geohash_to_latlon >>> geoLife_df lat lon geohash 0 39.984094 116.319236 wx4eqyvh4xkg0xs 1 39.984198 116.319322 wx4eqyvhudszsev 2 39.984224 116.319402 wx4eqyvhyx8d9wc 3 39.984211 116.319389 wx4eqyvhyjnv5m7 4 39.984217 116.319422 wx4eqyvhyyr2yy8 >>> print(type(decode_geohash_to_latlon(geoLife_df))) >>> geoLife_df <class 'NoneType'> lat lon geohash lat_decode lon_decode 0 39.984094 116.319236 wx4eqyvh4xkg0xs 39.984094 116.319236 1 39.984198 116.319322 wx4eqyvhudszsev 39.984198 116.319322 2 39.984224 116.319402 wx4eqyvhyx8d9wc 39.984224 116.319402 3 39.984211 116.319389 wx4eqyvhyjnv5m7 39.984211 116.319389 4 39.984217 116.319422 wx4eqyvhyyr2yy8 39.984217 116.319422
-
pymove.utils.geoutils.
v_color
(ob: shapely.geometry.base.BaseGeometry) → str[source]¶ Returns ‘#ffcc33’ if object crosses otherwise it returns ‘#6699cc’.
Parameters: ob (geometry object) – Any geometric object Returns: Geometric object color Return type: str Example
>>> from pymove.utils.geoutils import v_color >>> from shapely.geometry import LineString >>> case_1 = LineString([(3,3),(4,4), (3,4)]) >>> case_2 = LineString([(3,3),(4,4), (4,3)]) >>> case_3 = LineString([(3,3),(4,4), (3,4), (4,3)]) >>> print(v_color(case_1), type(v_color(case_1))) #6699cc <class 'str'> >>> print(v_color(case_2), type(v_color(case_2))) #6699cc <class 'str'> >>> print(v_color(case_3), type(v_color(case_3))) #ffcc33 <class 'str'>
pymove.utils.integration module¶
Integration operations.
union_poi_bank, union_poi_bus_station, union_poi_bar_restaurant, union_poi_parks, union_poi_police, join_collective_areas, join_with_pois, join_with_pois_by_category, join_with_events, join_with_event_by_dist_and_time, join_with_home_by_id, merge_home_with_poi
-
pymove.utils.integration.
join_collective_areas
(data: DataFrame, areas: DataFrame, label_geometry: str = 'geometry', inplace: bool = False) → DataFrame | None[source]¶ Performs the integration between trajectories and collective areas.
Generating a new column that informs if the point of the trajectory is inserted in a collective area.
Parameters: - data (geopandas.GeoDataFrame) – The input trajectory data
- areas (geopandas.GeoDataFrame) – The input coletive areas data
- label_geometry (str, optional) – Label referring to the Point of Interest category, by default GEOMETRY
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns: data with joined geometries or None
Return type: DataFrame
Examples
>>> from pymove.utils.integration import join_collective_areas >>> data lat lon datetime id geometry 0 39.984094 116.319236 2008-10-23 05:53:05 1 POINT (116.31924 39.98409) 1 39.984198 116.319322 2008-10-23 05:53:06 1 POINT (116.31932 39.98420) 2 39.984224 116.319402 2008-10-23 05:53:11 1 POINT (116.31940 39.98422) 3 39.984211 116.319389 2008-10-23 05:53:16 1 POINT (116.31939 39.98421) 4 39.984217 116.319422 2008-10-23 05:53:21 1 POINT (116.31942 39.98422) >>> area_c lat lon datetime id geometry 0 39.984094 116.319236 2008-10-23 05:53:05 1 POINT (116.319236 39.984094) 1 40.006436 116.317701 2008-10-23 10:53:31 1 POINT (116.317701 40.006436) 2 40.014125 116.306159 2008-10-23 23:43:56 1 POINT (116.306159 40.014125) 3 39.984211 116.319389 2008-10-23 05:53:16 1 POINT (116.319389 39.984211) POINT (116.32687 39.97901) >>> join_collective_areas(gdf, area_c) >>> gdf.head() lat lon datetime id geometry violating 0 39.984094 116.319236 2008-10-23 05:53:05 1 POINT (116.319236 39.984094) True 1 39.984198 116.319322 2008-10-23 05:53:06 1 POINT (116.319322 39.984198) False 2 39.984224 116.319402 2008-10-23 05:53:11 1 POINT (116.319402 39.984224) False 3 39.984211 116.319389 2008-10-23 05:53:16 1 POINT (116.319389 39.984211) True 4 39.984217 116.319422 2008-10-23 05:53:21 1 POINT (116.319422 39.984217) False
-
pymove.utils.integration.
join_with_event_by_dist_and_time
(data: pandas.core.frame.DataFrame, df_events: pandas.core.frame.DataFrame, label_date: str = 'datetime', label_event_id: str = 'event_id', label_event_type: str = 'event_type', time_window: float = 3600, radius: float = 1000, inplace: bool = False)[source]¶ Performs the integration between trajectories and events on windows.
Generating new columns referring to the category of the point of interest, the distance between the location of the user and location of the poi based on the distance and on time of each point of the trajectories.
Parameters: - data (DataFrame) – The input trajectory data.
- df_pois (DataFrame) – The input events points of interest data.
- label_date (str, optional) – Label of data referring to the datetime of the input trajectory data, by default DATETIME
- label_event_id (str, optional) – Label of df_events referring to the id of the event, by default EVENT_ID
- label_event_type (str, optional) – Label of df_events referring to the type of the event, by default EVENT_TYPE
- time_window (float, optional) – tolerable length of time range in `seconds`for assigning the event’s point of interest to the trajectory point, by default 3600
- radius (float, optional) – maximum radius of pois in meters, by default 1000
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Examples
>>> from pymove.utils.integration import join_with_pois_by_dist_and_datetime >>> move_df lat lon datetime id 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 39.984559 116.326696 2008-10-23 10:37:26 1 2 39.993527 116.326483 2008-10-24 00:02:14 2 3 39.978575 116.326975 2008-10-24 00:22:01 3 4 39.981668 116.310769 2008-10-24 01:57:57 3 >>> events lat lon id datetime type_poi name_poi 0 39.984094 116.319236 1 2008-10-23 05:53:05 show forro_tropykalia 1 39.991013 116.326384 2 2008-10-23 10:27:26 corrida racha_de_jumento 2 39.990013 116.316384 2 2008-10-23 10:37:26 show dia_do_municipio 3 40.010000 116.312615 3 2008-10-24 01:57:57 feira adocao_de_animais >>> join_with_pois_by_dist_and_datetime(move_df, pois) >>> move_df lat lon datetime id type_poi dist_event name_poi 0 39.984094 116.319236 2008-10-23 05:53:05 1 [show] [0.0] [forro_tropykalia] 1 39.984559 116.326696 2008-10-23 10:37:26 1 [corrida, show] [718.144, 1067.53] [racha_de_jumento, dia_do_municipio] 2 39.993527 116.326483 2008-10-24 00:02:14 2 None None None 3 39.978575 116.326975 2008-10-24 00:22:01 3 None None None 4 39.981668 116.310769 2008-10-24 01:57:57 3 None None None
Raises: ValueError
– If feature generation fails
-
pymove.utils.integration.
join_with_events
(data: pandas.core.frame.DataFrame, df_events: pandas.core.frame.DataFrame, label_date: str = 'datetime', time_window: int = 900, label_event_id: str = 'event_id', label_event_type: str = 'event_type', inplace: bool = False)[source]¶ Performs the integration between trajectories and the closest event in time window.
Generating new columns referring to the category of the point of interest, the distance from the nearest point of interest based on time of each point of the trajectories.
Parameters: - data (DataFrame) – The input trajectory data.
- df_events (DataFrame) – The input events points of interest data.
- label_date (str, optional) – Label of data referring to the datetime of the input trajectory data, by default DATETIME
- time_window (float, optional) – tolerable length of time range in seconds for assigning the event’s point of interest to the trajectory point, by default 900
- label_event_id (str, optional) – Label of df_events referring to the id of the event, by default EVENT_ID
- label_event_type (str, optional) – Label of df_events referring to the type of the event, by default EVENT_TYPE
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Examples
>>> from pymove.utils.integration import join_with_events >>> move_df lat lon datetime id 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 39.984559 116.326696 2008-10-23 10:37:26 1 2 39.993527 116.326483 2008-10-24 00:02:14 2 3 39.978575 116.326975 2008-10-24 00:22:01 3 4 39.981668 116.310769 2008-10-24 01:57:57 3 >>> events lat lon id datetime event_type event_id 0 39.984094 116.319236 1 2008-10-23 05:53:05 show forro_tropykalia 1 39.991013 116.326384 2 2008-10-23 10:37:26 show dia_do_municipio 2 40.010000 116.312615 3 2008-10-24 01:57:57 feira adocao_de_animais >>> join_with_events(move_df, events) lat lon datetime id event_type dist_event event_id 0 39.984094 116.319236 2008-10-23 05:53:05 1 show 0.000000 forro_tropykalia 1 39.984559 116.326696 2008-10-23 10:37:26 1 show 718.144152 dia_do_municipio 2 39.993527 116.326483 2008-10-24 00:02:14 2 inf 3 39.978575 116.326975 2008-10-24 00:22:01 3 inf 4 39.981668 116.310769 2008-10-24 01:57:57 3 feira 3154.296880 adocao_de_animais
Raises: ValueError
– If feature generation fails
-
pymove.utils.integration.
join_with_home_by_id
(data: pandas.core.frame.DataFrame, df_home: pandas.core.frame.DataFrame, label_id: str = 'id', label_address: str = 'formatted_address', label_city: str = 'city', drop_id_without_home: bool = False, inplace: bool = False)[source]¶ Performs the integration between trajectories and home points.
Generating new columns referring to the distance of the nearest home point, address and city of each trajectory point.
Parameters: - data (DataFrame) – The input trajectory data.
- df_home (DataFrame) – The input home points data.
- label_id (str, optional) – Label of df_home referring to the home point id, by default TRAJ_ID
- label_address (str, optional) – Label of df_home referring to the home point address, by default ADDRESS
- label_city (str, optional) – Label of df_home referring to the point city, by default CITY
- drop_id_without_home (bool, optional) – flag as an option to drop id’s that don’t have houses, by default False
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Examples
>>> from pymove.utils.integration import join_with_home_by_id >>> move_df lat lon datetime id 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 39.984559 116.326696 2008-10-23 10:37:26 1 2 40.002899 116.321520 2008-10-23 10:50:16 1 3 40.016238 116.307691 2008-10-23 11:03:06 1 4 40.013814 116.306525 2008-10-23 11:58:33 2 5 40.009735 116.315069 2008-10-23 23:50:45 2 >>> home_df lat lon id formatted_address city 0 39.984094 116.319236 1 rua da mae quixiling 1 40.013821 116.306531 2 rua da familia quixeramoling >>> join_with_home_by_id(move_df, home_df) >>> move_df id lat lon datetime dist_home home city 0 1 39.984094 116.319236 2008-10-23 05:53:05 0.000000 rua da mae quixiling 1 1 39.984559 116.326696 2008-10-23 10:37:26 637.690216 rua da mae quixiling 2 1 40.002899 116.321520 2008-10-23 10:50:16 2100.053501 rua da mae quixiling 3 1 40.016238 116.307691 2008-10-23 11:03:06 3707.066732 rua da mae quixiling 4 2 40.013814 116.306525 2008-10-23 11:58:33 0.931101 rua da familia quixeramoling 5 2 40.009735 116.315069 2008-10-23 23:50:45 857.417540 rua da familia quixeramoling
-
pymove.utils.integration.
join_with_pois
(data: pandas.core.frame.DataFrame, df_pois: pandas.core.frame.DataFrame, label_id: str = 'id', label_poi_name: str = 'name_poi', reset_index: bool = True, inplace: bool = False)[source]¶ Performs the integration between trajectories and the closest point of interest.
Generating two new columns referring to the name and the distance from the point of interest closest to each point of the trajectory.
Parameters: - data (DataFrame) – The input trajectory data.
- df_pois (DataFrame) – The input point of interest data.
- label_id (str, optional) – Label of df_pois referring to the Point of Interest id, by default TRAJ_ID
- label_poi_name (str, optional) – Label of df_pois referring to the Point of Interest name, by default NAME_POI
- reset_index (bool, optional) – Flag for reset index of the df_pois and data dataframes before the join, by default True
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Examples
>>> from pymove.utils.integration import join_with_pois >>> move_df lat lon datetime id 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 39.984559 116.326696 2008-10-23 10:37:26 1 2 40.002899 116.321520 2008-10-23 10:50:16 1 3 40.016238 116.307691 2008-10-23 11:03:06 1 4 40.013814 116.306525 2008-10-23 11:58:33 2 5 40.009735 116.315069 2008-10-23 23:50:45 2 >>> pois lat lon id type_poi name_poi 0 39.984094 116.319236 1 policia distrito_pol_1 1 39.991013 116.326384 2 policia policia_federal 2 40.010000 116.312615 3 comercio supermercado_aroldo >>> join_with_pois(move_df, pois) lat lon datetime id id_poi dist_poi name_poi 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 0.000000 distrito_pol_1 1 39.984559 116.326696 2008-10-23 10:37:26 1 1 637.690216 distrito_pol_1 2 40.002899 116.321520 2008-10-23 10:50:16 1 3 1094.860663 supermercado_aroldo 3 40.016238 116.307691 2008-10-23 11:03:06 1 3 810.542998 supermercado_aroldo 4 40.013814 116.306525 2008-10-23 11:58:33 2 3 669.973155 supermercado_aroldo 5 40.009735 116.315069 2008-10-23 23:50:45 2 3 211.069129 supermercado_aroldo
-
pymove.utils.integration.
join_with_pois_by_category
(data: pandas.core.frame.DataFrame, df_pois: pandas.core.frame.DataFrame, label_category: str = 'type_poi', label_id: str = 'id', inplace: bool = False)[source]¶ Performs the integration between trajectories and each type of points of interest.
Generating new columns referring to the category and distance from the nearest point of interest that has this category at each point of the trajectory.
Parameters: - data (DataFrame) – The input trajectory data.
- df_pois (DataFrame) – The input point of interest data.
- label_category (str, optional) – Label of df_pois referring to the point of interest category, by default TYPE_POI
- label_id (str, optional) – Label of df_pois referring to the point of interest id, by default TRAJ_ID
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Examples
>>> from pymove.utils.integration import join_with_pois_by_category >>> move_df lat lon datetime id 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 39.984559 116.326696 2008-10-23 10:37:26 1 2 40.002899 116.321520 2008-10-23 10:50:16 1 3 40.016238 116.307691 2008-10-23 11:03:06 1 4 40.013814 116.306525 2008-10-23 11:58:33 2 5 40.009735 116.315069 2008-10-23 23:50:45 2 >>> pois lat lon id type_poi name_poi 0 39.984094 116.319236 1 policia distrito_pol_1 1 39.991013 116.326384 2 policia policia_federal 2 40.010000 116.312615 3 comercio supermercado_aroldo >>> join_with_pois_by_category(move_df, pois) lat lon datetime id id_policia dist_policia id_comercio dist_comercio 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 0.000000 3 2935.310277 1 39.984559 116.326696 2008-10-23 10:37:26 1 1 637.690216 3 3072.696379 2 40.002899 116.321520 2008-10-23 10:50:16 1 2 1385.087181 3 1094.860663 3 40.016238 116.307691 2008-10-23 11:03:06 1 2 3225.288831 3 810.542998 4 40.013814 116.306525 2008-10-23 11:58:33 2 2 3047.838222 3 669.973155 5 40.009735 116.315069 2008-10-23 23:50:45 2 2 2294.075820 3 211.069129
-
pymove.utils.integration.
merge_home_with_poi
(data: pandas.core.frame.DataFrame, label_dist_poi: str = 'dist_poi', label_name_poi: str = 'name_poi', label_id_poi: str = 'id_poi', label_home: str = 'home', label_dist_home: str = 'dist_home', drop_columns: bool = True, inplace: bool = False)[source]¶ Performs or merges the points of interest and the trajectories.
Considering the starting points as other points of interest, generating a new DataFrame.
Parameters: - data (DataFrame) – The input trajectory data, with join_with_pois and join_with_home_by_id applied.
- label_dist_poi (str, optional) – Label of data referring to the distance from the nearest point of interest, by default DIST_POI
- label_name_poi (str, optional) – Label of data referring to the name from the nearest point of interest, by default NAME_POI
- label_id_poi (str, optional) – Label of data referring to the id from the nearest point of interest, by default ID_POI
- label_home (str, optional) – Label of df_home referring to the home point, by default HOME
- label_dist_home (str, optional) – Label of df_home referring to the distance to the home point, by default DIST_HOME
- drop_columns (bool, optional) – Flag that controls the deletion of the columns referring to the id and the distance from the home point, by default
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Examples
>>> from pymove.utils.integration import ( >>> merge_home_with_poi, >>> join_with_home_by_id >>> ) >>> move_df lat lon datetime id id_poi dist_poi name_poi 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 0.000000 distrito_pol_1 1 39.984559 116.326696 2008-10-23 10:37:26 1 1 637.690216 distrito_pol_1 2 40.002899 116.321520 2008-10-23 10:50:16 1 2 1385.087181 policia_federal 3 40.016238 116.307691 2008-10-23 11:03:06 1 2 3225.288831 policia_federal 4 40.013814 116.306525 2008-10-23 11:58:33 2 2 3047.838222 policia_federal 5 40.009735 116.315069 2008-10-23 23:50:45 2 2 2294.075820 policia_federal >>> home_df lat lon id formatted_address city 0 39.984094 116.319236 1 rua da mae quixiling 1 40.013821 116.306531 2 rua da familia quixeramoling >>> join_with_home_by_id(move, home_df, inplace=True) >>> move_df id lat lon datetime id_poi dist_poi name_poi dist_home home city 0 1 39.984094 116.319236 2008-10-23 05:53:05 1 0.000000 distrito_pol_1 0.000000 rua da mae quixiling 1 1 39.984559 116.326696 2008-10-23 10:37:26 1 637.690216 distrito_pol_1 637.690216 rua da mae quixiling 2 1 40.002899 116.321520 2008-10-23 10:50:16 2 1385.087181 policia_federal 2100.053501 rua da mae quixiling 3 1 40.016238 16.307691 2008-10-23 11:03:06 2 3225.288831 policia_federal 3707.066732 rua da mae quixiling 4 2 40.013814 116.306525 2008-10-23 11:58:33 2 3047.838222 policia_federal 0.931101 rua da familia quixeramoling 5 2 40.009735 116.315069 2008-10-23 23:50:45 2 2294.075820 policia_federal 857.417540 rua da familia quixeramoling >>> merge_home_with_poi(move_df) id lat lon datetime id_poi dist_poi name_poi city 0 1 39.984094 116.319236 2008-10-23 05:53:05 rua da mae 0.000000 home quixiling 1 1 39.984559 116.326696 2008-10-23 10:37:26 rua da mae 637.690216 home quixiling 2 1 40.002899 116.321520 2008-10-23 10:50:16 2 1385.087181 policia_federal quixiling 3 1 40.016238 116.307691 2008-10-23 11:03:06 2 3225.288831 policia_federal quixiling 4 2 40.013814 116.306525 2008-10-23 11:58:33 rua da familia 0.931101 home quixeramoling 5 2 40.009735 116.315069 2008-10-23 23:50:45 rua da familia 857.417540 home quixeramoling
-
pymove.utils.integration.
union_poi_bank
(data: DataFrame, label_poi: str = 'type_poi', banks: list[str] | None = None, inplace: bool = False) → DataFrame | None[source]¶ Performs the union between the different bank categories.
For Points of Interest in a single category named ‘banks’.
Parameters: - data (DataFrame) – Input points of interest data
- label_poi (str, optional) – Label referring to the Point of Interest category, by default TYPE_POI
- banks (list of str, optional) –
- Names of poi refering to banks, by default
- banks = [ ‘bancos_filiais’, ‘bancos_agencias’, ‘bancos_postos’, ‘bancos_PAE’, ‘bank’,
]
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns: data with poi or None
Return type: DataFrame
Examples
>>> from pymove.utils.integration import union_poi_bank >>> pois_df lat lon id type_poi 0 39.984094 116.319236 1 bank 1 39.984198 116.319322 2 randomvalue 2 39.984224 116.319402 3 bancos_postos 3 39.984211 116.319389 4 randomvalue 4 39.984217 116.319422 5 bancos_PAE 5 39.984710 116.319865 6 bancos_postos 6 39.984674 116.319810 7 bancos_agencias 7 39.984623 116.319773 8 bancos_filiais 8 39.984606 116.319732 9 banks 9 39.984555 116.319728 10 banks >>> union_poi_bank(pois_df) lat lon id type_poi 0 39.984094 116.319236 1 banks 1 39.984198 116.319322 2 randomvalue 2 39.984224 116.319402 3 banks 3 39.984211 116.319389 4 randomvalue 4 39.984217 116.319422 5 banks 5 39.984710 116.319865 6 banks 6 39.984674 116.319810 7 banks 7 39.984623 116.319773 8 banks 8 39.984606 116.319732 9 banks 9 39.984555 116.319728 10 banks
-
pymove.utils.integration.
union_poi_bar_restaurant
(data: DataFrame, label_poi: str = 'type_poi', bar_restaurant: list[str] | None = None, inplace: bool = False) → DataFrame | None[source]¶ Performs the union between bar and restaurant categories.
For Points of Interest in a single category named ‘bar-restaurant’.
Parameters: - data (DataFrame) – Input points of interest data
- label_poi (str, optional) – Label referring to the Point of Interest category, by default TYPE_POI
- bar_restaurant (list of str, optional) –
- Names of poi refering to bars or restaurants, by default
- bar_restaurant = [
- ‘restaurant’, ‘bar’
]
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns: data with poi or None
Return type: DataFrame
Examples
>>> from pymove.utils.integration import union_poi_bar_restaurant >>> pois_df lat lon id type_poi 0 39.984094 116.319236 1 restaurant 1 39.984198 116.319322 2 restaurant 2 39.984224 116.319402 3 randomvalue 3 39.984211 116.319389 4 bar 4 39.984217 116.319422 5 bar 5 39.984710 116.319865 6 bar-restaurant 6 39.984674 116.319810 7 random123 7 39.984623 116.319773 8 123 >>> union_poi_bar_restaurant(pois_df) lat lon id type_poi 0 39.984094 116.319236 1 bar-restaurant 1 39.984198 116.319322 2 bar-restaurant 2 39.984224 116.319402 3 randomvalue 3 39.984211 116.319389 4 bar-restaurant 4 39.984217 116.319422 5 bar-restaurant 5 39.984710 116.319865 6 bar-restaurant 6 39.984674 116.319810 7 random123 7 39.984623 116.319773 8 123
-
pymove.utils.integration.
union_poi_bus_station
(data: DataFrame, label_poi: str = 'type_poi', bus_stations: list[str] | None = None, inplace: bool = False) → DataFrame | None[source]¶ Performs the union between the different bus station categories.
For Points of Interest in a single category named ‘bus_station’.
Parameters: - data (DataFrame) – Input points of interest data
- label_poi (str, optional) – Label referring to the Point of Interest category, by default TYPE_POI
- bus_stations (list of str, optional) –
- Names of poi refering to bus_stations, by default
- bus_stations = [
- ‘transit_station’, ‘pontos_de_onibus’
]
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns: data with poi or None
Return type: DataFrame
Examples
>>> from pymove.utils.integration import union_poi_bus_station >>> pois_df lat lon id type_poi 0 39.984094 116.319236 1 transit_station 1 39.984198 116.319322 2 randomvalue 2 39.984224 116.319402 3 transit_station 3 39.984211 116.319389 4 pontos_de_onibus 4 39.984217 116.319422 5 transit_station 5 39.984710 116.319865 6 randomvalue 6 39.984674 116.319810 7 bus_station 7 39.984623 116.319773 8 bus_station >>> union_poi_bus_station(pois_df) lat lon id type_poi 0 39.984094 116.319236 1 bus_station 1 39.984198 116.319322 2 randomvalue 2 39.984224 116.319402 3 bus_station 3 39.984211 116.319389 4 bus_station 4 39.984217 116.319422 5 bus_station 5 39.984710 116.319865 6 randomvalue 6 39.984674 116.319810 7 bus_station 7 39.984623 116.319773 8 bus_station
-
pymove.utils.integration.
union_poi_parks
(data: DataFrame, label_poi: str = 'type_poi', parks: list[str] | None = None, inplace: bool = False) → DataFrame | None[source]¶ Performs the union between park categories.
For Points of Interest in a single category named ‘parks’.
Parameters: - data (DataFrame) – Input points of interest data
- label_poi (str, optional) – Label referring to the Point of Interest category, by default TYPE_POI
- parks (list of str, optional) –
- Names of poi refering to parks, by default
- parks = [
- ‘pracas_e_parques’, ‘park’
]
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns: data with poi or None
Return type: DataFrame
Examples
>>> from pymove.utils.integration import union_poi_parks >>> pois_df lat lon id type_poi 0 39.984094 116.319236 1 pracas_e_parques 1 39.984198 116.319322 2 park 2 39.984224 116.319402 3 parks 3 39.984211 116.319389 4 random 4 39.984217 116.319422 5 123 5 39.984710 116.319865 6 park 6 39.984674 116.319810 7 parks 7 39.984623 116.319773 8 pracas_e_parques >>> union_poi_parks(pois_df) lat lon id type_poi 0 39.984094 116.319236 1 parks 1 39.984198 116.319322 2 parks 2 39.984224 116.319402 3 parks 3 39.984211 116.319389 4 random 4 39.984217 116.319422 5 123 5 39.984710 116.319865 6 parks 6 39.984674 116.319810 7 parks 7 39.984623 116.319773 8 parks
-
pymove.utils.integration.
union_poi_police
(data: DataFrame, label_poi: str = 'type_poi', police: list[str] | None = None, inplace: bool = False) → DataFrame | None[source]¶ Performs the union between police categories.
For Points of Interest in a single category named ‘police’.
Parameters: - data (DataFrame) – Input points of interest data
- label_poi (str, optional) – Label referring to the Point of Interest category, by default TYPE_POI
- police (list of str, optional) –
- Names of poi refering to police stations, by default
- police = [
- ‘distritos_policiais’, ‘delegacia’
]
- inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns: data with poi or None
Return type: DataFrame
Examples
>>> from pymove.utils.integration import union_poi_police >>> pois_df lat lon id type_poi 0 39.984094 116.319236 1 distritos_policiais 1 39.984198 116.319322 2 police 2 39.984224 116.319402 3 police 3 39.984211 116.319389 4 distritos_policiais 4 39.984217 116.319422 5 random 5 39.984710 116.319865 6 randomvalue 6 39.984674 116.319810 7 123 7 39.984623 116.319773 8 bus_station >>> union_poi_police(pois_df) lat lon id type_poi 0 39.984094 116.319236 1 police 1 39.984198 116.319322 2 police 2 39.984224 116.319402 3 police 3 39.984211 116.319389 4 police 4 39.984217 116.319422 5 random 5 39.984710 116.319865 6 randomvalue 6 39.984674 116.319810 7 123 7 39.984623 116.319773 8 bus_station
pymove.utils.log module¶
Logging operations.
progress_bar set_verbosity timer_decorator
-
pymove.utils.log.
progress_bar
(sequence: Iterable, desc: str | None = None, total: int | None = None, miniters: int | None = None)[source]¶ Make and display a progress bar.
Parameters: - sequence (iterable) – Represents a sequence of elements.
- desc (str, optional) – Represents the description of the operation, by default None.
- total (int, optional) – Represents the total/number elements in sequence, by default None.
- miniters (int, optional) – Represents the steps in which the bar will be updated, by default None.
Returns: - >>> from pymove.utils.log import progress_bar
- >>> for i in progress_bar(range(1,101), desc=’Print 1 to 100’)
- >>> print(i)
- # A bar that shows the progress of the iterations
pymove.utils.math module¶
Math operations.
is_number, std, avg_std, std_sample, avg_std_sample, arrays_avg, array_stats, interpolation
-
pymove.utils.math.
array_stats
(values_array: list[float]) → tuple[float, float, int][source]¶ Computes statistics about the array.
The sum of all the elements in the array, the sum of the square of each element and the number of elements of the array.
Parameters: values_array (array like of numerical values.) – Represents the set of values to compute the operation. Returns: - float. – The sum of all the elements in the array.
- float – The sum of the square value of each element in the array.
- int. – The number of elements in the array.
Example
>>> from pymove.utils.math import array_stats >>> list = [7.8, 9.7, 6.4, 5.6, 10] >>> print(array_stats(list), type(array_stats(list))) (39.5, 327.25, 5) <class 'tuple'>
-
pymove.utils.math.
arrays_avg
(values_array: list[float], weights_array: list[float] | None = None) → float[source]¶ Computes the mean of the elements of the array.
Parameters: - values_array (array like of numerical values.) – Represents the set of values to compute the operation.
- weights_array (array, optional, default None.) – Used to calculate the weighted average, indicates the weight of each element in the array (values_array).
Returns: The mean of the array elements.
Return type: float
Examples
>>> from pymove.utils.math import arrays_avg >>> list = [7.8, 9.7, 6.4, 5.6, 10] >>> weights = [0.1, 0.3, 0.15, 0.15, 0.3] >>> print('standard average', arrays_avg(list), type(arrays_avg(list))) 'standard average 7.9 <class 'float'>' >>> print( >>> 'weighted average: ', >>> arrays_avg(list, weights), >>> type(arrays_avg(list, weights)) >>> ) 'weighted average: 1.6979999999999997 <class 'float'>'
-
pymove.utils.math.
avg_std
(values_array: list[float]) → tuple[float, float][source]¶ Compute the average of standard deviation.
Parameters: values_array (array like of numerical values.) – Represents the set of values to compute the operation. Returns: - float – Represents the value of average.
- float – Represents the value of standard deviation.
Example
>>> from pymove.utils.math import avg_std >>> list = [7.8, 9.7, 6.4, 5.6, 10] >>> print(avg_std(list), type(avg_std(list))) 1.9493588689617927 <class 'float'>
-
pymove.utils.math.
avg_std_sample
(values_array: list[float]) → tuple[float, float][source]¶ Compute the average of standard deviation of sample.
Parameters: values_array (array like of numerical values.) – Represents the set of values to compute the operation. Returns: - float – Represents the value of average
- float – Represents the standard deviation of sample.
Example
>>> from pymove.utils.math import avg_std_sample >>> list = [7.8, 9.7, 6.4, 5.6, 10] >>> print(avg_std_sample(list), type(avg_std_sample(list))) (7.9, 1.9493588689617927) <class 'tuple'>
-
pymove.utils.math.
interpolation
(x0: float, y0: float, x1: float, y1: float, x: float) → float[source]¶ Performs interpolation.
Parameters: - x0 (float.) – The coordinate of the first point on the x axis.
- y0 (float.) – The coordinate of the first point on the y axis.
- x1 (float.) – The coordinate of the second point on the x axis.
- y1 (float.) – The coordinate of the second point on the y axis.
- x (float.) – A value in the interval (x0, x1).
Returns: Is the interpolated or extrapolated value.
Return type: float.
Example
>>> from pymove.utils.math import interpolation >>> x0, y0, x1, y1, x = 2, 4, 3, 6, 3.5 >>> print(interpolation(x0,y0,x1,y1,x), type(interpolation(x0,y0,x1,y1,x))) 7.0 <class 'float'>
-
pymove.utils.math.
is_number
(value: int | float | str)[source]¶ Returns if value is numerical or not.
Parameters: value (int, float, str) – Returns: True if numerical, otherwise False Return type: boolean Examples
>>> from pymove.utils.math import is_number >>> a, b, c, d = 50, 22.5, '11.25', 'house' >>> print(is_number(a), type(is_number(a))) True <class 'bool'> >>> print(is_number(b), type(is_number(b))) True <class 'bool'> >>> print(is_number(c), type(is_number(c))) True <class 'bool'> >>> print(is_number(d), type(is_number(d))) False <class 'bool'>
-
pymove.utils.math.
std
(values_array: list[float]) → float[source]¶ Compute standard deviation.
Parameters: values_array (array like of numerical values.) – Represents the set of values to compute the operation. Returns: Represents the value of standard deviation. Return type: float References
squaring with * is over 3 times as fast as with **2 http://stackoverflow.com/questions/29046346/comparison-of-power-to-multiplication-in-python
Example
>>> from pymove.utils.math import std >>> list = [7.8, 9.7, 6.4, 5.6, 10] >>> print(std(list), type(std(list))) 1.7435595774162693 <class 'float'>
-
pymove.utils.math.
std_sample
(values_array: list[float]) → float[source]¶ Compute the standard deviation of sample.
Parameters: values_array (array like of numerical values.) – Represents the set of values to compute the operation. Returns: Represents the value of standard deviation of sample. Return type: float Example
>>> from pymove.utils.math import std_sample >>> list = [7.8, 9.7, 6.4, 5.6, 10] >>> print(std_sample(list), type(std_sample(list))) 1.9493588689617927 <class 'float'>
pymove.utils.mem module¶
Memory operations.
reduce_mem_usage_automatic, total_size, begin_operation, end_operation, sizeof_fmt, top_mem_vars
-
pymove.utils.mem.
begin_operation
(name: str) → dict[source]¶ Gets the stats for the current operation.
Parameters: name (str) – name of the operation Returns: dictionary with the operation stats Return type: dict Examples
>>> from pymove.utils.mem import begin_operation >>> operation = begin_operation('operation') >>> operation { 'process': psutil.Process( pid=103401, name='python', status='running', started='21:48:11' ), 'init': 293732352, 'start': 1622082973.8825781, 'name': 'operation' }
-
pymove.utils.mem.
end_operation
(operation: dict) → dict[source]¶ Gets the time and memory usage of the operation.
Parameters: operation (dict) – dictionary with the begining stats of the operation Returns: dictionary with the operation execution stats Return type: dict Examples
>>> import numpy as np >>> import time >>> from pymove.utils.mem import begin_operation, end_operation >>> operation = begin_operation('create_arr') >>> arr = np.arange(100000, dtype=np.float64) >>> time.sleep(1.2) >>> end_operation(operation) {'name': 'create_arr', 'time in seconds': 1.2022554874420166, 'memory': '752.0 KiB'}
-
pymove.utils.mem.
reduce_mem_usage_automatic
(df: pandas.core.frame.DataFrame)[source]¶ Reduces the memory usage of the given dataframe.
- df : dataframe
- The input data to which the operation will be performed.
Examples
>>> import numpy as np >>> import pandas as pd >>> from pymove.utils.mem import reduce_mem_usage_automatic >>> df = pd.DataFrame({'col_1': np.arange(10000, dtype=np.float64)}) >>> df.dtytes col_1 float64 dtype: object >>> reduce_mem_usage_automatic(df) 'Memory usage of dataframe is 0.08 MB' 'Memory usage after optimization is: 0.02 MB' 'Decreased by 74.9 %' >>> df.dtytes col_1 float16 dtype: object
-
pymove.utils.mem.
sizeof_fmt
(mem_usage: float, suffix: str = 'B') → str[source]¶ Returns the memory usage calculation of the last function.
Parameters: - mem_usage (float) – memory usage in bytes
- suffix (string, optional) – suffix of the unit, by default ‘B’
Returns: A string of the memory usage in a more readable format
Return type: str
Examples
>>> from pymove.utils.mem import sizeof_fmt >>> sizeof_fmt(1024) 1.0 KiB >>> sizeof_fmt(2e6) 1.9 MiB
-
pymove.utils.mem.
top_mem_vars
(variables: dict, n: int = 10, hide_private=True) → pandas.core.frame.DataFrame[source]¶ Shows the sizes of the active variables.
Parameters: - variables (locals() or globals()) – Whether to shows local or global variables
- n (int, optional) – number of variables to show, by default
- hide_private (bool, optional) – Whether to hide private variables, by default True
Returns: dataframe with variables names and sizes
Return type: DataFrame
Examples
>>> import numpy as np >>> from pymove.utils.mem import top_mem_vars >>> arr = np.arange(100000, dtype=np.float64) >>> long_string = 'Hello World!' * 100 >>> top_mem_vars(locals()) var mem 0 arr 781.4 KiB 1 long_string 1.2 KiB 2 local 416.0 B 3 top_mem_vars 136.0 B 4 np 72.0 B
-
pymove.utils.mem.
total_size
(o: object, handlers: Optional[dict] = None, verbose: bool = True) → float[source]¶ Calculates the approximate memory footprint of an given object.
Automatically finds the contents of the following builtin containers and their subclasses: tuple, list, deque, dict, set and frozenset.
Parameters: - o (object) – The object to calculate his memory footprint.
- handlers (dict, optional) –
- To search other containers, add handlers to iterate over their contents,
- handlers = {SomeContainerClass: iter,
- OtherContainerClass: OtherContainerClass.get_elements}
by default None
- verbose (boolean, optional) –
If set to True, the following information will be printed for each content of the object, by default False
- the size of the object in bytes.
- his type_
- the object values
Returns: The memory used by the given object
Return type: float
Examples
>>> import numpy as np >>> from pymove.utils.mem import total_size >>> arr = np.arange(10000, dtype=np.float64) >>> sz = total_size(arr) 'Size in bytes: 80104, Type: <class 'numpy.ndarray'>' >>> sz 432
pymove.utils.trajectories module¶
Data operations.
read_csv, invert_dict, flatten_dict, flatten_columns, shift, fill_list_with_new_values, object_for_array, column_to_array
-
pymove.utils.trajectories.
column_to_array
(data: pandas.core.frame.DataFrame, column: str) → pandas.core.frame.DataFrame[source]¶ Transforms all columns values to list.
Parameters: - data (dataframe) – The input trajectory data
- column (str) – Label of data referring to the column for conversion
Returns: Dataframe with the selected column converted to list
Return type: dataframe
Example
>>> from pymove.utils.trajectories import column_to_array >>> move_df lat lon datetime id list_column 0 39.984094 116.319236 2008-10-23 05:53:05 1 '[1,2]' 1 39.984198 116.319322 2008-10-23 05:53:06 1 '[3,4]' 2 39.984224 116.319402 2008-10-23 05:53:11 1 '[5,6]' 3 39.984211 116.319389 2008-10-23 05:53:16 1 '[7,8]' 4 39.984217 116.319422 2008-10-23 05:53:21 1 '[9,10]' >>> column_to_array(move_df, column='list_column') lat lon datetime id list_column 0 39.984094 116.319236 2008-10-23 05:53:05 1 [1.0,2.0] 1 39.984198 116.319322 2008-10-23 05:53:06 1 [3.0,4.0] 2 39.984224 116.319402 2008-10-23 05:53:11 1 [5.0,6.0] 3 39.984211 116.319389 2008-10-23 05:53:16 1 [7.0,8.0] 4 39.984217 116.319422 2008-10-23 05:53:21 1 [9.0,10.0]
-
pymove.utils.trajectories.
fill_list_with_new_values
(original_list: list, new_list_values: list)[source]¶ Copies elements from one list to another.
The elements will be positioned in the same position in the new list as they were in their original list.
Parameters: - original_list (list.) – The list to which the elements will be copied
- new_list_values (list.) – The list from which elements will be copied
Example
>>> from pymove.utils.trajectories import fill_list_with_new_values >>> lst = [1, 2, 3, 4] >>> fill_list_with_new_values(lt, ['a','b']) >>> print(lst) ['a', 'b', 3, 4]
-
pymove.utils.trajectories.
flatten_columns
(data: pandas.core.frame.DataFrame, columns: list) → pandas.core.frame.DataFrame[source]¶ Transforms columns containing dictionaries in individual columns.
Parameters: - data (DataFrame) – Dataframe with columns to be flattened
- columns (list) – List of columns from dataframe containing dictionaries
Returns: Dataframe with the new columns from the flattened dictionary columns
Return type: dataframe
References
https://stackoverflow.com/questions/51698540/import-nested-mongodb-to-pandas
Examples
>>> from pymove.utils.trajectories import flatten_columns >>> move_df lat lon datetime id dict_column 0 39.984094 116.319236 2008-10-23 05:53:05 1 {'a': 1} 1 39.984198 116.319322 2008-10-23 05:53:06 1 {'b': 2} 2 39.984224 116.319402 2008-10-23 05:53:11 1 {'c': 3, 'a': 4} 3 39.984211 116.319389 2008-10-23 05:53:16 1 {'b': 2} 4 39.984217 116.319422 2008-10-23 05:53:21 1 {'a': 3, 'c': 2} >>> flatten_columns(move_df, columns='dict_column') lat lon datetime id dict_column_b dict_column_c dict_column_a 0 39.984094 116.319236 2008-10-23 05:53:05 1 NaN NaN 1.0 1 39.984198 116.319322 2008-10-23 05:53:06 1 2.0 NaN NaN 2 39.984224 116.319402 2008-10-23 05:53:11 1 NaN 3.0 4.0 3 39.984211 116.319389 2008-10-23 05:53:16 1 2.0 NaN NaN 4 39.984217 116.319422 2008-10-23 05:53:21 1 NaN 2.0 3.0
-
pymove.utils.trajectories.
flatten_dict
(d: dict, parent_key: str = '', sep: str = '_') → dict[source]¶ Flattens a nested dictionary.
Parameters: - d (dict) – Dictionary to be flattened
- parent_key (str, optional) – Key of the parent dictionary, by default ‘’
- sep (str, optional) – Separator for the parent and child keys, by default ‘_’
Returns: Flattened dictionary
Return type: dict
References
https://stackoverflow.com/questions/6027558/flatten-nested-dictionaries-compressing-keys
Examples
>>> from pymove.utils.trajectories import flatten_dict >>> d = {'a': 1, 'b': {'c': 2, 'd': 3}} >>> flatten_dict(d) {'a': 1, 'b_c': 2, 'b_d': 3}
-
pymove.utils.trajectories.
invert_dict
(d: dict) → dict[source]¶ Inverts the key:value relation of a dictionary.
Parameters: d (dict) – dictionary to be inverted Returns: inverted dict Return type: dict Examples
>>> from pymove.utils.trajectories import invert_dict >>> traj_dict = {'a': 1, 'b': 2} >>> invert_dict(traj_dict) {1: 'a, 2: 'b'}
-
pymove.utils.trajectories.
object_for_array
(object_: str) → numpy.ndarray[source]¶ Transforms an object into an array.
Parameters: object (str) – object representing a list of integers or strings Returns: object converted to a list Return type: array Examples
>>> from pymove.utils.trajectories import object_for_array >>> list_str = '[1,2,3,4,5]' >>> object_for_array(list_str) array([1., 2., 3., 4., 5.], dtype=float32)
-
pymove.utils.trajectories.
read_csv
(filepath_or_buffer: Union[PathLike[str], str, IO[AnyStr], io.RawIOBase, io.BufferedIOBase, io.TextIOBase, _io.TextIOWrapper, mmap.mmap], latitude: str = 'lat', longitude: str = 'lon', datetime: str = 'datetime', traj_id: str = 'id', type_: str = 'pandas', n_partitions: int = 1, **kwargs)[source]¶ Reads a csv file and structures the data.
Parameters: - filepath_or_buffer (str or path object or file-like object) – Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.csv. If you want to pass in a path object, pandas accepts any os.PathLike. By file-like object, we refer to objects with a read() method, such as a file handle (e.g. via builtin open function) or StringIO.
- latitude (str, optional) – Represents the column name of feature latitude, by default ‘lat’
- longitude (str, optional) – Represents the column name of feature longitude, by default ‘lon’
- datetime (str, optional) – Represents the column name of feature datetime, by default ‘datetime’
- traj_id (str, optional) – Represents the column name of feature id trajectory, by default ‘id’
- type (str, optional) – Represents the type of the MoveDataFrame, by default ‘pandas’
- n_partitions (int, optional) – Represents number of partitions for DaskMoveDataFrame, by default 1
- **kwargs (Pandas read_csv arguments) – https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html?highlight=read_csv#pandas.read_csv
Returns: Trajectory data
Return type: MoveDataFrameAbstract subclass
Examples
>>> from pymove.utils.trajectories import read_csv >>> move_df = read_csv('geolife_sample.csv') >>> move_df.head() lat lon datetime id 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 39.984198 116.319322 2008-10-23 05:53:06 1 2 39.984224 116.319402 2008-10-23 05:53:11 1 3 39.984211 116.319389 2008-10-23 05:53:16 1 4 39.984217 116.319422 2008-10-23 05:53:21 1 >>> type(move_df) <class 'pymove.core.pandas.PandasMoveDataFrame'>
-
pymove.utils.trajectories.
shift
(arr: list | Series | ndarray, num: int, fill_value: Any | None = None) → ndarray[source]¶ Shifts the elements of the given array by the number of periods specified.
Parameters: - arr (array) – The array to be shifted
- num (int) – Number of periods to shift. Can be positive or negative If positive, the elements will be pulled down, and pulled up otherwise
- fill_value (float, optional) – The scalar value used for newly introduced missing values, by default np.nan
Returns: A new array with the same shape and type_ as the initial given array, but with the indexes shifted.
Return type: array
Notes
Similar to pandas shift, but faster.
References
https://stackoverflow.com/questions/30399534/shift-elements-in-a-numpy-array
Examples
>>> from pymove.utils.trajectories import shift >>> array = [1, 2, 3, 4, 5, 6, 7] >>> shift(array, 1) [0 1 2 3 4 5 6] >>> shift(array, 0) [1, 2, 3, 4, 5, 6, 7] >>> shift(array, -1) [2 3 4 5 6 7 0]
pymove.utils.visual module¶
Visualization auxiliary operations.
add_map_legend, generate_color, rgb, hex_rgb, cmap_hex_color, get_cmap, save_wkt
-
pymove.utils.visual.
add_map_legend
(m: Map, title: str, items: tuple | Sequence[tuple])[source]¶ Adds a legend for a folium map.
Parameters: - m (Map) – Represents a folium map.
- title (str) – Represents the title of the legend
- items (list of tuple) – Represents the color and name of the legend items
References
https://github.com/python-visualization/folium/issues/528#issuecomment-421445303
Examples
>>> import folium >>> from pymove.utils.visual import add_map_legend >>> df lat lon datetime id 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 39.984198 116.319322 2008-10-23 05:53:06 1 2 39.984224 116.319402 2008-10-23 05:53:11 1 3 39.984211 116.319389 2008-10-23 05:53:16 2 4 39.984217 116.319422 2008-10-23 05:53:21 2 >>> m = folium.Map(location=[df.lat.median(), df.lon.median()]) >>> folium.PolyLine(mdf[['lat', 'lon']], color='red').add_to(m) >>> pm.visual.add_map_legend(m, 'Color by ID', [(1, 'red')]) >>> m.get_root().to_dict() { "name": "Figure", "id": "1d32230cd6c54b19b35ceaa864e61168", "children": { "map_6f1abc8eacee41e8aa9d163e6bbb295f": { "name": "Map", "id": "6f1abc8eacee41e8aa9d163e6bbb295f", "children": { "openstreetmap": { "name": "TileLayer", "id": "f58c3659fea348cb828775f223e1e6a4", "children": {} }, "poly_line_75023fd7df01475ea5e5606ddd7f4dd2": { "name": "PolyLine", "id": "75023fd7df01475ea5e5606ddd7f4dd2", "children": {} } } }, "map_legend": { # legend element "name": "MacroElement", "id": "72911b4418a94358ba8790aab93573d1", "children": {} } }, "header": { "name": "Element", "id": "e46930fc4152431090b112424b5beb6a", "children": { "meta_http": { "name": "Element", "id": "868e20baf5744e82baf8f13a06849ecc", "children": {} } } }, "html": { "name": "Element", "id": "9c4da9e0aac349f594e2d23298bac171", "children": {} }, "script": { "name": "Element", "id": "d092078607c04076bf58bd4593fa1684", "children": {} } }
-
pymove.utils.visual.
cmap_hex_color
(cmap: matplotlib.colors.ListedColormap, i: int) → str[source]¶ Convert a Colormap to hex color.
Parameters: - cmap (ListedColormap) – Represents the Colormap
- i (int) – List color index
Returns: Represents corresponding hex str
Return type: str
Examples
>>> from pymove.utils.visual import cmap_hex_color >>> import matplotlib.pyplot as plt >>> jet = plt.get_cmap('jet') # This comand generates a Linear Segmented Colormap >>> print(cmap_hex_color(jet, 0)) '#000080' >>> print(cmap_hex_color(jet, 1)) '#000084'
-
pymove.utils.visual.
generate_color
() → str[source]¶ Generates a random color.
Returns: Return type: Random HEX color Examples
>>> from pymove.utils.visual import generate_color >>> print(generate_color(), type(generate_color())) '#E0FFFF' <class 'str'> >>> print(generate_color(), type(generate_color())) '#808000' <class 'str'>
-
pymove.utils.visual.
get_cmap
(cmap: str) → matplotlib.colors.Colormap[source]¶ Returns a matplotlib colormap instance.
Parameters: cmap (str) – name of the colormar Returns: matplotlib colormap Return type: Colormap Examples
>>> from pymove.utils.visual import get_cmap >>> print(get_cmap('Greys') <matplotlib.colors.LinearSegmentedColormap object at 0x7f743fc04bb0>
-
pymove.utils.visual.
hex_rgb
(rgb_colors: tuple[float, float, float]) → str[source]¶ Return a hex str, as used in Tk plots.
Parameters: rgb_colors (tuple) – Represents a tuple with three positions that correspond to the percentage red, green and blue colors Returns: Represents a color in hexadecimal format Return type: str Examples
>>> from pymove.utils.visual import hex_rgb >>> print(hex_rgb((0.1, 0.2, 0.7)), type(hex_rgb((0.1, 0.2, 0.7)))) '#33B219' <class 'str'> >>> print(hex_rgb((0.5, 0.4, 0.1)), type(hex_rgb((0.5, 0.4, 0.1)))) '#66197F' <class 'str'>
-
pymove.utils.visual.
randint
(low, high=None, size=None, dtype=int)¶ Return random integers from low (inclusive) to high (exclusive).
Return random integers from the “discrete uniform” distribution of the specified dtype in the “half-open” interval [low, high). If high is None (the default), then results are from [0, low).
Note
New code should use the
integers
method of adefault_rng()
instance instead; please see the random-quick-start.Parameters: - low (int or array-like of ints) – Lowest (signed) integers to be drawn from the distribution (unless
high=None
, in which case this parameter is one above the highest such integer). - high (int or array-like of ints, optional) – If provided, one above the largest (signed) integer to be drawn
from the distribution (see above for behavior if
high=None
). If array-like, must contain integer values - size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned. - dtype (dtype, optional) –
Desired dtype of the result. Byteorder must be native. The default value is int.
New in version 1.11.0.
Returns: out – size-shaped array of random integers from the appropriate distribution, or a single such random int if size not provided.
Return type: int or ndarray of ints
See also
random_integers()
- similar to randint, only for the closed interval [low, high], and 1 is the lowest value if high is omitted.
Generator.integers()
- which should be used for new code.
Examples
>>> np.random.randint(2, size=10) array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0]) # random >>> np.random.randint(1, size=10) array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Generate a 2 x 4 array of ints between 0 and 4, inclusive:
>>> np.random.randint(5, size=(2, 4)) array([[4, 0, 2, 1], # random [3, 2, 2, 0]])
Generate a 1 x 3 array with 3 different upper bounds
>>> np.random.randint(1, [3, 5, 10]) array([2, 2, 9]) # random
Generate a 1 by 3 array with 3 different lower bounds
>>> np.random.randint([1, 5, 7], 10) array([9, 8, 7]) # random
Generate a 2 by 4 array using broadcasting with dtype of uint8
>>> np.random.randint([1, 3, 5, 7], [[10], [20]], dtype=np.uint8) array([[ 8, 6, 9, 7], # random [ 1, 16, 9, 12]], dtype=uint8)
- low (int or array-like of ints) – Lowest (signed) integers to be drawn from the distribution (unless
-
pymove.utils.visual.
rgb
(rgb_colors: tuple[float, float, float]) → tuple[int, int, int][source]¶ Return a tuple of integers, as used in AWT/Java plots.
Parameters: rgb_colors (tuple) – Represents a tuple with three positions that correspond to the percentage red, green and blue colors. Returns: Represents a tuple of integers that correspond the colors values. Return type: tuple Examples
>>> from pymove.utils.visual import rgb >>> print(rgb((0.1, 0.2, 0.7)), type(rgb((0.1, 0.2, 0.7)))) (51, 178, 25) <class 'tuple'> >>> print(rgb((0.5, 0.4, 0.1)), type(rgb((0.5, 0.4, 0.1)))) (102, 25, 127) <class 'tuple'>
-
pymove.utils.visual.
save_wkt
(move_data: pandas.core.frame.DataFrame, filename: str, label_id: str = 'id')[source]¶ Save a visualization in a map in a new file .wkt.
Parameters: - move_data (DataFrame) – Input trajectory data
- filename (str) – Represents the filename
- label_id (str) – Represents column name of trajectory id
Returns: File
Return type: A file.wkt that contains geometric points that build a map visualization
Examples
>>> from pymove.utils.visual import save_wkt >>> df.head() lat lon datetime id 0 39.984094 116.319236 2008-10-23 05:53:05 1 1 39.984198 116.319322 2008-10-23 05:53:06 1 2 39.984224 116.319402 2008-10-23 05:53:11 1 3 39.984211 116.319389 2008-10-23 05:53:16 2 4 39.984217 116.319422 2008-10-23 05:53:21 2 >>> save_wkt(df, 'test.wkt', 'id') >>> with open('test.wtk') as f: >>> print(f.read()) 'id;linestring' '1;LINESTRING(116.319236 39.984094,116.319322 39.984198,116.319402 39.984224)' '2;LINESTRING(116.319389 39.984211,116.319422 39.984217)'
Module contents¶
Contains utility functions.
constants, conversions, data_augmentation, datetime, distances, geoutils, integration, log, math, mem, trajectories, visual