pymove.utils package

Submodules

pymove.utils.constants module

PyMove constants.

pymove.utils.conversions module

Unit conversion operations.

lat_meters, meters_to_eps, list_to_str, list_to_csv_str, list_to_svm_line, lon_to_x_spherical, lat_to_y_spherical, x_to_lon_spherical, y_to_lat_spherical, geometry_points_to_lat_and_lon, lat_and_lon_decimal_degrees_to_decimal, ms_to_kmh, kmh_to_ms, meters_to_kilometers, kilometers_to_meters, seconds_to_minutes, minute_to_seconds, minute_to_hours, hours_to_minute, seconds_to_hours, hours_to_seconds

pymove.utils.conversions.geometry_points_to_lat_and_lon(move_data: pandas.core.frame.DataFrame, geometry_label: str = 'geometry', drop_geometry: bool = False, inplace: bool = False) → pandas.core.frame.DataFrame[source]

Creates lat and lon columns from Points in geometry column.

Removes geometries that are not of the Point type.

Parameters:
  • move_data (DataFrame) – Input trajectory data.
  • geometry (str, optional) – Represents column name of the geometry column, by default GEOMETRY
  • drop_geometry (bool, optional) – Option to drop the geometry column, by default False
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import geometry_points_to_lat_and_lon
>>> geom_points_df
    id                     geometry
0    1   POINT (116.36184 39.77529)
1    2   POINT (116.36298 39.77564)
2    3   POINT (116.33767 39.83148)
>>> geometry_points_to_lat_and_lon(geom_points_df)
    id                     geometry        lon       lat
0    1   POINT (116.36184 39.77529)  116.36184  39.77529
1    2   POINT (116.36298 39.77564)  116.36298  39.77564
2    3   POINT (116.33767 39.83148)  116.33767  39.83148
pymove.utils.conversions.hours_to_minute(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Convert values, in hours, in label_distance column to minute.

Parameters:
  • move_data (DataFame) – Input trajectory data.
  • label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
  • new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import hours_to_minute
>>> geo_life_df
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153        0.000278        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788        0.001389         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083        0.001389         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671        0.001389         0.577934
>>> hours_to_minute(geo_life, inplace=False)
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153        0.016667        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788        0.083333         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083        0.083333         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671        0.083333         0.577934
pymove.utils.conversions.hours_to_seconds(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Convert values, in hours, in label_distance column to seconds.

Parameters:
  • move_data (DataFame) – Input trajectory data.
  • label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
  • new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import hours_to_seconds
>>> geo_life_df
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153        0.000278        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788        0.001389         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083        0.001389         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671        0.001389         0.577934
>>> hours_to_seconds(geo_life, inplace=False)
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153             1.0        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788             5.0         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083             5.0         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671             5.0         0.577934
pymove.utils.conversions.kilometers_to_meters(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_distance: str = 'dist_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Convert values, in kilometers, in label_distance column to meters.

Parameters:
  • move_data (DataFame) – Input trajectory data.
  • label_distance (str, optional) – Represents column name of speed, by default DIST_TO_PREV
  • new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import kilometers_to_meters
>>> geo_life_df
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06             0.013690             1.0        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             0.007404             5.0         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             0.001821            5.0         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             0.002890             5.0         0.577934
>>> kilometers_to_meters(geo_life, inplace=False)
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153             1.0        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788             5.0         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083             5.0         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671             5.0         0.577934
pymove.utils.conversions.kmh_to_ms(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_speed: str = 'speed_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Convert values, in kmh, in label_speed column to ms.

Parameters:
  • move_data (DataFame) – Input trajectory data.
  • label_speed (str, optional) – Represents column name of speed, by default SPEED_TO_PREV
  • new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import kmh_to_ms
>>> geo_life_df
   id         lat          lon              datetime          dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153             1.0        49.284551
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788             5.0         5.330727
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083             5.0         1.311180
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671             5.0         2.080563
>>> kmh_to_ms(geo_life, inplace=False)
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153             1.0        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788             5.0         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083             5.0         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671             5.0         0.577934
pymove.utils.conversions.lat_and_lon_decimal_degrees_to_decimal(move_data: pandas.core.frame.DataFrame, latitude: str = 'lat', longitude: str = 'lon') → pandas.core.frame.DataFrame[source]

Converts latitude and longitude format from decimal degrees to decimal format.

Parameters:
  • move_data (DataFrame) – Input trajectory data.
  • latitude (str, optional) – Represents column name of the latitude column, by default LATITUDE
  • longitude (str, optional) – Represents column name of the longitude column, by default LONGITUDE
Returns:

A new dataframe with the converted feature

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import lat_and_lon_decimal_degrees_to_decimal
>>> lat_and_lon_df
   id     lat     lon
0   0   28.0N   94.8W
1   1   41.3N   50.4W
2   1   40.8N   47.5W
>>> lat_and_lon_decimal_degrees_to_decimal(lat_and_lon_df)
   id    lat      lon
0   0   28.0    -94.8
1   1   41.3    -50.4
2   1   40.8    -47.5
pymove.utils.conversions.lat_meters(lat: float) → float[source]

Transform latitude degree to meters.

Parameters:lat (float) – This represent latitude value.
Returns:Represents the corresponding latitude value in meters.
Return type:float

Examples

Latitude in Fortaleza: -3.71839 >>> from pymove.utils.conversions import lat_meters >>> lat_meters(-3.71839) 110832.75545918777

pymove.utils.conversions.lat_to_y_spherical(lat: float | ndarray) → float | ndarray[source]

Convert latitude to Y EPSG:3857 WGS 84/Pseudo-Mercator.

Parameters:lat (float) – Represents latitude.
Returns:Y offset from your original position in meters.
Return type:float

Examples

>>> from pymove.utils.conversions import lat_to_y_spherical
>>> lat_fortaleza = -3.71839
>>> for_y = lat_to_y_spherical(lat_fortaleza)
>>> print(y_for, type(y_for))
-414220.15015607665 <class 'numpy.float64'>

References

https://epsg.io/transform

pymove.utils.conversions.list_to_csv_str(input_list: list) → str[source]

Concatenates the elements of the list, joining them by “,”.

Parameters:input_list (list) – List with elements to be joined.
Returns:Returns a string, resulting from concatenation of list elements, separeted by “,”.
Return type:str

Example

>>> from pymove.utils.conversions import list_to_csv_str
>>> list = [1,2,3,4,5]
>>> print(list_to_csv_str(list), type(list_to_csv_str(list)))
1,2,3,4,5 <class 'str'>
pymove.utils.conversions.list_to_str(input_list: list, delimiter: str = ', ') → str[source]

Concatenates a list elements, joining them by the separator delimiter.

Parameters:
  • input_list (list) – List with elements to be joined.
  • delimiter (str, optional) – The separator used between elements, by default ‘,’.
Returns:

Returns a string, resulting from concatenation of list elements, separeted by the delimiter.

Return type:

str

Example

>>> from pymove.utils.conversions import list_to_str
>>> list = [1,2,3,4,5]
>>> print(list_to_str(list, 'x'), type(list_to_str(list)))
1x2x3x4x5 <class 'str'>
pymove.utils.conversions.list_to_svm_line(original_list: list) → str[source]

Concatenates list elements in consecutive element pairs.

Parameters:original_list (list) – The elements to be joined
Returns:Returns a string, resulting from concatenation of list elements in consecutive element pairs, separeted by ” “.
Return type:str

Example

>>> from pymove.utils.conversions import list_to_svm_line
>>> list = [1,2,3,4,5]
>>> print(list_to_svm_line(list), type(list_to_svm_line(list)))
1 1:2 2:3 3:4 4:5 <class 'str'>
pymove.utils.conversions.lon_to_x_spherical(lon: float | ndarray) → float | ndarray[source]

Convert longitude to X EPSG:3857 WGS 84/Pseudo-Mercator.

Parameters:lon (float) – Represents longitude.
Returns:X offset from your original position in meters.
Return type:float

Examples

>>> from pymove.utils.conversions import lon_to_x_spherical
>>> lon_fortaleza = -38.5434
>>> for_x = lon_to_x_spherical(lon_fortaleza)
>>> print(x_for, type(x_for))
-4290631.66144146 <class 'numpy.float64'>

References

https://epsg.io/transform

pymove.utils.conversions.meters_to_eps(radius_meters: float, earth_radius: float = 6371) → float[source]

Converts radius in meters to eps.

Parameters:
  • radius_meters (float) – radius in meters
  • earth_radius (float, optional) – radius of the earth in the location, by default EARTH_RADIUS
Returns:

radius in eps

Return type:

float

Example

>>> from pymove.utils.conversions import meters_to_eps
>>> earth_radius = 6371000
>>> meters_to_eps(earth_radius)
1000.0
pymove.utils.conversions.meters_to_kilometers(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_distance: str = 'dist_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Convert values, in meters, in label_distance column to kilometers.

Parameters:
  • move_data (DataFame) – Input trajectory data.
  • label_distance (str, optional) – Represents column name of speed, by default DIST_TO_PREV
  • new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import meters_to_kilometers
>>> geo_life_df
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153             1.0        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788             5.0         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083             5.0         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671             5.0         0.577934
>>> meters_to_kilometers(geo_life, inplace=False)
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06             0.013690             1.0        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             0.007404             5.0         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             0.001821             5.0         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             0.002890             5.0         0.577934
pymove.utils.conversions.minute_to_hours(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Convert values, in minutes, in label_distance column to hours.

Parameters:
  • move_data (DataFame) – Input trajectory data.
  • label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
  • new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import minute_to_hours, seconds_to_minutes
>>> geo_life_df
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153             1.0        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788             5.0         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083             5.0         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671             5.0         0.577934
>>> seconds_to_minutes(geo_life, inplace=True)
>>> minute_to_hours(geo_life, inplace=False)
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153        0.000278        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788        0.001389         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083        0.001389         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671        0.001389         0.577934
pymove.utils.conversions.minute_to_seconds(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Convert values, in minutes, in label_distance column to seconds.

Parameters:
  • move_data (DataFame) – Input trajectory data.
  • label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
  • new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import minute_to_seconds
>>> geo_life_df
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153        0.016667        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788        0.083333         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083        0.083333         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671        0.083333         0.577934
>>> minute_to_seconds(geo_life, inplace=False)
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153             1.0        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788             5.0         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083             5.0         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671             5.0         0.577934
pymove.utils.conversions.ms_to_kmh(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_speed: str = 'speed_to_prev', new_label: str = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Convert values, in ms, in label_speed column to kmh.

Parameters:
  • move_data (DataFrame) – Input trajectory data.
  • label_speed (str, optional) – Represents column name of speed, by default SPEED_TO_PREV
  • new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import ms_to_kmh
>>> geo_life_df
          lat          lon             datetime     id
0   39.984094   116.319236   2008-10-23 05:53:05     1
1   39.984198   116.319322   2008-10-23 05:53:06     1
2   39.984224   116.319402   2008-10-23 05:53:11     1
3   39.984211   116.319389   2008-10-23 05:53:16     1
4   39.984217   116.319422   2008-10-23 05:53:21     1
>>> geo_life.generate_dist_time_speed_features(inplace=True)
>>> geo_life
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153             1.0        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788             5.0         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083             5.0         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671             5.0         0.577934
>>> ms_to_kmh(geo_life, inplace=False)
   id         lat          lon             datetime          dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153             1.0        49.284551
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788             5.0         5.330727
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083             5.0         1.311180
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671             5.0         2.080563
pymove.utils.conversions.seconds_to_hours(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Convert values, in seconds, in label_distance column to hours.

Parameters:
  • move_data (DataFame) – Input trajectory data.
  • label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
  • new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import minute_to_seconds, seconds_to_hours
>>> geo_life_df
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153        0.016667        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788        0.083333         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083        0.083333         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671        0.083333         0.577934
>>> minute_to_seconds(geo_life, inplace=True)
>>> seconds_to_hours(geo_life, inplace=False)
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153        0.000278        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788        0.001389         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083        0.001389         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671        0.001389         0.577934
pymove.utils.conversions.seconds_to_minutes(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', label_time: str = 'time_to_prev', new_label: str | None = None, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Convert values, in seconds, in label_distance column to minutes.

Parameters:
  • move_data (DataFame) – Input trajectory data.
  • label_time (str, optional) – Represents column name of speed, by default TIME_TO_PREV
  • new_label (str, optional) – Represents a new column that will contain the conversion result, by default None
  • inplace (bool, optional) – Whether the operation will be done in the original dataframe, by default False
Returns:

A new dataframe with the converted feature or None

Return type:

DataFrame

Example

>>> from pymove.utils.conversions import seconds_to_minutes
>>> geo_life_df
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153             1.0        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788             5.0         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083             5.0         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671             5.0         0.577934
>>> seconds_to_minutes(geo_life, inplace=False)
   id         lat          lon              datetime         dist_to_prev    time_to_prev    speed_to_prev
0   1   39.984094   116.319236   2008-10-23 05:53:05                  NaN             NaN              NaN
1   1   39.984198   116.319322   2008-10-23 05:53:06            13.690153        0.016667        13.690153
2   1   39.984224   116.319402   2008-10-23 05:53:11             7.403788        0.083333         1.480758
3   1   39.984211   116.319389   2008-10-23 05:53:16             1.821083        0.083333         0.364217
4   1   39.984217   116.319422   2008-10-23 05:53:21             2.889671        0.083333         0.577934
pymove.utils.conversions.x_to_lon_spherical(x: float | ndarray) → float | ndarray[source]

Convert X EPSG:3857 WGS 84 / Pseudo-Mercator to longitude.

Parameters:x (float) – X offset from your original position in meters.
Returns:Represents longitude.
Return type:float

Examples

>>> from pymove.utils.conversions import x_to_lon_spherical
>>> for_x = -4290631.66144146
>>> print(x_to_lon_spherical(for_x), type(x_to_lon_spherical(for_x)))
-38.5434 <class 'numpy.float64'>

References

https://epsg.io/transform

pymove.utils.conversions.y_to_lat_spherical(y: float | ndarray) → float | ndarray[source]

Convert Y EPSG:3857 WGS 84 / Pseudo-Mercator to latitude.

Parameters:y (float) – Y offset from your original position in meters.
Returns:Represents latitude.
Return type:float

Examples

>>> from pymove.utils.conversions import y_to_lat_spherical
>>> for_y = -414220.15015607665
>>> print(y_to_lat_spherical(y_for), type(y_to_lat_spherical(y_for)))
-3.7183900000000096 <class 'numpy.float64'>

References

https://epsg.io/transform

pymove.utils.data_augmentation module

Data augmentation operations.

append_row, generate_trajectories_df, generate_start_feature, generate_destiny_feature, split_crossover, augmentation_trajectories_df, insert_points_in_df, instance_crossover_augmentation

pymove.utils.data_augmentation.append_row(data: DataFrame, row: Series | None = None, columns: dict | None = None)[source]

Insert a new line in the dataframe with the information passed by parameter.

Parameters:
  • data (DataFrame) – The input trajectories data.
  • row (Series, optional) – The row of a dataframe, by default None
  • columns (dict, optional) – Dictionary containing the values to be added, by default None
pymove.utils.data_augmentation.augmentation_trajectories_df(data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', restriction: str = 'destination only', label_trajectory: str = 'trajectory', insert_at_df: bool = False, frac: float = 0.5) → DataFrame[source]

Generates new data from unobserved trajectories, given a specific restriction.

By default, the algorithm uses the same route destination constraint.

Parameters:
  • data (DataFrame) – The input trajectories data.
  • restriction (str, optional) – Constraint used to generate new data, by default ‘destination only’
  • label_trajectory (str, optional) – Label of the points sequences, by default TRAJECTORY
  • insert_at_df (boolean, optional) – Whether to return a new DataFrame, by default False If True then value of copy is ignored.
  • frac (number, optional) – Represents the percentage to be exchanged, by default 0.5
Returns:

Dataframe with the new data generated

Return type:

DataFrame

pymove.utils.data_augmentation.generate_destiny_feature(data: pandas.core.frame.DataFrame, label_trajectory: str = 'trajectory')[source]

Removes the first point from the trajectory and adds it in a new column ‘start’.

Parameters:
  • data (DataFrame) – The input trajectory data.
  • label_trajectory (str, optional) – Label of the points sequences, by default ‘trajectory’
pymove.utils.data_augmentation.generate_start_feature(data: pandas.core.frame.DataFrame, label_trajectory: str = 'trajectory')[source]

Removes the last point from the trajectory and adds it in a new column ‘destiny’.

Parameters:
  • data (DataFrame) – The input trajectory data.
  • label_trajectory (str, optional) – Label of the points sequences, by default TRAJECTORY
pymove.utils.data_augmentation.generate_trajectories_df(data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame') → DataFrame[source]

Generates a dataframe with the sequence of location points of a trajectory.

Parameters:data (DataFrame) – The input trajectory data.
Returns:DataFrame of the trajectories
Return type:DataFrame
pymove.utils.data_augmentation.insert_points_in_df(data: pandas.core.frame.DataFrame, aug_df: pandas.core.frame.DataFrame)[source]

Inserts the points of the generated trajectories to the original data sets.

Parameters:
  • data (DataFrame) – The input trajectories data
  • aug_df (DataFrame) – The data of unobserved trajectories
pymove.utils.data_augmentation.instance_crossover_augmentation(data: pandas.core.frame.DataFrame, restriction: str = 'destination only', label_trajectory: str = 'trajectory', frac: float = 0.5)[source]

Generates new data from unobserved trajectories, with a specific restriction.

By default, the algorithm uses the same destination constraint as the route and inserts the points on the original dataframe.

Parameters:
  • data (DataFrame) – The input trajectories data
  • restriction (str, optional) – Constraint used to generate new data, by default ‘destination only’
  • label_trajectory (str, optional) – Label of the points sequences, by default ‘trajectory’
  • frac (number, optional) – Represents the percentage to be exchanged, by default 0.5
pymove.utils.data_augmentation.split_crossover(sequence_a: list, sequence_b: list, frac: float = 0.5) → tuple[list, list][source]

Divides two arrays in the indicated ratio and exchange their halves.

Parameters:
  • sequence_a (list or ndarray) – Array any
  • sequence_b (list or ndarray) – Array any
  • frac (float, optional) – Represents the percentage to be exchanged, by default 0.5
Returns:

Arrays with the halves exchanged.

Return type:

Tuple[List, List]

pymove.utils.datetime module

Datetime operations.

date_to_str, str_to_datetime, datetime_to_str, datetime_to_min, min_to_datetime, to_day_of_week_int, working_day, now_str, deltatime_str, timestamp_to_millis, millis_to_timestamp, time_to_str, str_to_time, elapsed_time_dt, diff_time, create_time_slot_in_minute, generate_time_statistics, threshold_time_statistics

pymove.utils.datetime.create_time_slot_in_minute(data: DataFrame, slot_interval: int = 15, initial_slot: int = 0, label_datetime: str = 'datetime', label_time_slot: str = 'time_slot', inplace: bool = False) → DataFrame | None[source]

Partitions the time in slot windows.

Parameters:
  • data (DataFrame) – dataframe with datetime column
  • slot_interval (int, optional) – size of the slot window in minutes, by default 5
  • initial_slot (int, optional) – initial window time, by default 0
  • label_datetime (str, optional) – name of the datetime column, by default DATETIME
  • label_time_slot (str, optional) – name of the time slot column, by default TIME_SLOT
  • inplace (boolean, optional) – wether the operation will be done in the original dataframe, by default False
Returns:

data with converted time slots or None

Return type:

DataFrame

Examples

>>> from pymove.utils.datetime import create_time_slot_in_minute
>>> from pymove import datetime
>>> data
          lat          lon              datetime  id
0   39.984094   116.319236   2008-10-23 05:44:05   1
1   39.984198   116.319322   2008-10-23 05:56:06   1
2   39.984224   116.319402   2008-10-23 05:56:11   1
3   39.984224   116.319402   2008-10-23 06:10:15   1
>>> datetime.create_time_slot_in_minute(data, inplace=False)
          lat          lon              datetime  id   time_slot
0   39.984094   116.319236   2008-10-23 05:44:05   1          22
1   39.984198   116.319322   2008-10-23 05:56:06   1          23
2   39.984224   116.319402   2008-10-23 05:56:11   1          23
3   39.984224   116.319402   2008-10-23 06:10:15   1          24
pymove.utils.datetime.date_to_str(dt: datetime.datetime) → str[source]

Get date, in string format, from timestamp.

Parameters:dt (datetime) – Represents a date
Returns:Represents the date in string format
Return type:str

Example

>>> from datetime import datatime
>>> from pymove.utils.datetime import date_to_str
>>> time_now = datetime.now()
>>> print(time_now)
'2021-04-29 11:01:29.909340'
>>> print(type(time_now))
'<class 'datetime.datetime'>'
>>> print(date_to_str(time_now), type(time_now))
'2021-04-29   <class 'str'>'
pymove.utils.datetime.datetime_to_min(dt: datetime.datetime) → int[source]

Converts a datetime to an int representation in minutes.

To do the reverse use: min_to_datetime.

Parameters:dt (datetime) – Represents a datetime in datetime format
Returns:Represents minutes from
Return type:int

Example

>>> from pymove.utils.datetime import datetime_to_min
>>> from datetime import datetime
>>> time_now = datetime.now()
>>> print(type(datetime_to_min(time_now)))
'<class 'int'>'
>>> datetime_to_min(time_now)
'26996497'
pymove.utils.datetime.datetime_to_str(dt: datetime.datetime) → str[source]

Converts a date in datetime format to string format.

Parameters:dt (datetime) – Represents a datetime in datetime format.
Returns:
  • str – Represents a datetime in string format “%Y-%m-%d %H:%M:%S”.
  • Example
  • ——-
  • >>> from pymove.utils.datetime import datetime_to_str
  • >>> from datetime import datetime
  • >>> time_now = datetime.now()
  • >>> print(time_now)
  • ’2021-04-29 14 (15:29.708113’)
  • >>> print(type(time_now))
  • ’<class ‘datetime.datetime’>’
  • >>> print(datetime_to_str(time_now), type(datetime_to_str(time_now)))
  • ’2021-04-29 14 (15:29 <class ‘str’ >’)
pymove.utils.datetime.deltatime_str(deltatime_seconds: float) → str[source]

Convert time in a format appropriate of time.

Parameters:deltatime_seconds (float) – Represents the elapsed time in seconds
Returns:time_str – Represents time in a format hh:mm:ss
Return type:str

Examples

>>> from pymove.utils.datetime import deltatime_str
>>> deltatime_str(1082.7180936336517)
'18m:02.718s'

Notes

Output example if more than 24 hours: 25:33:57 https://stackoverflow.com/questions/3620943/measuring-elapsed-time-with-the-time-module

pymove.utils.datetime.diff_time(start_time: datetime.datetime, end_time: datetime.datetime) → int[source]

Computes the elapsed time from the start time to the end time specified by the user.

Parameters:
  • start_time (datetime) – Specifies the start time of the time range to be computed
  • end_time (datetime) – Specifies the start time of the time range to be computed
Returns:

Represents the time elapsed from the start time to the current time (when the function was called).

Return type:

int

Examples

>>> from datetime import datetime
>>> from pymove.utils.datetime import str_to_datetime
>>> time_now = datetime.now()
>>> start_time_1 = datetime(2020, 6, 29, 0, 0)
>>> start_time_2 = str_to_datetime('2020-06-29 12:45:59')
>>> print(diff_time(start_time_1, time_now))
26411808665
>>> print(diff_time(start_time_2, time_now))
26365849665
pymove.utils.datetime.elapsed_time_dt(start_time: datetime.datetime) → int[source]

Computes the elapsed time from a specific start time.

Parameters:start_time (datetime) – Specifies the start time of the time range to be computed
Returns:Represents the time elapsed from the start time to the current time (when the function was called).
Return type:int

Examples

>>> from datetime import datetime
>>> from pymove.utils.datetime import str_to_datetime
>>> start_time_1 = datetime(2020, 6, 29, 0, 0)
>>> start_time_2 = str_to_datetime('2020-06-29 12:45:59')
>>> print(elapsed_time_dt(start_time_1))
26411808666
>>> print(elapsed_time_dt(start_time_2))
26365849667
pymove.utils.datetime.generate_time_statistics(data: pandas.core.frame.DataFrame, local_label: str = 'local_label')[source]

Calculates time statistics of the pairwise local labels.

(average, standard deviation, minimum, maximum, sum and count) of the pairwise local labels of a symbolic trajectory.

Parameters:
  • data (DataFrame) – The input trajectories date.
  • local_label (str, optional) – The name of the feature with local id, by default LOCAL_LABEL
Returns:

Statistics infomations of the pairwise local labels

Return type:

DataFrame

Example

>>> from pymove.utils.datetime import generate_time_statistics
>>> df
    local_label   prev_local   time_to_prev   id
0         house          NaN            NaN    1
1        market        house          720.0    1
2        market       market            5.0    1
3        market       market            1.0    1
4        school       market          844.0    1
>>> generate_time_statistics(df)
   local_label   prev_local    mean        std                min          max     sum      count
0        house       market   844.0   0.000000              844.0        844.0   844.0          1
1       market        house   720.0   0.000000              720.0        720.0   720.0          1
2       market       market     3.0   2.828427                1.0          5.0     6.0          2
pymove.utils.datetime.millis_to_timestamp(milliseconds: float) → pandas._libs.tslibs.timestamps.Timestamp[source]

Converts milliseconds to timestamp.

Parameters:milliseconds (int) – Represents millisecond.
Returns:Represents the date corresponding.
Return type:Timestamp

Examples

>>> from pymove.utils.datetime import millis_to_timestamp
>>> millis_to_timestamp(1449907200123)
'2015-12-12 08:00:00.123000'
pymove.utils.datetime.min_to_datetime(minutes: int) → datetime.datetime[source]

Converts an int representation in minutes to a datetime.

To do the reverse use: datetime_to_min.

Parameters:minutes (int) – Represents minutes
Returns:Represents minutes in datetime format
Return type:datetime

Example

>>> from pymove.utils.datetime import min_to_datetime
>>> print(min_to_datetime(26996497), type(min_to_datetime(26996497)))
'2021-04-30 13:37:00 <class 'datetime.datetime'>'
pymove.utils.datetime.now_str() → str[source]

Get datetime of now.

Returns:Represents a date
Return type:str

Examples

>>> from pymove.utils.datetime import now_str
>>> now_str()
'2019-09-02 13:54:16'
pymove.utils.datetime.str_to_datetime(dt_str: str) → datetime.datetime[source]

Converts a datetime in string format to datetime format.

Parameters:dt_str (str) – Represents a datetime in string format, “%Y-%m-%d” or “%Y-%m-%d %H:%M:%S”
Returns:Represents a datetime in datetime format
Return type:datetime

Example

>>> from pymove.utils.datetime import str_to_datetime
>>> time_1 = '2020-06-29'
>>> time_2 = '2020-06-29 12:45:59'
>>> print(type(time_1), type(time_2))
'<class 'str'> <class 'str'>'
>>> print( str_to_datetime(time_1), type(str_to_datetime(time_1)))
'2020-06-29 00:00:00 <class 'datetime.datetime'>'
>>> print(str_to_datetime(time_2), type(str_to_datetime(time_2)))
'2020-06-29 12:45:59 <class 'datetime.datetime'>'
pymove.utils.datetime.str_to_time(dt_str: str) → datetime.datetime[source]

Converts a time in string format “%H:%M:%S” to datetime format.

Parameters:dt_str (str) – Represents a time in string format
Returns:Represents a time in datetime format
Return type:datetime

Examples

>>> from pymove.utils.datetime import str_to_time
>>> str_to_time("08:00:00")
datetime(1900, 1, 1, 8, 0)
pymove.utils.datetime.threshold_time_statistics(df_statistics: DataFrame, mean_coef: float = 1.0, std_coef: float = 1.0, inplace: bool = False) → DataFrame | None[source]

Calculates and creates the threshold column.

The values are based in the time statistics dataframe for each segment.

Parameters:
  • df_statistics (DataFrame) – Time Statistics of the pairwise local labels.
  • mean_coef (float) – Multiplication coefficient of the mean time for the segment, by default 1.0
  • std_coef (float) – Multiplication coefficient of sdt time for the segment, by default 1.0
  • inplace (boolean, optional) – wether the operation will be done in the original dataframe, by default False
Returns:

DataFrame of time statistics with the aditional feature: threshold, which indicates the time limit of the trajectory segment, or None

Return type:

DataFrame

Example

>>> from pymove.utils.datetime import generate_time_statistics
>>> df
    local_label   prev_local   time_to_prev   id
0         house          NaN            NaN    1
1        market        house          720.0    1
2        market       market            5.0    1
3        market       market            1.0    1
4        school       market          844.0    1
>>> statistics = generate_time_statistics(df)
>>> statistics
    local_label   prev_local    mean        std     min     max     sum   count
0         house       market   844.0   0.000000   844.0   844.0   844.0       1
1        market        house   720.0   0.000000   720.0   720.0   720.0       1
2        market       market     3.0   2.828427     1.0     5.0     6.0       2
>>> threshold_time_statistics(statistics)
    local_label   prev_local    mean         std     min                 max          sum   count   threshold
0         house       market   844.0    0.000000   844.0               844.0        844.0       1       844.0
1        market        house   720.0    0.000000   720.0               720.0        720.0       1       720.0
2        market       market     3.0    2.828427     1.0                 5.0          6.0       2         5.8
pymove.utils.datetime.time_to_str(time: pandas._libs.tslibs.timestamps.Timestamp) → str[source]

Get time, in string format, from timestamp.

Parameters:time (Timestamp) – Represents a time
Returns:Represents the time in string format
Return type:str

Examples

>>> from pymove.utils.datetime import time_to_str
>>> time_to_str("2015-12-12 08:00:00.123000")
'08:00:00'
pymove.utils.datetime.timestamp_to_millis(timestamp: str) → int[source]

Converts a local datetime to a POSIX timestamp in milliseconds (like in Java).

Parameters:timestamp (str) – Represents a date
Returns:Represents millisecond results
Return type:int

Examples

>>> from pymove.utils.datetime import timestamp_to_millis
>>> timestamp_to_millis('2015-12-12 08:00:00.123000')
1449907200123 (UTC)
pymove.utils.datetime.to_day_of_week_int(dt: datetime.datetime) → int[source]

Get day of week of a date. Monday == 0…Sunday == 6.

Parameters:dt (datetime) – Represents a datetime in datetime format.
Returns:Represents day of week.
Return type:int

Example

>>> from pymove.utils.datetime import str_to_datetime
>>> monday = str_to_datetime('2021-05-3 12:00:01')
>>> friday = str_to_datetime('2021-05-7 12:00:01')
>>> print(to_day_of_week_int(monday), type(to_day_of_week_int(monday)))
'0 <class 'int'>'
>>> print(to_day_of_week_int(friday), type(to_day_of_week_int(friday)))
'4 <class 'int'>'
pymove.utils.datetime.working_day(dt: str | datetime, country: str = 'BR', state: str | None = None) → bool[source]

Indices if a day specified by the user is a working day.

Parameters:
  • dt (str or datetime) – Specifies the day the user wants to know if it is a business day.
  • country (str) – Indicates country to check for vacation days, by default ‘BR’
  • state (str) – Indicates state to check for vacation days, by default None
Returns:

if true, means that the day informed by the user is a working day. if false, means that the day is not a working day.

Return type:

boolean

Examples

>>> from pymove.utils.datetime import str_to_datetime
>>> independence_day = str_to_datetime('2021-09-7 12:00:01') # Holiday in Brazil
>>> next_day = str_to_datetime('2021-09-8 12:00:01') # Not a Holiday in Brazil
>>> print(working_day(independence_day, 'BR'))
False
>>> print(type(working_day(independence_day, 'BR')))
<class 'bool'>
>>> print(working_day(next_day, 'BR'))
True
>>> print(type(working_day(next_day, 'BR')))
'<class 'bool'>'

References

Countries and States names available in https://pypi.org/project/holidays/

pymove.utils.distances module

Distances operations.

haversine, euclidean_distance_in_meters, nearest_points, medp, medt

pymove.utils.distances.euclidean_distance_in_meters(lat1: float | ndarray, lon1: float | ndarray, lat2: float | ndarray, lon2: float | ndarray) → float | ndarray[source]

Calculate the euclidean distance in meters between two points.

Parameters:
  • lat1 (float or array) – latitute of point 1
  • lon1 (float or array) – longitude of point 1
  • lat2 (float or array) – latitute of point 2
  • lon2 (float or array) – longitude of point 2
Returns:

euclidean distance in meters between the two points.

Return type:

float or ndarray

Example

>>> from pymove.utils.distances import euclidean_distance_in_meters
>>> lat_fortaleza, lon_fortaleza = [-3.71839 ,-38.5434]
>>> lat_quixada, lon_quixada = [-4.979224744401671, -39.056434302570665]
>>> euclidean_distance_in_meters(
>>>    lat_fortaleza, lon_fortaleza, lat_quixada, lon_quixada
>>> )
151907.9670136588
pymove.utils.distances.haversine(lat1: float | ndarray, lon1: float | ndarray, lat2: float | ndarray, lon2: float | ndarray, to_radians: bool = True, earth_radius: float = 6371) → float | ndarray[source]

Calculates the great circle distance between two points on the earth.

Specified in decimal degrees or in radians. All (lat, lon) coordinates must have numeric dtypes and be of equal length. Result in meters. Use 3956 in earth radius for miles.

Parameters:
  • lat1 (float or array) – latitute of point 1
  • lon1 (float or array) – longitude of point 1
  • lat2 (float or array) – latitute of point 2
  • lon2 (float or array) – longitude of point 2
  • to_radians (boolean) – Wether to convert the values to radians, by default True
  • earth_radius (int) – Radius of sphere, by default EARTH_RADIUS
Returns:

Represents distance between points in meters

Return type:

float or ndarray

Example

>>> from pymove.utils.distances import haversine
>>> lat_fortaleza, lon_fortaleza = [-3.71839 ,-38.5434]
>>> lat_quixada, lon_quixada = [-4.979224744401671, -39.056434302570665]
>>> haversine(lat_fortaleza, lon_fortaleza, lat_quixada, lon_quixada)
151298.02548428564

References

Vectorized haversine function:
https://stackoverflow.com/questions/43577086/pandas-calculate-haversine-distance-within-each-group-of-rows
About distance between two points:
https://janakiev.com/blog/gps-points-distance-python/
pymove.utils.distances.medp(traj1: pandas.core.frame.DataFrame, traj2: pandas.core.frame.DataFrame, latitude: str = 'lat', longitude: str = 'lon') → float[source]

Returns the Mean Euclidian Distance Predictive between two trajectories.

Considers only the spatial dimension for the similarity measure.

Parameters:
  • traj1 (dataframe) – The input of one trajectory.
  • traj2 (dataframe) – The input of another trajectory.
  • latitude (str, optional) – Label of the trajectories dataframe referring to the latitude, by default LATITUDE
  • longitude (str, optional) – Label of the trajectories dataframe referring to the longitude, by default LONGITUDE
Returns:

total distance

Return type:

float

Example

>>> from pymove.utils.distances import medp
>>> traj_1
            lat          lon           datetime     id
0   39.98471   116.319865   2008-10-23 05:53:23      1
>>> traj_2
            lat        lon             datetime     id
0   39.984674   116.31981   2008-10-23 05:53:28      1
>>> medp(traj_1, traj_2)
6.573431370981577e-05
pymove.utils.distances.medt(traj1: pandas.core.frame.DataFrame, traj2: pandas.core.frame.DataFrame, latitude: str = 'lat', longitude: str = 'lon', datetime: str = 'datetime') → float[source]

Returns the Mean Euclidian Distance Trajectory between two trajectories.

Considers the spatial dimension and the temporal dimension when measuring similarity.

Parameters:
  • traj1 (dataframe) – The input of one trajectory.
  • traj2 (dataframe) – The input of another trajectory.
  • latitude (str, optional) – Label of the trajectories dataframe referring to the latitude, by default LATITUDE
  • longitude (str, optional) – Label of the trajectories dataframe referring to the longitude, by default LONGITUDE
  • datetime (str, optional) – Label of the trajectories dataframe referring to the timestamp, by default DATETIME
Returns:

total distance

Return type:

float

Example

>>> from pymove.utils.distances import medt
>>> traj_1
         lat          lon              datetime  id
0   39.98471   116.319865   2008-10-23 05:53:23   1
>>> traj_2
          lat         lon              datetime  id
0   39.984674   116.31981   2008-10-23 05:53:28   1
>>> medt(traj_1, traj_2)
6.592419887747872e-05
pymove.utils.distances.nearest_points(traj1: pandas.core.frame.DataFrame, traj2: pandas.core.frame.DataFrame, latitude: str = 'lat', longitude: str = 'lon') → pandas.core.frame.DataFrame[source]

Returns the point closest to another trajectory based on the Euclidean distance.

Parameters:
  • traj1 (dataframe) – The input of one trajectory.
  • traj2 (dataframe) – The input of another trajectory.
  • latitude (str, optional) – Label of the trajectories dataframe referring to the latitude, by default LATITUDE
  • longitude (str, optional) – Label of the trajectories dataframe referring to the longitude, by default LONGITUDE
Returns:

dataframe with closest points

Return type:

DataFrame

Example

>>> from pymove.utils.distances import nearest_points
>>> df_a
         lat          lon               datetime    id
0   39.984198   116.319322   2008-10-23 05:53:06     1
1   39.984224   116.319402   2008-10-23 05:53:11     1
>>> df_b
          lat          lon              datetime    id
0   39.984211   116.319389   2008-10-23 05:53:16     1
1   39.984217   116.319422   2008-10-23 05:53:21     1
>>> nearest_points(df_a,df_b)
          lat          lon              datetime    id
0   39.984211   116.319389   2008-10-23 05:53:16     1
1   39.984211   116.319389   2008-10-23 05:53:16     1

pymove.utils.geoutils module

Geo operations.

v_color, create_geohash_df, create_bin_geohash_df, decode_geohash_to_latlon,

pymove.utils.geoutils.create_bin_geohash_df(data: pandas.core.frame.DataFrame, precision: float = 15)[source]

Create trajectory geohash binaries and integrate with df.

Parameters:
  • data (dataframe) – The input trajectories data
  • precision (float, optional) – Number of characters in resulting geohash, by default 15
Returns:

Return type:

A DataFrame with the additional column ‘bin_geohash’

Example

>>> from pymove.utils.geoutils import create_bin_geohash_df
>>> geoLife_df
         lat          lon
0   39.984094   116.319236
1   39.984198   116.319322
2   39.984224   116.319402
3   39.984211   116.319389
4   39.984217   116.319422
>>> print(type(create_bin_geohash_df(geoLife_df)))
>>> geoLife_df
<class 'NoneType'>
          lat         lon                                         bin_geohash
0   39.984094   116.319236  [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ...
1   39.984198   116.319322  [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ...
2   39.984224   116.319402  [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ...
3   39.984211   116.319389  [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ...
4   39.984217   116.319422  [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, ...
pymove.utils.geoutils.create_geohash_df(data: pandas.core.frame.DataFrame, precision: float = 15)[source]

Create geohash from geographic coordinates and integrate with df.

Parameters:
  • data (dataframe) – The input trajectories data
  • precision (float, optional) – Number of characters in resulting geohash, by default 15
Returns:

Return type:

A DataFrame with the additional column ‘geohash’

Example

>>> from pymove.utils.geoutils import create_geohash_df, _reset_and_create_arrays_none
>>> geoLife_df
          lat          lon
0   39.984094   116.319236
1   39.984198   116.319322
2   39.984224   116.319402
3   39.984211   116.319389
4   39.984217   116.319422
>>> print(type (create_geohash_df(geoLife_df)))
>>> geoLife_df
<class 'NoneType'>
          lat          lon           geohash
0   39.984094   116.319236   wx4eqyvh4xkg0xs
1   39.984198   116.319322   wx4eqyvhudszsev
2   39.984224   116.319402   wx4eqyvhyx8d9wc
3   39.984211   116.319389   wx4eqyvhyjnv5m7
4   39.984217   116.319422   wx4eqyvhyyr2yy8
pymove.utils.geoutils.decode_geohash_to_latlon(data: pandas.core.frame.DataFrame, label_geohash: str = 'geohash', reset_index: bool = True)[source]

Decode feature with hash of trajectories back to geographic coordinates.

Parameters:
  • data (dataframe) – The input trajectories data
  • label_geohash (str, optional) – The name of the feature with hashed trajectories, by default GEOHASH
  • reset_index (boolean, optional) – Condition to reset the df index, by default True
Returns:

Return type:

A DataFrame with the additional columns ‘lat_decode’ and ‘lon_decode’

Example

>>> from pymove.utils.geoutils import decode_geohash_to_latlon
>>> geoLife_df
          lat          lon           geohash
0   39.984094   116.319236   wx4eqyvh4xkg0xs
1   39.984198   116.319322   wx4eqyvhudszsev
2   39.984224   116.319402   wx4eqyvhyx8d9wc
3   39.984211   116.319389   wx4eqyvhyjnv5m7
4   39.984217   116.319422   wx4eqyvhyyr2yy8
>>> print(type(decode_geohash_to_latlon(geoLife_df)))
>>> geoLife_df
<class 'NoneType'>
          lat          lon           geohash  lat_decode   lon_decode
0   39.984094   116.319236   wx4eqyvh4xkg0xs   39.984094   116.319236
1   39.984198   116.319322   wx4eqyvhudszsev   39.984198   116.319322
2   39.984224   116.319402   wx4eqyvhyx8d9wc   39.984224   116.319402
3   39.984211   116.319389   wx4eqyvhyjnv5m7   39.984211   116.319389
4   39.984217   116.319422   wx4eqyvhyyr2yy8   39.984217   116.319422
pymove.utils.geoutils.v_color(ob: shapely.geometry.base.BaseGeometry) → str[source]

Returns ‘#ffcc33’ if object crosses otherwise it returns ‘#6699cc’.

Parameters:ob (geometry object) – Any geometric object
Returns:Geometric object color
Return type:str

Example

>>> from pymove.utils.geoutils import v_color
>>> from shapely.geometry import LineString
>>> case_1 = LineString([(3,3),(4,4), (3,4)])
>>> case_2 = LineString([(3,3),(4,4), (4,3)])
>>> case_3 = LineString([(3,3),(4,4), (3,4), (4,3)])
>>> print(v_color(case_1), type(v_color(case_1)))
#6699cc <class 'str'>
>>> print(v_color(case_2), type(v_color(case_2)))
#6699cc <class 'str'>
>>> print(v_color(case_3), type(v_color(case_3)))
#ffcc33 <class 'str'>

pymove.utils.integration module

Integration operations.

union_poi_bank, union_poi_bus_station, union_poi_bar_restaurant, union_poi_parks, union_poi_police, join_collective_areas, join_with_pois, join_with_pois_by_category, join_with_events, join_with_event_by_dist_and_time, join_with_home_by_id, merge_home_with_poi

pymove.utils.integration.join_collective_areas(data: DataFrame, areas: DataFrame, label_geometry: str = 'geometry', inplace: bool = False) → DataFrame | None[source]

Performs the integration between trajectories and collective areas.

Generating a new column that informs if the point of the trajectory is inserted in a collective area.

Parameters:
  • data (geopandas.GeoDataFrame) – The input trajectory data
  • areas (geopandas.GeoDataFrame) – The input coletive areas data
  • label_geometry (str, optional) – Label referring to the Point of Interest category, by default GEOMETRY
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

data with joined geometries or None

Return type:

DataFrame

Examples

>>> from pymove.utils.integration import join_collective_areas
>>> data
          lat          lon              datetime   id                      geometry
0   39.984094   116.319236   2008-10-23 05:53:05    1    POINT (116.31924 39.98409)
1   39.984198   116.319322   2008-10-23 05:53:06    1    POINT (116.31932 39.98420)
2   39.984224   116.319402   2008-10-23 05:53:11    1    POINT (116.31940 39.98422)
3   39.984211   116.319389   2008-10-23 05:53:16    1    POINT (116.31939 39.98421)
4   39.984217   116.319422   2008-10-23 05:53:21    1    POINT (116.31942 39.98422)
>>> area_c
         lat         lon               datetime  id                         geometry
0  39.984094  116.319236    2008-10-23 05:53:05   1     POINT (116.319236 39.984094)
1  40.006436  116.317701    2008-10-23 10:53:31   1     POINT (116.317701 40.006436)
2  40.014125  116.306159    2008-10-23 23:43:56   1     POINT (116.306159 40.014125)
3  39.984211  116.319389    2008-10-23 05:53:16   1     POINT (116.319389 39.984211)
    POINT (116.32687 39.97901)
>>> join_collective_areas(gdf, area_c)
>>> gdf.head()
         lat         lon                datetime  id                             geometry    violating
0  39.984094  116.319236    2008-10-23 05:53:05   1         POINT (116.319236 39.984094)         True
1  39.984198  116.319322    2008-10-23 05:53:06   1         POINT (116.319322 39.984198)        False
2  39.984224  116.319402    2008-10-23 05:53:11   1         POINT (116.319402 39.984224)        False
3  39.984211  116.319389    2008-10-23 05:53:16   1         POINT (116.319389 39.984211)         True
4  39.984217  116.319422    2008-10-23 05:53:21   1         POINT (116.319422 39.984217)        False
pymove.utils.integration.join_with_event_by_dist_and_time(data: pandas.core.frame.DataFrame, df_events: pandas.core.frame.DataFrame, label_date: str = 'datetime', label_event_id: str = 'event_id', label_event_type: str = 'event_type', time_window: float = 3600, radius: float = 1000, inplace: bool = False)[source]

Performs the integration between trajectories and events on windows.

Generating new columns referring to the category of the point of interest, the distance between the location of the user and location of the poi based on the distance and on time of each point of the trajectories.

Parameters:
  • data (DataFrame) – The input trajectory data.
  • df_pois (DataFrame) – The input events points of interest data.
  • label_date (str, optional) – Label of data referring to the datetime of the input trajectory data, by default DATETIME
  • label_event_id (str, optional) – Label of df_events referring to the id of the event, by default EVENT_ID
  • label_event_type (str, optional) – Label of df_events referring to the type of the event, by default EVENT_TYPE
  • time_window (float, optional) – tolerable length of time range in `seconds`for assigning the event’s point of interest to the trajectory point, by default 3600
  • radius (float, optional) – maximum radius of pois in meters, by default 1000
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False

Examples

>>> from pymove.utils.integration import join_with_pois_by_dist_and_datetime
>>>  move_df
         lat         lon            datetime  id
0  39.984094  116.319236 2008-10-23 05:53:05   1
1  39.984559  116.326696 2008-10-23 10:37:26   1
2  39.993527  116.326483 2008-10-24 00:02:14   2
3  39.978575  116.326975 2008-10-24 00:22:01   3
4  39.981668  116.310769 2008-10-24 01:57:57   3
>>> events
         lat         lon  id            datetime type_poi           name_poi
0  39.984094  116.319236   1 2008-10-23 05:53:05     show   forro_tropykalia
1  39.991013  116.326384   2 2008-10-23 10:27:26  corrida   racha_de_jumento
2  39.990013  116.316384   2 2008-10-23 10:37:26     show   dia_do_municipio
3  40.010000  116.312615   3 2008-10-24 01:57:57    feira  adocao_de_animais
>>> join_with_pois_by_dist_and_datetime(move_df, pois)
>>> move_df
         lat         lon            datetime  id                type_poi          dist_event                              name_poi
0  39.984094  116.319236 2008-10-23 05:53:05   1                  [show]               [0.0]                    [forro_tropykalia]
1  39.984559  116.326696 2008-10-23 10:37:26   1         [corrida, show]  [718.144, 1067.53]  [racha_de_jumento, dia_do_municipio]
2  39.993527  116.326483 2008-10-24 00:02:14   2                   None                 None                                  None
3  39.978575  116.326975 2008-10-24 00:22:01   3                   None                 None                                  None
4  39.981668  116.310769 2008-10-24 01:57:57   3                   None                 None                                  None
Raises:ValueError – If feature generation fails
pymove.utils.integration.join_with_events(data: pandas.core.frame.DataFrame, df_events: pandas.core.frame.DataFrame, label_date: str = 'datetime', time_window: int = 900, label_event_id: str = 'event_id', label_event_type: str = 'event_type', inplace: bool = False)[source]

Performs the integration between trajectories and the closest event in time window.

Generating new columns referring to the category of the point of interest, the distance from the nearest point of interest based on time of each point of the trajectories.

Parameters:
  • data (DataFrame) – The input trajectory data.
  • df_events (DataFrame) – The input events points of interest data.
  • label_date (str, optional) – Label of data referring to the datetime of the input trajectory data, by default DATETIME
  • time_window (float, optional) – tolerable length of time range in seconds for assigning the event’s point of interest to the trajectory point, by default 900
  • label_event_id (str, optional) – Label of df_events referring to the id of the event, by default EVENT_ID
  • label_event_type (str, optional) – Label of df_events referring to the type of the event, by default EVENT_TYPE
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False

Examples

>>> from pymove.utils.integration import join_with_events
>>>  move_df
         lat         lon            datetime  id
0  39.984094  116.319236 2008-10-23 05:53:05   1
1  39.984559  116.326696 2008-10-23 10:37:26   1
2  39.993527  116.326483 2008-10-24 00:02:14   2
3  39.978575  116.326975 2008-10-24 00:22:01   3
4  39.981668  116.310769 2008-10-24 01:57:57   3
>>> events
         lat         lon  id            datetime  event_type             event_id
0  39.984094  116.319236   1 2008-10-23 05:53:05        show     forro_tropykalia
1  39.991013  116.326384   2 2008-10-23 10:37:26        show     dia_do_municipio
2  40.010000  116.312615   3 2008-10-24 01:57:57        feira   adocao_de_animais
>>> join_with_events(move_df, events)
         lat         lon            datetime  id                event_type   dist_event             event_id
0  39.984094  116.319236 2008-10-23 05:53:05   1                      show     0.000000     forro_tropykalia
1  39.984559  116.326696 2008-10-23 10:37:26   1                      show   718.144152     dia_do_municipio
2  39.993527  116.326483 2008-10-24 00:02:14   2                                    inf
3  39.978575  116.326975 2008-10-24 00:22:01   3                                    inf
4  39.981668  116.310769 2008-10-24 01:57:57   3                     feira  3154.296880    adocao_de_animais
Raises:ValueError – If feature generation fails
pymove.utils.integration.join_with_home_by_id(data: pandas.core.frame.DataFrame, df_home: pandas.core.frame.DataFrame, label_id: str = 'id', label_address: str = 'formatted_address', label_city: str = 'city', drop_id_without_home: bool = False, inplace: bool = False)[source]

Performs the integration between trajectories and home points.

Generating new columns referring to the distance of the nearest home point, address and city of each trajectory point.

Parameters:
  • data (DataFrame) – The input trajectory data.
  • df_home (DataFrame) – The input home points data.
  • label_id (str, optional) – Label of df_home referring to the home point id, by default TRAJ_ID
  • label_address (str, optional) – Label of df_home referring to the home point address, by default ADDRESS
  • label_city (str, optional) – Label of df_home referring to the point city, by default CITY
  • drop_id_without_home (bool, optional) – flag as an option to drop id’s that don’t have houses, by default False
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False

Examples

>>> from pymove.utils.integration import join_with_home_by_id
>>> move_df
          lat          lon              datetime   id
0   39.984094   116.319236   2008-10-23 05:53:05    1
1   39.984559   116.326696   2008-10-23 10:37:26    1
2   40.002899   116.321520   2008-10-23 10:50:16    1
3   40.016238   116.307691   2008-10-23 11:03:06    1
4   40.013814   116.306525   2008-10-23 11:58:33    2
5   40.009735   116.315069   2008-10-23 23:50:45    2
>>> home_df
          lat          lon   id   formatted_address            city
0   39.984094   116.319236    1          rua da mae       quixiling
1   40.013821   116.306531    2      rua da familia   quixeramoling
>>> join_with_home_by_id(move_df, home_df)
>>> move_df
    id         lat          lon              datetime     dist_home                   home                city
0    1   39.984094   116.319236   2008-10-23 05:53:05      0.000000             rua da mae           quixiling
1    1   39.984559   116.326696   2008-10-23 10:37:26    637.690216             rua da mae           quixiling
2    1   40.002899   116.321520   2008-10-23 10:50:16   2100.053501             rua da mae           quixiling
3    1   40.016238   116.307691   2008-10-23 11:03:06   3707.066732             rua da mae           quixiling
4    2   40.013814   116.306525   2008-10-23 11:58:33      0.931101         rua da familia       quixeramoling
5    2   40.009735   116.315069   2008-10-23 23:50:45    857.417540         rua da familia       quixeramoling
pymove.utils.integration.join_with_pois(data: pandas.core.frame.DataFrame, df_pois: pandas.core.frame.DataFrame, label_id: str = 'id', label_poi_name: str = 'name_poi', reset_index: bool = True, inplace: bool = False)[source]

Performs the integration between trajectories and the closest point of interest.

Generating two new columns referring to the name and the distance from the point of interest closest to each point of the trajectory.

Parameters:
  • data (DataFrame) – The input trajectory data.
  • df_pois (DataFrame) – The input point of interest data.
  • label_id (str, optional) – Label of df_pois referring to the Point of Interest id, by default TRAJ_ID
  • label_poi_name (str, optional) – Label of df_pois referring to the Point of Interest name, by default NAME_POI
  • reset_index (bool, optional) – Flag for reset index of the df_pois and data dataframes before the join, by default True
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False

Examples

>>> from pymove.utils.integration import join_with_pois
>>>  move_df
          lat          lon              datetime   id
0   39.984094   116.319236   2008-10-23 05:53:05    1
1   39.984559   116.326696   2008-10-23 10:37:26    1
2   40.002899   116.321520   2008-10-23 10:50:16    1
3   40.016238   116.307691   2008-10-23 11:03:06    1
4   40.013814   116.306525   2008-10-23 11:58:33    2
5   40.009735   116.315069   2008-10-23 23:50:45    2
>>> pois
          lat          lon   id   type_poi              name_poi
0   39.984094   116.319236    1    policia        distrito_pol_1
1   39.991013   116.326384    2    policia       policia_federal
2   40.010000   116.312615    3   comercio   supermercado_aroldo
>>> join_with_pois(move_df, pois)
          lat          lon              datetime   id   id_poi            dist_poi              name_poi
0   39.984094   116.319236   2008-10-23 05:53:05    1        1            0.000000        distrito_pol_1
1   39.984559   116.326696   2008-10-23 10:37:26    1        1          637.690216        distrito_pol_1
2   40.002899   116.321520   2008-10-23 10:50:16    1        3         1094.860663   supermercado_aroldo
3   40.016238   116.307691   2008-10-23 11:03:06    1        3          810.542998   supermercado_aroldo
4   40.013814   116.306525   2008-10-23 11:58:33    2        3          669.973155   supermercado_aroldo
5   40.009735   116.315069   2008-10-23 23:50:45    2        3          211.069129   supermercado_aroldo
pymove.utils.integration.join_with_pois_by_category(data: pandas.core.frame.DataFrame, df_pois: pandas.core.frame.DataFrame, label_category: str = 'type_poi', label_id: str = 'id', inplace: bool = False)[source]

Performs the integration between trajectories and each type of points of interest.

Generating new columns referring to the category and distance from the nearest point of interest that has this category at each point of the trajectory.

Parameters:
  • data (DataFrame) – The input trajectory data.
  • df_pois (DataFrame) – The input point of interest data.
  • label_category (str, optional) – Label of df_pois referring to the point of interest category, by default TYPE_POI
  • label_id (str, optional) – Label of df_pois referring to the point of interest id, by default TRAJ_ID
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False

Examples

>>> from pymove.utils.integration import join_with_pois_by_category
>>>  move_df
          lat          lon              datetime   id
0   39.984094   116.319236   2008-10-23 05:53:05    1
1   39.984559   116.326696   2008-10-23 10:37:26    1
2   40.002899   116.321520   2008-10-23 10:50:16    1
3   40.016238   116.307691   2008-10-23 11:03:06    1
4   40.013814   116.306525   2008-10-23 11:58:33    2
5   40.009735   116.315069   2008-10-23 23:50:45    2
>>> pois
          lat          lon   id   type_poi              name_poi
0   39.984094   116.319236    1    policia        distrito_pol_1
1   39.991013   116.326384    2    policia       policia_federal
2   40.010000   116.312615    3   comercio   supermercado_aroldo
>>> join_with_pois_by_category(move_df, pois)
          lat          lon              datetime   id             id_policia   dist_policia   id_comercio   dist_comercio
0   39.984094   116.319236   2008-10-23 05:53:05    1                      1       0.000000             3     2935.310277
1   39.984559   116.326696   2008-10-23 10:37:26    1                      1     637.690216             3     3072.696379
2   40.002899   116.321520   2008-10-23 10:50:16    1                      2    1385.087181             3     1094.860663
3   40.016238   116.307691   2008-10-23 11:03:06    1                      2    3225.288831             3      810.542998
4   40.013814   116.306525   2008-10-23 11:58:33    2                      2    3047.838222             3      669.973155
5   40.009735   116.315069   2008-10-23 23:50:45    2                      2    2294.075820             3      211.069129
pymove.utils.integration.merge_home_with_poi(data: pandas.core.frame.DataFrame, label_dist_poi: str = 'dist_poi', label_name_poi: str = 'name_poi', label_id_poi: str = 'id_poi', label_home: str = 'home', label_dist_home: str = 'dist_home', drop_columns: bool = True, inplace: bool = False)[source]

Performs or merges the points of interest and the trajectories.

Considering the starting points as other points of interest, generating a new DataFrame.

Parameters:
  • data (DataFrame) – The input trajectory data, with join_with_pois and join_with_home_by_id applied.
  • label_dist_poi (str, optional) – Label of data referring to the distance from the nearest point of interest, by default DIST_POI
  • label_name_poi (str, optional) – Label of data referring to the name from the nearest point of interest, by default NAME_POI
  • label_id_poi (str, optional) – Label of data referring to the id from the nearest point of interest, by default ID_POI
  • label_home (str, optional) – Label of df_home referring to the home point, by default HOME
  • label_dist_home (str, optional) – Label of df_home referring to the distance to the home point, by default DIST_HOME
  • drop_columns (bool, optional) – Flag that controls the deletion of the columns referring to the id and the distance from the home point, by default
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False

Examples

>>> from pymove.utils.integration import (
>>>    merge_home_with_poi,
>>>    join_with_home_by_id
>>> )
>>> move_df
          lat          lon              datetime   id                    id_poi       dist_poi          name_poi
0   39.984094   116.319236   2008-10-23 05:53:05    1                         1       0.000000    distrito_pol_1
1   39.984559   116.326696   2008-10-23 10:37:26    1                         1     637.690216    distrito_pol_1
2   40.002899   116.321520   2008-10-23 10:50:16    1                         2    1385.087181   policia_federal
3   40.016238   116.307691   2008-10-23 11:03:06    1                         2    3225.288831   policia_federal
4   40.013814   116.306525   2008-10-23 11:58:33    2                         2    3047.838222   policia_federal
5   40.009735   116.315069   2008-10-23 23:50:45    2                         2    2294.075820   policia_federal
>>> home_df
           lat          lon   id   formatted_address            city
0   39.984094   116.319236    1          rua da mae       quixiling
1   40.013821   116.306531    2      rua da familia   quixeramoling
>>> join_with_home_by_id(move, home_df, inplace=True)
>>> move_df
    id         lat          lon              datetime   id_poi      dist_poi                name_poi    dist_home         home                city
0    1   39.984094   116.319236   2008-10-23 05:53:05        1      0.000000          distrito_pol_1     0.000000        rua da mae       quixiling
1    1   39.984559   116.326696   2008-10-23 10:37:26        1    637.690216          distrito_pol_1   637.690216        rua da mae       quixiling
2    1   40.002899   116.321520   2008-10-23 10:50:16        2   1385.087181         policia_federal  2100.053501        rua da mae       quixiling
3    1   40.016238    16.307691   2008-10-23 11:03:06        2   3225.288831         policia_federal  3707.066732        rua da mae       quixiling
4    2   40.013814   116.306525   2008-10-23 11:58:33        2   3047.838222         policia_federal     0.931101    rua da familia   quixeramoling
5    2   40.009735   116.315069   2008-10-23 23:50:45        2   2294.075820         policia_federal   857.417540    rua da familia   quixeramoling
>>> merge_home_with_poi(move_df)
    id         lat          lon              datetime           id_poi           dist_poi           name_poi            city
0    1   39.984094   116.319236   2008-10-23 05:53:05       rua da mae           0.000000               home       quixiling
1    1   39.984559   116.326696   2008-10-23 10:37:26       rua da mae         637.690216               home       quixiling
2    1   40.002899   116.321520   2008-10-23 10:50:16                2        1385.087181    policia_federal       quixiling
3    1   40.016238   116.307691   2008-10-23 11:03:06                2        3225.288831    policia_federal       quixiling
4    2   40.013814   116.306525   2008-10-23 11:58:33   rua da familia           0.931101               home   quixeramoling
5    2   40.009735   116.315069   2008-10-23 23:50:45   rua da familia         857.417540               home   quixeramoling
pymove.utils.integration.union_poi_bank(data: DataFrame, label_poi: str = 'type_poi', banks: list[str] | None = None, inplace: bool = False) → DataFrame | None[source]

Performs the union between the different bank categories.

For Points of Interest in a single category named ‘banks’.

Parameters:
  • data (DataFrame) – Input points of interest data
  • label_poi (str, optional) – Label referring to the Point of Interest category, by default TYPE_POI
  • banks (list of str, optional) –
    Names of poi refering to banks, by default
    banks = [ ‘bancos_filiais’, ‘bancos_agencias’, ‘bancos_postos’, ‘bancos_PAE’, ‘bank’,

    ]

  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

data with poi or None

Return type:

DataFrame

Examples

>>> from pymove.utils.integration import union_poi_bank
>>> pois_df
          lat          lon   id          type_poi
0   39.984094   116.319236    1              bank
1   39.984198   116.319322    2       randomvalue
2   39.984224   116.319402    3     bancos_postos
3   39.984211   116.319389    4       randomvalue
4   39.984217   116.319422    5        bancos_PAE
5   39.984710   116.319865    6     bancos_postos
6   39.984674   116.319810    7   bancos_agencias
7   39.984623   116.319773    8    bancos_filiais
8   39.984606   116.319732    9             banks
9   39.984555   116.319728   10             banks
>>> union_poi_bank(pois_df)
          lat          lon   id      type_poi
0   39.984094   116.319236    1         banks
1   39.984198   116.319322    2   randomvalue
2   39.984224   116.319402    3         banks
3   39.984211   116.319389    4   randomvalue
4   39.984217   116.319422    5         banks
5   39.984710   116.319865    6         banks
6   39.984674   116.319810    7         banks
7   39.984623   116.319773    8         banks
8   39.984606   116.319732    9         banks
9   39.984555   116.319728   10         banks
pymove.utils.integration.union_poi_bar_restaurant(data: DataFrame, label_poi: str = 'type_poi', bar_restaurant: list[str] | None = None, inplace: bool = False) → DataFrame | None[source]

Performs the union between bar and restaurant categories.

For Points of Interest in a single category named ‘bar-restaurant’.

Parameters:
  • data (DataFrame) – Input points of interest data
  • label_poi (str, optional) – Label referring to the Point of Interest category, by default TYPE_POI
  • bar_restaurant (list of str, optional) –
    Names of poi refering to bars or restaurants, by default
    bar_restaurant = [
    ‘restaurant’, ‘bar’

    ]

  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

data with poi or None

Return type:

DataFrame

Examples

>>> from pymove.utils.integration import union_poi_bar_restaurant
>>> pois_df
          lat          lon   id         type_poi
0   39.984094   116.319236    1       restaurant
1   39.984198   116.319322    2       restaurant
2   39.984224   116.319402    3      randomvalue
3   39.984211   116.319389    4              bar
4   39.984217   116.319422    5              bar
5   39.984710   116.319865    6   bar-restaurant
6   39.984674   116.319810    7        random123
7   39.984623   116.319773    8              123
>>> union_poi_bar_restaurant(pois_df)
          lat          lon   id         type_poi
0   39.984094   116.319236    1   bar-restaurant
1   39.984198   116.319322    2   bar-restaurant
2   39.984224   116.319402    3      randomvalue
3   39.984211   116.319389    4   bar-restaurant
4   39.984217   116.319422    5   bar-restaurant
5   39.984710   116.319865    6   bar-restaurant
6   39.984674   116.319810    7        random123
7   39.984623   116.319773    8              123
pymove.utils.integration.union_poi_bus_station(data: DataFrame, label_poi: str = 'type_poi', bus_stations: list[str] | None = None, inplace: bool = False) → DataFrame | None[source]

Performs the union between the different bus station categories.

For Points of Interest in a single category named ‘bus_station’.

Parameters:
  • data (DataFrame) – Input points of interest data
  • label_poi (str, optional) – Label referring to the Point of Interest category, by default TYPE_POI
  • bus_stations (list of str, optional) –
    Names of poi refering to bus_stations, by default
    bus_stations = [
    ‘transit_station’, ‘pontos_de_onibus’

    ]

  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

data with poi or None

Return type:

DataFrame

Examples

>>> from pymove.utils.integration import union_poi_bus_station
>>> pois_df
          lat          lon  id           type_poi
0   39.984094   116.319236   1    transit_station
1   39.984198   116.319322   2        randomvalue
2   39.984224   116.319402   3    transit_station
3   39.984211   116.319389   4   pontos_de_onibus
4   39.984217   116.319422   5    transit_station
5   39.984710   116.319865   6        randomvalue
6   39.984674   116.319810   7        bus_station
7   39.984623   116.319773   8        bus_station
>>> union_poi_bus_station(pois_df)
          lat          lon  id           type_poi
0   39.984094   116.319236   1        bus_station
1   39.984198   116.319322   2        randomvalue
2   39.984224   116.319402   3        bus_station
3   39.984211   116.319389   4        bus_station
4   39.984217   116.319422   5        bus_station
5   39.984710   116.319865   6        randomvalue
6   39.984674   116.319810   7        bus_station
7   39.984623   116.319773   8        bus_station
pymove.utils.integration.union_poi_parks(data: DataFrame, label_poi: str = 'type_poi', parks: list[str] | None = None, inplace: bool = False) → DataFrame | None[source]

Performs the union between park categories.

For Points of Interest in a single category named ‘parks’.

Parameters:
  • data (DataFrame) – Input points of interest data
  • label_poi (str, optional) – Label referring to the Point of Interest category, by default TYPE_POI
  • parks (list of str, optional) –
    Names of poi refering to parks, by default
    parks = [
    ‘pracas_e_parques’, ‘park’

    ]

  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

data with poi or None

Return type:

DataFrame

Examples

>>> from pymove.utils.integration import union_poi_parks
>>> pois_df
          lat          lon   id           type_poi
0   39.984094   116.319236    1   pracas_e_parques
1   39.984198   116.319322    2               park
2   39.984224   116.319402    3              parks
3   39.984211   116.319389    4             random
4   39.984217   116.319422    5                123
5   39.984710   116.319865    6               park
6   39.984674   116.319810    7              parks
7   39.984623   116.319773    8   pracas_e_parques
>>> union_poi_parks(pois_df)
          lat          lon   id           type_poi
0   39.984094   116.319236    1              parks
1   39.984198   116.319322    2              parks
2   39.984224   116.319402    3              parks
3   39.984211   116.319389    4             random
4   39.984217   116.319422    5                123
5   39.984710   116.319865    6              parks
6   39.984674   116.319810    7              parks
7   39.984623   116.319773    8              parks
pymove.utils.integration.union_poi_police(data: DataFrame, label_poi: str = 'type_poi', police: list[str] | None = None, inplace: bool = False) → DataFrame | None[source]

Performs the union between police categories.

For Points of Interest in a single category named ‘police’.

Parameters:
  • data (DataFrame) – Input points of interest data
  • label_poi (str, optional) – Label referring to the Point of Interest category, by default TYPE_POI
  • police (list of str, optional) –
    Names of poi refering to police stations, by default
    police = [
    ‘distritos_policiais’, ‘delegacia’

    ]

  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

data with poi or None

Return type:

DataFrame

Examples

>>> from pymove.utils.integration import union_poi_police
>>> pois_df
          lat          lon   id              type_poi
0   39.984094   116.319236    1   distritos_policiais
1   39.984198   116.319322    2                police
2   39.984224   116.319402    3                police
3   39.984211   116.319389    4   distritos_policiais
4   39.984217   116.319422    5                random
5   39.984710   116.319865    6           randomvalue
6   39.984674   116.319810    7                   123
7   39.984623   116.319773    8           bus_station
>>> union_poi_police(pois_df)
          lat          lon   id              type_poi
0   39.984094   116.319236    1                police
1   39.984198   116.319322    2                police
2   39.984224   116.319402    3                police
3   39.984211   116.319389    4                police
4   39.984217   116.319422    5                random
5   39.984710   116.319865    6           randomvalue
6   39.984674   116.319810    7                   123
7   39.984623   116.319773    8           bus_station

pymove.utils.log module

Logging operations.

progress_bar set_verbosity timer_decorator

pymove.utils.log.progress_bar(sequence: Iterable, desc: str | None = None, total: int | None = None, miniters: int | None = None)[source]

Make and display a progress bar.

Parameters:
  • sequence (iterable) – Represents a sequence of elements.
  • desc (str, optional) – Represents the description of the operation, by default None.
  • total (int, optional) – Represents the total/number elements in sequence, by default None.
  • miniters (int, optional) – Represents the steps in which the bar will be updated, by default None.
Returns:

  • >>> from pymove.utils.log import progress_bar
  • >>> for i in progress_bar(range(1,101), desc=’Print 1 to 100’)
  • >>> print(i)
  • # A bar that shows the progress of the iterations

pymove.utils.log.set_verbosity(level)[source]

Change logging level.

pymove.utils.log.timer_decorator(func: Callable) → Callable[source]

A decorator that prints how long a function took to run.

pymove.utils.math module

Math operations.

is_number, std, avg_std, std_sample, avg_std_sample, arrays_avg, array_stats, interpolation

pymove.utils.math.array_stats(values_array: list[float]) → tuple[float, float, int][source]

Computes statistics about the array.

The sum of all the elements in the array, the sum of the square of each element and the number of elements of the array.

Parameters:values_array (array like of numerical values.) – Represents the set of values to compute the operation.
Returns:
  • float. – The sum of all the elements in the array.
  • float – The sum of the square value of each element in the array.
  • int. – The number of elements in the array.

Example

>>> from pymove.utils.math import array_stats
>>> list = [7.8, 9.7, 6.4, 5.6, 10]
>>> print(array_stats(list), type(array_stats(list)))
(39.5, 327.25, 5) <class 'tuple'>
pymove.utils.math.arrays_avg(values_array: list[float], weights_array: list[float] | None = None) → float[source]

Computes the mean of the elements of the array.

Parameters:
  • values_array (array like of numerical values.) – Represents the set of values to compute the operation.
  • weights_array (array, optional, default None.) – Used to calculate the weighted average, indicates the weight of each element in the array (values_array).
Returns:

The mean of the array elements.

Return type:

float

Examples

>>> from pymove.utils.math import arrays_avg
>>> list = [7.8, 9.7, 6.4, 5.6, 10]
>>> weights = [0.1, 0.3, 0.15, 0.15, 0.3]
>>> print('standard average', arrays_avg(list), type(arrays_avg(list)))
'standard average 7.9 <class 'float'>'
>>> print(
>>>    'weighted average: ',
>>>     arrays_avg(list, weights),
>>>     type(arrays_avg(list, weights))
>>> )
'weighted average:  1.6979999999999997 <class 'float'>'
pymove.utils.math.avg_std(values_array: list[float]) → tuple[float, float][source]

Compute the average of standard deviation.

Parameters:values_array (array like of numerical values.) – Represents the set of values to compute the operation.
Returns:
  • float – Represents the value of average.
  • float – Represents the value of standard deviation.

Example

>>> from pymove.utils.math import avg_std
>>> list = [7.8, 9.7, 6.4, 5.6, 10]
>>> print(avg_std(list), type(avg_std(list)))
1.9493588689617927 <class 'float'>
pymove.utils.math.avg_std_sample(values_array: list[float]) → tuple[float, float][source]

Compute the average of standard deviation of sample.

Parameters:values_array (array like of numerical values.) – Represents the set of values to compute the operation.
Returns:
  • float – Represents the value of average
  • float – Represents the standard deviation of sample.

Example

>>> from pymove.utils.math import avg_std_sample
>>> list = [7.8, 9.7, 6.4, 5.6, 10]
>>> print(avg_std_sample(list), type(avg_std_sample(list)))
(7.9, 1.9493588689617927) <class 'tuple'>
pymove.utils.math.interpolation(x0: float, y0: float, x1: float, y1: float, x: float) → float[source]

Performs interpolation.

Parameters:
  • x0 (float.) – The coordinate of the first point on the x axis.
  • y0 (float.) – The coordinate of the first point on the y axis.
  • x1 (float.) – The coordinate of the second point on the x axis.
  • y1 (float.) – The coordinate of the second point on the y axis.
  • x (float.) – A value in the interval (x0, x1).
Returns:

Is the interpolated or extrapolated value.

Return type:

float.

Example

>>> from pymove.utils.math import interpolation
>>> x0, y0, x1, y1, x = 2, 4, 3, 6, 3.5
>>> print(interpolation(x0,y0,x1,y1,x), type(interpolation(x0,y0,x1,y1,x)))
7.0 <class 'float'>
pymove.utils.math.is_number(value: int | float | str)[source]

Returns if value is numerical or not.

Parameters:value (int, float, str) –
Returns:True if numerical, otherwise False
Return type:boolean

Examples

>>> from pymove.utils.math import is_number
>>> a, b, c, d = 50, 22.5, '11.25', 'house'
>>> print(is_number(a), type(is_number(a)))
True <class 'bool'>
>>> print(is_number(b), type(is_number(b)))
True <class 'bool'>
>>> print(is_number(c), type(is_number(c)))
True <class 'bool'>
>>> print(is_number(d), type(is_number(d)))
False <class 'bool'>
pymove.utils.math.std(values_array: list[float]) → float[source]

Compute standard deviation.

Parameters:values_array (array like of numerical values.) – Represents the set of values to compute the operation.
Returns:Represents the value of standard deviation.
Return type:float

References

squaring with * is over 3 times as fast as with **2 http://stackoverflow.com/questions/29046346/comparison-of-power-to-multiplication-in-python

Example

>>> from pymove.utils.math import std
>>> list = [7.8, 9.7, 6.4, 5.6, 10]
>>> print(std(list), type(std(list)))
1.7435595774162693 <class 'float'>
pymove.utils.math.std_sample(values_array: list[float]) → float[source]

Compute the standard deviation of sample.

Parameters:values_array (array like of numerical values.) – Represents the set of values to compute the operation.
Returns:Represents the value of standard deviation of sample.
Return type:float

Example

>>> from pymove.utils.math import std_sample
>>> list = [7.8, 9.7, 6.4, 5.6, 10]
>>> print(std_sample(list), type(std_sample(list)))
1.9493588689617927 <class 'float'>

pymove.utils.mem module

Memory operations.

reduce_mem_usage_automatic, total_size, begin_operation, end_operation, sizeof_fmt, top_mem_vars

pymove.utils.mem.begin_operation(name: str) → dict[source]

Gets the stats for the current operation.

Parameters:name (str) – name of the operation
Returns:dictionary with the operation stats
Return type:dict

Examples

>>> from pymove.utils.mem import begin_operation
>>> operation = begin_operation('operation')
>>> operation
{
    'process': psutil.Process(
        pid=103401, name='python', status='running', started='21:48:11'
    ),
    'init': 293732352, 'start': 1622082973.8825781, 'name': 'operation'
}
pymove.utils.mem.end_operation(operation: dict) → dict[source]

Gets the time and memory usage of the operation.

Parameters:operation (dict) – dictionary with the begining stats of the operation
Returns:dictionary with the operation execution stats
Return type:dict

Examples

>>> import numpy as np
>>> import time
>>> from pymove.utils.mem import begin_operation, end_operation
>>> operation = begin_operation('create_arr')
>>> arr = np.arange(100000, dtype=np.float64)
>>> time.sleep(1.2)
>>> end_operation(operation)
{'name': 'create_arr', 'time in seconds': 1.2022554874420166, 'memory': '752.0 KiB'}
pymove.utils.mem.reduce_mem_usage_automatic(df: pandas.core.frame.DataFrame)[source]

Reduces the memory usage of the given dataframe.

df : dataframe
The input data to which the operation will be performed.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from pymove.utils.mem import reduce_mem_usage_automatic
>>> df = pd.DataFrame({'col_1': np.arange(10000, dtype=np.float64)})
>>> df.dtytes
col_1    float64
dtype: object
>>> reduce_mem_usage_automatic(df)
'Memory usage of dataframe is 0.08 MB'
'Memory usage after optimization is: 0.02 MB'
'Decreased by 74.9 %'
>>> df.dtytes
col_1    float16
dtype: object
pymove.utils.mem.sizeof_fmt(mem_usage: float, suffix: str = 'B') → str[source]

Returns the memory usage calculation of the last function.

Parameters:
  • mem_usage (float) – memory usage in bytes
  • suffix (string, optional) – suffix of the unit, by default ‘B’
Returns:

A string of the memory usage in a more readable format

Return type:

str

Examples

>>> from pymove.utils.mem import sizeof_fmt
>>> sizeof_fmt(1024)
1.0 KiB
>>> sizeof_fmt(2e6)
1.9 MiB
pymove.utils.mem.top_mem_vars(variables: dict, n: int = 10, hide_private=True) → pandas.core.frame.DataFrame[source]

Shows the sizes of the active variables.

Parameters:
  • variables (locals() or globals()) – Whether to shows local or global variables
  • n (int, optional) – number of variables to show, by default
  • hide_private (bool, optional) – Whether to hide private variables, by default True
Returns:

dataframe with variables names and sizes

Return type:

DataFrame

Examples

>>> import numpy as np
>>> from pymove.utils.mem import top_mem_vars
>>> arr = np.arange(100000, dtype=np.float64)
>>> long_string = 'Hello World!' * 100
>>> top_mem_vars(locals())
            var        mem
0           arr  781.4 KiB
1   long_string    1.2 KiB
2         local    416.0 B
3  top_mem_vars    136.0 B
4            np     72.0 B
pymove.utils.mem.total_size(o: object, handlers: Optional[dict] = None, verbose: bool = True) → float[source]

Calculates the approximate memory footprint of an given object.

Automatically finds the contents of the following builtin containers and their subclasses: tuple, list, deque, dict, set and frozenset.

Parameters:
  • o (object) – The object to calculate his memory footprint.
  • handlers (dict, optional) –
    To search other containers, add handlers to iterate over their contents,
    handlers = {SomeContainerClass: iter,
    OtherContainerClass: OtherContainerClass.get_elements}

    by default None

  • verbose (boolean, optional) –

    If set to True, the following information will be printed for each content of the object, by default False

    • the size of the object in bytes.
    • his type_
    • the object values
Returns:

The memory used by the given object

Return type:

float

Examples

>>> import numpy as np
>>> from pymove.utils.mem import total_size
>>> arr = np.arange(10000, dtype=np.float64)
>>> sz = total_size(arr)
'Size in bytes: 80104, Type: <class 'numpy.ndarray'>'
>>> sz
432

pymove.utils.trajectories module

Data operations.

read_csv, invert_dict, flatten_dict, flatten_columns, shift, fill_list_with_new_values, object_for_array, column_to_array

pymove.utils.trajectories.column_to_array(data: pandas.core.frame.DataFrame, column: str) → pandas.core.frame.DataFrame[source]

Transforms all columns values to list.

Parameters:
  • data (dataframe) – The input trajectory data
  • column (str) – Label of data referring to the column for conversion
Returns:

Dataframe with the selected column converted to list

Return type:

dataframe

Example

>>> from pymove.utils.trajectories import column_to_array
>>> move_df
          lat          lon              datetime  id   list_column
0   39.984094   116.319236   2008-10-23 05:53:05   1        '[1,2]'
1   39.984198   116.319322   2008-10-23 05:53:06   1        '[3,4]'
2   39.984224   116.319402   2008-10-23 05:53:11   1        '[5,6]'
3   39.984211   116.319389   2008-10-23 05:53:16   1        '[7,8]'
4   39.984217   116.319422   2008-10-23 05:53:21   1       '[9,10]'
>>> column_to_array(move_df, column='list_column')
          lat          lon              datetime  id    list_column
0   39.984094   116.319236   2008-10-23 05:53:05   1      [1.0,2.0]
1   39.984198   116.319322   2008-10-23 05:53:06   1      [3.0,4.0]
2   39.984224   116.319402   2008-10-23 05:53:11   1      [5.0,6.0]
3   39.984211   116.319389   2008-10-23 05:53:16   1      [7.0,8.0]
4   39.984217   116.319422   2008-10-23 05:53:21   1     [9.0,10.0]
pymove.utils.trajectories.fill_list_with_new_values(original_list: list, new_list_values: list)[source]

Copies elements from one list to another.

The elements will be positioned in the same position in the new list as they were in their original list.

Parameters:
  • original_list (list.) – The list to which the elements will be copied
  • new_list_values (list.) – The list from which elements will be copied

Example

>>> from pymove.utils.trajectories import fill_list_with_new_values
>>> lst = [1, 2, 3, 4]
>>> fill_list_with_new_values(lt, ['a','b'])
>>> print(lst)
['a', 'b', 3, 4]
pymove.utils.trajectories.flatten_columns(data: pandas.core.frame.DataFrame, columns: list) → pandas.core.frame.DataFrame[source]

Transforms columns containing dictionaries in individual columns.

Parameters:
  • data (DataFrame) – Dataframe with columns to be flattened
  • columns (list) – List of columns from dataframe containing dictionaries
Returns:

Dataframe with the new columns from the flattened dictionary columns

Return type:

dataframe

References

https://stackoverflow.com/questions/51698540/import-nested-mongodb-to-pandas

Examples

>>> from pymove.utils.trajectories import flatten_columns
>>> move_df
          lat          lon              datetime  id           dict_column
0   39.984094   116.319236   2008-10-23 05:53:05   1              {'a': 1}
1   39.984198   116.319322   2008-10-23 05:53:06   1              {'b': 2}
2   39.984224   116.319402   2008-10-23 05:53:11   1      {'c': 3, 'a': 4}
3   39.984211   116.319389   2008-10-23 05:53:16   1              {'b': 2}
4   39.984217   116.319422   2008-10-23 05:53:21   1      {'a': 3, 'c': 2}
>>> flatten_columns(move_df, columns='dict_column')
          lat            lon               datetime   id     dict_column_b         dict_column_c   dict_column_a
0   39.984094      116.319236   2008-10-23 05:53:05    1               NaN                   NaN             1.0
1   39.984198      116.319322   2008-10-23 05:53:06    1               2.0                   NaN             NaN
2   39.984224      116.319402   2008-10-23 05:53:11    1               NaN                   3.0             4.0
3   39.984211      116.319389   2008-10-23 05:53:16    1               2.0                   NaN             NaN
4   39.984217      116.319422   2008-10-23 05:53:21    1               NaN                   2.0             3.0
pymove.utils.trajectories.flatten_dict(d: dict, parent_key: str = '', sep: str = '_') → dict[source]

Flattens a nested dictionary.

Parameters:
  • d (dict) – Dictionary to be flattened
  • parent_key (str, optional) – Key of the parent dictionary, by default ‘’
  • sep (str, optional) – Separator for the parent and child keys, by default ‘_’
Returns:

Flattened dictionary

Return type:

dict

References

https://stackoverflow.com/questions/6027558/flatten-nested-dictionaries-compressing-keys

Examples

>>> from pymove.utils.trajectories import flatten_dict
>>> d = {'a': 1, 'b': {'c': 2, 'd': 3}}
>>> flatten_dict(d)
{'a': 1, 'b_c': 2, 'b_d': 3}
pymove.utils.trajectories.invert_dict(d: dict) → dict[source]

Inverts the key:value relation of a dictionary.

Parameters:d (dict) – dictionary to be inverted
Returns:inverted dict
Return type:dict

Examples

>>> from pymove.utils.trajectories import invert_dict
>>> traj_dict = {'a': 1, 'b': 2}
>>> invert_dict(traj_dict)
{1: 'a, 2: 'b'}
pymove.utils.trajectories.object_for_array(object_: str) → numpy.ndarray[source]

Transforms an object into an array.

Parameters:object (str) – object representing a list of integers or strings
Returns:object converted to a list
Return type:array

Examples

>>> from pymove.utils.trajectories import object_for_array
>>> list_str = '[1,2,3,4,5]'
>>> object_for_array(list_str)
array([1., 2., 3., 4., 5.], dtype=float32)
pymove.utils.trajectories.read_csv(filepath_or_buffer: Union[PathLike[str], str, IO[AnyStr], io.RawIOBase, io.BufferedIOBase, io.TextIOBase, _io.TextIOWrapper, mmap.mmap], latitude: str = 'lat', longitude: str = 'lon', datetime: str = 'datetime', traj_id: str = 'id', type_: str = 'pandas', n_partitions: int = 1, **kwargs)[source]

Reads a csv file and structures the data.

Parameters:
  • filepath_or_buffer (str or path object or file-like object) – Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.csv. If you want to pass in a path object, pandas accepts any os.PathLike. By file-like object, we refer to objects with a read() method, such as a file handle (e.g. via builtin open function) or StringIO.
  • latitude (str, optional) – Represents the column name of feature latitude, by default ‘lat’
  • longitude (str, optional) – Represents the column name of feature longitude, by default ‘lon’
  • datetime (str, optional) – Represents the column name of feature datetime, by default ‘datetime’
  • traj_id (str, optional) – Represents the column name of feature id trajectory, by default ‘id’
  • type (str, optional) – Represents the type of the MoveDataFrame, by default ‘pandas’
  • n_partitions (int, optional) – Represents number of partitions for DaskMoveDataFrame, by default 1
  • **kwargs (Pandas read_csv arguments) – https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html?highlight=read_csv#pandas.read_csv
Returns:

Trajectory data

Return type:

MoveDataFrameAbstract subclass

Examples

>>> from pymove.utils.trajectories import read_csv
>>> move_df = read_csv('geolife_sample.csv')
>>> move_df.head()
          lat          lon              datetime  id
0   39.984094   116.319236   2008-10-23 05:53:05   1
1   39.984198   116.319322   2008-10-23 05:53:06   1
2   39.984224   116.319402   2008-10-23 05:53:11   1
3   39.984211   116.319389   2008-10-23 05:53:16   1
4   39.984217   116.319422   2008-10-23 05:53:21   1
>>> type(move_df)
<class 'pymove.core.pandas.PandasMoveDataFrame'>
pymove.utils.trajectories.shift(arr: list | Series | ndarray, num: int, fill_value: Any | None = None) → ndarray[source]

Shifts the elements of the given array by the number of periods specified.

Parameters:
  • arr (array) – The array to be shifted
  • num (int) – Number of periods to shift. Can be positive or negative If positive, the elements will be pulled down, and pulled up otherwise
  • fill_value (float, optional) – The scalar value used for newly introduced missing values, by default np.nan
Returns:

A new array with the same shape and type_ as the initial given array, but with the indexes shifted.

Return type:

array

Notes

Similar to pandas shift, but faster.

References

https://stackoverflow.com/questions/30399534/shift-elements-in-a-numpy-array

Examples

>>> from pymove.utils.trajectories import shift
>>> array = [1, 2, 3, 4, 5, 6, 7]
>>> shift(array, 1)
[0 1 2 3 4 5 6]
>>> shift(array, 0)
[1, 2, 3, 4, 5, 6, 7]
>>> shift(array, -1)
[2 3 4 5 6 7 0]

pymove.utils.visual module

Visualization auxiliary operations.

add_map_legend, generate_color, rgb, hex_rgb, cmap_hex_color, get_cmap, save_wkt

pymove.utils.visual.add_map_legend(m: Map, title: str, items: tuple | Sequence[tuple])[source]

Adds a legend for a folium map.

Parameters:
  • m (Map) – Represents a folium map.
  • title (str) – Represents the title of the legend
  • items (list of tuple) – Represents the color and name of the legend items

References

https://github.com/python-visualization/folium/issues/528#issuecomment-421445303

Examples

>>> import folium
>>> from pymove.utils.visual import add_map_legend
>>> df
          lat          lon              datetime  id
0   39.984094   116.319236   2008-10-23 05:53:05   1
1   39.984198   116.319322   2008-10-23 05:53:06   1
2   39.984224   116.319402   2008-10-23 05:53:11   1
3   39.984211   116.319389   2008-10-23 05:53:16   2
4   39.984217   116.319422   2008-10-23 05:53:21   2
>>> m = folium.Map(location=[df.lat.median(), df.lon.median()])
>>> folium.PolyLine(mdf[['lat', 'lon']], color='red').add_to(m)
>>> pm.visual.add_map_legend(m, 'Color by ID', [(1, 'red')])
>>> m.get_root().to_dict()
{
    "name": "Figure",
    "id": "1d32230cd6c54b19b35ceaa864e61168",
    "children": {
        "map_6f1abc8eacee41e8aa9d163e6bbb295f": {
            "name": "Map",
            "id": "6f1abc8eacee41e8aa9d163e6bbb295f",
            "children": {
                "openstreetmap": {
                    "name": "TileLayer",
                    "id": "f58c3659fea348cb828775f223e1e6a4",
                    "children": {}
                },
                "poly_line_75023fd7df01475ea5e5606ddd7f4dd2": {
                    "name": "PolyLine",
                    "id": "75023fd7df01475ea5e5606ddd7f4dd2",
                    "children": {}
                }
            }
        },
        "map_legend": {  # legend element
            "name": "MacroElement",
            "id": "72911b4418a94358ba8790aab93573d1",
            "children": {}
        }
    },
    "header": {
        "name": "Element",
        "id": "e46930fc4152431090b112424b5beb6a",
        "children": {
            "meta_http": {
                "name": "Element",
                "id": "868e20baf5744e82baf8f13a06849ecc",
                "children": {}
            }
        }
    },
    "html": {
        "name": "Element",
        "id": "9c4da9e0aac349f594e2d23298bac171",
        "children": {}
    },
    "script": {
        "name": "Element",
        "id": "d092078607c04076bf58bd4593fa1684",
        "children": {}
    }
}
pymove.utils.visual.cmap_hex_color(cmap: matplotlib.colors.ListedColormap, i: int) → str[source]

Convert a Colormap to hex color.

Parameters:
  • cmap (ListedColormap) – Represents the Colormap
  • i (int) – List color index
Returns:

Represents corresponding hex str

Return type:

str

Examples

>>> from pymove.utils.visual import  cmap_hex_color
>>> import matplotlib.pyplot as plt
>>> jet = plt.get_cmap('jet')  # This comand generates a Linear Segmented Colormap
>>> print(cmap_hex_color(jet, 0))
'#000080'
>>> print(cmap_hex_color(jet, 1))
'#000084'
pymove.utils.visual.generate_color() → str[source]

Generates a random color.

Returns:
Return type:Random HEX color

Examples

>>> from pymove.utils.visual import generate_color
>>> print(generate_color(), type(generate_color()))
'#E0FFFF' <class 'str'>
>>> print(generate_color(), type(generate_color()))
'#808000' <class 'str'>
pymove.utils.visual.get_cmap(cmap: str) → matplotlib.colors.Colormap[source]

Returns a matplotlib colormap instance.

Parameters:cmap (str) – name of the colormar
Returns:matplotlib colormap
Return type:Colormap

Examples

>>> from pymove.utils.visual import  get_cmap
>>> print(get_cmap('Greys')
<matplotlib.colors.LinearSegmentedColormap object at 0x7f743fc04bb0>
pymove.utils.visual.hex_rgb(rgb_colors: tuple[float, float, float]) → str[source]

Return a hex str, as used in Tk plots.

Parameters:rgb_colors (tuple) – Represents a tuple with three positions that correspond to the percentage red, green and blue colors
Returns:Represents a color in hexadecimal format
Return type:str

Examples

>>> from pymove.utils.visual import hex_rgb
>>> print(hex_rgb((0.1, 0.2, 0.7)), type(hex_rgb((0.1, 0.2, 0.7))))
'#33B219' <class 'str'>
>>> print(hex_rgb((0.5, 0.4, 0.1)), type(hex_rgb((0.5, 0.4, 0.1))))
'#66197F' <class 'str'>
pymove.utils.visual.randint(low, high=None, size=None, dtype=int)

Return random integers from low (inclusive) to high (exclusive).

Return random integers from the “discrete uniform” distribution of the specified dtype in the “half-open” interval [low, high). If high is None (the default), then results are from [0, low).

Note

New code should use the integers method of a default_rng() instance instead; please see the random-quick-start.

Parameters:
  • low (int or array-like of ints) – Lowest (signed) integers to be drawn from the distribution (unless high=None, in which case this parameter is one above the highest such integer).
  • high (int or array-like of ints, optional) – If provided, one above the largest (signed) integer to be drawn from the distribution (see above for behavior if high=None). If array-like, must contain integer values
  • size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.
  • dtype (dtype, optional) –

    Desired dtype of the result. Byteorder must be native. The default value is int.

    New in version 1.11.0.

Returns:

outsize-shaped array of random integers from the appropriate distribution, or a single such random int if size not provided.

Return type:

int or ndarray of ints

See also

random_integers()
similar to randint, only for the closed interval [low, high], and 1 is the lowest value if high is omitted.
Generator.integers()
which should be used for new code.

Examples

>>> np.random.randint(2, size=10)
array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0]) # random
>>> np.random.randint(1, size=10)
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Generate a 2 x 4 array of ints between 0 and 4, inclusive:

>>> np.random.randint(5, size=(2, 4))
array([[4, 0, 2, 1], # random
       [3, 2, 2, 0]])

Generate a 1 x 3 array with 3 different upper bounds

>>> np.random.randint(1, [3, 5, 10])
array([2, 2, 9]) # random

Generate a 1 by 3 array with 3 different lower bounds

>>> np.random.randint([1, 5, 7], 10)
array([9, 8, 7]) # random

Generate a 2 by 4 array using broadcasting with dtype of uint8

>>> np.random.randint([1, 3, 5, 7], [[10], [20]], dtype=np.uint8)
array([[ 8,  6,  9,  7], # random
       [ 1, 16,  9, 12]], dtype=uint8)
pymove.utils.visual.rgb(rgb_colors: tuple[float, float, float]) → tuple[int, int, int][source]

Return a tuple of integers, as used in AWT/Java plots.

Parameters:rgb_colors (tuple) – Represents a tuple with three positions that correspond to the percentage red, green and blue colors.
Returns:Represents a tuple of integers that correspond the colors values.
Return type:tuple

Examples

>>> from pymove.utils.visual import rgb
>>> print(rgb((0.1, 0.2, 0.7)), type(rgb((0.1, 0.2, 0.7))))
(51, 178, 25) <class 'tuple'>
>>> print(rgb((0.5, 0.4, 0.1)), type(rgb((0.5, 0.4, 0.1))))
(102, 25, 127) <class 'tuple'>
pymove.utils.visual.save_wkt(move_data: pandas.core.frame.DataFrame, filename: str, label_id: str = 'id')[source]

Save a visualization in a map in a new file .wkt.

Parameters:
  • move_data (DataFrame) – Input trajectory data
  • filename (str) – Represents the filename
  • label_id (str) – Represents column name of trajectory id
Returns:

File

Return type:

A file.wkt that contains geometric points that build a map visualization

Examples

>>> from pymove.utils.visual import save_wkt
>>> df.head()
          lat          lon              datetime  id
0   39.984094   116.319236   2008-10-23 05:53:05   1
1   39.984198   116.319322   2008-10-23 05:53:06   1
2   39.984224   116.319402   2008-10-23 05:53:11   1
3   39.984211   116.319389   2008-10-23 05:53:16   2
4   39.984217   116.319422   2008-10-23 05:53:21   2
>>> save_wkt(df, 'test.wkt', 'id')
>>> with open('test.wtk') as f:
>>>     print(f.read())
'id;linestring'
'1;LINESTRING(116.319236 39.984094,116.319322 39.984198,116.319402 39.984224)'
'2;LINESTRING(116.319389 39.984211,116.319422 39.984217)'

Module contents

Contains utility functions.

constants, conversions, data_augmentation, datetime, distances, geoutils, integration, log, math, mem, trajectories, visual