pywrangler.pandas.wranglers package¶
Submodules¶
pywrangler.pandas.wranglers.interval_identifier module¶
This module contains implementations of the interval identifier wrangler.
-
class
pywrangler.pandas.wranglers.interval_identifier.
NaiveIterator
(marker_column: str, marker_start: Any, marker_end: Any = <object object>, marker_start_use_first: bool = False, marker_end_use_first: bool = True, orderby_columns: Union[str, Iterable[str], None] = None, groupby_columns: Union[str, Iterable[str], None] = None, ascending: Union[bool, Iterable[bool]] = None, result_type: str = 'enumerated', target_column_name: str = 'iids')[source]¶ Bases:
pywrangler.pandas.wranglers.interval_identifier._BaseIntervalIdentifier
Most simple, sequential implementation which iterates values while remembering the state of start and end markers.
-
class
pywrangler.pandas.wranglers.interval_identifier.
VectorizedCumSum
(marker_column: str, marker_start: Any, marker_end: Any = <object object>, marker_start_use_first: bool = False, marker_end_use_first: bool = True, orderby_columns: Union[str, Iterable[str], None] = None, groupby_columns: Union[str, Iterable[str], None] = None, ascending: Union[bool, Iterable[bool]] = None, result_type: str = 'enumerated', target_column_name: str = 'iids')[source]¶ Bases:
pywrangler.pandas.wranglers.interval_identifier._BaseIntervalIdentifier
Sophisticated approach using multiple, vectorized operations. Using cumulative sum allows enumeration of intervals to avoid looping.