pregex.core.pre
This module a single class, namely Pregex
, which
constitutes the base class for every other class within pregex.
Classes & methods
Below are listed all classes within pregex.core.pre
along with any possible methods they may possess.
- class pregex.core.pre.Pregex(pattern: str = '', escape: bool = True)[source]
Wraps the provided pattern within an instance of this class.
- Parameters
pattern (str) – The pattern that is to be wrapped within an instance of this class. Defaults to the empty string
''
.escape (bool) – Determines whether to escape the provided pattern or not. Defaults to
True
.
- Raises
InvalidArgumentTypeException – Parameter
pattern
is not a string.- Note
This class constitutes the base class for every other class within the pregex package.
- at_least(n: int, is_greedy: bool = True) Pregex [source]
Applies quantifier
{n,}
to this instance’s underlying pattern and returns the result as aPregex
instance.- Parameters
n (int) – The minimum number of times that the pattern is to be matched.
is_greedy (bool) – Determines whether to declare this quantifier as greedy. When declared as such, the regex engine will try to match the expression as many times as possible. Defaults to
True
.`
- Raises
InvalidArgumentTypeException – Parameter
n
is not an integer.InvalidArgumentValueException – Parameter
n
has a value of less than zero.CannotBeRepeatedException – This instance represents a non-repeatable pattern.
- at_least_at_most(n: int, m: Optional[int], is_greedy: bool = True) Pregex [source]
Applies quantifier
{n,m}
to this instance’s underlying pattern and returns the result as aPregex
instance.- Parameters
n (int) – The minimum number of times that the pattern is to be matched.
m (int) – The minimum number of times that the pattern is to be matched.
is_greedy (bool) – Determines whether to declare this quantifier as greedy. When declared as such, the regex engine will try to match the expression as many times as possible. Defaults to
True
.`
- Raises
InvalidArgumentTypeException –
Parameter
pre
is neither aPregex
instance nor a string.Parameter
n
is not an integer.Parameter
m
is neither an integer norNone
.
InvalidArgumentValueException –
Either parameter
n
orm
has a value of less than zero.Parameter
n
has a greater value than that of parameterm
.
CannotBeRepeatedException – Parameter
m
has a value of greater than one, while this instance represents a non-repeatable pattern.
- Note
Parameter
is_greedy
has no effect in the case thatn
equalsm
.Setting
m
equal toNone
indicates that there is no upper limit to the number of times the pattern is to be repeated.
- at_most(n: Optional[int], is_greedy: bool = True) Pregex [source]
Applies quantifier
{,n}
to this instance’s underlying pattern and returns the result as aPregex
instance.- Parameters
n (int) – The maximum number of times that the pattern is to be matched.
is_greedy (bool) – Determines whether to declare this quantifier as greedy. When declared as such, the regex engine will try to match the expression as many times as possible. Defaults to
True
.
- Raises
InvalidArgumentTypeException – Parameter
n
is neither an integer norNone
.InvalidArgumentValueException – Parameter
n
has a value of less than zero.CannotBeRepeatedException – Parameter
n
has a value of greater than one, while this instance represents a non-repeatable pattern.
- Note
Setting
n
equal toNone
indicates that there is no upper limit to the number of times the pattern is to be repeated.
- capture(name: Optional[str] = None) Pregex [source]
Creates a capturing group out of this instance’s underlying pattern and returns the result as a
Pregex
instance.- Parameters
pre (Pregex | str) – The pattern out of which the capturing group is created.
name (str) – The name that is assigned to the captured group for backreference purposes. A value of
None
indicates that no name is to be assigned to the group. Defaults toNone
.
- Raises
InvalidArgumentTypeException – Parameter
name
is neither a string norNone
.InvalidCapturingGroupNameException – Parameter
name
is not a valid capturing group name. Such name must contain word characters only and start with a non-digit character.
- Note
Creating a capturing group out of a capturing group does nothing.
Creating a capturing group out of a non-capturing group converts it into a capturing group, except if any flags have been applied to it, in which case, the non-capturing group is wrapped within a capturing group as a whole.
Creating a named capturing group out of an unnamed capturing group, assigns a name to it.
Creating a named capturing group out of a named capturing group, changes the group’s name.
- compile() None [source]
Compiles the underlying RegEx pattern. After invoking this method, any further attempt at matching a string will be making use of the compiled RegEx pattern.
- concat(pre: Union[Pregex, str], on_right: bool = True) Pregex [source]
Concatenates the provided pattern to this instance’s underlying pattern and returns the resulting pattern as a
Pregex
instance.- Parameters
pre (Pregex | str) – Either a string or a
Pregex
instance representing the pattern that is to take part in the concatenation.on_right (bool) – If
True
, then places the provided pattern on the right side of the concatenation, else on the left. Defaults toTrue
.
- Raises
InvalidArgumentTypeException – Parameter
pre
is neither aPregex
instance nor a string.
- either(pre: Union[Pregex, str], on_right: bool = True) Pregex [source]
Applies the alternation operator
|
between the provided pattern and this instance’s underlying pattern, and returns the resulting pattern as aPregex
instance.- Parameters
pre (Pregex | str) – Either a string or a
Pregex
instance representing the pattern that is to take part in the alternation.on_right (bool) – If
True
, then places the provided pattern on the right side of the alternation, else on the left. Defaults toTrue
.
- Raises
InvalidArgumentTypeException – Parameter
pre
is neither aPregex
instance nor a string.
- enclose(pre: Union[Pregex, str]) Pregex [source]
Concatenates the provided pattern to both sides of this instance’s underlying pattern, and returns the resulting pattern as a
Pregex
instance.- Parameters
pre (Pregex | str) – Either a string or a
Pregex
instance representing the “enclosing” pattern.- Raises
InvalidArgumentTypeException – Parameter pre is neither a
Pregex
instance nor a string.
- enclosed_by(pre: Union[Pregex, str]) Pregex [source]
Applies both positive lookahead assertion
(?=<PRE>)
and positive lookbehind assertion(?<=<PRE>)
, where<PRE>
corresponds to the provided pattern, to this instance’s underlying pattern and returns the resulting pattern as aPregex
instance.- Parameters
pre (str | Pregex) – A Pregex instance or string representing the “assertion” pattern.
- Raises
InvalidArgumentTypeException – The provided argument is neither a
Pregex
instance nor a string.NonFixedWidthPatternException – A non-fixed-width pattern is provided in place of parameter
assertion
.
- Note
The resulting pattern cannot have a repeating quantifier applied to it.
- exactly(n: int) Pregex [source]
Applies quantifier
{n}
to this instance’s underlying pattern and returns the result as aPregex
instance.- Parameters
n (int) – The exact number of times that the patterns is to be matched.
- Raises
InvalidArgumentTypeException – Parameter
n
is not an integer.InvalidArgumentValueException – Parameter
n
has a value of less than zero.CannotBeRepeatedException – Parameter
n
has a value of greater than one, while this instance represents a non-repeatable pattern.
- followed_by(pre: Union[Pregex, str]) Pregex [source]
Applies positive lookahead assertion
(?=<PRE>)
, where<PRE>
corresponds to the provided pattern, to this instance’s underlying pattern and returns the resulting pattern as aPregex
instance.- Parameters
pre (str | Pregex) – A Pregex instance or string representing the “assertion” pattern.
- Raises
InvalidArgumentTypeException – The provided argument is neither a
Pregex
instance nor a string.- Note
The resulting pattern cannot have a repeating quantifier applied to it.
- get_captures(source: str, include_empty: bool = True, is_path: bool = False) list[tuple[str]] [source]
Returns a list of tuples, one tuple per match, where each tuple contains all of its corresponding match’s captured groups.
- Parameters
source (str) – The text that is to be examined.
include_empty (bool) – Determines whether to include empty captures into the results. Defaults to
True
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Note
In case there exists an optional capturing group within the pattern, that has not been captured by a match, then that capture’s corresponding value will be
None
.
- get_captures_and_pos(source: str, include_empty: bool = True, relative_to_match: bool = False, is_path: bool = False) list[list[tuple[str, int, int]]] [source]
Returns a list containing lists of tuples, one list per match, where each tuple contains one of its corresponding match’s captured groups along with its exact position within the text.
- Parameters
source (str) – The text that is to be examined.
include_empty (bool) – Determines whether to include empty captures into the results. Defaults to
True
.relative_to_match (bool) – If
True
, then each group’s position-indices are calculated relative to the group’s corresponding match, not to the whole string. Defaults toFalse
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Note
In case there exists an optional capturing group within the pattern, that has not been captured by a match, then that capture’s corresponding tuple will be
(None, -1, -1)
.
- get_compiled_pattern(discard_after: bool = True) Pattern [source]
Returns this instance’s underlying RegEx pattern as a
re.Pattern
instance.- Parameters
discard_after (bool) – Determines whether the compiled pattern is to be discarded after the program has exited from this method, or to be retained so that any further attempt at matching a string will use the compiled pattern instead of the regular one. Defaults to
True
.
- get_matches(source: str, is_path: bool = False) list[str] [source]
Returns a list containing any possible matches found within the provided text.
- Parameters
source (str) – The text that is to be examined.
is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- get_matches_and_pos(source: str, is_path: bool = False) list[tuple[str, int, int]] [source]
Returns a list containing any possible matches found within the provided text along with their exact position.
- Parameters
source (str) – The text that is to be examined.
is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- get_matches_with_context(source: str, n_left: int = 5, n_right: int = 5, is_path: bool = False) list[str] [source]
Returns a list containing any possible matches found within the provided text, along with any of its surrounding context, the exact length of which can be configured through this method’s parameters.
- Parameters
source (str) – The text that is to be examined.
n_left (int) – The number of characters representing the context on the left side of the match. Defaults to
5
.n_right (int) – The number of characters representing the context on the right side of the match. Defaults to
5
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Raises
InvalidArgumentTypeException – Either parameter
n_left
orn_right
is not an integer.InvalidArgumentValueException – Either parameter
n_left
orn_right
has a value of less than zero.
- get_named_captures(source: str, include_empty: bool = True, is_path: bool = False) list[dict[str, str]] [source]
Returns a dictionary of tuples, one dictionary per match, where each dictionary contains key-value pairs of any named captured groups that belong to its corresponding match, with each key being the name of the captured group, whereas its corresponding value will be the actual captured text.
- Parameters
source (str) – The text that is to be examined.
include_empty (bool) – Determines whether to include empty captures into the results. Defaults to
True
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Note
In case there exists an optional capturing group within the pattern, that has not been captured by a match, then that capture’s corresponding key-value pair will be
name --> None
.
- get_named_captures_and_pos(source: str, include_empty: bool = True, relative_to_match: bool = False, is_path: bool = False) list[dict[str, tuple[str, int, int]]] [source]
Returns a dictionary of tuples, one dictionary per match, where each dictionary contains key-value pairs of any named captured groups that belong to its corresponding match, with each key being the name of the captured group, whereas its corresponding value will be a tuple containing the actual captured group along with its exact position within the text.
- Parameters
source (str) – The text that is to be examined.
include_empty (bool) – Determines whether to include empty captures into the results. Defaults to
True
.relative_to_match (bool) – If
True
, then each group’s position-indices are calculated relative to the group’s corresponding match, not to the whole string. Defaults toFalse
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Note
In case there exists an optional capturing group within the pattern, that has not been captured by a match, then that capture’s corresponding key-value pair will be
name --> (None, -1, -1)
.
- get_pattern(include_flags: bool = False) str [source]
Returns this instance’s underlying RegEx pattern as a string.
- Parameters
include_flags (bool) – Determines whether to display the used RegEx flags along with the pattern. Defaults to
False
.- Note
This method is to be preferred over str() when one needs to display this instance’s underlying Regex pattern.
- group(is_case_insensitive: bool = False) Pregex [source]
Creates a non-capturing group out of this instance’s underlying pattern and returns the result as a
Pregex
instance.- Parameters
is_case_insensitive (bool) – If
True
, then the “case insensitive” flag is applied to the group so that the pattern within it ignores case when it comes to matching. Defaults toFalse
.- Raises
InvalidArgumentTypeException – Parameter
pre
is neither aPregex
instance nor a string.- Note
Creating a non-capturing group out of a non-capturing group does nothing, except for reset its flags, e.g.
is_case_insensitive
, if it has any.Creating a non-capturing group out of a capturing group converts it into a non-capturing group.
- has_match(source: str, is_path: bool = False) bool [source]
Returns
True
if at least one match is found within the provided text.- Parameters
source (str) – The text that is to be examined.
is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- indefinite(is_greedy: bool = True) Pregex [source]
Applies quantifier
*
to this instance’s underlying pattern and returns the result as aPregex
instance.- Parameters
is_greedy (bool) – Determines whether to declare this quantifier as greedy. When declared as such, the regex engine will try to match the expression as many times as possible. Defaults to
True
.- Raises
CannotBeRepeatedException – This instance represents a non-repeatable pattern.
- is_exact_match(source: str, is_path: bool = False) bool [source]
Returns
True
only if the provided text matches this pattern exactly.- Parameters
source (str) – The text that is to be examined.
is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- iterate_captures(source: str, include_empty: bool = True, is_path: bool = False) Iterator[tuple[str]] [source]
Generates tuples, one tuple per match, where each tuple contains all of its corresponding match’s captured groups.
- Parameters
source (str) – The text that is to be examined.
include_empty (bool) – Determines whether to include empty captures into the results. Defaults to
True
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Note
In case there exists an optional capturing group within the pattern, that has not been captured by a match, then that capture’s corresponding value will be
None
.
- iterate_captures_and_pos(source: str, include_empty: bool = True, relative_to_match: bool = False, is_path: bool = False) Iterator[list[tuple[str, int, int]]] [source]
Generates lists of tuples, one list per match, where each tuple contains one of its corresponding match’s captured groups along with its exact position within the text.
- Parameters
source (str) – The text that is to be examined.
include_empty (bool) – Determines whether to include empty captures into the results. Defaults to
True
.relative_to_match (bool) – If
True
, then each group’s position-indices are calculated relative to the group’s corresponding match, not to the whole string. Defaults toFalse
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Note
In case there exists an optional capturing group within the pattern, that has not been captured by a match, then that capture’s corresponding tuple will be
(None, -1, -1)
.
- iterate_matches(source: str, is_path: bool = False) Iterator[str] [source]
Generates any possible matches found within the provided text.
- Parameters
source (str) – The text that is to be examined.
is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- iterate_matches_and_pos(source: str, is_path: bool = False) Iterator[tuple[str, int, int]] [source]
Generates any possible matches found within the provided text along with their exact position.
- Parameters
source (str) – The text that is to be examined.
is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- iterate_matches_with_context(source: str, n_left: int = 5, n_right: int = 5, is_path: bool = False) Iterator[str] [source]
Generates any possible matches found within the provided text, along with any of its surrounding context, the exact length of which can be configured through this method’s parameters.
- Parameters
source (str) – The text that is to be examined.
n_left (int) – The number of characters representing the context on the left side of the match. Defaults to
5
.n_right (int) – The number of characters representing the context on the right side of the match. Defaults to
5
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Raises
InvalidArgumentTypeException – Either parameter
n_left
orn_right
is not an integer.InvalidArgumentValueException – Either parameter
n_left
orn_right
has a value of less than zero.
- iterate_named_captures(source: str, include_empty: bool = True, is_path: bool = False) Iterator[dict[str, str]] [source]
Generates dictionaries, one dictionary per match, where each dictionary contains key-value pairs of any named captured groups that belong to its corresponding match, with each key being the name of the captured group, whereas its corresponding value will be the actual captured text.
- Parameters
source (str) – The text that is to be examined.
include_empty (bool) – Determines whether to include empty captures into the results. Defaults to
True
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Note
In case there exists an optional capturing group within the pattern, that has not been captured by a match, then that capture’s corresponding key-value pair will be
name --> None
.
- iterate_named_captures_and_pos(source: str, include_empty: bool = True, relative_to_match: bool = False, is_path: bool = False) Iterator[dict[str, tuple[str, int, int]]] [source]
Generates dictionaries, one dictionary per match, where each dictionary contains key-value pairs of any named captured groups that belong to its corresponding match, with each key being the name of the captured group, whereas its corresponding value will be a tuple containing the actual captured group along with its exact position within the text.
- Parameters
source (str) – The text that is to be examined.
include_empty (bool) – Determines whether to include empty captures into the results. Defaults to
True
.relative_to_match (bool) – If
True
, then each group’s position-indices are calculated relative to the group’s corresponding match, not to the whole string. Defaults toFalse
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Note
In case there exists an optional capturing group within the pattern, that has not been captured by a match, then that capture’s corresponding key-value pair will be
name --> (None, -1, -1)
.
- match_at_end() Pregex [source]
Applies assertion
\Z
to this instance’s underlying pattern so that it only matches if it is found at the end of a string, and returns the resulting pattern as aPregex
instance.- Note
The resulting pattern cannot have a repeating quantifier applied to it.
- match_at_line_end() Pregex [source]
Applies assertion
$
to this instance’s underlying pattern so that it only matches if it is found at the end of a line, and returns the resulting pattern as aPregex
instance.- Note
The resulting pattern cannot have a repeating quantifier applied to it.
Uses meta character
$
since the MULTILINE flag is considered on.
- match_at_line_start() Pregex [source]
Applies assertion
^
to this instance’s underlying pattern so that it only matches if it is found at the start of a line, and returns the resulting pattern as aPregex
instance.- Note
The resulting pattern cannot have a repeating quantifier applied to it.
Uses meta character
^
since the MULTILINE flag is considered on.
- match_at_start() Pregex [source]
Applies assertion
\A
to this instance’s underlying pattern so that it only matches if it is found at the start of a string, and returns the resulting pattern as aPregex
instance.- Note
The resulting pattern cannot have a repeating quantifier applied to it.
- not_enclosed_by(pre: Union[Pregex, str]) Pregex [source]
Applies both negative lookahead assertion
(?=<PRE>)`
and negative lookbehind assertion(?<!<PRE>)
, where<PRE>
corresponds to the provided pattern, to this instance’s underlying pattern and returns the resulting pattern as aPregex
instance.- Parameters
pre (Pregex | str) – Either a string or a
Pregex
instance representing the “assertion” pattern.- Raises
InvalidArgumentTypeException – The provided argument is neither a
Pregex
instance nor a string.EmptyNegativeAssertionException – The provided assertion pattern is the empty-string pattern.
NonFixedWidthPatternException – The provided assertion pattern does not have a fixed width.
- not_followed_by(pre: Union[Pregex, str]) Pregex [source]
Applies negative lookahead assertion
(?!<PRE>)
, where<PRE>
corresponds to the provided pattern, to this instance’s underlying pattern and returns the resulting pattern as aPregex
instance.- Parameters
pre (Pregex | str) – Either a string or a
Pregex
instance representing the “assertion” pattern.- Raises
InvalidArgumentTypeException – The provided argument is neither a
Pregex
instance nor a string.EmptyNegativeAssertionException – The provided assertion pattern is the empty-string pattern.
- not_preceded_by(pre: Union[Pregex, str]) Pregex [source]
Applies negative lookbehind assertion
(?<!<PRE>)
, where<PRE>
corresponds to the provided pattern, to this instance’s underlying pattern and returns the resulting pattern as aPregex
instance.- Parameters
pre (Pregex | str) – Either a string or a
Pregex
instance representing the “assertion” pattern.- Raises
InvalidArgumentTypeException – The provided argument is neither a
Pregex
instance nor a string.EmptyNegativeAssertionException – The provided assertion pattern is the empty-string pattern.
NonFixedWidthPatternException – The provided assertion pattern does not have a fixed width.
- one_or_more(is_greedy: bool = True) Pregex [source]
Applies quantifier
+
to this instance’s underlying pattern and returns the result as aPregex
instance.- Parameters
is_greedy (bool) – Determines whether to declare this quantifier as greedy. When declared as such, the regex engine will try to match the expression as many times as possible. Defaults to
True
.- Raises
CannotBeRepeatedException – This instance represents a non-repeatable pattern.
- optional(is_greedy: bool = True) Pregex [source]
Applies quantifier
?
to this instance’s underlying pattern and returns the result as aPregex
instance.- Parameters
is_greedy (bool) – Determines whether to declare this quantifier as greedy. When declared as such, the regex engine will try to match the expression as many times as possible. Defaults to
True
.
- preceded_by(pre: Union[Pregex, str]) Pregex [source]
Applies positive lookbehind assertion
(?<=<PRE>)
, where<PRE>
corresponds to the provided pattern, to this instance’s underlying pattern and returns the resulting pattern as aPregex
instance.- Parameters
pre (str | Pregex) – A Pregex instance or string representing the “assertion” pattern.
- Raises
InvalidArgumentTypeException – The provided argument is neither a
Pregex
instance nor a string.NonFixedWidthPatternException – A non-fixed-width pattern is provided in place of parameter
assertion
.
- Note
The resulting pattern cannot have a repeating quantifier applied to it.
- print_pattern(include_flags: bool = False) None [source]
Prints this instance’s underlying RegEx pattern.
- Parameters
include_flags (bool) – Determines whether to display the used RegEx flags along with the pattern. Defaults to
False
.
- replace(source: str, repl: str, count: int = 0, is_path: bool = False) str [source]
Replaces all or some of the occuring matches with
repl
and returns the resulting string. If there are no matches, then this method will return the provided text without modifying it.- Parameters
source (str) – The text that is to be matched and modified.
repl (str) – The string that is to replace any matches.
count (int) – The number of matches that are to be replaced, starting from left to right. A value of
0
indicates that all matches must be replaced. Defaults to0
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- Raises
InvalidArgumentValueException – Parameter
count
has a value of less than zero.
- split_by_capture(source: str, include_empty: bool = True, is_path: bool = False) list[str] [source]
Splits the provided text based on any occuring captures and returns the result as alist containing each individual part of the text after the split.
- Parameters
source (str) – The piece of text that is to be matched and split.
include_empty (bool) – Determines whether to include empty groups into the results. Defaults to
True
.is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.
- split_by_match(source: str, is_path: bool = False) list[str] [source]
Splits the provided text based on any occuring matches and returns the result as a list containing each individual part of the text after the split.
- Parameters
source (str) – The text that is to be matched and split.
is_path (bool) – If set to
True
, then parametersource
is considered to be a local path pointing to the file from which the text is to be read. Defaults toFalse
.