Source code for pregex.core.groups

__doc__ = """
This module contains all necessary classes that are used to construct both
capturing and non-capturing groups, as well as any other classes which relate
to concepts that are based on groups, such as backreferences and conditionals.

Pattern grouping
-------------------------------------------
In general, one should not have to concern themselves with pattern grouping,
as patterns are automatically wrapped within non-capturing groups whenever this is
deemed necessary. Consider for instance the following code snippet:

.. code-block:: python

   from pregex.core.quantifiers import Optional

   Optional('a').print_pattern() # This prints "a?"
   Optional('aa').print_pattern() # This prints "(?:aa)?"

In the first case, quantifier :class:`~pregex.core.quantifiers.Optional` is applied to
the pattern directly, whereas in the second case the pattern is placed into a non-capturing
group so that "aa" is quantified as a whole. Be that as it may, there exists a separate class,
namely :class:`Group`, through which one is able to explicitly wrap any pattern within
a non-capturing group if they wish to do so:

.. code-block:: python

   from pregex.core.groups import Group
   from pregex.core.quantifiers import Optional

   pre = Group(Optional('a'))

   pre.print_pattern() # This prints "(?:a)?"

This class can also be used so as to apply various RegEx flags, also known \
as *modifiers*, to a pattern. As of yet, the only flag that is supported is
the case-insensitive flag ``i``:

.. code-block:: python

   from pregex.core.groups import Group

   pre = Group('pregex', is_case_insensitive=True)

   # This statement is "True"
   pre.is_exact_match('PRegEx')


Capturing patterns
-------------------------------------------

You'll find however that :class:`Capture` is probably the most important class
of this module, as it is used to create a capturing group out of a pattern,
so that said pattern is captured separately whenever a match occurs.

.. code-block:: python

   from pregex.core.groups import Capture
   from pregex.core.classes import AnyLetter

   pre = AnyLetter() + Capture(2 * AnyLetter())

   text = "abc def"
   print(pre.get_matches(text)) # This prints "['abc', 'def']"
   print(pre.get_captures(text)) # This prints "[('bc'), ('ef')]"

As you can see, capturing is a very useful tool for whenever you are
interested in isolating some specific part of a pattern.

Classes & methods
-------------------------------------------

Below are listed all classes within :py:mod:`pregex.core.groups`
along with any possible methods they may possess.
"""


import re as _re
import pregex.core.pre as _pre
import pregex.core.exceptions as _ex
from typing import Union as _Union
from typing import Optional as _Optional


class __Group(_pre.Pregex):
    '''
    Constitutes the base class for all classes that are part of this module.

    :param Pregex | str pre: A Pregex instance or string representing the pattern \
        that is to be groupped.
    :param (Pregex => str) transform: A `transform` function for the provided pattern.

    :raises InvalidArgumentTypeException: Parameter ``pre`` is neither a \
        ``Pregex`` instance nor a string.
    '''
    def __init__(self, pre: _Union[_pre.Pregex, str], transform) -> _pre.Pregex:
        '''
        Constitutes the base class for all classes that are part of this module.

        :param Pregex | str pre: A Pregex instance or string representing the pattern \
            that is to be groupped.
        :param (Pregex => str) transform: A `transform` function for the provided pattern.

        :raises InvalidArgumentTypeException: Parameter ``pre`` is neither a \
            ``Pregex`` instance nor a string.
        '''
        pattern = transform(__class__._to_pregex(pre))
        super().__init__(str(pattern), escape=False)


[docs]class Capture(__Group): ''' Creates a capturing group out of the provided pattern. :param Pregex | str pre: The pattern out of which the capturing group is created. :param str name: The name that is assigned to the captured group \ for backreference purposes. A value of ``None`` indicates that no name \ is to be assigned to the group. Defaults to ``None``. :raises InvalidArgumentTypeException: - Parameter ``pre`` is neither a ``Pregex`` instance nor a string. - Parameter ``name`` is neither a string nor ``None``. :raises InvalidCapturingGroupNameException: Parameter ``name`` is not a valid \ capturing group name. Such name must contain word characters only and start \ with a non-digit character. :note: - Creating a capturing group out of a capturing group does nothing to it. - Creating a capturing group out of a non-capturing group converts it \ into a capturing group, except if any flags have been applied to it, \ in which case, the non-capturing group is wrapped within a capturing \ group as a whole. - Creating a named capturing group out of an unnamed capturing group, \ assigns a name to it. - Creating a named capturing group out of a named capturing group, \ changes the group's name. ''' def __init__(self, pre: _Union[_pre.Pregex, str], name: _Optional[str] = None): ''' Creates a capturing group out of the provided pattern. :param Pregex | str pre: The pattern that is to be wrapped \ within a capturing group. :param str name: The name that is assigned to the captured group \ for backreference purposes. A value of ``None`` indicates that no name \ is to be assigned to the group. Defaults to ``None``. :raises InvalidArgumentTypeException: - Parameter ``pre`` is neither a ``Pregex`` instance nor a string. - Parameter ``name`` is neither a string nor ``None``. :raises InvalidCapturingGroupNameException: Parameter ``name`` is not a valid \ capturing group name. Such name must contain word characters only and start \ with a non-digit character. :note: - Creating a capturing group out of a capturing group does nothing to. - Creating a capturing group out of a non-capturing group converts it \ into a capturing group, except if any flags have been applied to it, \ in which case, the non-capturing group is wrapped within a capturing \ group as a whole. - Creating a named capturing group out of an unnamed capturing group, \ assigns a name to it. - Creating a named capturing group out of a named capturing group, \ changes the group's name. ''' super().__init__(pre, lambda pre: pre.capture(name))
[docs]class Group(__Group): ''' Creates a non-capturing group out of the provided pattern. :param Pregex | str pre: The pattern that is to be wrapped \ within a non-capturing group. :param bool is_case_insensitive: If ``True``, then the "case insensitive" \ flag is applied to the group so that the pattern within it ignores case \ when it comes to matching. Defaults to ``False``. :raises InvalidArgumentTypeException: Parameter ``pre`` is neither \ a ``Pregex`` instance nor a string. :note: - Creating a non-capturing group out of a non-capturing group does nothing, \ except for remove its flags if it has any. - Creating a non-capturing group out of a capturing group converts it into \ a non-capturing group. ''' def __init__(self, pre: _Union[_pre.Pregex, str], is_case_insensitive: bool = False): ''' Creates a non-capturing group out of the provided pattern. :param Pregex | str pre: The pattern that is to be wrapped \ within a non-capturing group. :param bool is_case_insensitive: If ``True``, then the "case insensitive" \ flag is applied to the group so that the pattern within it ignores case \ when it comes to matching. Defaults to ``False``. :raises InvalidArgumentTypeException: Parameter ``pre`` is neither \ a ``Pregex`` instance nor a string. :note: - Creating a non-capturing group out of a non-capturing group does nothing, \ except for remove its flags if it has any. - Creating a non-capturing group out of a capturing group converts it into \ a non-capturing group. ''' super().__init__(pre, lambda pre: pre.group(is_case_insensitive))
[docs]class Backreference(__Group): ''' Creates a backreference to some previously declared capturing group. :param int | str ref: A reference to some previously declared capturing group. \ This parameter can either be an integer, in which case the capturing group \ is referenced by order, or a string, in which case the capturing group is \ referenced by name. :raises InvalidArgumentTypeException: Parameter ``ref`` is neither an integer \ nor a string. :raises InvalidArgumentValueException: Parameter ``ref`` is an integer but \ has a value of either less than ``1`` or greater than ``10``. :raises InvalidCapturingGroupNameException: Parameter ``ref`` is a string but \ not a valid capturing group name. Such name must contain word characters \ only and start with a non-digit character. ''' def __init__(self, ref: _Union[int, str]): ''' Creates a backreference to some previously declared capturing group. :param int | str ref: A reference to some previously declared capturing group. \ This parameter can either be an integer, in which case the capturing group \ is referenced by order, or a string, in which case the capturing group is \ referenced by name. :raises InvalidArgumentTypeException: Parameter ``ref`` is neither an integer \ nor a string. :raises InvalidArgumentValueException: Parameter ``ref`` is an integer but \ has a value of either less than ``1`` or greater than ``10``. :raises InvalidCapturingGroupNameException: Parameter ``ref`` is a string but \ not a valid capturing group name. Such name must contain word characters \ only and start with a non-digit character. ''' if isinstance(ref, int): if isinstance(ref, bool): message = "Parameter \"ref\" is neither an integer nor a string." raise _ex.InvalidArgumentTypeException(message) if ref < 1 or ref > 99: message = "Parameter \"ref\" cannot be less than 1 or greater than 99." raise _ex.InvalidArgumentValueException(message) transform = lambda s : f"\\{s}" elif isinstance(ref, str): if _re.fullmatch("[A-Za-z_][A-Za-z_0-9]*", ref) is None: raise _ex.InvalidCapturingGroupNameException(ref) transform = lambda s : f"(?P={s})" else: message = "Parameter \"ref\" is neither an integer nor a string." raise _ex.InvalidArgumentTypeException(message) super().__init__(str(ref), transform)
[docs]class Conditional(__Group): ''' Given the name of a capturing group, matches ``pre1`` only if said capturing group has \ been previously matched. Furthermore, if a second pattern ``pre2`` is provided, then \ this pattern is matched in case the referenced capturing group was not, though one \ should be aware that for this to be possible, the referenced capturing group must \ be optional. :param str name: The name of the referenced capturing group. :param Pregex | str pre1: The pattern that is to be matched in case condition is true. :param Pregex | str pre2: The pattern that is to be matched in case condition \ is false. Defaults to ``None``. :raises InvalidArgumentTypeException: - Parameter ``name`` is not a string. - Parameter ``pre1`` is neither a ``Pregex`` instance nor a string. - Parameter ``pre2`` is neither a ``Pregex`` instance nor a string nor ``None``. :raises InvalidCapturingGroupNameException: Parameter ``name`` is not a valid \ capturing group name. Such name must contain word characters only and start \ with a non-digit character. ''' def __init__(self, name: str, pre1: _Union[_pre.Pregex, str], pre2: _Optional[_Union[_pre.Pregex, str]] = None): ''' Given the name of a capturing group, matches ``pre1`` only if said capturing group has \ been previously matched. Furthermore, if a second pattern ``pre2`` is provided, then \ this pattern is matched in case the referenced capturing group was not, though one \ should be aware that for this to be possible, the referenced capturing group must \ be optional. :param str name: The name of the referenced capturing group. :param Pregex | str pre1: The pattern that is to be matched in case condition is true. :param Pregex | str pre2: The pattern that is to be matched in case condition \ is false. Defaults to ``None``. :raises InvalidArgumentTypeException: - Parameter ``name`` is not a string. - Parameter ``pre1`` is neither a ``Pregex`` instance nor a string. - Parameter ``pre2`` is neither a ``Pregex`` instance nor a string nor ``None``. :raises InvalidCapturingGroupNameException: Parameter ``name`` is not a valid \ capturing group name. Such name must contain word characters only and start \ with a non-digit character. ''' if not isinstance(name, str): message = "Provided argument \"name\" is not a string." raise _ex.InvalidArgumentTypeException(message) if _re.fullmatch("[A-Za-z_][\w]*", name) is None: raise _ex.InvalidCapturingGroupNameException(name) super().__init__(name, lambda s: f"(?({s}){pre1}{'|' + str(pre2) if pre2 != None else ''})")