Skip to content

Implementation Details

uwotm8.convert.convert_american_to_british_spelling

convert_american_to_british_spelling(text, strict=False)

Convert American English spelling to British English spelling.

PARAMETER DESCRIPTION
text

TYPE: str

strict

TYPE: bool DEFAULT: False

PARAMETER DESCRIPTION
text

The text to convert.

TYPE: str

strict

Whether to raise an exception if a word cannot be converted.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
Any

The text with American English spelling converted to British English spelling.

Source code in uwotm8/convert.py
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
def convert_american_to_british_spelling(  # noqa: C901
    text: str, strict: bool = False
) -> Any:
    """
    Convert American English spelling to British English spelling.

    Args:
        text: The text to convert.
        strict: Whether to raise an exception if a word cannot be converted.

    Returns:
        The text with American English spelling converted to British English spelling.
    """
    if not text.strip():
        return text
    try:

        def should_skip_word(word: str, pre: str, post: str, match_start: int, match_end: int) -> bool:
            """Check if the word should be skipped for conversion."""
            # Skip if within code blocks
            if "`" in pre or "`" in post:
                return True

            # Skip if word is in the ignore_list
            if word.lower() in CONVERSION_IGNORE_LIST:
                return True

            # Check for hyphenated terms (e.g., "3-color", "x-coordinate")
            # If the word is part of a hyphenated term, we should skip it
            if "-" in pre and pre.rstrip().endswith("-"):
                return True

            # Check for URL/URI context
            line_start = text.rfind("\n", 0, match_start)
            if line_start == -1:
                line_start = 0
            else:
                line_start += 1

            line_end = text.find("\n", match_end)
            if line_end == -1:
                line_end = len(text)

            line_context = text[line_start:line_end]

            # Skip if word appears to be in a URL/URI
            return "://" in line_context or "www." in line_context

        def preserve_capitalization(original: str, replacement: str) -> str:
            """Preserve the capitalization from the original word in the replacement."""
            if original.isupper():
                return replacement.upper()
            elif original.istitle():
                return replacement.title()
            return replacement

        def replace_word(match: re.Match) -> Any:
            """
            Replace a word with its British English spelling.

            Args:
                match: The match object.

            Returns:
                The word with its spelling converted to British English.
            """
            # The first group contains any leading punctuation/spaces
            # The second group contains the word
            # The third group contains any trailing punctuation/spaces
            pre, word, post = match.groups()

            if should_skip_word(word, pre, post, match.start(), match.end()):
                return match.group(0)

            if american_spelling_exists(word.lower()):
                try:
                    british = get_british_spelling(word.lower())
                    british = preserve_capitalization(word, british)
                    return pre + british + post
                except Exception:
                    if strict:
                        raise
            return match.group(0)

        # Match any word surrounded by non-letter characters
        # Group 1: Leading non-letters (including empty)
        # Group 2: The word itself (only letters)
        # Group 3: Trailing non-letters (including empty)
        pattern = r"([^a-zA-Z]*?)([a-zA-Z]+)([^a-zA-Z]*?)"
        return re.sub(pattern, replace_word, text)
    except Exception:
        if strict:
            raise
        return text

Word Context Detection

The convert_american_to_british_spelling function includes special handling for various text contexts:

Hyphenated Terms

Words that are part of hyphenated terms are preserved in their original form. For example:

  • "3-color" remains "3-color" (not converted to "3-colour")
  • "x-coordinate" remains "x-coordinate" (not converted to "x-coordinate")
  • "multi-colored" remains "multi-colored" (not converted to "multi-coloured")

This is useful for preserving technical terminology and compound adjectives where conversion might be inappropriate.

Code Blocks

Words within code blocks (surrounded by backticks) are not converted, preserving code syntax and variable names.

URLs and URIs

Words that appear in lines containing URLs or URIs (identified by "://" or "www.") are not converted to avoid breaking links.

Conversion Ignore List

An ignore list of words that should not be converted is maintained, including technical terms that have different meanings in different contexts:

  • "program" vs "programme" (in computing contexts)
  • "disk" vs "disc" (in computing contexts)
  • "analog" vs "analogue" (in technical contexts)
  • And others

Capitalization Preservation

The function preserves the capitalization pattern of the original word:

  • ALL CAPS words remain ALL CAPS
  • Title Case words remain Title Case
  • lowercase words remain lowercase