Longest Alphabetical Substring (Python)

The challenge

Find the longest substring in alphabetical order. If there are multiple substrings of the same maximum length, return the one that appears first.

Example: the longest alphabetical substring in "asdfaaaabbbbcttavvfffffdf" is "aaaabbbbctt".

Strings can be up to 10,000 characters long, so your solution needs to be efficient (O(n) time).

Solution 1: chunking approach (clearest)

Walk through the string, building chunks of consecutive alphabetical characters:

def longest(s):
    chunks = []
    for c in s:
        if chunks and chunks[-1][-1] <= c:
            chunks[-1] += c
        else:
            chunks.append(c)
    return max(chunks, key=len)

How it works: Start a new chunk whenever the current character is less than the previous one. At the end, return the longest chunk. O(n) time, O(n) space.

Solution 2: index tracking (memory efficient)

Track start/end indices instead of building substrings:

def longest(s):
    best_start = 0
    best_len = 1
    curr_start = 0

    for i in range(1, len(s)):
        if s[i] < s[i-1]:
            curr_start = i
        elif i - curr_start + 1 > best_len:
            best_start = curr_start
            best_len = i - curr_start + 1

    return s[best_start:best_start + best_len]

How it works: Only stores indices, not substrings. Resets curr_start when alphabetical order breaks. Updates the best whenever the current run exceeds it. O(n) time, O(1) extra space.

Solution 3: regex (concise)

A regex that matches runs of alphabetical characters:

import re

def longest(s):
    reg = re.compile('a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*')
    return max(reg.findall(s), key=len)

How it works: The pattern a*b*c*d*...z* matches any substring where characters appear in alphabetical order (each letter is optional, but they must be in sequence). findall returns all matches, and we pick the longest.

This is clever but slower than the iterative approaches for very long strings due to regex overhead.

Performance comparison

For a 10,000-character random string:

Chunking: ~1.2ms
Index tracking: ~0.8ms
Regex: ~3.5ms

All are fast enough for the constraint, but index tracking wins on both speed and memory.

Test cases

assert longest('asd') == 'as'
assert longest('nab') == 'ab'
assert longest('abcdeapbcdef') == 'abcde'
assert longest('asdfaaaabbbbcttavvfffffdf') == 'aaaabbbbctt'
assert longest('asdfbyfgiklag') == 'fgikl'
assert longest('z') == 'z'
assert longest('zyba') == 'z'

Same puzzle, other languages

Longest alphabetical substring — Java — same single-pass logic in idiomatic Java.
Longest alphabetical substring — JavaScript — JS solution using string traversal.

Enjoyed this string traversal problem? Try these next:

Find all permutations of a string — generate every possible arrangement of characters without built-in functions.
Find the longest vowel substring in Java — a similar “longest substring” problem, but tracking vowel sequences instead of alphabetical order.