The challenge

Find the longest substring in alphabetical order. If there are multiple substrings of the same maximum length, return the one that appears first.

Example: the longest alphabetical substring in "asdfaaaabbbbcttavvfffffdf" is "aaaabbbbctt".

Strings can be up to 10,000 characters long, so your solution needs to be efficient (O(n) time).

Solution 1: chunking approach (clearest)

Walk through the string, building chunks of consecutive alphabetical characters:

def longest(s):
    chunks = []
    for c in s:
        if chunks and chunks[-1][-1] <= c:
            chunks[-1] += c
        else:
            chunks.append(c)
    return max(chunks, key=len)

How it works: Start a new chunk whenever the current character is less than the previous one. At the end, return the longest chunk. O(n) time, O(n) space.

Solution 2: index tracking (memory efficient)

Track start/end indices instead of building substrings:

def longest(s):
    best_start = 0
    best_len = 1
    curr_start = 0

    for i in range(1, len(s)):
        if s[i] < s[i-1]:
            curr_start = i
        elif i - curr_start + 1 > best_len:
            best_start = curr_start
            best_len = i - curr_start + 1

    return s[best_start:best_start + best_len]

How it works: Only stores indices, not substrings. Resets curr_start when alphabetical order breaks. Updates the best whenever the current run exceeds it. O(n) time, O(1) extra space.

Solution 3: regex (concise)

A regex that matches runs of alphabetical characters:

import re

def longest(s):
    reg = re.compile('a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*')
    return max(reg.findall(s), key=len)

How it works: The pattern a*b*c*d*...z* matches any substring where characters appear in alphabetical order (each letter is optional, but they must be in sequence). findall returns all matches, and we pick the longest.

This is clever but slower than the iterative approaches for very long strings due to regex overhead.

Performance comparison

For a 10,000-character random string:

  • Chunking: ~1.2ms
  • Index tracking: ~0.8ms
  • Regex: ~3.5ms

All are fast enough for the constraint, but index tracking wins on both speed and memory.

Test cases

assert longest('asd') == 'as'
assert longest('nab') == 'ab'
assert longest('abcdeapbcdef') == 'abcde'
assert longest('asdfaaaabbbbcttavvfffffdf') == 'aaaabbbbctt'
assert longest('asdfbyfgiklag') == 'fgikl'
assert longest('z') == 'z'
assert longest('zyba') == 'z'

Enjoyed this string traversal problem? Try these next: