All good things have already been said, alas.
Perhaps a solution using numpy
, where you can use arrays to select members of arrays.
import numpy as np
def reverse_vowels(s: str) -> str:
ss = np.array(list(s))
vowels = set("aeiouAEIOU")
vowel_mask = np.vectorize(lambda c: c in vowels)(ss)
ss[vowel_mask] = ss[vowel_mask][::-1]
return "".join(ss)
Perhaps a solution using iterators, but it's essentially a slightly more general version of Schmuddi's answer.
from itertools import filterfalse
def intersperse_by(key_iterator, iterators):
for key in key_iterator:
yield next(iterators[key])
def reverse_vowels(s: str) -> str:
vowels = set("aeiouAEIOU")
def is_vowel(c): return c in vowels
return "".join(
intersperse_by(
map(is_vowel, s), [
filterfalse(is_vowel, s),
filter(is_vowel, reversed(s)),
]
)
)
Reviewing my own code, perhaps it could be useful for someone else.
I've looked at runtimes for strings of lengths n=10**k
, both of my approaches are linear in n
. Which is both expected and good. Below, I'm using the runtimes encountered for a random string of length 1_000_000
.
no_comment's regexp based solution takes 75 ms
. My numpy
based solution takes 796 ms
, the iterator-based one 452 ms
. I don't expect to be able to beat the regexp-based answer, but surely there's room for improvement.
Defining a function, and calling it is costly. Indeed,
vowels = set("aeiouAEIOU")
def is_vowel(c): return c in vowels
could have been
is_vowel = set("aeiouAEIOU").__contains__
and this simple change cuts the runtime of the iterator-based answer from 452 ms
to 257 ms
.
Numpy is great, but not for string processing. Duh. Even the simple act of converting the string to an array and back to the string,
"".join(np.array(list(s)))
takes 520 ms
, the somewhat better line,
"".join(np.fromiter(s, dtype='<U1'))
takes 437 ms
.
Picking more numpy-friendly datatypes,
"".join(map(chr, np.fromiter(map(ord, s), dtype=np.uint8)))
we can reduce this to 195 ms
.
which is better, but still won't win us any awards. Nonetheless, using this idea gets the runtime down from 795 ms
to 275 ms
.
The improved codes:
def reverse_vowels(s: str) -> str:
vowel_mask = np.fromiter(map(set("aeiouAEIOU").__contains__, s), np.bool)
ss = np.fromiter(map(ord, s), dtype=np.uint8)
ss[vowel_mask] = ss[vowel_mask][::-1]
return "".join(map(chr, ss))
def reverse_vowels(s: str) -> str:
is_vowel = set("aeiouAEIOU").__contains__
vowel_mask = map(is_vowel, s)
consonants = filterfalse(is_vowel, s)
vowels_reversed = filter(is_vowel, reversed(s))
return "".join(intersperse_by(vowel_mask, (consonants, vowels_reversed)))