Splitting Text and Numbers in a String with Python

Without regular expression

Amir Ali Hashemi

--

Many times when working with strings, you might feel the need to separate text from numbers. As I was surfing through programming guides, I found that achieving this using regular expressions is fairly straightforward.

However, if you are like me, you might find regular expressions unreasonably difficult. Therefore, I came up with a simple algorithm that does the job.

Please feel free to look at the code and take a moment to see if the explanation below makes sense to you.

def split_text_from_digit(string):
result = []
text = ''
digit = ''
prev_char = None
for char in string:
if char.isdigit():

if prev_char == "digit" or prev_char==None:
digit+=char
else:
if text != '':
result.append(text)
text = ''
digit = ''
digit+=char

prev_char = "digit"

else:
if prev_char == "text" or prev_char==None:
text+=char

else:
if digit != '':
result.append(digit)
digit = ''
text = ''
text+=char
prev_char = "text"

if digit != '':
result.append(digit)
if text != '':
result.append(text)
return result

print(split_text_from_digit('amir-tech23medium 203ld'))

output:

['helloword', '23', ' i am', '322', ' ', '43', ' her']

The algorithm starts by iterating through each character in the string one by one and checks whether it is a digit or not. It compares the current character to the previous one. If the current character is a digit and the previous character was a digit as well, it continues by adding it to the digit variable. However, if the previous character was a letter, then whatever has been accumulated in the text variable so far should be added to the results. Afterward, both the digit and text variables are reset, and the current character is added to the text variable.

Similarly, if the current character is a letter and the previous character was a letter too, we continue by adding it to the text variable. But if the previous character was a digit, that means the accumulation of the digit is done, and it’s ready to be added to the results and reset immediately afterward.

The limitation of this algorithm is that it is not able to work with floats, as they have a dot in between the digits, and it recognizes the dot as a non-digit.

--

--

Amir Ali Hashemi

I'm an AI student who attempts to find simple explanations for questions and share them with others