8
\$\begingroup\$

I've been learning python for a few days now, and I programmed this password generator after today's lesson (following the 100 days of code on udemy course). This code works like it should and gives me the desired results, but out of pure curiosity, I am interested in optimizing the code even more.

import random

letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
numbers = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
symbols = ['!', '#', '$', '%', '&', '(', ')', '*', '+']

print("Welcome to the PyPassword Generator!")
nr_letters = int(input("How many letters would you like in your password?\n"))
nr_symbols = int(input(f"How many symbols would you like?\n"))
nr_numbers = int(input(f"How many numbers would you like?\n"))

password = []

for req in range(0, nr_letters + 1):
    letter_index = random.randint(0, len(letters))
    password.append(letters[letter_index-1])

for sym in range(1, nr_symbols + 1):
    symbol_index = random.randint(0,len(symbols))
    password.append(symbols[symbol_index-1])

for num in range(1, nr_numbers + 1):
    number_index = random.randint(0,len(numbers))
    password.append(numbers[number_index-1])

random.shuffle(password)
print(*password,sep='')
\$\endgroup\$
3
  • \$\begingroup\$ Tangentially related, but if you relax the constraints on the password, you get stronger passwords, and the code simplifies to "".join(random.choices(f"{string.ascii_letters}0123456789!#$%&()*+", k=int(input('The password length, please!')))) as well. \$\endgroup\$ Commented Jul 7 at 13:10
  • 1
    \$\begingroup\$ Probably too small for a review (only one suggestion, and I like the existing answer very much - almost nothing to add), but a suggestion: besides interactive usage, consider making this a proper CLI utility. Check out typer - it will save you a bit of manual input parsing, while also providing both script arguments and interactive prompts out of the box. (Why care about this - because many people prefer writing ./something.py --digits 4 --letters 8 --symbols 1 instead of talking to interactive prompts or use it in scripts easier) \$\endgroup\$
    – STerliakov
    Commented Jul 8 at 0:56
  • 1
    \$\begingroup\$ And for inspiration, there's a well-known utility with similar functionality - pwgen. You may want to read its manpage for some ideas that might be worth including in your script. \$\endgroup\$
    – STerliakov
    Commented Jul 8 at 0:57

2 Answers 2

13
\$\begingroup\$

Remarks:

  • Bug: you have an extra letter sneaking in with for req in range(0, nr_letters + 1):, making your output longer than it should be. That should be for req in range(0, nr_letters): or for req in range(nr_letters):. The other calls like for sym in range(1, nr_symbols + 1): can be simplified as well. Try to eliminate these little + and - operations. You generally don't need them, and off by one errors are one of the classic subtle programming bugs.
  • In the line above, req is unused, so replace it with a _, by convention.
  • Avoid indexing--prefer random.choices to select random items from a list (random.choice if you just want to select one item).
    • Tiny bug: random.randint(0,len(numbers)) should be random.randint(0, len(numbers) - 1) since the end is inclusive, and don't subtract 1 once you have this index. You're subtracting 1 later, so your range is -1 to len(numbers) - 1. The -1 is a valid index in Python so your code won't raise an IndexError, but it means there'll be a bit of extra weight added to the last item.
  • Separate user interaction (I/O) and algorithm logic.
  • Use a function to make your code reusable, testable and abstract away complexity.
  • Add type hints to your function and check them with pyright.
  • Don't use f-strings unless there's actually interpolation happening.
  • Don't assume input is valid; handle errors gracefully.
  • There is no performance issue with your code, so "optimization" doesn't matter at this point. Focus on coding style and writing maintainable code.
  • Add docstrings.
  • Consider adding unit tests. Since your code uses randomness, you can use a regex or logic to make sure the password has the right stuff in it, or seed the random library to be deterministic. But even writing a simple length test would have caught your primary bug mentioned above. Fuzz testing can also be used to catch edge case crashes and make sure your input sanitization and exception handling is tight. A comprehensive test suite means you can refactor with a lower likelihood of causing regressions.

Here's a rewrite:

from random import choices, shuffle
from string import ascii_letters, digits


def generate_password(nr_letters: int, nr_numbers: int, nr_symbols: int) -> str:
    """Generates a password with n random letters, numbers and symbols"""
    symbols = "!#$%&()*+"
    password = [
        *choices(ascii_letters, k=nr_letters),
        *choices(digits, k=nr_numbers),
        *choices(symbols, k=nr_symbols),
    ]
    shuffle(password)
    return "".join(password)


def read_positive_int(
    prompt: str, invalid: str = "Input must be a positive integer"
) -> int:
    """Reads a positive integer input from stdin, repeating until successful"""
    while True:
        try:
            if (n := int(input(prompt))) > 0:
                return n
        except ValueError:
            pass
        print(invalid)


def main():
    """Interacts with the user to generate a password"""
    print("Welcome to the PyPassword Generator!")
    nr_letters = read_positive_int("How many letters would you like in your password? ")
    nr_symbols = read_positive_int("How many symbols would you like? ")
    nr_numbers = read_positive_int("How many numbers would you like? ")
    password = generate_password(nr_letters, nr_numbers, nr_symbols)
    print(password)


if __name__ == "__main__":
    main()

You could generalize this a step further and accept arbitrary iterables and counts instead of hardcoded letters, numbers and symbols, but that seems a bit premature, so I left the function as-is in that respect.

3 positional args is getting to be a bit much, so you might want to make those kwargs as well, also left as a future improvement.

I'm not a security expert, so I can't remark on whether this should be used for actual passwords. Erring on the side of caution, I'd say don't use it in favor of an expert-vetted algorithm. See Typical password generator in Python for our community thread on the topic.

\$\endgroup\$
8
  • \$\begingroup\$ Thank you so much!! The word I was looking for was not optimization, but I didn’t know how else to describe what I’m trying to achieve. I’ll try grasping the concepts behind your suggestions as for example functions are still a bit foreign to me :) \$\endgroup\$
    – PSapola
    Commented Jul 7 at 0:03
  • \$\begingroup\$ I understand--optimization is a bit of a vague term and usually means "speed of the program" if there are no qualifiers. But you can also say "optimize the code to be maintainable". \$\endgroup\$
    – ggorlen
    Commented Jul 7 at 0:06
  • \$\begingroup\$ And one more thing noting the f-strings; I didn’t even notice them there as the first lines of this code (that includes all the lists for letters, symbols and numbers as well as the user input lines. Logic is coded by me.) are copied from the course I’m following along on udemy. But it’s a mistake on my part to not double check what I’m copying. I just thought the instructor doesn’t make mistakes as simple as these. \$\endgroup\$
    – PSapola
    Commented Jul 7 at 0:12
  • 1
    \$\begingroup\$ Might be more of a personal preference thing, but I always tend to view ... as "I'm going back to put something here later" while pass conveys "this does exactly what it does and we're not changing it further." \$\endgroup\$
    – smitelli
    Commented Jul 7 at 11:51
  • 1
    \$\begingroup\$ @CrisLuengo That was my initial thinking--I didn't see the harm in actually using it, but I'm not a security expert so I erred on the side of caution once I saw that other answer gaining traction. Later, I saw and upvoted your comment, which makes sense. I don't want to commit to making a recommendation here; I'm interested in Python code quality only, and it's incidental to my goals that the application happens to have possible security ramifications. \$\endgroup\$
    – ggorlen
    Commented Jul 8 at 16:36
9
\$\begingroup\$

Security

The documentation for the random module gives the following warning:

Warning: The pseudo-random generators of this module should not be used for security purposes. For security or cryptographic uses, see the secrets module.

The secrets module can be used in a similar way to the random module, but uses the best source of randomness available on your system. However, it doesn't provide an equivalent method to shuffle, so you'll have to rework your algorithm a bit. You can take inspiration from the "Recipes and best practices" section of the secrets documentation.

Index into string

Instead of using lists of single-character strings, you can index directly into strings:

>>> foo = 'abc'
>>> foo[1]
'b'

Taking advantage of this would make the definitions at the beginning of the program easier to write and more readable.

Note that strings are immutable, you can't use string indexing to change characters:

>>> foo[1] = 'd'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

In your case, changing the character sets would likely be a bug, so string immutability will catch these.

Use built-ins when possible

Even better, the string modules provide these constant strings, so the character set definitions can be simplified to:

from string import ascii_letters, digits, punctuation

This would ensure that no character character is omitted or duplicated, and reduce complexity, whether for writing the code or proofreading it.

You could also prefer to use ascii_lowercase and ascii_uppercase for more fine-grained control.

Factor code into functions

This improves reusability, testability and readability of your code. @ggorlen provides good feedback on how to do that.

\$\endgroup\$
7
  • 2
    \$\begingroup\$ Creating a password is not cryptography. It's not like someone is going to be able to recreate your password if they know your program, no matter how poor the source of randomness. OP's program will create stronger passwords than any a person would invent, so it's a win. \$\endgroup\$ Commented Jul 8 at 16:22
  • 1
    \$\begingroup\$ @CrisLuengo as a reviewer, I'd definitely point to that warning in documentation and ask to either justify why random is suitable here or replace it with secrets (or probably random.SystemRandom to keep interface the same). The docs are pretty clear: "In particular, secrets should be used in preference to the default pseudo-random number generator in the random module, which is designed for modelling and simulation, not security or cryptography" - I don't think that password generation is not "security". Since rewriting with SystemRandom would be trivial, there's no harm doing that. \$\endgroup\$
    – STerliakov
    Commented Jul 8 at 16:44
  • 2
    \$\begingroup\$ And it is definitely not less secure to use algorithms designed specifically for the use case. Unless there is some other reason (performance? unlikely), I'd insist on applying standard library solution targeting security-related use. Since random.Random is based on Mersenne Twister algorithm (fully reproducible with a known seed), which is widely accepted to be not applicable for secret generation, I don't believe your claim is reasonable. crypto.stackexchange.com/questions/12426/… \$\endgroup\$
    – STerliakov
    Commented Jul 8 at 16:48
  • \$\begingroup\$ @SUTerliakov A password is usually created by a person, and most of them are guessable. Anything you do to automate the password generation, even if it’s using a shitty pseudo-random generator seeded with the current time, is a big win for safety. This has nothing to do with secret generation in cryptography. —— There’s nothing wrong with pointing out the dangers of insecure RNG, I’m just pointing out that for this particular application you don’t need secure RNG. \$\endgroup\$ Commented Jul 8 at 17:05
  • 1
    \$\begingroup\$ This has nothing to do with secret generation in cryptography is simply wrong. Password generation is absolutely an activity that deserves cryptographic-strength prng, and given the cost (none) vs. benefit (some), we should always prefer the secrets approach. \$\endgroup\$
    – Reinderien
    Commented Jul 10 at 12:23

Not the answer you're looking for? Browse other questions tagged or ask your own question.