Clean Python: Names

What is Clean Code?

Clean Code is a book written in 2008 by Robert C. Martin. It compiles a lot of knowledge to help every developer write Clean Code.

But what is "clean code" exactly? A code considered "clean" is code that:

  1. is easy to read;
  2. easy to understand;
  3. easy to modify.

Why bother about that? Because we read more code than we write. We devote a lot of time to reading code to clearly understand what is going on before making some modifications – tiny ones, most of the time. We are authors that write code for readers. Make your code easy to read if you want to get things done quickly.

Okay, but that doesn't sound so hard to do! Well, it might be a bit harder than you think. Writing Clean Code is a lot like painting a picture. Most of us know when an image is painted well or poorly. But it doesn't mean that most of us know how to paint. Writing clean code requires a "sense of cleanliness"; some of us are born with it, and some of us have to learn it.

This book helps us to learn it.

Robert C. Martin, author of Clean Code

If you have to remember two core principles from this book, it would be to keep your code as readable and as simple as possible. And it starts with finding good names.

And while you learn about Clean Code, remember the Boy Scout rule:

Leave the code you are editing a little cleaner than you found it.

This simple rule will help you improve your codebase and make the work easier daily.

Bad Code

Let's start with a few simple examples.

# Elapsed time in days
d = 10

That doesn't look so bad at first sight. What is the problem here? If you come across that variable d in the middle of a considerable function, you would ask yourself: what does it mean, and why is it there? This name doesn't reveal anything about what it contains. But the comment above the declaration does.

Now, what if you named that variable elapsed_time_in_days, days_since_creation, days_since_modification, file_age_in_days? That way, you reveal the intention and can delete the comment. What does it take to rename it? Most modern code editors do it automatically. Finding a good name takes time, so change them as soon as you find better ones.

A good name has to answer the following big questions:

  • Why does it exist?
  • What does it do?
  • How is it used?

A better solution would have been to eliminate that integer and use a more appropriate type.

elapsed_time = datetime.timedelta(days=10)

Nowadays, most code editors will tell you the type of variable with your mouse over it, and typing tells you how to use an object. Don't go straight for primitives when you have to create a variable or a function argument. Stop for a few seconds and ask yourself how to reveal your intention with an appropriate name or type.

Let's look at another example. What is the purpose of this code?

def get_them(self):
    a_list = []
    for x in self.the_list:
        if x[0] == 4:
             a_list.append(x)
    return a_list

It's a pretty simple method but taken apart from the rest of the class; it doesn't speak by itself. This code is too implicit.

Now, let's do some simple refactoring.

def get_pending_pull_requests(self):
    pending_pull_requests = []
    for pull_request in self.embarked_pull_requests:
        if pull_request[STATUS_VALUE] == PENDING:
             pending_pull_requests.append(pull_request)
    return pending_pull_requests

We did some simple alterations here. We renamed some variables, found a good name for the method, and extracted some constants. Now the method seems pretty straightforward, right? All these refactorings can be done automatically with modern code editors, so it's pretty trivial.

We could go further and create a dataclass for pull requests, to make the code even more human-readable.

def get_pending_pull_requests(self):
    pending_pull_requests: list[PullRequest] = []
    for pull_request in self.embarked_pull_requests:
        if pull_request.is_pending:
             pending_pull_requests.append(pull_request)
    return pending_pull_requests

This refactoring is more complex to apply, as self.embarked_pull_requests may be used in a lot of places — but you get the idea.

Advice on Naming

There are a few traps that you should know to be able to name pieces of code correctly.

  • First, you need to distinguish clearly two different names. Some developers like to change a name just to make the distinction and satisfy the interpreter. Like pull_request and the_pull_request, or pull_request_1 and pull_request_2. These variables are created for some reason, and that reason must be visible in the name. You should take the time to wonder why they exist and come out with names such as pull_request and pull_request_updated. This will also be very useful for code completion.
  • Talking about code completion, avoid similar names (names that vary in a small way) or very long names. Code completion will become a valuable tool in a large codebase, and having names too close to each other makes it hard to use.
  • Names should be pronounceable and searchable. Pronounceable to not sound like an idiot when we talk about code — that happens more frequently than you think. Searchable to find references quickly. Avoid single-letter names and numeric constants. GITHUB_MAXIMUM_REVIEW_REQUEST is easier to find than 15, and it tells a lot about that number. Nothing is worse than a random number used in a condition in the middle of nowhere. The code becomes untouchable as we slowly forget about its meaning. Notice that the size of a name should correspond to the size of its scope. A small name is acceptable for a local variable in a small context.
  • Avoid double negatives. For example, you should rewrite the condition `not is_invalid by is_valid, it makes conditions easier to read, and reduces complexity.
  • Also, avoid humor when naming. Use kill() instead of say_your_prayers(), or abort() instead of oops().

Conventions

Rely on conventions to find suitable names.

  • Class and object names should be a noun or noun phrases, like PullRequest, MergeAction or BranchUpdater. Avoid terms like Manager, Processor, Data or Info in the name, they are too vague.
  • Method and function names should contain a verb or verb phrase, like add_pull, remove_pull or send_merge_signal. You can prefix accessors with get (or create a @propetry), mutators with set and predicates with is, can or has. A long descriptive name is better than short, enigmatic names or a long explanatory comment. Remember Ward Cunningham's principle:
You know you are working on clean code when each routine turns out to be pretty much what you expected.
  • Try to pick one word per concept. Terms like fetch, retrieve and get are confusing when used in the same codebase. Same for controller, manager and driver.
  • Remember that you talk to programmers, so use computer science terms, algorithm names, design pattern names, math terms… Otherwise, use domain names, even more when you work on core domain logic. If a developer doesn't know about a word, he can ask a domain expert to explain it.

Conclusion

Changing names is the simplest thing you could do to improve your code, even if good names are hard to find, so don't be too harsh with yourself or other developers. Take time to wonder where a name comes from and how you could improve it. Take time to read the code, too, you'll learn a lot of good practices from other developers.

The second thing you could do to improve your code is to keep things as small and simple as possible. This will lead us to our next article about Clean Python because — as you might guess— it’s not as simple as you think.

There are only two hard things in Computer Science: cache invalidation and naming things — Phil Karlton