I learned Python in my first computer science course at Berkeley. Through my work experience, I've found that there are some differences between the code that I wrote in college vs. the code that I write in production. This isn't an exhaustive list by any means, but here are some things off the top of my head that stick out to me.
1. Code style: PEP 8 and more
You can (and should) follow a style guide like PEP 8, which is easy enough with an IDE or tools like
autopep8. That's often not sufficient to make your code readable, though. Here are some examples:
What's wrong with the following pseudocode?
def check_xyz(): if cond1: ... # some additional computation if cond2: ... # some additional computation if cond3: return True return False
Answer: In general, you should avoid nested if statements like that. The code should be rewritten like this:
def check_xyz(): if not cond1: return False ... # some additional computation if not cond2: return False ... # some additional computation if not cond3: return False return True
Or if you're really just checking those three conditions, with no additional computation:
def check_xyz(): return cond1 and cond2 and cond3
# Rule: For a long list of arguments, one argument per line for better readability. some_function( first_arg, second_arg, third_arg, fourth_arg ) # becomes some_function( first_arg, second_arg, third_arg, fourth_arg ) # Rule: Avoid backslashes to break lines. def example_of_a_long_fn(count: int, another_arg: float) -> \ str: ... # becomes def example_of_a_long_fn( count: int, another_arg: float ) -> str: ...
Some other things I've noticed:
- Imports should be sorted so you can easily identify whether something has been imported. In my team at Two Sigma, we split imports into three groups: built-in modules, third-party modules, and our own modules. Within each group, we sort the imports by the module name (not by "from" or "import").
- You typically don't use
assertunless it's in a testing framework like pytest. Usually there's a more descriptive exception or error that you can raise, e.g., a ValueError for a bad argument. (There are definitely some valid use cases for assert, though I don't think example #1 is a very good one here: https://stackoverflow.com/a/18980471/657200)
- Static methods are typically not used in Python. You can just move the function out of the class and make it a module-level function.
- A personal preference of mine is to use named arguments for better readability.
2. Type checking in Python: mypy
typing module includes classes like
Dict, etc. You can use these to define the types that your function expects and returns.
def add_numbers(a, b): return a + b # becomes def add_numbers(a: float, b: float) -> float: return a + b
There are tons of advantages. When you run
mypy on your module (download it via pip), you'll easily catch a lot of errors. The type annotations also make it obvious to other developers how to use your function.
If you want to type check a dictionary, you have a few different options:
|Allows default parameters||No||Yes||Yes|
3. Docstrings and comments: Follow a convention
At Two Sigma, I'm using the NumPy docstring guide. It doesn't really matter which one you use as long as you stick to it. PyCharm, the IDE that I use, automatically generates the docstring templates in the correct format.
Not all docstrings are created equal. If a function is user-facing (developer-facing), then it should probably have a detailed docstring complete with parameter descriptions and examples. If it's a helper function or a function that external users won't see or use, then you don't necessarily need to describe the parameters or give examples.
Once you've written your docstrings, you can easily generate Sphinx documentation.
On other types of comments: Comments are typically used if you're doing something unintuitive, to explain the reasoning behind why you did it that way. For example, there might be some error that you want to suppress. You should comment and explain why you're justified in suppressing it, and at what point it should be fixed.
Typically, triple quote comments are only used for docstrings. Even if a comment takes a couple of lines, if it's in the middle of a function, our coding convention is to use a hash symbol.
4. Testing: pytest
This is something that a lot of students are exposed to in school, but I figured I would include it here for completeness just in case. You can read all about using pytest.
5. Printing output: the logging module
Instead of using
print('Example logging message') # becomes logger = logging.getLogger(__name__) logger.debug('Example logging message')
You would set the output level appropriately (debug, info, etc.). It's best practice to get the logger as shown above rather than calling
This way, users can easily view the output if they want with something like
While this isn't a complete list, I hope that this is helpful to some people!