8mo ago

Python Performance: Why 'if not list' is 2x Faster Than Using len()

blog.codingconfessions.com Python Performance: Why 'if not list' is 2x Faster Than Using len()

Discover why 'if not mylist' is twice as fast as 'len(mylist) == 0' by examining CPython's VM instructions and object memory access patterns.

You're viewing a single thread.

146 comments

I don't like it very much, my variable could also be None here
- You'd need to explicitly check for None if using the len() construct as well, so this doesn't change the point of the article.
  
  But None has no len
  undefined
  
  if not foo:
  -> foo could be an empty list or None, it is ambiguous.
  len(foo) will lead to an exception TypeError if foo is None, I can cleanly catch that.
  It suggests I deal with a boolean when that is not the case. Explicit is better than implicit, and if not foo to check for an empty list may be pythonic, but it's still implicit af
  
  Apart from the quote from the zen of python, does this really make your code better though? You will end up writing 4-5 lines with an extra level of indentation. The code does the same, but has worse performance and communicates the intent poorly (compared to the “pythonic” version).
  
  I am not saying it's better, just that I don't like the proposed way :) I would argue that being "pythonic" has even less value than the Zen, which I quoted because it's true, not because it is some strict rule (which it isn't anyway).
  You could argue I also need to write that extra code for the if not case, as I explicitly have to check if it is None if my program somewhere further down expects only lists.
  Hunting for those sweet milliseconds is a popular game in the Python community ;) if this mechanism is that important for your program, you should definitely use it, I would do as well!
  
  I think pythonic is more important than performance and I would still choose that version over a try-catch block, were it slower. Being pythonic means it represents a commonly understood pattern in Python code, therefore it is more efficient in communicating intent.
  
  Exactly. The point of following a code style is to make obvious patterns easy to spot and deviations stand out. That's why code style guidelines say your priorities should be:
  follow whatever style the code around it uses
  follow project style guidelines
  do the technically optimal option
  3 should only be prioritized if the win is big enough, and there should probably be a comment right there explaining why the deviation was made.
  
  it's still implicit
  I don't see it that way. If you're doing if len(foo) == 0, you're implying that foo is expected to not be None, and expecting an exception should not be the default assumption, because exceptions should be... exceptional.
  Here's what I assume:
  if foo is not None - empty values are explicitly acceptable
  if not foo - the difference between an empty and None value isn't important
  if len(foo) == 0 - implicit assumption that foo is not None (I frequently forget that len(...) raises on None)
  If an exception was intended by the last bullet point, I prefer an explicit raise:
  undefined
  
  if foo is None: raise ValueError("foo may not be None")
  I actually use schema validation to enforce this at the edge so the rest of my code can make reasonable assumptions, and I'm explicit about whether each field may or may not be None.
  
  My point is that if your variable can be None then you need the same pattern for the length check.
  So for the Pythonic version:
  undefined
  
  if (foo is not None) and not foo: ...
  For the explicit length check:
  undefined
  
  if (foo is not None) and (len(foo) == 0): ...
  Honestly you're probably better off using type hints and catching such things with static checks and not adding the None check.
  
  This is what I would come up with:
  py
  
  try: if len(foo) == 0: ... except TypeError: ...
  There is no need to add a None check, as foo being None should be considered as a faulty input. Avoiding the possibility of foo being None from the beginning using static checks or testing is of course the preferred solution. But in reality we do not work in such optimal environments, at least I can say that from the perspective of data science, where often procedural, untested code is produced that runs only a few times. But I get your point and I think both paths are viable, but I am also okay with being in the wrong here,
  
  That's terrible, and I would block that PR in a heartbeat, unless there was a very good reason for it (given context). I would instead prefer:
  undefined
  
  if foo is None: ...
  Exceptions are useful for bubbling up errors, they're a massive code smell if you're catching something thrown by local logic. Just like you shouldn't catch IndexError right after indexing a list, you shouldn't catch TypeError right after checking the length. If you need to check parameters, check them at the start of your function and return early.
  
  Sir, I will make sure to never bother you with a PR and my terrible, terrible code ;)
  
  Rejecting a PR shouldn't be offensive, it should be a learning opportunity, both for the reviewer and the submitter. If I reject it, I'll give a clear reason why, and suggestions on how to fix it. I'll also engage in conversation if you're not clear on why I made a given comment, as well as a defense for why your code should be accepted as-is (i.e. that context I'm talking about).
  So please bother me with terrible, terrible code. I want to take time out of my day to help contributors learn, and I like pointing out areas where I learn something as well (like, "hey, this is really clever and also really easy to read, good job!"). I'm not always right, but I do have a lot of experience that I think others could benefit from. I know I was deeply appreciative of constructive criticism as a new dev, and I hope that's true for the people I provide reviews for.

146 comments