This is a short article to give you an example of using a key
argument when working with Python’s built-in min
and max
functions. I’ve used this argument in the past when working with sorted
but I wasn’t aware that it’s not the only place where it can be leveraged.
Example: finding a most recent event from a collection of events
Let’s assume we’re working with some kind of event data. Each event has 2 attributes: information about when it happened (timestamp
) and a value
.
Here’s its representation in the code:
from dataclasses import dataclass
@dataclass
class SomeEvent:
timestamp: int # epoch in milliseconds
value: int
You’ve collected a batch of such events (which are not sorted by timestamp).
events = [
SomeEvent(timestamp=1720270961271, value=100),
SomeEvent(timestamp=1720270974201, value=105),
SomeEvent(timestamp=1720270953749, value=102),
SomeEvent(timestamp=1720270980151, value=110),
SomeEvent(timestamp=1720270967495, value=97),
]
How would you return the event with a minimum or maximum timestamp?
Solution
For loop / sorted
Initially I thought I could just iterate over these events and store the information about min/max timestamp. If current event timestamp is lower/higher than. Something like:
from typing import Sequence
def find_event_with_min_timestamp_1(events: Sequence[SomeEvent]) -> SomeEvent:
event_with_min_timestamp = events[0]
for event in events[1:]:
if event.timestamp < event_with_min_timestamp.timestamp:
event_with_min_timestamp = event
return event_with_min_timestamp
Another approach you could take is to sort the events based on a timestamp
key and return the first (or the last) one:
def find_event_with_min_timestamp_2(events: Sequence[SomeEvent]) -> SomeEvent:
return sorted(events, key=lambda x: x.timestamp)[0]
The latter solution is much more concise - it became a one-liner.
What would happened though if we pass an empty sequence as an argument? Well, both versions would raise an IndexError
. Let’s add a guarding if statement and return a None
value in such case:
def find_event_with_min_timestamp_1(events: Sequence[SomeEvent]) -> SomeEvent | None:
if not events:
return None
event_with_min_timestamp = events[0]
for event in events[1:]:
if event.timestamp < event_with_min_timestamp.timestamp:
event_with_min_timestamp = event
return event_with_min_timestamp
def find_event_with_min_timestamp_2(events: Sequence[SomeEvent]) -> SomeEvent | None:
if not events:
return None
return sorted(events, key=lambda x: x.timestamp)[0]
Due to extra if statement, sorted
version is not a one-liner anymore.
Can we do better than that?
I think we can.
min function
It turns out that, similarly to sorted
, Python’s built-in min
and max
functions also expose a key
argument which allows for defining an expression that will be used for comparison operations.
Our function would look like this:
def find_event_with_min_timestamp_3(events: Sequence[SomeEvent]) -> SomeEvent:
return min(events, key=lambda x: x.timestamp)
What if min
receives an empty iterable? ValueError
is raised. However, you can handle such cases by providing a default
argument:
def find_event_with_min_timestamp_3(events: Sequence[SomeEvent]) -> SomeEvent | None:
return min(events, key=lambda x: x.timestamp, default=None)
Although all 3 versions implement the same logic, the min
one is the most concise.
Closing remarks
In this article I’ve shown an example of leveraging key
argument when working with min
and max
built-in functions. You may save some code of lines with it.
BTW - I am not always striving for packing everything into the least amount of lines of code. Make sure that your code remains readable.
(16/52) This is a 16th post for my blogging challenge (publishing 52 posts in 2024).