Todo Kata - Python Part 1
Posted on October 11, 2020 in Tutorial
In this part of the Todo kata, we will cover the Python implementation. If you are new to the series, I would recommend reading the intro first. It should give sufficient background to be able to follow along even without being familiar with previous posts.
Series Outline
- Intro
- F# Series
- Python Series
- Part 1 - Done (you are here)
- Part 2 - Todo
- Part 3 - SQLite
Full source code is available here.
Setup
Note: The version of Python on my machine - Python 3.6 - is relatively old, and there are several nice features that have been added to the language since which I will point out along the way.
We will start by creating a folder for the project and setting up a virtual environment:
$ mkdir PyTodo
$ cd PyTodo
$ python3 -m venv venv
$ code .
VS Code will automatically detect and activate the virtual environment, but you can also manually run source ./venv/bin/activate
on a *nix system or .\venv\Scripts\activate
on Windows.
To use mypy as the linter, we install it and configure the .vscode/settings.json.
(venv) $ python -m pip install mypy
.vscode/settings.json
{
"python.linting.mypyEnabled": true,
"python.linting.enabled": true,
}
Of course, you can use another linter and skip the type hints. However, even for a simple program like this, type hints caught a number of bugs at design time.
With our setup complete, we can start writing some code.
done_domain.py
The domain logic of done
consists just of the CompletedItem
class and some functions for returning dates we will use to query for completed items.
from datetime import datetime, timedelta
from typing import NamedTuple
class CompletedItem(NamedTuple):
item: str
completed_on: datetime
def __str__(self) -> str:
return f'[{self.completed_on.isoformat()}] {self.item}'
def create_default(item: str) -> CompletedItem:
return CompletedItem(item, datetime.now())
The CompletedItem
class is just a NamedTuple
with a nice string representation.
We will use that when writing a CompletedItem
to a file and when printing it to the console.
(Note: a dataclass
would also be a nice way of implementing CompletedItem
in Python 3.7+.)
The date functions are nothing special:
def _start_of_day(date: datetime) -> datetime:
return date.replace(hour=0, minute=0, second=0, microsecond=0)
def _start_of_week(date: datetime) -> datetime:
return _start_of_day(date - timedelta(days=date.isoweekday()))
def days_ago(date: datetime, num_days: int) -> datetime:
return _start_of_day(date - timedelta(days=num_days))
def weeks_ago(date: datetime, num_weeks: int) -> datetime:
return _start_of_week(date - timedelta(days=7*num_weeks))
Note the leading _
to mark "private" functions - we will not use _start_of_day
or _start_of_week
outside of this module.
file.py
Next we will create some helper functions for reading from and writing to files.
We will only use append_line
for done
(the others will be used in todo
), but we may as well define them all now.
import os
from typing import List
def write_all_lines(path: str, lines: List[str]):
with open(path, 'w') as f:
for line in lines:
print(line, file=f)
def append_line(path: str, line: str):
with open(path, 'a') as f:
print(line, file=f)
def read_all_lines(path: str) -> List[str]:
if not os.path.exists(path):
return []
with open(path) as f:
return [line.strip() for line in f.readlines()]
The print(line, file=f)
idiom is a nice way of writing a line to a file without having to remember to specify the new line at the end.
It is easy to forget the \n
in f.writelines([f'{line}\n' for line in lines])
and get everything on one line!
It is also easy to forget to strip()
the new lines when reading a file, but such is life.
done_file.py
You may have noticed in Part 1 of the F# series that we had domain functions (in domain.fs) related to parsing completed items from strings. Since that logic is specific to file persistence, it really does not belong in the domain - we will correct that this time around.
import re
import os
import file
import done_domain as done
from datetime import datetime
from typing import List, Optional
# define a file path where completed items will be stored
PATH = 'todo.done.txt'
def _parse_datetime(iso_date: str) -> datetime:
return datetime.strptime(iso_date, "%Y-%m-%dT%H:%M:%S.%f")
# could add a print statement when parse fails
def try_parse(done_item: str) -> Optional[done.CompletedItem]:
pattern = r'^\[(?P<completedOn>.*)\] (?P<item>.*)'
matches = re.match(pattern, done_item)
if matches:
completed_on = _parse_datetime(matches.group('completedOn'))
item = matches.group('item')
return done.CompletedItem(item, completed_on)
return None
def parse(done_item: str) -> done.CompletedItem:
'''Raises `ValueError` if parse fails.'''
completed_item = try_parse(done_item)
if completed_item:
return completed_item
else:
raise ValueError("`done_item` must be of format '[isodate] completed item'")
Recall that we are calling completed_on.isoformat()
to get a string representation of the completed time.
In Python 3.7+, there is an inverse function fromisoformat
to get back the datetime object, but since we are using 3.6 we have to parse manually.
It is useful to be familiar with how to do this anyway when the need arises to parse dates encountered in the wild.
The parse
function is basically the equivalent of our explicitly extracting the Option.Value
in the F# version - we do it to get rid of the Optional
while understanding we should be sure the parse does not fail.
You could see this as a drawback of using type hints in that we almost doubled the amount of parsing code just to have the types work out.
With try_parse
alone, we would still get a runtime exception if we tried to access the returned CompletedItem
, so you could argue we are no better off having parse
raise a runtime exception slightly earlier.
However, having try_parse
return an Optional[done.CompletedItem]
gives it an honest type signature.
Callers know that the parse may fail and return None
, enabling them to respond accordingly at design time without having to peek at the definition, look at any documentation, or run any code.
Without type hints, we may still assume that the parse could fail (hence the try
), but what is the failure behavior?
Does it raise an exception (and if so, what?), return a Tuple
of (success_status, value)
, return None
, something else?
Unit tests are a great way of exposing and documenting the API in cases like this without type hints (and may still be a good idea since type hints are ignored at runtime), but having a linter like mypy do some of that work for you definitely takes off cognitive load.
Finally, notice that in both try_parse
and parse
we assign a variable and then immediately use it in an if
statement.
Python 3.8 introduced assignment expressions (aka the walrus operator) to allow assigning a variable and using it in one line:
if (matches := re.match(pattern, done_item) is not None:
item = matches.group('item')
# ...
Wouldn't that be convenient? This use case of working with the match on a regular expression was actually part of the original justification for introducing assignment expressions into the language in PEP 572.
Now we can finish off with the functions for reading and writing completed items to a file.
def _check_completed_since(completed_item: Optional[done.CompletedItem], completed_since: datetime) -> bool:
return completed_since < completed_item.completed_on if completed_item else False
def get(path: str, completed_since: datetime) -> List[done.CompletedItem]:
if not os.path.exists(path):
return []
with open(path) as f:
return [
parse(line.strip())
for line in f.readlines()
if _check_completed_since(try_parse(line), completed_since)
]
def save(path: str, completed_items: List[done.CompletedItem]) -> None:
for completed_item in completed_items:
file.append_line(path, str(completed_item))
We use Python's ability to easily operate on lines in a file as they are read (for line in f.readlines()
) to filter out items completed before the specified date.
Our returning an Optional
from try_parse
alerts us that _check_completed_since
will have to account for the possibility of a null reference.
Also note we explicitly write str(completed_item)
to the file to get the __str__
representation we defined for the CompletedItem
class.
shared.py
The last piece we need before we get to the executable program is a function for displaying to the console.
We will put this in a "shared" module since both done
and todo
will use it.
from typing import Any, Iterable
def display(data: Iterable[Any]):
for datum in data: print(str(datum))
We make a call to str
just to be explicit (print
will already use the __str__
representation implicitly).
done.py
Now we can actually configure and define the CLI (command-line interface).
We will use partial application (via partial
from functools
) to create get
and save
functions which will access the path defined in done_file.py to read and write completed items.
We will also define a helper function save_from_string
to save a collection of strings as completed items.
This will be more helpful later when we technically pass a list of todo items to be completed, but it also gives us the ability to save multiple completed items at once with one command with a little modification to the command-line parsing.
We will leave that as an exercise.
import done_file
import done_domain
from functools import partial
from typing import Iterable
save = partial(done_file.save, done_file.PATH)
get = partial(done_file.get, done_file.PATH)
def save_from_string(items: Iterable[str]):
save([done_domain.create_default(item) for item in items])
With the configuration complete, we can work on handling the incoming command-line arguments. We will use the click package to help with this.
(venv) $ python -m pip install click
I was hesitant at first to install a new package as opposed to just parsing from sys.argv
or using argparse
, but I have to admit after trying it out of curiosity that click
is wonderful to use.
import click
import shared
from datetime import datetime
# ...
@click.group()
def cli(): pass
@cli.command(name='a')
@click.argument('completed_item', type=str)
def save_cli(completed_item: str):
save_from_string([completed_item])
@cli.command(name='d')
@click.argument('number_of_days_ago', type=int)
def get_by_days(number_of_days_ago):
completed_since = done_domain.days_ago(datetime.now(), number_of_days_ago)
shared.display(get(completed_since))
@cli.command(name='w')
@click.argument('number_of_weeks_ago', type=int)
def get_by_weeks(number_of_weeks_ago):
completed_since = done_domain.weeks_ago(datetime.now(), number_of_weeks_ago)
shared.display(get(completed_since))
if __name__ == "__main__":
cli()
The @click.group()
decorator lets us define a mutually exclusive set of commands which are configured with the @cli.command()
decorator.
In this case, we define 'a', 'd', and 'w' commands to 'add' a completed item and retrieving items completed a number of 'days' or 'weeks' ago.
The call to cli()
under the if __name__ == "__main__"
idiom handles parsing the incoming command-line arguments and dispatching to the appropriate function.
That's it! We can run the program with commands like this:
(venv) $ python done.py a "complete python todo kata part 1"
(venv) $ python done.py d 0
[2020-10-11T10:33:03.696278] complete python todo kata part 1
Wrapping Up
That wraps up the Python implementation of done
.
Next we will cover todo
and get a small look at reactive programming with RxPY.