Welcome to Chicken Turtle Util’s documentation!

Chicken Turtle Util (CTU) is a Python utility library.

The API reference starts with an overview of all the features and then gets down to the nitty gritty details of each of them. Most of the reference provides examples. For a full overview of features see the module contents overview of the API reference and the table of contents of the user guide (in the sidebar) as they are complementary.

The API reference makes heavy use of a type language; for example, to describe exactly what arguments can be passed to a function.

Dependencies are grouped by module. For example, when using chicken_turtle_util.data_frame, you should pip install 'chicken_turtle_util[data_frame]'. To install dependencies of all modules, use pip install 'chicken_turtle_util[all]'. If you are not familiar with pip, see pip’s quickstart guide.

While all features are documented and tested, the API is changed frequently. When doing so, the major version is bumped and a changelog is kept to help upgrade. Fixes will not be backported. It is recommended to pin the major version in your setup.py, e.g. for 2.x.y:

install_requires = ['chicken_turtle_util==2.*', ...]

Contents:

API reference

See modules for a short description of each modules. For a full listing of the contents of all modules, see the module contents overview.

The API reference makes heavy use of a type language; for example, to describe exactly what arguments can be passed to a function.

Modules

algorithms
asyncio Extensions to asyncio.
click click utilities
configuration
data_frame
debug
dict
exceptions Exception classes: UserException and InvalidOperationError.
function Function manipulation, like functools.
hashlib hashlib additions
http HTTP utilities.
inspect Similar to inspect module.
iterable Utility functions for working with iterables.
logging Logging utilities.
multi_dict multi-dict utilities. Multi-dicts can map keys to multiple values.
observable Observable collections.
path Extensions to pathlib.
pymysql
series
set Set utilities.
sqlalchemy
test Test utilities.

Module contents overview

algorithms

multi_way_partitioning
spread_points_in_hypercube
toset_from_tosets

asyncio

stubborn_gather Stubbornly wait for awaitables, despite some of them raising

click

argument Like click.argument, but by default required=True
assert_runs Invoke click command and assert it completes successfully
option Like click.option, but by default show_default=True, required=True
password_option Like click.option, but by default prompt=True, hide_input=True, show_default=False, required=True.

configuration

ConfigurationLoader

data_frame

assert_equals
equals
replace_na_with_none
split_array_like

debug

pretty_memory_info

dict

pretty_print_head
DefaultDict
invert
assign

exceptions

exc_info Get exc_info tuple from exception
UserException Exception with message to show the user.
InvalidOperationError When an operation is illegal/invalid (in the current state), regardless of what arguments you throw at it.

function

compose Compose functions

hashlib

base85_digest Get base 85 encoded digest of hash

http

download Download an HTTP resource to a file

inspect

call_args Get function call arguments as a single dict

iterable

sliding_window Iterate using a sliding window
partition Split iterable into partitions
is_sorted Get whether iterable is sorted ascendingly
flatten Flatten shallowly zero or more times

logging

configure Configure root logger to log INFO to stderr and DEBUG to log file.
set_level Temporarily change log level of logger

multi_dict

MultiDict A multi-dict view of a {hashable => {hashable}} dict.

observable

Set Observable set

path

assert_equals Assert 2 files are equal
assert_mode Assert last 3 octal mode digits match given mode exactly
chmod Change file mode bits
hash Hash file or directory
read Get file contents
remove Remove file or directory (recursively), unless it’s missing
write Create or overwrite file with contents

pymysql

patch

series

assert_equals
equals
invert
split

set

merge_by_overlap Of a list of sets, merge those that overlap, in place.

sqlalchemy

log_sql
pretty_sql

test

assert_text_contains Assert long string contains given string
assert_text_equals Assert long strings are equal
assert_matches
assert_search_matches
temp_dir_cwd pytest fixture that sets current working directory to a temporary directory

Python type language

When documenting code, it is often necessary to refer to the type of an argument or a return. Here, I introduce a language for doing so in a semi-formal manner.

First off, I define these pseudo-types:

  • iterable: something you can iterate over once (or more) using iter
  • iterator: something you can call next on
  • collection: something you can iterate over multiple times

I define the rest of the type language through examples:

pathlib.Path

Expects a pathlib.Path-like, i.e. anything that looks like a pathlib.Path (duck typing) is allowed. None is not allowed.

exact(pathlib.Path)

Expects a Path or derived class instance, so no duck typing (and no None).

pathlib.Path or None

Expect a pathlib.Path-like or None. When None is allowed it must be explicitly specified like this.

bool or int

Expect a boolean or an int.

{bool}

A set of booleans.

{any}

A set of anything.

{'apples' => bool, 'name' => str}

A dictionary with keys ‘apples’ and ‘name’ which respectively have a boolean and a string as value. (Note that the : token is already used by Sphinx, and -> is usually used for lambdas, so we use => instead).

dict(apples=bool, name=str)

Equivalent to the previous example.

Parameters
----------
field : str
dict_ : {field => bool}

A dictionary with one key, specified by the value of field, another parameter (but can be any expression, e.g. a global).

{apples => bool, name => str}

Not equivalent to the apples dict earlier. apples and name are references to the value used as a key.

(bool,)

Tuple of a single bool.

[bool]

List of 0 or more booleans.

[(bool, bool)]

List of tuples of boolean pairs.

[(first :: bool, second :: bool)]

Equivalent type compared to the previous example, but you can more easily refer to the first and second bool in your parameter description this way.

{item :: int}

Set of int. We can refer to the set elements as item.

iterable(bool)

Iterable of bool. Something you can call iter on.

iterator(bool)

Iterator of bool. Something you can call next on.

type_of(expression)

Type of expression, avoid when possible in order to be as specific as possible.

Parameters
----------
a : SomeType
b : type_of(a.nodes[0].key_function)

b has the type of the retrieved function.

(int, str, k=int) -> bool

Function that takes an int and a str as positional args, an int as keyword arg named ‘k’ and returns a bool.

func :: int -> bool

Function that takes an int and returns a bool. We can refer to it as func.

Developer documentation

Documentation for developers/contributors of Chicken Turtle Util.

The project follows a simple project structure and associated workflow. Please read its documentation.

Project decisions

API design

If it’s a path, expect a pathlib.Path, not a str.

If extending a module from another project, e.g. pandas, use the same name as the module. While a from pandas import * would allow the user to access functions of the real pandas module through the extended module, we have no control over additions to the real pandas, which could lead to name clashes later on, so don’t.

Decorators and context managers should not be provided directly but should be returned by a function. This avoids confusion over whether or not parentheses should be used @f vs @f(), and parameters can easily be added in the future.

If a module is a collection of instances of something, give it a plural name, else make it singular. E.g. exceptions for a collection of Exception classes, but function for a set of related functions operating on functions.

API implementation

Do not prefix imports with underscore. When importing things, they also are exported, but help or Sphinx documentation will not include them and thus a user should realise they should not be used. E.g. import numpy as np in module.py can be accessed with module.np, but it isn’t mentioned in help(module) or Sphinx documentation.

Changelog

Semantic versioning is used (starting with v3.0.0).

4.1.1

  • Fixes:
    • add missing keys to extras_require: hashlib, multi_dict, test

4.1.0

  • Backwards incompatible changes: None
  • Enhancements/additions:
    • click.assert_runs: pass on extra args to click’s invoke()
    • path.chmod, path.remove: ignore disappearing children instead of raising
    • Add exceptions.exc_info: exc_info tuple as seen in function parameters in the traceback standard module
    • Add extras_require['all'] to setup.py: union of all extra dependencies
  • Fixes:
    • path.chmod: do not follow symlinks
    • iterable.flatten: removed debug prints: +, -
  • Internal / implementation details:
    • use simple project structure instead of Chicken Turtle Project
    • pytest-catchlog instead of pytest-capturelog
    • extras_require['dev']: test dependencies were missing
    • test_http created existing_file in working dir instead of in test dir

v4.0.1

  • Fixed: README formatting error

v4.0.0

  • Major:
    • path.digest renamed to path.hash (and added hash_function parameter)
    • renamed cli to click
    • require Python 3.5 or newer
    • Changed: asyncio.stubborn_gather:
      • raise CancelledError if all its awaitables raised CancelledError.
      • raise summary exception if any awaitable raises exception other than CancelledError
      • log exceptions, as soon as they are raised
  • Minor:
    • Added:
      • click.assert_runs
      • hashlib.base85_digest
      • logging.configure
      • path.assert_equals
      • path.assert_mode
      • test.assert_matches
      • test.assert_search_matches
      • test.assert_text_contains
      • test.assert_text_equals
  • Fixes:
    • path.remove: raised when path.is_symlink() or contains a symlink
    • path.digest/hash: directory hash collisions were more likely than necessary
    • pymysql.patch: change was not picked up in recent pymysql versions

v3.0.1

  • Fixed: README formatting error

v3.0.0

  • Removed:
    • cli.Context, cli.BasicsMixin, cli.DatabaseMixin, cli.OutputDirectoryMixin
    • pyqt module
    • URL_MAX_LENGTH
    • various module: Object, PATH_MAX_LENGTH
  • Enhanced:
    • data_frame.split_array_like: columns defaults to df.columns
    • sqlalchemy.pretty_sql: much better formatting
  • Added:
    • algorithms.toset_from_tosets: Create totally ordered set (toset) from tosets
    • configuration.ConfigurationLoader: loads a single configuration from one or more files directory according to XDG standards
    • data_frame.assert_equals: Assert 2 data frames are equal
    • data_frame.equals: Get whether 2 data frames are equal
    • dict.assign: assign one dict to the other through mutations
    • exceptions.InvalidOperationError: raise when an operation is illegal/invalid, regardless of the arguments you throw at it (in the current state).
    • inspect.call_args: Get function call arguments as a single dict
    • observable.Set: set which can be observed for changes
    • path.chmod: change file or directory mode bits (optionally recursively)
    • path.digest: Get SHA512 checksum of file or directory
    • path.read: get file contents
    • path.remove: remove file or directory (recursively), unless it’s missing
    • path.write: create or overwrite file with contents
    • series.assert_equals: Assert 2 series are equal
    • series.equals: Get whether 2 series are equal
    • series.split: Split values
    • test.temp_dir_cwd: pytest fixture that sets current working directory to a temporary directory

v2.0.4

No changelog

Indices and tables