Welcome to Chicken Turtle Util’s documentation!¶
Chicken Turtle Util (CTU) is a Python utility library. It was renamed to pytil, for the latest version, see pytil.
The API reference starts with an overview of all the features and then gets down to the nitty gritty details of each of them. Most of the reference provides examples. For a full overview of features see the module contents overview of the API reference and the table of contents of the user guide (in the sidebar) as they are complementary.
The API reference makes heavy use of a type language; for example, to describe exactly what arguments can be passed to a function.
Dependencies are grouped by module. For example, when using
chicken_turtle_util.data_frame
, you should pip install
'chicken_turtle_util[data_frame]'
. To install dependencies of all modules, use
pip install 'chicken_turtle_util[all]'
. If you are not familiar with pip,
see pip’s quickstart guide.
While all features are documented and tested, the API is changed frequently. When doing so, the major version is bumped and a changelog is kept to help upgrade. Fixes will not be backported. It is recommended to pin the major version in your setup.py, e.g. for 2.x.y:
install_requires = ['chicken_turtle_util==2.*', ...]
Contents:
API reference¶
See modules for a short description of each modules. For a full listing of the contents of all modules, see the module contents overview.
The API reference makes heavy use of a type language; for example, to describe exactly what arguments can be passed to a function.
Modules¶
algorithms |
|
asyncio |
Extensions to asyncio. |
click |
click utilities |
configuration |
|
data_frame |
|
debug |
|
dict |
|
exceptions |
Exception classes: UserException and InvalidOperationError. |
function |
Function manipulation, like functools. |
hashlib |
hashlib additions |
http |
HTTP utilities. |
inspect |
Similar to inspect module. |
iterable |
Utility functions for working with iterables. |
logging |
Logging utilities. |
multi_dict |
multi-dict utilities. Multi-dicts can map keys to multiple values. |
observable |
Observable collections. |
path |
Extensions to pathlib. |
pymysql |
|
series |
|
set |
Set utilities. |
sqlalchemy |
|
test |
Test utilities. |
Module contents overview¶
algorithms
multi_way_partitioning |
|
spread_points_in_hypercube |
|
toset_from_tosets |
asyncio
stubborn_gather |
Stubbornly wait for awaitables, despite some of them raising |
click
argument |
Like click.argument, but by default required=True |
assert_runs |
Invoke click command and assert it completes successfully |
option |
Like click.option, but by default show_default=True, required=True |
password_option |
Like click.option, but by default prompt=True, hide_input=True, show_default=False, required=True . |
configuration
ConfigurationLoader |
data_frame
assert_equals |
|
equals |
|
replace_na_with_none |
|
split_array_like |
debug
pretty_memory_info |
dict
pretty_print_head |
|
DefaultDict |
|
invert |
|
assign |
exceptions
exc_info |
Get exc_info tuple from exception |
UserException |
Exception with message to show the user. |
InvalidOperationError |
When an operation is illegal/invalid (in the current state), regardless of what arguments you throw at it. |
function
compose |
Compose functions |
hashlib
base85_digest |
Get base 85 encoded digest of hash |
http
download |
Download an HTTP resource to a file |
inspect
call_args |
Get function call arguments as a single dict |
iterable
sliding_window |
Iterate using a sliding window |
partition |
Split iterable into partitions |
is_sorted |
Get whether iterable is sorted ascendingly |
flatten |
Flatten shallowly zero or more times |
logging
configure |
Configure root logger to log INFO to stderr and DEBUG to log file. |
set_level |
Temporarily change log level of logger |
multi_dict
MultiDict |
A multi-dict view of a {hashable => {hashable}} dict. |
observable
Set |
Observable set |
path
assert_equals |
Assert 2 files are equal |
assert_mode |
Assert last 3 octal mode digits match given mode exactly |
chmod |
Change file mode bits |
hash |
Hash file or directory |
read |
Get file contents |
remove |
Remove file or directory (recursively), unless it’s missing |
write |
Create or overwrite file with contents |
pymysql
patch |
series
assert_equals |
|
equals |
|
invert |
|
split |
set
merge_by_overlap |
Of a list of sets, merge those that overlap, in place. |
sqlalchemy
log_sql |
|
pretty_sql |
test
assert_text_contains |
Assert long string contains given string |
assert_text_equals |
Assert long strings are equal |
assert_matches |
|
assert_search_matches |
|
temp_dir_cwd |
pytest fixture that sets current working directory to a temporary directory |
Python type language¶
When documenting code, it is often necessary to refer to the type of an argument or a return. Here, I introduce a language for doing so in a semi-formal manner.
First off, I define these pseudo-types:
- iterable: something you can iterate over once (or more) using iter
- iterator: something you can call next on
- collection: something you can iterate over multiple times
I define the rest of the type language through examples:
pathlib.Path
Expects a pathlib.Path-like, i.e. anything that looks like a pathlib.Path (duck typing) is allowed. None is not allowed.
exact(pathlib.Path)
Expects a Path or derived class instance, so no duck typing (and no None).
pathlib.Path or None
Expect a pathlib.Path-like or None. When None is allowed it must be explicitly specified like this.
bool or int
Expect a boolean or an int.
{bool}
A set of booleans.
{any}
A set of anything.
{'apples' => bool, 'name' => str}
A dictionary with keys ‘apples’ and ‘name’ which respectively have a boolean
and a string as value. (Note that the :
token is already used by Sphinx, and
->
is usually used for lambdas, so we use =>
instead).
dict(apples=bool, name=str)
Equivalent to the previous example.
Parameters
----------
field : str
dict_ : {field => bool}
A dictionary with one key, specified by the value of field, another parameter (but can be any expression, e.g. a global).
{apples => bool, name => str}
Not equivalent to the apples dict earlier. apples and name are references to the value used as a key.
(bool,)
Tuple of a single bool.
[bool]
List of 0 or more booleans.
[(bool, bool)]
List of tuples of boolean pairs.
[(first :: bool, second :: bool)]
Equivalent type compared to the previous example, but you can more easily refer to the first and second bool in your parameter description this way.
{item :: int}
Set of int. We can refer to the set elements as item.
iterable(bool)
Iterable of bool. Something you can call iter on.
iterator(bool)
Iterator of bool. Something you can call next on.
type_of(expression)
Type of expression, avoid when possible in order to be as specific as possible.
Parameters
----------
a : SomeType
b : type_of(a.nodes[0].key_function)
b has the type of the retrieved function.
(int, str, k=int) -> bool
Function that takes an int and a str as positional args, an int as keyword arg named ‘k’ and returns a bool.
func :: int -> bool
Function that takes an int and returns a bool. We can refer to it as func.
Developer documentation¶
Documentation for developers/contributors of Chicken Turtle Util.
The project follows a simple project structure and associated workflow. Please read its documentation.
Project decisions¶
API design¶
If it’s a path, expect a pathlib.Path, not a str.
If extending a module from another project, e.g. pandas, use the same name
as the module. While a from pandas import *
would allow the user to access
functions of the real pandas module through the extended module, we have no
control over additions to the real pandas, which could lead to name clashes
later on, so don’t.
Decorators and context managers should not be provided directly but should be
returned by a function. This avoids confusion over whether or not parentheses
should be used @f
vs @f()
, and parameters can easily be added in the
future.
If a module is a collection of instances of something, give it a plural name, else make it singular. E.g. exceptions for a collection of Exception classes, but function for a set of related functions operating on functions.
API implementation¶
Do not prefix imports with underscore. When importing things, they also are
exported, but help or Sphinx documentation will not include them and thus a
user should realise they should not be used. E.g. import numpy as np
in
module.py can be accessed with module.np, but it isn’t mentioned in
help(module) or Sphinx documentation.
Changelog¶
Semantic versioning is used (starting with v3.0.0).
4.1.2¶
Announce rename to pytil.
4.1.1¶
- Fixes:
- add missing keys to
extras_require
:hashlib
,multi_dict
,test
- add missing keys to
4.1.0¶
- Backwards incompatible changes: None
- Enhancements/additions:
click.assert_runs
: pass on extra args to click’sinvoke()
path.chmod
,path.remove
: ignore disappearing children instead of raising- Add
exceptions.exc_info
: exc_info tuple as seen in function parameters in thetraceback
standard module - Add
extras_require['all']
tosetup.py
: union of all extra dependencies
- Fixes:
path.chmod
: do not follow symlinksiterable.flatten
: removed debug prints:+
,-
- Internal / implementation details:
- use simple project structure instead of Chicken Turtle Project
pytest-catchlog
instead ofpytest-capturelog
extras_require['dev']
: test dependencies were missingtest_http
createdexisting_file
in working dir instead of in test dir
v4.0.1¶
- Fixed: README formatting error
v4.0.0¶
- Major:
path.digest
renamed topath.hash
(and addedhash_function
parameter)- renamed
cli
toclick
- require Python 3.5 or newer
- Changed:
asyncio.stubborn_gather
:- raise
CancelledError
if all its awaitables raisedCancelledError
. - raise summary exception if any awaitable raises exception other than
CancelledError
- log exceptions, as soon as they are raised
- raise
- Minor:
- Added:
click.assert_runs
hashlib.base85_digest
logging.configure
path.assert_equals
path.assert_mode
test.assert_matches
test.assert_search_matches
test.assert_text_contains
test.assert_text_equals
- Added:
- Fixes:
path.remove
: raised whenpath.is_symlink()
or contains a symlinkpath.digest/hash
: directory hash collisions were more likely than necessarypymysql.patch
: change was not picked up in recent pymysql versions
v3.0.1¶
- Fixed: README formatting error
v3.0.0¶
- Removed:
cli.Context
,cli.BasicsMixin
,cli.DatabaseMixin
,cli.OutputDirectoryMixin
pyqt
moduleURL_MAX_LENGTH
various
module:Object
,PATH_MAX_LENGTH
- Enhanced:
data_frame.split_array_like
:columns
defaults todf.columns
sqlalchemy.pretty_sql
: much better formatting
- Added:
algorithms.toset_from_tosets
: Create totally ordered set (toset) from tosetsconfiguration.ConfigurationLoader
: loads a single configuration from one or more files directory according to XDG standardsdata_frame.assert_equals
: Assert 2 data frames are equaldata_frame.equals
: Get whether 2 data frames are equaldict.assign
: assign one dict to the other through mutationsexceptions.InvalidOperationError
: raise when an operation is illegal/invalid, regardless of the arguments you throw at it (in the current state).inspect.call_args
: Get function call arguments as a single dictobservable.Set
: set which can be observed for changespath.chmod
: change file or directory mode bits (optionally recursively)path.digest
: Get SHA512 checksum of file or directorypath.read
: get file contentspath.remove
: remove file or directory (recursively), unless it’s missingpath.write
: create or overwrite file with contentsseries.assert_equals
: Assert 2 series are equalseries.equals
: Get whether 2 series are equalseries.split
: Split valuestest.temp_dir_cwd
: pytest fixture that sets current working directory to a temporary directory
v2.0.4¶
No changelog