Text segmentation is the task of splitting up any amount of text into segments by placing boundaries between some atomic unit (e.g., morphemes, words, lines, sentences, paragraphs, sections, etc.).
This package is a collection of metrics and for comparing text segmentations and evaluating automatic text segmenters. Both new (Boundary Similarity, Segmentation Similarity) and traditional (WindowDiff, Pk) are included, as well as inter-coder agreement coefficients and confusion matrices based upon a boundary edit distance.
3 years ago passed
.. image:: https://readthedocs.org/projects/segeval/badge/?version=latest :target: https://segeval.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status
<a href='https://segeval.readthedocs.io/en/latest/?badge=latest'> <img src='https://readthedocs.org/projects/segeval/badge/?version=latest' alt='Documentation Status' /> </a>
Project Privacy Level