The (Canonical) CitationExtractor¶
Todo
Insert one liner description.
About¶
TODO.
Overview¶
Running the extraction pipeline¶
There are two ways of running the citation extraction pipeline:
- by using a command line interface
- by calling directly its API
Command Line Interface¶
API¶
Input/Output¶
IOB/CONLL¶
Functions to deal with input/output of data in CONLL/IOB format.
-
citation_extractor.io.iob.
count_tokens
(instances)[source]¶ Short summary.
Parameters: instances (type) – Description of parameter instances. Returns: Description of returned object. Return type: type
-
citation_extractor.io.iob.
file_to_instances
(inp_file)[source]¶ Reads a IOB file a converts it into a list of instances.
Parameters: inp_file (type) – Path to the input IOB file. Returns: A list of tuples, where tuple[0] is the token and tuple[1] contains its assigned label. Return type: list