Welcome to pyorg’s documentation!¶
pyorg is a Python library for working with Org mode files and interacting with Org mode through Emacs itself.
This project, and especially the documentation, are a work in progress.
Installation¶
Emacs dependencies¶
pyorg requires the org-json package be installed in Emacs in order to be able to extract syntax trees from files.
Installing the package¶
Just clone the repository and run the setup script:
git clone https://github.com/jlumpe/pyorg
cd pyorg
python setup.py install
Quickstart¶
Getting the data from Emacs to Python¶
Create the following example file in Emacs:
#+title: Example file
* Header 1
Section 1
** Header 2
Section 2
*** Header 3
Section 3
**** Header 4
Section 4
* Markup
A paragraph with *bold*, /italic/, _underline_, +strike+, =verbatim=, and ~code~
objects.
* TODO [#A] A headline with a TODO and tags :tag1:tag2:
DEADLINE: <2019-06-29 Sat>
Use the org-json-export-buffer
command to export it as example.json
.
Now, read the JSON file with pyorg:
import json
from pyorg.io import org_node_from_json
with open('example.json') as f:
data = json.load(f)
doc = org_node_from_json(data)
Explore the AST structure¶
doc
is an OrgNode
which is the root node of the AST:
>>> doc
OrgOutlineNode(type='org-data')
Its has the type org-data
, which is always the root node of the buffer.
Its contents are a section
node and some more headline
nodes:
>>> doc.contents
[OrgNode(type='section'),
OrgOutlineNode(type='headline'),
OrgOutlineNode(type='headline'),
OrgOutlineNode(type='headline')]
We can print a simple representation of the outline tree with the
dump_outline()
method:
>>> doc.dump_outline()
Example file
0. Header 1
0. Header 2
0. Header 3
0. Header 4
1. Markup
2. A headline with a TODO and tags
Get the 2nd headline (3rd item in root node’s contents) and print the full AST subtree, along with each node’s properties:
>>> hl2 = doc[2]
>>> hl2.dump(properties=True)
headline
:archivedp = False
:commentedp = False
:footnote-section-p = False
:level = 1
:post-affiliated = 120
:post-blank = 2
:pre-blank = 0
:priority = None
:raw-value = 'Markup'
:tags = []
:title = ['Markup']
:todo-keyword = None
:todo-type = None
0 section
:post-affiliated = 129
:post-blank = 2
0 paragraph
:post-affiliated = 129
:post-blank = 0
0 'A paragraph with '
1 bold
:post-blank = 0
0 'bold'
2 ', '
3 italic
:post-blank = 0
0 'italic'
4 ', '
5 underline
:post-blank = 0
0 'underline'
6 ', '
7 strike-through
:post-blank = 0
0 'strike'
8 ', '
9 verbatim
:post-blank = 0
:value = 'verbatim'
10 ', and '
11 code
:post-blank = 0
:value = 'code'
12 '\nobjects.\n'
Check third headline’s properties to get the TODO information and tags:
>>> hl3 = doc[3]
>>> hl3.props
{'title': ['A headline with a TODO and tags'],
'deadline': OrgTimestampNode(type='timestamp'),
'post-affiliated': 301,
'commentedp': False,
'archivedp': False,
'footnote-section-p': False,
'post-blank': 0,
'todo-type': 'todo',
'todo-keyword': 'TODO',
'tags': ['tag1', 'tag2'],
'priority': 65,
'level': 1,
'pre-blank': 0,
'raw-value': 'A headline with a TODO and tags'}
Org file structure¶
Nodes¶
See OrgNodeType
, OrgNode
.
Outline structure¶
See OrgOutlineNode
.
Additional specialized node types¶
Reading in Org file data¶
The main function of this package is to read in Org mode documents as Abstract Syntax Trees (ASTs) where they can be processed and converted/exported into other formats. See the documentation for the org element API for more information about the AST structure.
Reading from JSON export¶
Rather attempting to parse .org files directly, pyorg is designed to work with
the output of the org-json Emacs package.
This simply converts the AST generated by the org
package itself into
machine-readable JSON format. This has the advantage of also including all of
your personal Org mode setting and customization in Emacs (such as link
abbreviations).
Parsing Org files directly¶
pyorg has very limited capability to parse .org files without the help of Emacs.
See the pyorg.parse
module.
Interfacing directly with Org mode¶
High-level interface to Org mode¶
Converting org file data to other formats¶
Plain text¶
Creating your own converters¶
Subclass pyorg.convert.base.OrgConverterBase
.
The agenda¶
Support for the agenda is a work in progress. See pyorg.agenda
.
pyorg¶
pyorg package¶
Subpackages¶
pyorg.convert package¶
Subpackages¶
-
class
pyorg.convert.html.converter.
OrgHtmlConverter
(config=None, **kw)[source]¶ Bases:
pyorg.convert.base.OrgConverterBase
-
DEFAULT_CONFIG
= {'date_format': '%Y-%m-%d %a', 'image_extensions': ('.png', '.jpg', '.gif', '.tiff'), 'latex_delims': ('$$', '$$'), 'latex_inline_delims': ('\\(', '\\)'), 'resolve_link': {}}¶
-
DEFAULT_RESOLVE_LINK
= {'http': True, 'https': True}¶
-
INLINE_NODES
= frozenset({'example-block', 'target', 'link', 'latex-fragment', 'verbatim', 'subscript', 'paragraph', 'italic', 'underline', 'inline-babel-call', 'inline-src-block', 'statistics-cookie', 'code', 'superscript', 'fixed-width', 'timestamp', 'entity', 'export-snippet', 'line-break', 'macro', 'bold', 'radio-target', 'footnote-reference', 'table-cell', 'strike-through'})¶
-
TAGS
= {'babel-call': None, 'bold': 'strong', 'center-block': 'div', 'code': 'code', 'comment': None, 'example-block': 'pre', 'fixed-width': 'pre', 'headline': 'article', 'horizontal-rule': 'hr', 'italic': 'em', 'item': 'li', 'link': 'a', 'org-data': 'article', 'paragraph': 'p', 'property-drawer': None, 'quote-block': 'blockquote', 'radio-target': 'span', 'section': 'section', 'statistics-cookie': 'span', 'strike-through': 's', 'subscript': 'sub', 'superscript': 'sup', 'timestamp': 'span', 'underline': 'u', 'verbatim': 'span', 'verse-block': 'p'}¶
-
convert
(node, dom=False, **kwargs)[source]¶ Convert org node to HTML.
Parameters: - node (pyorg.ast.OrgNode) – Org node to convert.
- dom (bool) – Return HTML element instead of string.
Returns: Return type: str or HtmlElement
-
-
pyorg.convert.html.converter.
to_html
(node, dom=False, **kwargs)[source]¶ Convert org node to HTML.
Parameters: - node (pyorg.ast.OrgNode) – Org node to convert.
- dom (bool) – Return HTML element instead of string.
- kwargs – Keyword arguments to
OrgHtmlConverter
constructor.
Returns: Return type: str or HtmlElement
-
class
pyorg.convert.html.element.
HtmlElement
(tag, children=None, attrs=None, inline=False, post_ws=False)[source]¶ Bases:
object
Lightweight class to represent an HTML element.
-
inline
¶ Whether to render children in an inline context. If False each child will be rendered on its own line. If True whitespace will only be added before/after children according to the
post_ws
attribute of the child.Type: bool
-
classes
-
Export org mode AST nodes to HTML.
Submodules¶
Convert org mode AST nodes to JSON.
-
class
pyorg.convert.json.
OrgJsonConverter
(config=None, **kw)[source]¶ Bases:
pyorg.convert.base.OrgConverterBase
-
DEFAULT_CONFIG
= {'date_format': '%Y-%m-%d %a', 'image_extensions': ('.png', '.jpg', '.gif', '.tiff'), 'include_agenda_extra': True, 'include_agenda_headline': True, 'object_type_key': '$$data_type'}¶
-
Module contents¶
Convert org AST to other formats.
pyorg.elisp package¶
Submodules¶
Base classes for Emacs Lisp abstract syntax trees.
-
class
pyorg.elisp.ast.
Form
[source]¶ Bases:
pyorg.elisp.ast.ElispAstNode
Pretty much everything is a form, right?
-
class
pyorg.elisp.ast.
Literal
(pyvalue)[source]¶ Bases:
pyorg.elisp.ast.Form
Basic self-evaluating forms like strings, numbers, etc.
-
PY_TYPES
= (<class 'str'>, <class 'int'>, <class 'float'>)¶
-
-
class
pyorg.elisp.ast.
Symbol
(name)[source]¶ Bases:
pyorg.elisp.ast.Form
Elisp symbol.
-
isconst
¶
-
-
class
pyorg.elisp.ast.
Cons
(car, cdr)[source]¶ Bases:
pyorg.elisp.ast.Form
A cons cell.
-
class
pyorg.elisp.ast.
List
(items)[source]¶ Bases:
pyorg.elisp.ast.Form
A list…
-
islist
= False¶
-
-
class
pyorg.elisp.ast.
Quote
(form)[source]¶ Bases:
pyorg.elisp.ast.Form
A quoted Elisp form.
-
class
pyorg.elisp.ast.
Raw
(src)[source]¶ Bases:
pyorg.elisp.ast.ElispAstNode
Just raw code to be pasted in at this point.
A DSL for writing Elisp in Python.
God help us all.
-
pyorg.elisp.dsl.
E
¶ Singleton object which implements the DSL.
-
class
pyorg.elisp.dsl.
ElispSingleton
[source]¶ Bases:
object
Singleton object which implements the DSL.
-
static
C
(car, cds)¶ Create a Cons cell, converting arguments.
-
static
Q
(value)¶ Quote value, converting Python strings to symbols.
-
R
¶ alias of
pyorg.elisp.ast.Raw
-
static
S
(*names)¶ Create a list of symbols.
-
static
Module contents¶
Build and print Emacs Lisp abstract syntax trees in Python.
Submodules¶
pyorg.agenda module¶
-
class
pyorg.agenda.
OrgAgendaItem
(text, **kwargs)[source]¶ Bases:
object
An agenda item.
-
headline
¶ Headline node item came from.
Type: OrgAstNode
-
deadline
¶ timestamp node
Type: OrgAstNode
-
view_priority
¶ Relative priority assigned to the item in the agenda buffer it was exported from.
Type: int
List of tags.
Type: list
-
priority_code
-
pyorg.ast module¶
Work with org file abstract syntax trees.
See https://orgmode.org/worg/dev/org-syntax.html for a description of the org syntax.
-
class
pyorg.ast.
DispatchNodeType
(default, registry=None, doc=None)[source]¶ Bases:
pyorg.util.SingleDispatchBase
Generic function which dispatches on the node type of its first argument.
-
pyorg.ast.
NODE_CLASSES
= {'headline': <class 'pyorg.ast.OrgOutlineNode'>, 'org-data': <class 'pyorg.ast.OrgOutlineNode'>, 'table': <class 'pyorg.ast.OrgTableNode'>, 'timestamp': <class 'pyorg.ast.OrgTimestampNode'>}¶ Mapping from org element/node types to their Python class
-
pyorg.ast.
ORG_NODE_TYPES
= {'babel-call': OrgNodeType('babel-call'), 'bold': OrgNodeType('bold'), 'center-block': OrgNodeType('center-block'), 'clock': OrgNodeType('clock'), 'code': OrgNodeType('code'), 'comment': OrgNodeType('comment'), 'comment-block': OrgNodeType('comment-block'), 'diary-sexp': OrgNodeType('diary-sexp'), 'drawer': OrgNodeType('drawer'), 'dynamic-block': OrgNodeType('dynamic-block'), 'entity': OrgNodeType('entity'), 'example-block': OrgNodeType('example-block'), 'export-block': OrgNodeType('export-block'), 'export-snippet': OrgNodeType('export-snippet'), 'fixed-width': OrgNodeType('fixed-width'), 'footnote-definition': OrgNodeType('footnote-definition'), 'footnote-reference': OrgNodeType('footnote-reference'), 'headline': OrgNodeType('headline'), 'horizontal-rule': OrgNodeType('horizontal-rule'), 'inline-babel-call': OrgNodeType('inline-babel-call'), 'inline-src-block': OrgNodeType('inline-src-block'), 'inlinetask': OrgNodeType('inlinetask'), 'italic': OrgNodeType('italic'), 'item': OrgNodeType('item'), 'keyword': OrgNodeType('keyword'), 'latex-environment': OrgNodeType('latex-environment'), 'latex-fragment': OrgNodeType('latex-fragment'), 'line-break': OrgNodeType('line-break'), 'link': OrgNodeType('link'), 'macro': OrgNodeType('macro'), 'node-property': OrgNodeType('node-property'), 'org-data': OrgNodeType('org-data'), 'paragraph': OrgNodeType('paragraph'), 'plain-list': OrgNodeType('plain-list'), 'planning': OrgNodeType('planning'), 'property-drawer': OrgNodeType('property-drawer'), 'quote-block': OrgNodeType('quote-block'), 'radio-target': OrgNodeType('radio-target'), 'section': OrgNodeType('section'), 'special-block': OrgNodeType('special-block'), 'src-block': OrgNodeType('src-block'), 'statistics-cookie': OrgNodeType('statistics-cookie'), 'strike-through': OrgNodeType('strike-through'), 'subscript': OrgNodeType('subscript'), 'superscript': OrgNodeType('superscript'), 'table': OrgNodeType('table'), 'table-cell': OrgNodeType('table-cell'), 'table-row': OrgNodeType('table-row'), 'target': OrgNodeType('target'), 'timestamp': OrgNodeType('timestamp'), 'underline': OrgNodeType('underline'), 'verbatim': OrgNodeType('verbatim'), 'verse-block': OrgNodeType('verse-block')}¶ Mapping from names of all AST node types to
OrgNodeType
instances.
-
class
pyorg.ast.
OrgNode
(type_, props=None, contents=None, keywords=None, parent=None, outline=None)[source]¶ Bases:
object
A node in an org file abstract syntax tree.
Implements the sequence protocol as a sequence containing its child nodes (identically to
contents
). Also allows accessing property values by indexing with a string key.-
type
¶ Node type, obtained from org-element-type.
Type: OrgNodeType
-
outline
¶ Most recent outline node in the node’s ancestors (not including self).
Type: OrgOutlineNode
-
children
¶ Iterator over all child AST nodes (in contents or keyword/property values.
-
dump
(index=None, properties=False, indent=' ', _level=0)[source]¶ Print a debug representation of the node and its descendants.
-
is_outline
= False
-
-
class
pyorg.ast.
OrgNodeType
[source]¶ Bases:
pyorg.ast.OrgNodeType
The properties of an org AST node type.
-
is_element
¶ Whether this node type is an element. “An element defines syntactical parts that are at the same level as a paragraph, i.e. which cannot contain or be included in a paragraph.”
Type: bool
-
is_object
¶ Whether this node type is an object. All nodes which are not elements are objects. “An object is a part that could be included in an element.”
Type: bool
-
is_greater_element
¶ Whether this node type is a greater element. “Greater elements are all parts that can contain an element.”
Type: bool
-
is_object_container
¶ Whether this node type is an object container, i.e. can directly contain objects.
Type: bool
References
-
is_object
-
-
class
pyorg.ast.
OrgOutlineNode
(type_, *args, title=None, id=None, **kw)[source]¶ Bases:
pyorg.ast.OrgNode
Org node that is a component of the outline tree.
Corresponds to the root org-data node or a headline node.
-
section
¶ Org node with type “section” that contains the outline node’s direct content (not part of any nested outline nodes).
Type: OrgNode
-
has_todo
-
is_outline
= True¶
-
outline_children
¶ Iterable over child outline nodes.
-
priority_chr
-
-
class
pyorg.ast.
OrgTableNode
(type_, props=None, contents=None, keywords=None, parent=None, outline=None)[source]¶ Bases:
pyorg.ast.OrgNode
An org node with type “table”.
-
rows
¶ List of standard rows.
Type: list of OrgNode
-
blocks
()[source]¶ Standard rows divided into “blocks”, which were separated by rule rows.
Returns: Return type: list of list of OrgNode
-
rows
-
-
class
pyorg.ast.
OrgTimestampNode
(type_, *args, **kwargs)[source]¶ Bases:
pyorg.ast.OrgNode
An org node with type “timestamp”.
-
begin
¶ Begin date, parsed from properties
Type: datetime
-
end
¶ End date, parsed from properties
Type: datetime
-
-
pyorg.ast.
as_secondary_string
(obj)[source]¶ Convert argument to a “secondary string” (list of nodes or strings.
Parameters: obj (OrgNode or str or list) – Returns: Return type: list Raises: TypeError : if obj
is not a str orOrgNode
or iterable of these.
-
pyorg.ast.
dispatch_node_type
(parent=None)[source]¶ Decorator to create DispatchNodeType instance from default implementation.
-
pyorg.ast.
get_node_type
(obj, name=False)[source]¶ Get type of AST node, returning None for other types.
-
pyorg.ast.
node_cls
(type_)[source]¶ Register a node class for a particular type in
NODE_CLASSES
.
pyorg.emacs module¶
Interface with Emacs and run commands.
-
class
pyorg.emacs.
Emacs
(cmd=('emacs', '--batch'), client=False, verbose=1)[source]¶ Bases:
object
Interface to Emacs program.
Parameters: -
eval
(source, process=False, **kwargs)[source]¶ Evaluate ELisp source code and return output.
Parameters: - source (str or list) – Elisp code. If a list of strings will be enclosed in
progn
. - process (bool) – If True return the
subprocess.CompletedProcess
object, otherwise just return the value ofstdout
. - kwargs – Passed to
run()
.
Returns: Command output or completed process object, depending on value of
process
.Return type: - source (str or list) – Elisp code. If a list of strings will be enclosed in
-
getoutput
(args, **kwargs)[source]¶ Get output of command.
Parameters: Returns: Value of stdout.
Return type:
-
getresult
(source, is_json=False, **kwargs)[source]¶ Get parsed result from evaluating the Elisp code.
Parameters: Returns: Return type: Parsed value.
-
run
(args, check=True, verbose=None)[source]¶ Run the Emacs command with a list of arguments.
Parameters: Returns: Return type: Raises: subprocess.CalledProcessError
– Ifcheck=True
and return code is nonzero.
-
pyorg.interface module¶
-
class
pyorg.interface.
Org
(emacs=None, orgdir=None)[source]¶ Bases:
object
Interface to org mode.
-
emacs
¶ Type: pyorg.emacs.Emacs
-
orgdir
¶ Directory org files are read from.
Type: OrgDirectory
-
agenda
(key='t', raw=False)[source]¶ TODO Read agenda information.
Parameters: key (str) – TODO Returns: Return type: list[dict]
-
open_org_file
(path, focus=False)[source]¶ Open an org file in the org directory for editing in Emacs.
Parameters: - path (str or pathlib.Path) – File path relative to org directory.
- focus (bool) – Switch window/input focus to opened buffer.
-
read_org_file
(path, assign_ids=True)[source]¶ Read and parse an org file.
Parameters: - path (str or pathlib.Path) – File path relative to org directory.
- assign_ids (bool) – Assign IDs to outline nodes. See
pyorg.ast.assign_outline_ids()
.
Returns: Return type: Raises:
-
read_org_file_direct
(path, raw=False)[source]¶ Read and parse an org file directly from Emacs.
Always reads the current file and does not use cached data, or perform any additional processing other than parsing.
Parameters: - path (str or pathlib.Path) – File path relative to org directory.
- raw (bool) – Don’t parse and just return raw JSON exported from Emacs.
Returns: Return type: Raises:
-
-
class
pyorg.interface.
OrgDirectory
(path)[source]¶ Bases:
object
The directory where the user’s org files are kept.
- path : pathlib.Path
- Absolute path to org directory.
-
get_abs_path
(path)[source]¶ Get absolute path from path relative to org directory.
Path will be normalized with “..” components removed.
Returns: Return type: pathlib.Path Raises: ValueError
– If the path is not relative or is outside of the org directory (can happen if it contains “..” components).
-
list_files
(path=None, recursive=False, hidden=False)[source]¶ List org files within the org directory.
Paths are relative to the org directory.
Parameters: - path (str or pathlib.Path) – Optional subdirectory to search through.
- recursive (bool) – Recurse through subdirectories.
- hidden (bool) – Include hidden files.
Returns: Return type: Iterator over
pathlib.Path
instances.
pyorg.io module¶
Read (and write) org mode data from JSON and other formats.
-
pyorg.io.
agenda_item_from_json
(data)[source]¶ Parse an agenda item from JSON data.
Parameters: data (dict) – Returns: Return type: OrgAgendaItem
pyorg.parse module¶
(Partially) parse org files.
Parse tags from string.
Parameters: string (str) – Tags separated by colons. Returns: List of tags. Return type: list[str]
-
pyorg.parse.
read_file_keywords
(file)[source]¶ Read file-level keywords from an .org file (without using Emacs).
Limitations: only reads up to the first element in the initial section (excluding comments). If the initial section does contain such an element, any keywords directly preceding it (not separated with a blank line) will be considered affiliated keywords of that element and ignored.
Will not parse org markup in keyword values.
All keys are converted to uppercase.
Keys which appear more than once will have values in a list.
Parameters: file – String or open file object or stream in text mode. Returns: Return type: dict
pyorg.util module¶
Misc. utility code.
-
class
pyorg.util.
SingleDispatch
(default, registry=None, doc=None)[source]¶ Bases:
pyorg.util.SingleDispatchBase
Generic function which dispatches on the type of its first argument.
-
validate_key
(key)[source]¶ Validate and possibly replace a key before an implementation is registered under it.
Default implementation simply returns the argument. Subclasses may wish to override this. An error should be raised for invalid keys.
Parameters: key – Key passed to register()
.Returns: Return type: Key to use for registration, which may be different than argument.
-
-
class
pyorg.util.
SingleDispatchBase
(default, registry=None, doc=None)[source]¶ Bases:
abc.ABC
ABC for a generic function which dispatches on some trait of its first argument.
May be bound to an object or class as a method.
Concrete subclasses must implement one of the
get_key()
oriter_keys()
method.-
default
¶ Default implementation.
Type: callable
-
bind
(instance, owner=None)[source]¶ Get a version of the function bound to the given instance as a method.
Parameters: - instance – Object instance to bind to.
- owner –
-
register
(key, impl=None)[source]¶ Register an implementation for the given key.
Parameters: - key – Key to register method under. May also be a list of keys.
- impl (callable) – Implementation to register under the given key(s). If None will return a decorator function that completes the registration.
Returns: None if
method
is given. Otherwise returns a decorator that will register the function it is applied to.Return type: function or None
-
validate_key
(key)[source]¶ Validate and possibly replace a key before an implementation is registered under it.
Default implementation simply returns the argument. Subclasses may wish to override this. An error should be raised for invalid keys.
Parameters: key – Key passed to register()
.Returns: Return type: Key to use for registration, which may be different than argument.
-