ckipnlp.container.util.parse_tree module¶
This module provides tree containers for parsed sentences.
- class ckipnlp.container.util.parse_tree.ParseNodeData(role: Optional[str] = None, pos: Optional[str] = None, word: Optional[str] = None)[source]¶
Bases:
BaseTuple
,_ParseNodeData
A parse node.
- Variables
role (str) – the semantic role.
pos (str) – the POS-tag.
word (str) – the text term.
Note
This class is an subclass of
tuple
. To change the attribute, please create a new instance instead.Data Structure Examples
- Text format
Used for
from_text()
andto_text()
.'Head:Na:中文字' # role / POS-tag / text-term
- List format
Not implemented.
- Dict format
Used for
from_dict()
andto_dict()
.{ 'role': 'Head', # role 'pos': 'Na', # POS-tag 'word': '中文字', # text term }
- class ckipnlp.container.util.parse_tree.ParseNode(tag=None, identifier=None, expanded=True, data=None)[source]¶
Bases:
Base
,Node
A parse node for tree.
- Variables
data (
ParseNodeData
) –
See also
treelib.tree.Node
Please refer https://treelib.readthedocs.io/ for built-in usages.
Data Structure Examples
- Text format
Not implemented.
- List format
Not implemented.
- Dict format
Used for
to_dict()
.{ 'role': 'Head', # role 'pos': 'Na', # POS-tag 'word': '中文字', # text term }
- data_class¶
alias of
ParseNodeData
- class ckipnlp.container.util.parse_tree.ParseRelation(head: ParseNode, tail: ParseNode, relation: ParseNode)[source]¶
Bases:
Base
,_ParseRelation
A parse relation.
- Variables
Notes
The parent of the relation node is always the common ancestor of the head node and tail node.
Data Structure Examples
- Text format
Not implemented.
- List format
Not implemented.
- Dict format
Used for
to_dict()
.{ 'tail': { 'role': 'Head', 'pos': 'Nab', 'word': '中文字' }, # head node 'tail': { 'role': 'particle', 'pos': 'Td', 'word': '耶' }, # tail node 'relation': 'particle', # relation }
- class ckipnlp.container.util.parse_tree.ParseTree(tree=None, deep=False, node_class=None, identifier=None)[source]¶
Bases:
Base
,Tree
A parse tree.
See also
treereelib.tree.Tree
Please refer https://treelib.readthedocs.io/ for built-in usages.
Data Structure Examples
- Text format
Used for
from_text()
andto_text()
.'S(Head:Nab:中文字|particle:Td:耶)'
- List format
Not implemented.
- Dict format
Used for
from_dict()
andto_dict()
. A dictionary such as{ 'id': 0, 'data': { ... }, 'children': [ ... ] }
, where'data'
is a dictionary with the same format asParseNodeData.to_dict()
, and'children'
is a list of dictionaries of subtrees with the same format as this tree.{ 'id': 0, 'data': { 'role': None, 'pos': 'S', 'word': None, }, 'children': [ { 'id': 1, 'data': { 'role': 'Head', 'pos': 'Nab', 'word': '中文字', }, 'children': [], }, { 'id': 2, 'data': { 'role': 'particle', 'pos': 'Td', 'word': '耶', }, 'children': [], }, ], }
- Penn Treebank format
Used for
from_penn()
andto_penn()
.[ 'S', [ 'Head:Nab', '中文字', ], [ 'particle:Td', '耶', ], ]
- classmethod from_text(data)[source]¶
Construct an instance from text format.
- Parameters
data (str) – A parse tree in text format (
ParseClause.clause
).
See also
- to_text(node_id=None)[source]¶
Transform to plain text.
- Parameters
node_id (int) – Output the plain text format for the subtree under node_id.
- Returns
str
- classmethod from_dict(data)[source]¶
Construct an instance from python built-in containers.
- Parameters
data (str) – A parse tree in dictionary format.
- to_dict(node_id=None)[source]¶
Transform to python built-in containers.
- Parameters
node_id (int) – Output the plain text format for the subtree under node_id.
- Returns
str
- to_penn(node_id=None, *, with_role=True, with_word=True, sep=':')[source]¶
Transform to Penn Treebank format.
- Parameters
node_id (int) – Output the plain text format for the subtree under node_id.
with_role (bool) – Contains role-tag or not.
with_word (bool) – Contains word or not.
sep (str) – The seperator between role and POS-tag.
- Returns
list
- get_children(node_id, *, role)[source]¶
Get children of a node with given role.
- Parameters
node_id (int) – ID of target node.
role (str) – the target role.
- Yields
ParseNode
– the children nodes with given role.
- get_heads(root_id=None, *, semantic=True, deep=True)[source]¶
Get all head nodes of a subtree.
- Parameters
root_id (int) – ID of the root node of target subtree.
semantic (bool) – use semantic/syntactic policy. For semantic mode, return
DUMMY
orhead
instead of syntacticHead
.deep (bool) – find heads recursively.
- Yields
ParseNode
– the head nodes.
- get_relations(root_id=None, *, semantic=True)[source]¶
Get all relations of a subtree.
- Parameters
root_id (int) – ID of the subtree root node.
semantic (bool) – please refer
get_heads()
for policy detail.
- Yields
ParseRelation
– the relations.
- get_subjects(root_id=None, *, semantic=True, deep=True)[source]¶
Get the subject node of a subtree.
- Parameters
root_id (int) – ID of the root node of target subtree.
semantic (bool) – please refer
get_heads()
for policy detail.deep (bool) – please refer
get_heads()
for policy detail.
- Yields
ParseNode
– the subject node.
Notes
A node can be a subject if either:
is a head of NP
is a head of a subnode (N) of S with subject role
is a head of a subnode (N) of S with neutral role and before the head (V) of S