ckipnlp.container.util.parsed_tree module

This module provides tree containers for sentence parsing.

class ckipnlp.container.util.parsed_tree.ParsedNodeData[source]

Bases: ckipnlp.container.base.BaseTuple, ckipnlp.container.util.parsed_tree._ParsedNodeData

A parser node.

Variables
  • role (str) – the semantic role.

  • pos (str) – the POS-tag.

  • word (str) – the text term.

Note

This class is an subclass of tuple. To change the attribute, please create a new instance instead.

Data Structure Examples

Text format

Used for from_text() and to_text().

'Head:Na:中文字'  # role / POS-tag / text-term
Dict format

Used for from_dict() and to_dict().

{
    'role': 'Head',   # role
    'pos': 'Na',      # POS-tag
    'word': '中文字',  # text term
}
List format

Not implemented.

from_list = NotImplemented
to_list = NotImplemented
classmethod from_text(data)[source]

Construct an instance from text format.

Parameters

data (str) – text such as 'Head:Na:中文字'.

Note

  • 'Head:Na:中文字' -> role = 'Head', pos = 'Na', word = '中文字'

  • 'Head:Na' -> role = 'Head', pos = 'Na', word = None

  • 'Na' -> role = None, pos = 'Na', word = None

to_text()[source]
class ckipnlp.container.util.parsed_tree.ParsedNode(tag=None, identifier=None, expanded=True, data=None)[source]

Bases: ckipnlp.container.base.Base, treelib.node.Node

A parser node for tree.

Variables

data (ParsedNodeData) –

See also

treelib.tree.Node

Please refer https://treelib.readthedocs.io/ for built-in usages.

Data Structure Examples

Text format

Not implemented.

Dict format

Used for to_dict().

{
    'role': 'Head',   # role
    'pos': 'Na',      # POS-tag
    'word': '中文字',  # text term
}
List format

Not implemented.

data_class

alias of ParsedNodeData

from_dict = NotImplemented
from_text = NotImplemented
to_text = NotImplemented
from_list = NotImplemented
to_list = NotImplemented
to_dict()[source]
class ckipnlp.container.util.parsed_tree.ParsedRelation[source]

Bases: ckipnlp.container.base.Base, ckipnlp.container.util.parsed_tree._ParsedRelation

A parser relation.

Variables
  • head (ParsedNode) – the head node.

  • tail (ParsedNode) – the tail node.

  • relation (ParsedNode) – the relation node. (the semantic role of this node is the relation.)

Notes

The parent of the relation node is always the common ancestor of the head node and tail node.

Data Structure Examples

Text format

Not implemented.

Dict format

Used for to_dict().

{
    'tail': { 'role': 'Head', 'pos': 'Nab', 'word': '中文字' }, # head node
    'tail': { 'role': 'particle', 'pos': 'Td', 'word': '耶' }, # tail node
    'relation': 'particle',  # relation
}
List format

Not implemented.

from_dict = NotImplemented
from_text = NotImplemented
to_text = NotImplemented
from_list = NotImplemented
to_list = NotImplemented
property head_first
to_dict()[source]
class ckipnlp.container.util.parsed_tree.ParsedTree(tree=None, deep=False, node_class=None, identifier=None)[source]

Bases: ckipnlp.container.base.Base, treelib.tree.Tree

A parsed tree.

See also

treereelib.tree.Tree

Please refer https://treelib.readthedocs.io/ for built-in usages.

Data Structure Examples

Text format

Used for from_text() and to_text().

'S(Head:Nab:中文字|particle:Td:耶)'
Dict format

Used for from_dict() and to_dict(). A dictionary such as { 'id': 0, 'data': { ... }, 'children': [ ... ] }, where 'data' is a dictionary with the same format as ParsedNodeData.to_dict(), and 'children' is a list of dictionaries of subtrees with the same format as this tree.

{
    'id': 0,
    'data': {
        'role': None,
        'pos': 'S',
        'word': None,
    },
    'children': [
        {
            'id': 1,
            'data': {
                'role': 'Head',
                'pos': 'Nab',
                'word': '中文字',
            },
            'children': [],
        },
        {
            'id': 2,
            'data': {
                'role': 'particle',
                'pos': 'Td',
                'word': '耶',
            },
            'children': [],
        },
    ],
}
List format

Not implemented.

node_class

alias of ParsedNode

from_list = NotImplemented
to_list = NotImplemented
static normalize_text(tree_text)[source]

Text normalization.

Remove leading number and trailing #.

classmethod from_text(data, *, normalize=True)[source]

Construct an instance from text format.

Parameters
  • data (str) – A parsed tree in text format.

  • normalize (bool) – Do text normalization using normalize_text().

to_text(node_id=None)[source]

Transform to plain text.

Parameters

node_id (int) – Output the plain text format for the subtree under node_id.

Returns

str

classmethod from_dict(data)[source]

Construct an instance a from python built-in containers.

Parameters

data (str) – A parsed tree in dictionary format.

to_dict(node_id=None)[source]

Construct an instance a from python built-in containers.

Parameters

node_id (int) – Output the plain text format for the subtree under node_id.

Returns

str

show(*, key=<function ParsedTree.<lambda>>, idhidden=False, **kwargs)[source]

Show pretty tree.

get_children(node_id, *, role)[source]

Get children of a node with given role.

Parameters
  • node_id (int) – ID of target node.

  • role (str) – the target role.

Yields

ParsedNode – the children nodes with given role.

get_heads(root_id=None, *, semantic=True, deep=True)[source]

Get all head nodes of a subtree.

Parameters
  • root_id (int) – ID of the root node of target subtree.

  • semantic (bool) – use semantic/syntactic policy. For semantic mode, return DUMMY or head instead of syntactic Head.

  • deep (bool) – find heads recursively.

Yields

ParsedNode – the head nodes.

get_relations(root_id=None, *, semantic=True)[source]

Get all relations of a subtree.

Parameters
  • root_id (int) – ID of the subtree root node.

  • semantic (bool) – please refer get_heads() for policy detail.

Yields

ParsedRelation – the relations.

get_subjects(root_id=None, *, semantic=True, deep=True)[source]

Get the subject node of a subtree.

Parameters
  • root_id (int) – ID of the root node of target subtree.

  • semantic (bool) – please refer get_heads() for policy detail.

  • deep (bool) – please refer get_heads() for policy detail.

Yields

ParsedNode – the subject node.

Notes

A node can be a subject if either:

  1. is a head of NP

  2. is a head of a subnode of S with subject role

  3. is a head of a subnode of S with neutral role and precede the head of S