ckipnlp.container.ner module

This module provides containers for NER sentences.

class ckipnlp.container.ner.NerToken(word, ner, idx, **kwargs)[source]

Bases: BaseTuple, _NerToken

A named-entity recognition token.

Variables
  • word (str) – the token word.

  • ner (str) – the NER-tag.

  • idx (Tuple[int, int]) – the starting / ending index.

Note

This class is an subclass of tuple. To change the attribute, please create a new instance instead.

Data Structure Examples

Text format

Not implemented

List format

Used for from_list() and to_list().

[
    '中文字'     # token word
    'LANGUAGE', # NER-tag
    (0, 3),     # starting / ending index.
]
Dict format

Used for from_dict() and to_dict().

{
    'word': '中文字',   # token word
    'ner': 'LANGUAGE', # NER-tag
    'idx': (0, 3),     # starting / ending index.
}
CkipTagger format

Used for from_tagger() and to_tagger().

(
    0,          # starting index
    3,          # ending index
    'LANGUAGE', # NER-tag
    '中文字',    # token word
)
classmethod from_tagger(data)[source]

Construct an instance from CkipTagger format.

to_tagger()[source]

Transform to CkipTagger format.

class ckipnlp.container.ner.NerSentence(initlist=None)[source]

Bases: BaseSentence

A named-entity recognition sentence.

Data Structure Examples

Text format

Not implemented

List format

Used for from_list() and to_list().

[
    [ '美國', 'GPE', (0, 2), ],   # name-entity 1
    [ '參議院', 'ORG', (3, 5), ], # name-entity 2
]
Dict format

Used for from_dict() and to_dict().

[
    { 'word': '美國', 'ner': 'GPE', 'idx': (0, 2), },   # name-entity 1
    { 'word': '參議院', 'ner': 'ORG', 'idx': (3, 5), }, # name-entity 2
]
CkipTagger format

Used for from_tagger() and to_tagger().

[
    ( 0, 2, 'GPE', '美國', ),   # name-entity 1
    ( 3, 5, 'ORG', '參議院', ), # name-entity 2
]
item_class

alias of NerToken

classmethod from_tagger(data)[source]

Construct an instance from CkipTagger format.

to_tagger()[source]

Transform to CkipTagger format.

class ckipnlp.container.ner.NerParagraph(initlist=None)[source]

Bases: BaseList

A list of named-entity recognition sentence.

Data Structure Examples

Text format

Not implemented

List format

Used for from_list() and to_list().

[
    [ # Sentence 1
        [ '中文字', 'LANGUAGE', (0, 3), ],
    ],
    [ # Sentence 2
        [ '美國', 'GPE', (0, 2), ],
        [ '參議院', 'ORG', (3, 5), ],
    ],
]
Dict format

Used for from_dict() and to_dict().

[
    [ # Sentence 1
        { 'word': '中文字', 'ner': 'LANGUAGE', 'idx': (0, 3), },
    ],
    [ # Sentence 2
        { 'word': '美國', 'ner': 'GPE', 'idx': (0, 2), },
        { 'word': '參議院', 'ner': 'ORG', 'idx': (3, 5), },
    ],
]
CkipTagger format

Used for from_tagger() and to_tagger().

[
    [ # Sentence 1
        ( 0, 3, 'LANGUAGE', '中文字', ),
    ],
    [ # Sentence 2
        ( 0, 2, 'GPE', '美國', ),
        ( 3, 5, 'ORG', '參議院', ),
    ],
]
item_class

alias of NerSentence

classmethod from_tagger(data)[source]

Construct an instance from CkipTagger format.

to_tagger()[source]

Transform to CkipTagger format.