ckipnlp.container.util.wspos module

This module provides containers for word-segmented sentences with part-of-speech-tags.

class ckipnlp.container.util.wspos.WsPosToken(word: Optional[str] = None, pos: Optional[str] = None)[source]

Bases: ckipnlp.container.base.BaseTuple, ckipnlp.container.util.wspos._WsPosToken

A word with POS-tag.

Variables
  • word (str) – the word.

  • pos (str) – the POS-tag.

Note

This class is an subclass of tuple. To change the attribute, please create a new instance instead.

Data Structure Examples

Text format

Used for from_text() and to_text().

'中文字(Na)'  # word / POS-tag
List format

Used for from_list() and to_list().

[
    '中文字', # word
    'Na',    # POS-tag
]
Dict format

Used for from_dict() and to_dict().

{
    'word': '中文字', # word
    'pos': 'Na',     # POS-tag
}
classmethod from_text(data)[source]

Construct an instance from text format.

Parameters

data (str) – text such as '中文字(Na)'.

Note

  • '中文字(Na)' -> word = '中文字', pos = 'Na'

  • '中文字' -> word = '中文字', pos = None

class ckipnlp.container.util.wspos.WsPosSentence[source]

Bases: object

A helper class for data conversion of word-segmented and part-of-speech sentences.

classmethod from_text(data)[source]

Convert text format to word-segmented and part-of-speech sentences.

Parameters

data (str) – text such as '中文字(Na)\u3000耶(T)'.

Returns

static to_text(word, pos)[source]

Convert text format to word-segmented and part-of-speech sentences.

Parameters
Returns

str – text such as '中文字(Na)\u3000耶(T)'.

class ckipnlp.container.util.wspos.WsPosParagraph[source]

Bases: object

A helper class for data conversion of word-segmented and part-of-speech sentence lists.

classmethod from_text(data)[source]

Convert text format to word-segmented and part-of-speech sentence lists.

Parameters

data (Sequence[str]) – list of sentences such as '中文字(Na)\u3000耶(T)'.

Returns

static to_text(word, pos)[source]

Convert text format to word-segmented and part-of-speech sentence lists.

Parameters
Returns

List[str] – list of sentences such as '中文字(Na)\u3000耶(T)'.