ckipnlp.pipeline.core module

This module provides core CKIPNLP pipeline.

class ckipnlp.pipeline.core.CkipDocument(*, raw=None, text=None, ws=None, pos=None, ner=None, parsed=None)[source]

Bases: collections.abc.Mapping

The core document.

Variables
class ckipnlp.pipeline.core.CkipPipeline(*, sentence_segmenter_kind=<DriverKind.BUILTIN: 1>, word_segmenter_kind=<DriverKind.TAGGER: 2>, pos_tagger_kind=<DriverKind.TAGGER: 2>, sentence_parser_kind=<DriverKind.CLASSIC: 3>, ner_chunker_kind=<DriverKind.TAGGER: 2>, lazy=True)[source]

Bases: object

The core pipeline.

Parameters
  • sentence_segmenter_kind (DriverKind) – The type of sentence segmenter.

  • word_segmenter_kind (DriverKind) – The type of word segmenter.

  • pos_tagger_kind (DriverKind) – The type of part-of-speech tagger.

  • ner_chunker_kind (DriverKind) – The type of named-entity recognition chunker.

  • sentence_parser_kind (DriverKind) – The type of sentence parser.

Other Parameters

lazy (bool) – Lazy initialize the drivers.

get_text(doc)[source]

Apply sentence segmentation.

Parameters

doc (CkipDocument) – The input document.

Returns

doc.text (TextParagraph) – The sentences.

Note

This routine modify doc inplace.

get_ws(doc)[source]

Apply word segmentation.

Parameters

doc (CkipDocument) – The input document.

Returns

doc.ws (SegParagraph) – The word-segmented sentences.

Note

This routine modify doc inplace.

get_pos(doc)[source]

Apply part-of-speech tagging.

Parameters

doc (CkipDocument) – The input document.

Returns

doc.pos (SegParagraph) – The part-of-speech sentences.

Note

This routine modify doc inplace.

get_ner(doc)[source]

Apply named-entity recognition.

Parameters

doc (CkipDocument) – The input document.

Returns

doc.ner (NerParagraph) – The named-entity recognition results.

Note

This routine modify doc inplace.

get_parsed(doc)[source]

Apply sentence parsing.

Parameters

doc (CkipDocument) – The input document.

Returns

doc.parsed (ParsedParagraph) – The parsed sentences.

Note

This routine modify doc inplace.