ckipnlp.pipeline.kernel module¶
This module provides kernel CKIPNLP pipeline.
- class ckipnlp.pipeline.kernel.CkipDocument(*, raw=None, text=None, ws=None, pos=None, ner=None, conparse=None)[source]¶
Bases:
Mapping
The kernel document.
- Variables
raw (str) – The unsegmented text input.
text (
TextParagraph
) – The sentences.ws (
SegParagraph
) – The word-segmented sentences.pos (
SegParagraph
) – The part-of-speech sentences.ner (
NerParagraph
) – The named-entity recognition results.conparse (
ParseParagraph
) – The constituency-parsing sentences.
- class ckipnlp.pipeline.kernel.CkipPipeline(*, sentence_segmenter='default', word_segmenter='tagger', pos_tagger='tagger', con_parser='classic-client', ner_chunker='tagger', lazy=True, opts={})[source]¶
Bases:
object
The kernel pipeline.
- Parameters
sentence_segmenter (str) – The type of sentence segmenter.
word_segmenter (str) – The type of word segmenter.
pos_tagger (str) – The type of part-of-speech tagger.
ner_chunker (str) – The type of named-entity recognition chunker.
con_parser (str) – The type of constituency parser.
lazy (bool) – Lazy initialize the drivers.
opts (Dict[str, Dict]) – The driver options. Key: driver name (e.g. ‘sentence_segmenter’); Value: a dictionary of options.
- get_text(doc)[source]¶
Apply sentence segmentation.
- Parameters
doc (
CkipDocument
) – The input document.- Returns
doc.text (
TextParagraph
) – The sentences.
Note
This routine modify doc inplace.
- get_ws(doc)[source]¶
Apply word segmentation.
- Parameters
doc (
CkipDocument
) – The input document.- Returns
doc.ws (
SegParagraph
) – The word-segmented sentences.
Note
This routine modify doc inplace.
- get_pos(doc)[source]¶
Apply part-of-speech tagging.
- Parameters
doc (
CkipDocument
) – The input document.- Returns
doc.pos (
SegParagraph
) – The part-of-speech sentences.
Note
This routine modify doc inplace.
- get_ner(doc)[source]¶
Apply named-entity recognition.
- Parameters
doc (
CkipDocument
) – The input document.- Returns
doc.ner (
NerParagraph
) – The named-entity recognition results.
Note
This routine modify doc inplace.
- get_conparse(doc)[source]¶
Apply constituency parsing.
- Parameters
doc (
CkipDocument
) – The input document.- Returns
doc.conparse (
ParseParagraph
) – The constituency parsing sentences.
Note
This routine modify doc inplace.