ckipnlp.ws package

class ckipnlp.ws.CkipWs(*, logger=False, ini_file=None, lex_list=None, **kwargs)[source]

Bases: object

The CKIP word segmentation driver.

Parameters
Other Parameters

** – the configs for CKIPWS, passed to ckipnlp.util.ini.create_ws_ini(), ignored if ini_file is set.

Danger

Never instance more than one object of this class!

static normalize_text(text)[source]

Text normalization output.

Replacing keywords ()+-:|&# by by full-width ones.

apply(text, *, normalize=True)[source]

Parse a sentence.

Parameters
  • text (str) – the input sentence.

  • normalize (bool) – do text normalization (please refer normalize_text()).

Returns

str – the output sentence.

Hint

One may also call this method as __call__().

apply_list(ilist, *, normalize=True)[source]

Parse a list of sentences.

Parameters
  • ilist (List[str]) – the list of input sentences.

  • normalize (bool) – do text normalization (please refer normalize_text()).

Returns

List[str] – the list of output sentences.

apply_file(ifile, ofile, uwfile='')[source]

Segment a file.

Parameters
  • ifile (str) – the input file.

  • ofile (str) – the output file (will be overwritten).

  • uwfile (str) – the unknown word file (will be overwritten).