Introduction¶
External Links¶
Requirements¶
Attention
For Python 2 users, please use PyCkip 0.4.2 instead.
CKIPWS (Optional)¶
CKIP Word Segmentation Linux version 20190524+
CKIPParser (Optional)¶
CKIP Parser Linux version 20190506+ (20190725+ recommended)
Installation¶
Denote <ckipws-linux-root>
as the root path of CKIPWS Linux Version, and <ckipparser-linux-root>
as the root path of CKIPParser Linux Version.
Install Using Pip¶
pip install --upgrade ckipnlp
pip install --no-deps --force-reinstall --upgrade ckipnlp \
--install-option='--ws' \
--install-option='--ws-dir=<ckipws-linux-root>' \
--install-option='--parser' \
--install-option='--parser-dir=<ckipparser-linux-root>'
Ignore ws/parser options if one doesn’t have CKIPWS/CKIPParser.
Installation Options¶
Option |
Detail |
Default Value |
---|---|---|
|
Enable/disable CKIPWS. |
False |
|
Enable/disable CKIPParser. |
False |
|
CKIPWS root directory. |
|
|
CKIPWS libraries directory |
|
|
CKIPWS share directory |
|
|
CKIPParser root directory. |
|
|
CKIPParser libraries directory |
|
|
CKIPParser share directory |
|
|
“Data2” directory |
|
|
“Rule” directory |
|
|
“RDB” directory |
|
Usage¶
See http://ckipnlp.readthedocs.io/ for API details.
CKIPWS¶
import ckipnlp.ws
print(ckipnlp.__name__, ckipnlp.__version__)
ws = ckipnlp.ws.CkipWs(logger=False)
print(ws('中文字喔'))
for l in ws.apply_list(['中文字喔', '啊哈哈哈']): print(l)
ws.apply_file(ifile='sample/sample.txt', ofile='output/sample.tag', uwfile='output/sample.uw')
with open('output/sample.tag') as fin:
print(fin.read())
with open('output/sample.uw') as fin:
print(fin.read())
CKIPParser¶
import ckipnlp.parser
print(ckipnlp.__name__, ckipnlp.__version__)
ps = ckipnlp.parser.CkipParser(logger=False)
print(ps('中文字喔'))
for l in ps.apply_list(['中文字喔', '啊哈哈哈']): print(l)
ps.apply_file(ifile='sample/sample.txt', ofile='output/sample.tree')
with open('output/sample.tree') as fin:
print(fin.read())
Utilities¶
import ckipnlp
print(ckipnlp.__name__, ckipnlp.__version__)
from ckipnlp.util.ws import *
from ckipnlp.util.parser import *
# Format CkipWs output
ws_text = ['中文字(Na) 喔(T)', '啊哈(I) 哈哈(D)']
# Show Sentence List
ws_sents = WsSentenceList.from_text(ws_text)
print(repr(ws_sents))
print(ws_sents.to_text())
# Show Each Sentence
for ws_sent in ws_sents: print(repr(ws_sent))
for ws_sent in ws_sents: print(ws_sent.to_text())
# Show CkipParser output as tree
tree_text = '#1:1.[0] S(theme:NP(possessor:N‧的(head:Nhaa:我|Head:DE:的)|Head:Nab(DUMMY1:Nab(DUMMY1:Nab:早餐|Head:Caa:、|DUMMY2:Naa:午餐)|Head:Caa:和|DUMMY2:Nab:晚餐))|quantity:Dab:都|target:PP(Head:P30:往|DUMMY:NP(property:Ncb:天|Head:Ncda:上))|Head:VA11:飛|aspect:Di:了)#'
tree = ParserTree.from_text(tree_text)
tree.show()
# Get heads of tree
for node in tree.get_heads(): print(node)
# Get heads of node 1
for node in tree.get_heads(1): print(node)
# Get heads of node 2
for node in tree.get_heads(2): print(node)
# Get heads of node 13
for node in tree.get_heads(13): print(node)
# Get relations
for rel in tree.get_relations(): print(rel)
FAQ¶
Danger
Due to C code implementation, both CkipWs
and CkipParser
can only be instance once.
Tip
The CKIPWS throws “what(): locale::facet::_S_create_c_locale name not valid”. What should I do?
Install locale data.
apt-get install locales-all
Tip
The CKIPParser throws “ImportError: libCKIPParser.so: cannot open shared object file: No such file or directory”. What should I do?
Add below command to ~/.bashrc
:
export LD_LIBRARY_PATH=<ckipparser-linux-root>/lib:$LD_LIBRARY_PATH