We have developed and opened NLP analysis tools and their corpora in the github site. All of the tools have been developing by deep learning techniques as well as statistical ones. Please, visit the github site (https://github.com/sgnlplabeling/nlp_labeling)of our project.
My paper related to WSD was accepted in COLING 2018 as a full paper. Its title and abstract is as follows:
Word Sense Disambiguation
Based on Word Similarity Calculation Using Word Vector Representation from a
Knowledge-based Graph
Word sense disambiguation (WSD) is the task to determine
the sense of an ambiguous word according to its context.Many existing WSD
studies have been using an external knowledge-based unsupervised approach
because it has fewer word set constraints than supervised approaches requiring
training data. In this paper, we propose a new WSD method to generate the context
of an ambiguous word by using similarities between an ambiguous word and words
in the input document. In addition, to leverage our WSD method, we further
propose a new word similarity calculation method based on the semantic network
structure of BabelNet. We evaluate the proposed methods on the SemEval-2013 and
SemEval-2015 for English WSD dataset. Experimental results demonstrate that the
proposed WSD method significantly improves the baseline WSD method.
Furthermore, our WSD system outperforms the state-of-the-art WSD systems in the
Semeval-13 dataset. Finally, it has higher performance than the state-of-the-art
unsupervised knowledge-based WSD system in the average performance of both
datasets.
COLING 2018 will be held in Santa Fe, New-Mexico, USA, August 20-26, 2018.
My paper related to NER was accepted in CIKM 2017. Its abstract is as follows:
Korean named-entity
recognition (NER) systems have been developed mainly on the morphological-level, and they are commonly based on a pipeline framework that
identifies named-entities (NEs) following the morphological analysis. However,
this framework can mean that the performance of NER systems is degraded,
because errors from the morphological analysis propagate into NER systems. This
paper proposes a novel syllable-level NER system, which does not require a
morphological analysis and can achieve a similar or better performance compared
with the morphological-level NER systems. In addition, because the proposed
system does not require a morphological analysis step, its processing speed is
about 1.9 times faster than those of the previous morphological-level NER systems. CIKM 2017 will be held in Singapore, July 6-10, 2017.
My paper related to Text Classification was published in Information Processing and Management (SSCI & SCIE). The title is "How to Use Negative Class Information for Naive Bayes Classification" and Its abstract is as follows:
The Naive Bayes (NB) classifier is a popular classifier for text classification problems due to its simple, flexible framework and its reasonable performance. In this paper, we present how to effectively utilize negative class information to improve NB classification. As opposed to information retrieval, supervised learning based text classification already obtains class information, a negative class as well as a positive class, from a labeled training dataset. Since the negative class can also provide significant information to improve the NB classifier, the negative class information is applied to the NB classifier through two phases of indexing and class prediction tasks. As a result, the new classifier using the negative class information consistently performs better than the traditional multinomial NB classifier.
You can freelyget the PDF version of this paper from the link https://authors.elsevier.com/a/1VSJt15hYdYMhA until September 15, 2017. This is the fourth manuscript about text classification using negative class information: SIGIR 2012, Pattern Recognition Letters 2015, JASIST 2015 and IPM 2017. Actually, I'm still interested in this topic so I hope that I will be able to do more studies about that.
I gave an invited talk at KISTI. The title is "Text Classification and Summarization (Using Natural Language Processing and Machine Learning Techniques)." http://web.donga.ac.kr/yjko/talks/TC&TS(Youngjoong%20Ko).pdf