We have developed and opened NLP analysis tools and their corpora in the github site.
All of the tools have been developing by deep learning techniques as well as statistical ones.
Please, visit the github site (https://github.com/sgnlplabeling/nlp_labeling) of our project.
Sunday, March 24, 2019
Wednesday, May 16, 2018
My accepted paper in COLING 2018
My paper related to WSD was accepted in COLING 2018 as a full paper. Its title and abstract is as follows:
COLING 2018 will be held in Santa Fe, New-Mexico, USA, August 20-26, 2018.
Word Sense Disambiguation
Based on Word Similarity Calculation Using Word Vector Representation from a
Knowledge-based Graph
Word sense disambiguation (WSD) is the task to determine
the sense of an ambiguous word according to its context. Many existing WSD
studies have been using an external knowledge-based unsupervised approach
because it has fewer word set constraints than supervised approaches requiring
training data. In this paper, we propose a new WSD method to generate the context
of an ambiguous word by using similarities between an ambiguous word and words
in the input document. In addition, to leverage our WSD method, we further
propose a new word similarity calculation method based on the semantic network
structure of BabelNet. We evaluate the proposed methods on the SemEval-2013 and
SemEval-2015 for English WSD dataset. Experimental results demonstrate that the
proposed WSD method significantly improves the baseline WSD method.
Furthermore, our WSD system outperforms the state-of-the-art WSD systems in the
Semeval-13 dataset. Finally, it has higher performance than the state-of-the-art
unsupervised knowledge-based WSD system in the average performance of both
datasets.
COLING 2018 will be held in Santa Fe, New-Mexico, USA, August 20-26, 2018.
Saturday, August 5, 2017
My accepted paper in CIKM 2017
My paper related to NER was accepted in CIKM 2017. Its abstract is as follows:
Korean named-entity recognition (NER) systems have been developed mainly on the morphological-level, and they are commonly based on a pipeline framework that identifies named-entities (NEs) following the morphological analysis. However, this framework can mean that the performance of NER systems is degraded, because errors from the morphological analysis propagate into NER systems. This paper proposes a novel syllable-level NER system, which does not require a morphological analysis and can achieve a similar or better performance compared with the morphological-level NER systems. In addition, because the proposed system does not require a morphological analysis step, its processing speed is about 1.9 times faster than those of the previous morphological-level NER systems.
CIKM 2017 will be held in Singapore, July 6-10, 2017.
Korean named-entity recognition (NER) systems have been developed mainly on the morphological-level, and they are commonly based on a pipeline framework that identifies named-entities (NEs) following the morphological analysis. However, this framework can mean that the performance of NER systems is degraded, because errors from the morphological analysis propagate into NER systems. This paper proposes a novel syllable-level NER system, which does not require a morphological analysis and can achieve a similar or better performance compared with the morphological-level NER systems. In addition, because the proposed system does not require a morphological analysis step, its processing speed is about 1.9 times faster than those of the previous morphological-level NER systems.
CIKM 2017 will be held in Singapore, July 6-10, 2017.
Friday, July 28, 2017
My published paper in Information Processing and Management (IPM 2017)
My paper related to Text Classification was published in Information Processing and Management (SSCI & SCIE). The title is "How to Use Negative Class Information for Naive Bayes Classification" and Its abstract is as follows:
The Naive Bayes (NB) classifier is a popular classifier for text classification problems due to its simple, flexible framework and its reasonable performance. In this paper, we present how to effectively utilize negative class information to improve NB classification. As opposed to information retrieval, supervised learning based text classification already obtains class information, a negative class as well as a positive class, from a labeled training dataset. Since the negative class can also provide significant information to improve the NB classifier, the negative class information is applied to the NB classifier through two phases of indexing and class prediction tasks. As a result, the new classifier using the negative class information consistently performs better than the traditional multinomial NB classifier.
You can freely get the PDF version of this paper from the link https://authors.elsevier.com/ a/1VSJt15hYdYMhA until September 15, 2017.
This is the fourth manuscript about text classification using negative class information: SIGIR 2012, Pattern Recognition Letters 2015, JASIST 2015 and IPM 2017. Actually, I'm still interested in this topic so I hope that I will be able to do more studies about that.
The Naive Bayes (NB) classifier is a popular classifier for text classification problems due to its simple, flexible framework and its reasonable performance. In this paper, we present how to effectively utilize negative class information to improve NB classification. As opposed to information retrieval, supervised learning based text classification already obtains class information, a negative class as well as a positive class, from a labeled training dataset. Since the negative class can also provide significant information to improve the NB classifier, the negative class information is applied to the NB classifier through two phases of indexing and class prediction tasks. As a result, the new classifier using the negative class information consistently performs better than the traditional multinomial NB classifier.
You can freely get the PDF version of this paper from the link https://authors.elsevier.com/
This is the fourth manuscript about text classification using negative class information: SIGIR 2012, Pattern Recognition Letters 2015, JASIST 2015 and IPM 2017. Actually, I'm still interested in this topic so I hope that I will be able to do more studies about that.
Thursday, July 6, 2017
Text Classification and Summarization (Using Natural Language Processing and Machine Learning Techniques)
I gave an invited talk at KISTI. The title is "Text Classification and Summarization (Using Natural Language Processing and Machine Learning Techniques)."
http://web.donga.ac.kr/yjko/talks/TC&TS(Youngjoong%20Ko).pdf
http://web.donga.ac.kr/yjko/talks/TC&TS(Youngjoong%20Ko).pdf
Friday, June 2, 2017
How to Develop NLP Tools with DNN Techniques
I gave an invited talk in IT 21 Global Conference at June 2, 2017. The title is "How to develop NLP tools with DNN techniques."
http://web.donga.ac.kr/yjko/talks/NLP_Tools_with_DNN(Youngjoong%20Ko).pdf
http://web.donga.ac.kr/yjko/talks/NLP_Tools_with_DNN(Youngjoong%20Ko).pdf
Friday, March 4, 2016
The Basic Concept of TensorFlow
I am preparing to teach TensorFlow in my graduate course. TensorFlow is Google's open software library for machine learning. The first class is about the basic concept of TensorFlow.
Next topic is about "Practice of NNet with the MNIST data."
Subscribe to:
Posts (Atom)