Natural Language Processing, Machine Learning & Information Retrieval

Friday, April 19, 2019

Word Embedding 시작하기

두번째 동영상("Word Embedding 시작하기")을 올립니다.

https://youtu.be/d3Pb0_8cI4A

많은 응원 부탁드립니다.

Saturday, April 6, 2019

Natural Language Processing with Deep Neural Networks

앞으로 자연어처리, 기계학습, 정보검색, 텍스트 마이닝 분야의 내용을 동영상으로 녹화해서 지속적으로 Youtube에 올릴 계획입니다.

첫번째 만들고 있는 동영상의 제목은 "Natural Language Processing with Deep Neural Networks" 입니다.

첫번째 동영상은 아래 링크를 통해 보실 수 있습니다.

https://youtu.be/sNq5xbtXhSI

많은 시청 부탁드리고 계속 유익한 동영상 올릴 수 있도록 응원 부탁드립니다.

Sunday, March 24, 2019

Opening NLP analysis tools and their corpora

We have developed and opened NLP analysis tools and their corpora in the github site.
All of the tools have been developing by deep learning techniques as well as statistical ones.

Please, visit the github site (https://github.com/sgnlplabeling/nlp_labeling) of our project.

Wednesday, May 16, 2018

My accepted paper in COLING 2018

My paper related to WSD was accepted in COLING 2018 as a full paper. Its title and abstract is as follows:

Word Sense Disambiguation Based on Word Similarity Calculation Using Word Vector Representation from a Knowledge-based Graph

Word sense disambiguation (WSD) is the task to determine the sense of an ambiguous word according to its context. Many existing WSD studies have been using an external knowledge-based unsupervised approach because it has fewer word set constraints than supervised approaches requiring training data. In this paper, we propose a new WSD method to generate the context of an ambiguous word by using similarities between an ambiguous word and words in the input document. In addition, to leverage our WSD method, we further propose a new word similarity calculation method based on the semantic network structure of BabelNet. We evaluate the proposed methods on the SemEval-2013 and SemEval-2015 for English WSD dataset. Experimental results demonstrate that the proposed WSD method significantly improves the baseline WSD method. Furthermore, our WSD system outperforms the state-of-the-art WSD systems in the Semeval-13 dataset. Finally, it has higher performance than the state-of-the-art unsupervised knowledge-based WSD system in the average performance of both datasets.

COLING 2018 will be held in Santa Fe, New-Mexico, USA, August 20-26, 2018.

Saturday, August 5, 2017

My accepted paper in CIKM 2017

My paper related to NER was accepted in CIKM 2017. Its abstract is as follows:

Korean named-entity recognition (NER) systems have been developed mainly on the morphological-level, and they are commonly based on a pipeline framework that identifies named-entities (NEs) following the morphological analysis. However, this framework can mean that the performance of NER systems is degraded, because errors from the morphological analysis propagate into NER systems. This paper proposes a novel syllable-level NER system, which does not require a morphological analysis and can achieve a similar or better performance compared with the morphological-level NER systems. In addition, because the proposed system does not require a morphological analysis step, its processing speed is about 1.9 times faster than those of the previous morphological-level NER systems.

CIKM 2017 will be held in Singapore, July 6-10, 2017.

Friday, July 28, 2017

My published paper in Information Processing and Management (IPM 2017)

My paper related to Text Classification was published in Information Processing and Management (SSCI & SCIE). The title is "How to Use Negative Class Information for Naive Bayes Classification" and Its abstract is as follows:

The Naive Bayes (NB) classifier is a popular classifier for text classification problems due to its simple, flexible framework and its reasonable performance. In this paper, we present how to effectively utilize negative class information to improve NB classification. As opposed to information retrieval, supervised learning based text classification already obtains class information, a negative class as well as a positive class, from a labeled training dataset. Since the negative class can also provide significant information to improve the NB classifier, the negative class information is applied to the NB classifier through two phases of indexing and class prediction tasks. As a result, the new classifier using the negative class information consistently performs better than the traditional multinomial NB classifier.

You can freely get the PDF version of this paper from the link https://authors.elsevier.com/a/1VSJt15hYdYMhA until September 15, 2017.

This is the fourth manuscript about text classification using negative class information: SIGIR 2012, Pattern Recognition Letters 2015, JASIST 2015 and IPM 2017. Actually, I'm still interested in this topic so I hope that I will be able to do more studies about that.

Thursday, July 6, 2017

Text Classification and Summarization (Using Natural Language Processing and Machine Learning Techniques)

I gave an invited talk at KISTI. The title is "Text Classification and Summarization (Using Natural Language Processing and Machine Learning Techniques)."

http://web.donga.ac.kr/yjko/talks/TC&TS(Youngjoong%20Ko).pdf