Abstract
In this paper, we propose a combinational method of different written styles and apply it to a sightseeing information generation system. There are various kinds of information sources on the Web and their expressions differ largely from formal to casual. For example, literary expression is used in Wikipedia, whereas colloquial expression is used in many SNSs such as twitter. Since much useful and valuable information might be contained in these different sources, research on combining different expression styles to reduce unnaturalness and to implement a user-friendly application is indispensable and very important. In the proposed method, first, N-grarn is constructed from Wikipedia as examples of literary style. Then it is used to convert colloquial style to literary style. In the application to the sightseeing information generation system, we use user’s location as an input. It retrieves basic information of the user’s location from Wikipedia and word-of-mouth information from social media. The colloquial style in the social media data is converted to literary style. Then, selection of sentences is carried out to extract valuable and useful information for sightseeing. As for the output, the proposed system combines information from two types of sources and generates an informative sightseeing information document. We evaluated the method of conversion of written style and the proposed sightseeing information generation system. The experimental results show that the conversion method enabled colloquial style to change to literary style. As for the implementation for sightseeing, the results show that combining sightseeing information from more than one source is considered to be effective.
Original language | English |
---|---|
Pages (from-to) | 1827-1841 |
Number of pages | 15 |
Journal | International Journal of Innovative Computing, Information and Control |
Volume | 10 |
Issue number | 5 |
Publication status | Published - 2014 Oct 1 |
Externally published | Yes |
Keywords
- Ar-gram
- Case frames
- Natural language processing
- Sentence generation
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Information Systems
- Computational Theory and Mathematics