Taiyō Corpus

The Taiyō Corpus is a text database of the periodical Taiyō, which was read by a wide range of readers from the end of 19th to the beginning of the 20th centuries. The individual articles exhibit a variety of writing styles and orthographic characteristics, providing an excellent resource for the study of the development of modern Japanese. The Taiyō Corpus provides texts of 3,409 articles in 60 issues published over the period of 1895-1925, amounting to approximately 15 million characters. All texts are XML formatted, allowing for efficient text retrieval and data handling. Shown below are samples of the original text of Taiyō (top left), the XML document (top right), the XML document formatted with an XSL style sheet (bottom right), and the dialogue of the information retrieval tool provided with the corpus (bottom left).

Images of the Taiyo Corpus

  back

Links |  Access |  Contact us |      Copyright (c) The National Institute for Japanese Language All Rights Reserved.

top Japanese