Developing a text-based corpus of the language of Japanese comics (<em>manga</em>)

Unser-Schutz, Giancarla; Newman, John; Baayen, Harald; Rice, Sally

doi:10.1163/9789401206884_012

Javascript is disabled or not supported in your browser. JavaScript must be enabled in order for you to use WIKINDX fully. Enable JavaScript through your browser options then try again, otherwise, try using a different browser.

BOBC

Wikindx
- Home
- Preferences
- Statistics...
- User Logon
- Register
- About Wikindx
Resources
- Random Resource
- Last Solo View
- Last List View
- Export list...
  - RTF
  - BibTeX
  - HTML
  - EndNote
  - RIS
Search
- Quick Search
- Advanced Search
- Category Tree
- Quick List All...
- Browse...
  - Types
  - Creators
  - Cited
  - Years
  - Keywords
  - Keyword Groups
  - Collections...
    - ALL
    - Book
    - Journal
    - Magazine
    - Newspaper
    - Proceedings
    - Thesis Abstracts
    - Web Site
  - Publishers...
    - ALL
    - Book
    - Conference
    - Distributor
    - Institution
  - Categories
  - Subcategories
  - Languages
  - System Users
  - Departments
  - Institutions
- Zoom...
  - Titles
  - Creators
  - Years

WIKINDX Resources

Unser-Schutz, Giancarla. "Developing a text-based corpus of the language of Japanese comics (manga)." Corpus-based Studies in Language Use Language Learning and Language Documentation. Eds. John Newman, Harald Baayen and Sally Rice. Language and Computers. Amsterdam [etc.]: Rodopi, 2011. 213–38.
Added by: joachim (23/02/2017, 12:09) Last edited by: joachim (23/02/2017, 12:16)

Resource type: Book Chapter
Language: en: English
DOI: 10.1163/9789401206884_012
BibTeX citation key: UnserSchutz2011a
Email resource to friend
View all bibliographic details

Categories: General
Keywords: Digitalization, Japan, Language, Manga
Creators: Baayen, Newman, Rice, Unser-Schutz
Publisher: Rodopi (Amsterdam [etc.])
Collection: Corpus-based Studies in Language Use Language Learning and Language Documentation

Views: 3/451

Attachments

Abstract

While demands for corpora from media which mix visual and linguistic elements have increased in recent years with developments in corpus-based linguistics research, the actual creation and design of such corpora present many unique problems. Most centrally, there remains much to be considered in terms of how to isolate and meaningfully represent their linguistic data. In line with these trends, in this paper I introduce a 687,654 character (55,415 entries) corpus of the language from Japanese comics (manga). Many of the issues encountered in its design are found with other media – newspaper stories, advertisements, political cartoons – which mix the visual with the linguistic. In addition to describing how such unusual text could be of interest to other researchers, the approaches taken here may help others with similar projects.

WIKINDX 6.9.1 | Total resources: 14545 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: Modern Language Association (MLA)