Encoding Transcultural Texts - Applying TEI to Early Modern Chinese Christian Literature

Authors: Wang, Wenlu

Date: Thursday, 7 September 2023, 4:15pm to 5:45pm

Location: Main Campus, L 1 <campus:stage>


This project centers on data modeling and encoding of early modern Chinese Christian literature. It demonstrates the effectiveness of structuring and encoding these texts in TEI-XML for accumulating and analyzing Chinese Christian text data, as well as contributes to the discussion on developing models for encoding and linking transcultural texts.

Over 1,050 works were co-produced by European missionaries and Chinese converts between 1582 and the mid-19th century. These texts, including religious manuals and translations of Scholastic philosophy, document various aspects of the transcultural encounters between Europe and China during the early modern period. Consequently, they serve as essential source materials for research on, among other topics, the Catholic Church’s global mission and global knowledge circulation in the early modern era. In the past few decades, bibliographic databases and large-scale digitization and reprint projects of collections in European and Chinese libraries and archives have made a major part of the texts available to the research community. Thanks to these publications, a series of print-based transcriptions have been published, further improving the accessibility of the texts.

However, digital versions with searchable full-text are not yet available; moreover, no efforts have been made to provide the text data in structured machine-readable formats, such as TEI-XML. Drawing from experiences of encoding text in East Asian languages in TEI-XML, especially Buddhist texts, this project applies TEI encoding to Chinese Christian texts, exploring possibilities of making the texts available in more economical, sustainable ways and to a wider community.

As an initial effort, this project encoded a collection of texts known as “Dottrina Christiana,” which consists of fundamental Christian tenets that converts need to learn before receiving baptism and recite in religious practices afterwards. Often anonymous, this genre contains dozens of variants that differ in length and specific word usage, which previously made comprehensive analysis challenging and unsuccessful. The critical apparatus framework of the TEI guidelines proved to be effective in recording the variants in an efficient and precise manner while allowing future additions when a new variant is identified. The project adopts the parallel segmentation method and encodes according to the neutral style (Figure 1), as the purpose of the project is to provide an overall understanding of the fluidity of the texts and evidence for grouping and dating each variant. Besides the <app> element, the encoding also utilizes a set of structural tags, including <front>, <body>, <back>, and under the <body> element, <div>, <head>, <p> to show the basic structure of the text.

Fig. 1.
Fig. 1.

At this point, the project has encoded seven primary variants and utilized existing visualization tools such as Versioning Machine and Critical Apparatus Toolbox alongside to revise and improve the encoding. Based on the encoded data and visualizations, this project managed to provide the first systematic analysis of this genre, grouping and dating the variants. This enables mapping the gradual development of translations of key theological terminologies and contents to teach the catechumens and converts at different time periods. Although both the encoding guidelines and visualization tools generally worked well with these early modern Chinese Christian texts, several issues remain unresolved. Among the challenges are how to encode and visualize warigaki 割書 (shuanghang xiaozhu 双行小注 in Chinese, i.e. texts written in smaller size in two equally split lines, functioning as annotations to the main text, Figure 2), and how to present and compare each section encoded as <div>, that varies in length, in a parallel fashion, which the project couldn’t realize using existing visualization tools.

Fig. 2.
Fig. 2.

Chinese Christian texts are transcultural in nature. Some are direct translations from works in European languages, and others are free adaptations. They are also part of the multilingual body of works produced in the early modern global mission of the Catholic Church, whose activities ranged from Asia, the Americas, through Africa. Drawing from previous studies of parallel corpora of Buddhist texts and multilingual Bibles, this project is exploring ways to link Chinese Christian texts with their European origins and early modern Japanese Christian texts that are translated from the same European originals. In addition to paragraph or sentence-level parallel corpus, the project also plans to encode key theological terminologies that are rendered into Chinese and Japanese according to either pronunciation or meaning. Encoding Chinese Christian literature as transcultural texts shares general practices and common challenges with multilingual corpus building in Buddhist studies and Biblical studies. The shared practice may serve as the gateway through which researchers of the History of Christianity in Asia can connect with broader research communities.


Apollon, Daniel, Claire Bélisle, and Régnier Philippe, eds. Digital Critical Editions. University of Illinois Press, 2014.

Burghart, Marjorie. “The TEI Critical Apparatus Toolbox: Empowering Textual Scholars through Display, Control, and Comparison Features.” Journal of the Text Encoding Initiative, no. 10 (December 7, 2016). https://doi.org/10.4000/jtei.1520.

Nagasaki, Kiyonori, Toru Tomabechi, and Masahiro Shimoda. “Towards a Digital Research Environment for Buddhist Studies.” Literary and Linguistic Computing 28, no. 2 (June 2013): 296–300. https://doi.org/10.1093/llc/fqs076.

Schreibman, Susan, Amit Kumar, and Jarom McDonald. “The Versioning Machine.” Literary and Linguistic Computing 18, no. 1 (April 1, 2003): 101–7. https://doi.org/10.1093/llc/18.1.101.

Standaert, Nicolas, and Nora Van den Bosch. “Mapping the Printing of Sino-European Intercultural Books in China (1582–c.1823).” East Asian Publishing and Society 12, no. 2 (October 11, 2022): 130–91. https://doi.org/10.1163/22106286-12341367.

Standaert, Nicolas, and Ad Dudink. “Chinese Christian Texts Database (CCT-Database).” Accessed August 1, 2023. http://www.arts.kuleuven.be/sinology/cct.

The CrossWire Bible Society. “OSIS Manual.” Accessed August 1, 2023. https://crosswire.org/osis/.

The SAT Daizōkyō Text Database Committee (SAT). “The SAT Daizōkyō Text Database.” Accessed August 1, 2023. http://21dzk.l.u-tokyo.ac.jp/SAT/.

About the author

WANG Wenlu is a project researcher at Tokyo College, The University of Tokyo. Her research field includes the history of Christianity in China, and Europe-East Asia intercultural encounters. She is now working on a manuscript about Chinese translations of Christian doctrine (1580s-1720s) and a TEI critical edition of a Manila Chinese Christian work.

Contribution Type