Research blog

Below we show the text messages of COVID-19 campaign posters that we collected in Hong Kong and Guangzhou. These posters were officially released from the outbreak of COVID-19 to May 7, 2020.

The poster texts are categorized into Chinese-English bilingual datasets or the monolingual datasets in either Chinese or English, with pinyin transcription of the Chinese added.

In every dataset, modality devices are marked in bold, negative polarity markers are underlined with single lines, personal references are underlined with double lines, and terms referring to social distancing are marked with rectangles by the authors.

Posted on Aug. 27, 2021, 1:46 p.m.

In this article we present the Database of Word-Level Statistics for Mandarin Chinese (DoWLS-MAN). The database addresses the lack of agreement in phonological syllable segmentation specific to Mandarin by offering phonological features for each lexical item according to 16 schematic representations of the syllable (8 with tone and 8 without tone). Those lexical statistics that differ per phonological word and nonword due to changes in syllable segmentation are of the variant category and include subtitle lexical frequency, phonological neighborhood density measures, homophone density, and network science measures. The invariant characteristics consist of each items' lexical tone, phonological transcription, and syllable structure among others.

Posted on Aug. 25, 2021, 11:04 a.m.

詞性的理論基礎,集合論還是類型論?
Are Grammatical Categories Set-theoretical or Type-theoretical?

Posted on Dec. 16, 2020, 11:01 a.m.

Accepted for Publication in Frontiers in Psychology - Language Sciences by Dr. I-Hsuan Chen (former CBS PDF who just completed her appointment), Prof. Huang and Dr. Politzer-Ahles is a paper providing new evidence for the influence of prosodic cues on scalar interpretation but with different roles in quantity-contrast and type-contrast. Colleagues who are interested in reading and commenting on the pre-final version please contact the authors.

I-Hsuan Chen, Chu-Ren Huang, and Stephen Politzer-Ahles. To Appear 2018. Determining the types of contrasts: the influences of prosody on pragmatic inferences. Frontiers in Psychology, section Language Sciences.

Abstract
This study explores the issues involving pragmatic inferences with prosodic cues. Although there is a well-established literature from multiple languages demonstrating how different pragmatic inferences can be applied to the same syntactic structure, few studies discuss whether prosody can determine types of alternative sets based on the same syntactic structure. In Mandarin Chinese, the same sentence containing a numeral-classifier phrase as a negative polarity item can be employed for two types of scalar inferences based on either the numeral or the noun. The sentence wo yi zhi mayi dou mei kan dao ("I didn't even see one ant") can induce two different scalar inferences: Quantity-contrast (‘I did not see one ant, much less two ants, three ants, and so on’ by drawing a contrast against the minimal quantity of one), and Type-contrast (‘I did not see an ant, much less a dog, a cat, a human being, and so on’ by drawing a contrast against the minimally surprising type, that of ants). Taking advantage of similar sentences with the syntactic structure and lexical items, our study examines whether prosodic conditions can guide people to choose pragmatic inferences from a set of options based on the same syntactic structure. The experiments of this study are designed to answer whether prosody interacts with contextual information in this grammatical structure. The results suggest that Mandarin speakers can use sentence prosody to determine which inference is intended, at least in experimental contexts that directly probe explicit awareness of prosody. Prosody does play a role in inducing scalar inferences, but contextual information can override the effects of prosody. Each prosodic pattern can evoke a specific set of scalar inferences, but quantity-contrast inferences are favored over type-contrast inferences. Our experiments show that prosodic prominence can serve as a linguistic cue to pragmatic inferences.

Keywords: prosody, scalar inferences, numeral-classifier phrases, negative polarity items, intonation.

Posted on Oct. 15, 2018, 3:15 p.m.

Accepted for Publication in Language and Linguistics 20.2. April 2019 by Dr. Shichang Wang (PhD from CBS, 2016, currently at Shandong University), Prof. Huang, Dr. Yao, and Dr. Chan is a paper exploring the relation between lexical semantic processing and headedness based on semantic transparency. The study used the innovative method of crowdsourcing to build a semantic transparency dataset that included transparency rating for each compound and both of its constituent roots. Colleagues who are interested in reading and commenting on the pre-final version please contact the authors. The semantic transparency dataset will be available worldwide through LDC, UPenn and is open to our colleagues for research. Please see second page for description of the dataset.

Shichang Wang, Chu-Ren Huang, Yao Yao and Wing Shan Angel Chan. 2019. The effect of morphological structure on semantic transparency ratings. Language and Linguistics 20.2. April 2019.

Abstract
Semantic transparency deals with the interface between lexical semantics and morphology. It is an important linguistic phenomenon in Chinese in the context of prediction of meanings of compounds from its constituents. Given prominence of compounding in Chinese morpho-lexical processes, to date there is no semantic transparency dataset available to support verifiable and replicable quantitative analysis of semantic transparency in Mandarin Chinese. In addition, the relation between semantic transparency and morphological structure has not been systematically examined. This paper reports a crowdsourcing-based experiment designed for the construction of a large semantic transparency dataset of Chinese Chinese compounds which includes semantic transparency ratings of both the compound and each constituent root of the compound. We also present an analysis of the effects of morphological structure on semantic transparency using the constructed dataset. Our study found that in a transparent modifier-head compound, the head tends to get greater semantic transparency rating than the modifier. Interestingly, no such effect is observed in coordinative compounds. This result suggests that compounds of different morphological structures are processed differently and that the concept of head plays an important role in the word-formation process of compounding. We advocate that crowdsourcing can be a highly instrumental method to collect linguistic judgements and to construct language resources in Chinese language studies. In addition, the proposed methodology of comparing constituent transparency and word transparency sheds light on the relation between morpho-lexical structure and cognitive processing of lexical meanings.

Keywords: compound semantic transparency, constituent semantic transparency, semantic transparency dataset, headedness, crowdsourcing

Posted on Oct. 15, 2018, 3:09 a.m.

Long, Yunfei, Rong Xiang, Qin Lu, Dan Xiong, Chu-Ren Huang, Chenlin Bi, Minglei Li. 2018. Learning Heterogeneous Network Embedding from Text and Links. IEEE Access. 6.1.1-11.
Published Online 2018-10-02.
To Appear: 2018-12
DOI: 10.1109/ACCESS.2018.2873044
Early Access: https://ieeexplore.ieee.org/document/8478654/
IEEE Access® is a multidisciplinary, applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE's fields of interest.
JCR IF: 3.557 (2017)
SJR Q1 in Computer Science (Misc.) and Engineering (Misc.)

Posted on Oct. 8, 2018, 10:46 a.m.

ZHAO Qingqing (CBS) 赵青青 has been awarded “Faculty of Humanities Distinguished Thesis Award for PhD Students 2017/18” (with a certificate and $3,000 book coupon to be presented at the FH Congregation on 3 November 2018.

Posted on Sept. 14, 2018, 11:43 a.m.

The Routledge Handbook of Chinese Applied Linguistics
Edited by Chu-Ren Huang, Zhuo Jing-Schmidt, Barbara Meisterernst

The Routledge Handbook of Chinese Applied Linguistics is written for those wanting to acquire comprehensive knowledge of China, the diaspora, and the Sino-sphere communities through Chinese language.

It examines how Chinese language is used in different contexts, and how the use of Chinese language affects culture, society, expression of self, persuasion of others; as well as how neurophysiological aspects of language disorder affects how we function, and how the advance of technology changes the way the Chinese language is used and perceived. The handbook concentrates on the cultural, societal, and communicative characteristics of the Chinese language environment.

Focusing on language use in action, in context, and in vivo, this book intends to lay empirical grounds for collaboration and synergy among different fields.

For pre-order, please go to:

Routledge website:https://www.routledge.com/The-Routledge-Handbook-of-Chinese-Applied-Linguistics/Chu-Ren-Jing-Schmidt-Meisterernst/p/book/9781138650732

OR

Amazon website:https://www.amazon.com/Routledge-Handbook-Chinese-Applied-Linguistics/dp/1138650730/ref=sr_1_1?ie=UTF8&qid=1534211030&sr=8-1&keywords=The+Routledge+Handbook+of+Chinese+Applied+Linguistics

Posted on Aug. 28, 2018, 1:17 p.m.

>Conference Time<
Dec 1-3, 2018

>Conference Venue<
The Hong Kong Polytechnic University, Hong Kong SAR

>Submission Deadline<
To Be Confirmed

>Conference Enquiry<
paclic32@polyu.edu.hk

>Conference Host<
Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University

>Conference Goals<
The long tradition of PACLIC conferences emphasizes the synergy of theoretical analysis and processing of language - from theoretical frameworks to cognitive accounts, from lexical processing to language understanding, and from computational modelling to multi-lingual applications.

The most important purpose of PACLIC conferences is to provide a forum where researchers in different fields of language study in different areas in the Pacific-Asia region working on issues pertaining to different languages can come together and talk, get to know each other, learn old wisdom, be enlightened by new insights and generally get entertained intellectually, and come home ready to initiate a new research program with new research partners in a new state of mind.

We value hybrid talks on linguistic principles and implementation details, massive data collection and extraction of abstract rules, automated proficiency evaluation and philosophical contemplation on language learning, although pure theory and pure technology will also be appreciated. We welcome bi-lingual or multi-lingual research, while due respects will be paid to mono-lingual research.

>Publications<
Conference proceedings will be published in open-access digital formats. Past PACLIC proceeding papers have been indexed in Scopus (since PACLIC 19 in 2005) and listed in ACL Anthology. According to Google Scholar, PACLIC currently has an h5-index of 10 and the h5-median is 13.

Posted on Feb. 20, 2018, 3:41 p.m.

8 May 2018, co-located with LREC
Call for Papers
Long and short paper submission deadline: January 5, 2018

The LiNCR (pronounced as ‘linker’) workshop aims to provide a venue to explore a new generation of language resources which link and aggregate cognitive behavioral, neuroimaging measurement data to a shared set of richly annotated linguistic data. The issues will include but not limit to the ontology for aggregation of neuro-cognitive data with linguistic facts, how to interpret experimental data when linked to additional linguistic facts, how to design experiments that allow same data sets to be shared by different experimental modality, how to link and normalize data from subjects with special cognitive conditions to the norms, how to link and aggregate multilingual data, and the stochastic solutions for data aggregation and learning. In addition to providing a forum for presenting existing LiNCRs as well as innovative research based on integrated heterogeneous datasets, we also welcome project notes and discussions on our proposal to that may address issues and challenges arising from new types of LiNCRs. The workshop will also include breakout forums for initial discussion to form consortia for future collaboration.

Posted on Nov. 1, 2017, 2:27 p.m.

CLSW19將於西元2018年5月26日(星期六)至28日(星期一),在中正大學舉行,由中正大學語言學研究所承辦。
CLSW所發表之論文,會後經修改,將收錄會議論文集,英文選刊論文集由Springer (LNAI)出版,並收入EI、SCOPUS檢索。本屆會議將設立最佳論文奬項,會議也將推薦優秀論文發表至北京大學學報(自然科學版)、中文信息學報、語言文字應用、鄭州大學學報(理學版)、Lingua Sinica、International Journal of Knowledge and Language Processing (IKLP)等期刊。
會議日期:2018.05.26 -- 05.28
投稿截止:2017.12.31

Posted on Oct. 19, 2017, 8:30 p.m.

“第十六届全国计算语言学学术会议”(The Sixteenth China National Conference on Computational Linguistics, CCL 2017)将于2017年10月13日—15日在南京师范大学举行。作为国内最大的自然语言处理专家学者的社团组织——中国中文信息学会(CIPS)的旗舰会议,全国计算语言学会议从1991年开始每两年举办一次,从2013年开始每年举办一次,经过20余年的发展历程,已形成了十分广泛的学术影响,成为国内自然语言处理领域权威性最高、口碑最好、规模最大(2016年注册参会人数超过600人)的学术会议。CCL着重于中国境内各类语言的计算处理,为研讨和传播计算语言学最新的学术和技术成果提供了高水平的深入交流平台。
CCL 2017和NLP-NABD 2017时间表:
• 论文投稿的截止日期:2017年6月10日
• 录用通知发出日期:2017年7月10日
• 最终版提交日期:2017年7月25日

Posted on June 11, 2017, 6:34 a.m.

The 31th Pacific Asia Conference on Language, Information and Computation (PACLIC 31)
November 16-18, 2017
University of the Philippines Cebu
Cebu, Philippines

Organized by:
Linguistic Society of the Philippines
Computing Society of the Philippines – Special Interest Group on Natural Language Processing
University of the Philippines Cebu
National University

News
Paper Submission Deadline: May 15, 2017

Posted on April 5, 2017, 12:08 p.m.

CCL 2017和NLP-NABD 2017时间表:

• 论文投稿的截止日期:2017年6月1日

• 录用通知发出日期:2017年7月1日

• 最终版提交日期:2017年7月15日

Posted on March 28, 2017, 10:06 a.m.

The 18th Chinese Lexical Semantics Workshop (CLSW2017) will be heldon May 18-20, 2017 at Leshan Normal University, Leshan, Sichuan, China.
This workshop series has been held in different Asia Pacific cities, includingHong Kong, Beijing, Taipei, Singapore, Xiamen, Hsin Chu, Yantai, Suzhou,Wuhan, Zhengzhou and Macao.

CLSW2017:18 – 20 May 2017
Paper submission deadline:22 January 2017
Email: clsw2017@163.com

Posted on Dec. 4, 2016, 4:13 a.m.

COBRA is concerned with the dynamics of conversational interactions between people at the interface between the linguistic, physiological, cognitive and brain levels.
The objective of COBRA is to train a new generation of young researchers to accurately characterize how speakers coordinate with each other in a conversation, with a focus on the mechanisms employed by speakers in the building up of shared meaning.

As part of the CoBra consortium, our project proposed collaboration with European research network, and in particular with the CNRS-LPL at Aix-Marseille University. This proposed European research network will stand at the cross roads between language, social interaction, cognition, and the human brain. The main goal of this project is to train our participants with inter-disciplinary research competence, especially in terms of new technology to explore the study of linguistic and cognitive activities during conversation, as well as multilingual research methodology in the context of international collaboration.

Posted on Nov. 21, 2016, 8:01 a.m.

The 55th annual meeting of the Association for Computational Linguistics (ACL) will take place in Vancouver, Canada. ACL 2017 will be held at the Westin Bayshore Hotel in downtown Vancouver from July 30th through August 4th, 2017.

As in previous years, the program of the conference includes poster sessions, tutorials, workshops, and demonstrations in addition to the main conference. ACL is the premier conference of the field of computational linguistics, covering a broad spectrum of diverse research areas that are concerned with computational approaches to natural language.
Important dates:
Submission Deadline (Long & Short Papers): Monday February 6, 2017
Author Response Period Monday – Wednesday: March 13 – March 15, 2017
Notification of Acceptance Thursday: March 30, 2017
Tutorials : Sunday July 30, 2017
Main Conference Monday – Wednesday: July 31 – August 2, 2017
Workshops: Thursday – Friday August 3 – 4, 2017

Posted on Nov. 2, 2016, 1:54 p.m.

The Linguistics and Language Technology (LLT) Group will organize the 4th VariAmu Workshop on 6th-7th Jan.The introduction to the workshop seen as follows:

Languages differ in individual variations, in historical changes, and indifferent socio-cultural contexts. Linguistic differences even show up in different neurocognitions in spite of the common belief that all human beings share the same language faculty.

Although it seems self-evident that any two languages, such as Chinese and French,are different from each other ,there has never be enaconcerted effort to systematically mapouthow and why two languages differ.

Posted on Oct. 16, 2016, 5:24 p.m.

The 30th Pacific Asia Conference on Language, Information and Computation (PACLIC 30) will be held at Kyung Hee University at Seoul on (Friday) 28 October – (Sun) 30 October 2016.

The conference is co-hosted by the Korea Society of Language and Information, Kyung Hee University Institute of Study of Language and Information, and KAIST.

The PACLIC series of conferences emphasize the synergy of theoretical analysis and processing of language, and provide a forum for researchers in different fields of language study in the Pacific-Asia region to share their findings and interests in the formal and empirical study of languages.

Organized under the auspices of the PACLIC Steering Committee, PACLIC 30 will be the latest installment of our long standing collaborative efforts among theoretical and computational linguists in the Pacific-Asia region.

Posted on Oct. 16, 2016, 5:21 p.m.