ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information Zijun Sun, Xiaoya Li, Xiaofei Sun, Yuxian Meng Xiang Ao, Qing He, Fei Wu and

Size: px
Start display at page:

Download "ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information Zijun Sun, Xiaoya Li, Xiaofei Sun, Yuxian Meng Xiang Ao, Qing He, Fei Wu and"

Transcription

1 ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information Zijun Sun, Xiaoya Li, Xiaofei Sun, Yuxian Meng Xiang Ao, Qing He, Fei Wu and Jiwei Li Shannon.AI, Zhejiang University Key Lab of Intelligent Information Processing of Chinese Academy of Sciences {zijun_sun, xiaoya_li, xiaofei_sun, yuxian_meng, arxiv: v1 [cs.cl] 30 Jun 2021 Abstract Recent pretraining models in Chinese neglect two important aspects specific to the Chinese language: glyph and pinyin, which carry significant syntax and semantic information for language understanding. In this work, we propose ChineseBERT, which incorporates both the glyph and pinyin information of Chinese characters into language model pretraining. The glyph embedding is obtained based on different fonts of a Chinese character, being able to capture character semantics from the visual features, and the pinyin embedding characterizes the pronunciation of Chinese characters, which handles the highly prevalent heteronym phenomenon in Chinese (the same character has different pronunciations with different meanings). Pretrained on large-scale unlabeled Chinese corpus, the proposed ChineseBERT model yields significant performance boost over baseline models with fewer training steps. The proposed model achieves new SOTA performances on a wide range of Chinese NLP tasks,including machine reading comprehension, natural language inference, text classification, sentence pair matching, and competitive performances in named entity recognition and word segmentation Introduction -scale pretrained models have become a fundamental backbone for various natural language processing tasks such as natural language understanding (Liu et al., 2019b), text classification (Reimers and Gurevych, 2019; Chai et al., 2020) and question answering (Clark and Gardner, 2017; Lewis et al., 2020). Apart from English NLP tasks, pretrained models have also demonstrated their effectiveness for various Chinese NLP tasks (Sun et al., 2019, 2020; Cui et al., 2019a, 2020). 1 The code and pretrained models are publicly available at 2 To appear at ACL2021. Since pretraining models are originally designed for English, two important aspects specific to the Chinese language are missing in current large-scale pretraining: glyph-based information and pinyinbased information. For the former, a key aspect that makes Chinese distinguishable from languages such as English, German, is that Chinese is a logographic language. The logographic of characters encodes semantic information. For example, 液 (liquid), 河 (river) and 湖 (lake) all have the radical 氵 (water), which indicates that they are all related to water in semantics. Intuitively, the rich semantics behind Chinese character glyphs should enhance the expressiveness of Chinese NLP models. This idea has motivated a variety of of work on learning and incorporating Chinese glyph information into neural models (Sun et al., 2014; Shi et al., 2015; Liu et al., 2017; Dai and Cai, 2017; Su and Lee, 2017; Meng et al., 2019), but not yet large-scale pretraining. For the latter, pinyin, the Romanized sequence of a Chinese character representing its pronunciation(s), is crucial in modeling both semantic and syntax information that can not be captured by contextualized or glyph embeddings. This aspect is especially important considering the highly prevalent heteronym phenomenon in Chinese 3, where the same character have multiple pronunciations, each of which is associated with a specific meaning. Each pronunciation is associated with a specific pinyin expression. At the semantic level, for example, the Chinese character 乐 has two distinctly different pronunciations: 乐 can be pronounced as yuè [ye 51 ], which means music, and lè [lg 51 ], which means happy. On the syntax level, pronunciations help identify the part-of-speech of a character. For example, character 还 has two 3 Among 7000 common characters in Chinese, there are about 700 characters that have multiple pronunciations, according to the Contemporary Chinese Dictionary.

2 pronunciations: huán[xwan 35 ] and hái[xai 35 ], with the former meaning the verb return and the latter meaning the adverb also. Different pronunciations of the same character cannot be distinguished by the glyph embedding since the logographic is the same, or the char-id embedding, since they both point to the same character ID, but can be characterized by pinyin. In this work, we propose ChineseBERT, a model that incorporates the glyph and pinyin information of Chinese characters into the process of largescale pretraining. The glyph embedding is based on different fonts of a Chinese character, being able to capture character semantics from the visual surface character forms. The pinyin embedding models different semantic meanings that share the same character form and thus bypasses the limitation of interwound morphemes behind a single character. For a Chinese character, the glyph embedding, the pinyin embedding and the character embedding are combined to form a fusion embedding, which models the distinctive semantic property of that character. With less training data and fewer training epochs, ChineseBERT achieves significant performance boost over baselines across a wide range of Chinese NLP tasks. It achieves new SOTA performances on a wide range of Chinese NLP tasks,including machine reading comprehension, natural language inference, text classification, sentence pair matching, and results comparable to SOTA performances in named entity recognition and word segmentation. 2 Related Work 2.1 -Scale Pretraining in NLP Recent years has witnessed substantial work on large-scale pretraining in NLP. BERT (Devlin et al., 2018), which is built on top of the Transformer architecture (Vaswani et al., 2017), is pretrained on large-scale unlabeled text corpus in the manner of Masked Language Model (MLM) and Next Sentence Prediction (NSP). Following this trend, considerable progress has been made by modifying the masking strategy (Yang et al., 2019; Joshi et al., 2020), pretraining tasks (Liu et al., 2019a; Clark et al., 2020) or model backbones (Lan et al., 2020; Lample et al., 2019; Choromanski et al., 2020). Specifically, RoBERTa (Liu et al., 2019b) proposed to remove the NSP pretraining task since it has been proved to offer no benefits for improving downstream performances. The GPT series (Radford et al., 2019; Brown et al., 2020) and other BERT variants (Lewis et al., 2019; Song et al., 2019; Lample and Conneau, 2019; Dong et al., 2019; Bao et al., 2020; Zhu et al., 2020) adapted the paradigm of large-scale unsupervised pretraining to text generation tasks such as machine translation, text summarization and dialog generation, so that generative models can enjoy the benefit of large-scale pretraining. Unlike the English language, Chinese has its particular characteristics in terms of syntax, lexicon and pronunciation. Hence, pretraining Chinese models should fit the Chinese features correspondingly. Li et al. (2019b) proposed to use Chinese character as the basic unit instead of word or subword that is used in English (Wu et al., 2016; Sennrich et al., 2016). ERNIE (Sun et al., 2019, 2020) applied three types of masking strategies charlevel masking, phrase-level masking and entitylevel masking to enhance the ability of capturing multi-granularity semantics. Cui et al. (2019a, 2020) pretrained models using the Whole Word Masking strategy, where all characters within a Chinese word are masked altogether. In this way, the model is learning to address a more challenging task as opposed to predicting word components. More recently, Zhang et al. (2020) developed the largest Chinese pretrained language model to date CPM. It is pretrained on 100GB Chinese data and has 2.6B parameters comparable to GPT3 2.7B (Brown et al., 2020). Xu et al. (2020) released the first large-scale Chinese Language Understanding Evaluation benchmark CLUE, facilitating researches in large-scale Chinese pretraining. 2.2 Learning Glyph Information Learning glyph information from surface Chinese character forms has gained attractions since the prevalence of deep neural networks. Inspired by word embeddings (Mikolov et al., 2013b,a), Sun et al. (2014); Shi et al. (2015); Li et al. (2015); Yin et al. (2016) used indexed radical embeddings to capture character semantics, improving model performances on a wide range of Chinese NLP tasks. Another way of incorporating glyph information is to view characters in the form of image, by which glyph information can be naturally learned through image modeling. However, early work on learning visual features is not smooth. Liu et al. (2017); Shao et al. (2017); Zhang and LeCun (2017); Dai

3 Fusion Layer 我很 [M] [M] Char embedding 我很 0 0 Char embedding W F Pinyin embedding mao1 Fusion embedding like wo3 hen3 0 喜 0 欢 mao1 我很 [M] BERT [M] i very cats osition embedding usion embedding har embedding lyph embedding inyin embedding 4 mao1 cats Output Position embedding 我很 [M] [M] Fusion Layer 欢 我很 [M] [M] BERT 我很 0 0 wo3 hen3 0 0 mao1 我很 [M] [M] 1 i 2 very 3 4 cats Output Fusion embedding Char embedding Pinyin embedding Figure 1: An overview of ChineseBERT. The fusion layer consumes three D-dimensional embeddings Fusion char embedding, Layer glyph embedding and pinyin embedding. The three embeddings are first concatenated, and then mapped to a D-dimensional embedding through a fully connected layer to form the fusion embedding. 很 [M] [M] 很 0 0 hen3 0 0 mao1 and Cai (2017) used CNNs to extract glyph features [M] from character [M] images but 很 very cats did not achieve consistent performance boost over all tasks. Su and Lee (2017); Tao et al. (2019) obtained positive results on the word analogy and word similarity tasks but they did not further evaluate the learned glyph embeddings on more tasks. Meng et al. (2019) applied glyph embeddings to a broad array of Chinese tasks. They designed a specific CNN structure for character feature extractionfusion and used Layer image classification as an auxiliary Char embedding objective to regularize the influence Glyph of embedding a limited number of images. W F Song and Fusion embedding Sehanobish Pinyin embedding (2020); mao1 Xuan et al. (2020) extended the idea of Meng et al. (2019) to the task of named entity recognition (NER), significantly improving mao1 performances against vanilla BERT models. 3 Model 24 喜 很 [M] [M] 24 māo like CNN m a o m a o Glyph Layer W G Pinyin embedding Figure 2: An overview of inducing the glyph embedding. denotes vector concatenation. For each Chinese character, we use three types of fonts FangSong, XingKai and LiShu, each of which is a image with pixel value ranging Images are concatenated into a tensor of size The tensor is ed and passed to an FC to obtain the glyph embedding. 28 m a o Fusion Layer Char embedding Glyph Figure embedding3: An overviewwof F inducing the pinyin embedding. embedding Component Fusion embedding Pinyin 28 For mao1any Chinese character, e.g. (cat) in this embedding Glyph Layer case, a CNN with width 2 is applied to the sequence of 28 1 Romanized pinyin mao1letters, Pinyin embedding followed by max-pooling to derive the final pinyin 2 embedding. W G CNN māo m a o m a o Component embedding māo Char embedding Pinyin embedding m a o Glyph Layer mao1 W G mao1 CNN Fusion Layer Pinyin embedding Fusion embedding Figure 4: An overview of the fusion layer. denotes mao1 vector concatenation, and is vector-matrix multiplication. We concatenate the char embedding, the glyph CNN embedding māo and the pinyin embedding, and use an FC m a o layer with a learnable matrix W F to induce the fusion m a o embedding. 28 Component 3.1 Overview embedding Glyph Layer 28 1 Figure 1 shows an overview of the proposed ChineseBERT model. For each Chinese character, its 2 W G 3 char embedding, glyph embedding and pinyin embedding are first concatenated, and then mapped to a D-dimensional embedding through a fully connected layer to form the fusion embedding. The fusion embedding is then added with the position embedding, which is fed as input to the BERT model Since we do not use the NSP pretraining task, we omit the segment embedding. We use both Whole Word Masking (WWM) (Cui et al., 2019a) and Char Masking (CM) for pretraining (See Section 4.2 for details). 3.2 Input The input to the model is the addition of the learnable absolute positional embedding and the fusion embedding, where the fusion embedding is based on the char embedding, the glyph embedding and the pinyin embedding of the corresponding character. The char embedding performs in a way analogous to the token embedding used in BERT but at the character granularity. Below we respectively describe how to induce the glyph embedding, the pinyin embedding and the fusion embedding. W F

4 Glyph Embedding We followed Meng et al. (2019) to use three types of Chinese fonts Fang- Song, XingKai and LiShu, each of which is instantiated as a image with floating point pixels ranging from 0 to 255. Different from Meng et al. (2019), which used CNNs to convert image to representations, we use an FC layer. We first converted the vector to a 2,352 vector. The ed vector is fed to an FC layer to obtain the output glyph vector. Pinyin Embedding The pinyin embedding for each character is used to decouple different semantic meanings belonging to the same character form, as shown in Figure 3. We use the opensourced pypinyin package 4 to generate pinyin sequences for its constituent characters. pypinyin is a system that combines machine learning models with dictionary-based rules to infer the pinyin for characters given contexts. Pinyin for a Chinese character is a sequence of Romanian characters, with one of four diacritics denoting tones. We use special tokens to denote tones, which are appended to the end of the Romanian character sequence. We apply a CNN model with width 2 on the pinyin sequence, followed by max-pooling to derive the resulting pinyin embedding. This makes output dimensionality immune to the length of the input pinyin sequence. The length of the input pinyin sequence is fixed at 8, with the remaining slots filled with a special letter - when the actual length of the pinyin sequence does not reach 8. Fusion Embedding Once we have the char embedding, the glyph embedding and the pinyin embedding for a character, we concatenate them to form a 3D-dimensional vector. The fusion layers maps the 3D-dimensional vector to D-dimensional through a fully connected layer. The fusion embedding is added with position embedding, and output to the BERT layer. An illustration is shown in Figure Output The output is the corresponding contextualized representation for each input Chinese character (Devlin et al., 2018) Pretraining Setup 4.1 Data We collected our pretraining data from Common- Crawl 5. After pre-processing (such as removing the data with too much English text and filtering the html tagger), about 10% high-quality data is maintained for pretraining, containing 4B Chinese characters in total. We use the LTP toolkit 6 (Che et al., 2010) to identify the boundary of Chinese words for whole word masking. 4.2 Masking Strategies We use two masking strategies Whole Word Masking (WWM) and Char Masking (CM) for ChineseBERT. Li et al. (2019b) suggested that using Chinese characters as the basic input unit can alleviate the out-of-vocabulary issue in the Chinese language. We thus adopt the method of masking random characters in the given context, denoted by Char Masking. On the other hand, a large number of words in Chinese consist of multiple characters, for which the CM strategy may be too easy for the model to predict. For example, for the input context 我喜欢逛紫禁 [M] (i like going to The Forbidden [M]), the model can easily predict that the masked character is 城 (City). Hence, we follow Cui et al. (2019a) to use WWM, a strategy to mask out all characters within a selected word, mitigating the easy-predicting shortcoming of the CM strategy. Note that for both WWM and CM, the basic input unit is Chinese characters. The main difference between WWM and CM lies in how they mask characters and how the model predicts masked characters. 4.3 Pretraining Details Different from Cui et al. (2019a) who pretrained their model based on the official pretrained Chinese BERT model, we train the ChineseBERT model from scratch. To enforce the model to learn both long-term and short-term dependencies, we propose to alternate pretraining between packed input and single input, where the packed input is the concatenation of multiple sentences with a maximum length 512, and the single input is a single sentence. We feed the packed input with probability of 0.9 and the single input with probability of 0.1. We apply Whole Word Masking 90% of the time

5 ERNIE BERT-wwm MacBERT ChineseBERT Data Source Heterogeneous Wikipedia Heterogeneous CommonCrawl Vocab Size 18K 21K 21K 21K Input Unit Char Char Char Char Masking T/P/E WWM WWM/N WWM/CM Task MLM/NSP MLM MAC/SOP MLM Training Steps - 2M 1M 1M Init Checkpoint BERT BERT random # Token 0.4B 5.4B 5B Table 1: Comparison of data statistics between ERNIE (Sun et al., 2019), BERT-wwm (Cui et al., 2019a), MacBERT (Cui et al., 2020) and our proposed ChineseBERT. T: Token, P: Phrase, E: Entity, WWM: Whole Word Masking, N: N-gram, CM: Char Masking, MLM: Masked Language Model, NSP: Next Sentence Prediction, MAC: MLM-As-Correlation. SOP: Sentence Order Prediction. and Char Masking 10% of the time. The masking probability for each word/char is 15%. If the i-th word/char is chosen, we mask it 80% of the time, replace it with a random word/char 10% of the time and maintain it 10% of the time. We also use the dynamic masking strategy to avoid duplicate training instances (Liu et al., 2019b). We use two model setups: base and large, respectively consisting of 12/24 Transformer layers, with input dimensionality of 768/1,024 and 12/16 heads per layer. This makes our models comparable to other BERT-style models in terms of model size. Upon the submission of the paper, we have trained the base model 500K steps with a maximum learning rate 1e-4, warmup of 20K steps and a batch size of 3.2k, and the large model 280K steps with a maximum learning rate 3e-4, warmup of 90K steps and a batch size of 8k. After pretraining, the model can be directly used to be finetuned on downstream tasks in the same way as BERT (Devlin et al., 2018). 5 Experiments We conduct extensive experiments on a variety of Chinese NLP tasks. Models are separately finetuned on task-specific datasets for evaluation. Concretely, we use the following tasks: Machine Reading Comprehension (MRC) Natural Language Inference (NLI) Text Classification (TC) Sentence Pair Matching (SPM) Named Entity Recognition (NER) Chinese Word Segmentation (CWS). We compare ChineseBERT to current state-ofthe-art ERNIE (Sun et al., 2019, 2020), BERTwwm (Cui et al., 2019a) and MacBERT (Cui et al., 2020) models. ERNIE adopts various masking strategies including token-level, phrase-level and entity-level masking to pretrain BERT on largescale heterogeneous data. BERT-wwm/RoBERTawwm continues pretraining on top of official pretrained Chinese BERT/RoBERTa models with the Whole Word Masking pretraining strategy. Unless otherwise specified, we use BERT/RoBERTa to represent BERT-wwm/RoBERTa-wwm and omit wwm. MacBERT improves upon RoBERTa by using the MLM-As-Correlation (MAC) pretraining strategy as well as the sentence-order prediction (SOP) task. It is worth noting that BERT and BERTwwm do not have the large version available online, and we thus omit the corresponding performances. A comparison of these models is shown in Table 1. It is worth noting that training steps of the proposed model significantly smaller than baseline models. Different from BERT-wwm and MacBERT which are initialized with pretrained BERT, the proposed model is initialized from scratch. Due to the additional consideration of glyph and pinyin, the proposed cannot be directly initialized using a vanilla BERT model, as the model structures are different. Even initialized from scratch, the proposed model is trained fewer steps than the steps in retraining BERT-wwm and MacBERT after BERT initialization. 5.1 Machine Reading Comprehension Machine reading comprehension tests the model s ability of answering the questions based on the given contexts. We use two datasets for this task: CMRC 2018 (Cui et al., 2019b) and CJRC (Duan et al., 2019). CMRC is a span-extraction style dataset while CJRC additionally has yes/no questions and no-answer questions. CMRC 2018 and CJRC respectively contain 10K/3.2K/4.9K and 39K/6K/6K data instances for training/dev/test. Test results for CMRC 2018 are evaluated from the CLUE leaderboard. 7 Note that the CJRC dataset is different from the one used in Cui et al. (2019a) as Cui et al. (2019a) did not release their train/dev/test split. We thus run the released models on the CJRC dataset used in this work for comparison. Results are shown in Table 2 and Table 3. As we can see, ChineseBERT yields significant perfor- 7

6 CMRC Model Dev Test ERNIE BERT BERT RoBERTa MacBERT ChineseBERT RoBERTa MacBERT ChineseBERT Table 2: Performances of different models on CMRC. EM is reported for comparison. represents models pretrained on extended data. XNLI Model Dev Test ERNIE BERT BERT RoBERTa MacBERT ChineseBERT RoBERTa MacBERT ChineseBERT Table 4: Performances of different models on XNLI. Accuracy is reported for comparison. represents models pretrained on extended data. CJRC Dev Test Model EM F1 EM F1 BERT BERT RoBERTa ChineseBERT RoBERTa ChineseBERT Table 3: Performances of different models on the MRC dataset CJRC. We report results for baseline models based on their released models. represents models pretrained on extended data. mance boost on both datasets, and the improvement of EM is larger than that of F1 on the CJRC dataset, which indicates that ChineseBERT is better at detecting exact answer spans. 5.2 Natural Language Inference (NLI) The goal of NLI is to determine the entailment relationship between a hypothesis and a premise. We use the Cross-lingual Natural Language Inference (XNLI) dataset (Conneau et al., 2018) for evaluation. The corpus is a crowd-sourced collection of 5K test and 2.5K dev pairs for the MultiNLI corpus. Each sentence pair is annotated with the entailment, neutral or contradiction label. We use the official machine translated Chinese data for training. 8 Results are present in Table 4, which shows that ChineseBERT is able to achieve the best performances for both base and large setups. XNLI Text Classification (TC) In text classification the model is required to categorize a piece of text into one of the specified classes. We follow Cui et al. (2019a) to use THUC- News (Li and Sun, 2007) and ChnSentiCorp 9 for this task. THUCNews is a subset of THUCTC 10, with 50K/5K/10K data points respectively for training/dev/test. Data is evenly distributed in 10 domains including sports, finance, etc. 11 ChnSentiCorp is a binary sentiment classification dataset containing 9.6K/1.2K/1.2K data points respectively for training/dev/test. The two datasets are relatively simple with vanilla BERT achieving an accuracy of above 95%. Hence, apart from THUC- News and ChnSentiCorp, we also use TNEWS, a more difficult dataset that is included in the CLUE benchmark (Xu et al., 2020). 12 TNEWS is a 15- class short news text classification dataset with 53K/10K/10K data points for training/dev/test. Results are shown in Table 5. On ChunSentiCorp and THUCNews, the improvement from ChineseBERT is marginal as baselines have already achieved quite high results on these two datasets. On the TNEWS dataset, ChineseBERT outperforms all other models. We can see that the ERNIE model only performs slightly worse than ChineseBERT. This is because ERNIE is trained on additional web data, which is beneficial to model web news text that covers a wide range of domains. 9 classification/tree/master/data text-classification-cnn-rnn 12

7 ChnSentiCorp THUCNews TNEWS Model Dev Test Dev Test Dev Test ERNIE BERT BERT RoBERTa MacBERT ChineseBERT RoBERTa MacBERT ChineseBERT Table 5: Performances of different models on TC datasets ChnSentiCorp, THUCNews and TNEWS. The results of TNEWS are taken from the CLUE paper (Xu et al., 2020). Accuracy is reported for comparison. represents models pretrained on extended data. LCQMC BQ Corpus Model Dev Test Dev Test ERNIE BERT BERT RoBERTa MacBERT ChineseBERT RoBERTa MacBERT ChineseBERT Table 6: Performances of different models on SPM datasets LCQMC and BQ Corpus. We report accuracy for comparison. represents models pretrained on extended data. 5.4 Sentence Pair Matching (SPM) For SPM, the model is asked to determine whether a given sentence pair expresses the same semantics. We use the LCQMC (Liu et al., 2018) and BQ Corpus (Chen et al., 2018) datasets for evaluation. LCQMC is a large-scale Chinese question matching corpus for judging whether two given questions have the same intent, with 23.9K/8.8K/12.5K sentence pairs for training/dev/test. BQ Corpus is another large-scale Chinese dataset containing 100K/10K/10K sentence pairs for training/dev/test. Results are shown in Table 6. We can see that ChineseBERT generally outperforms MacBERT on LCQMC but slightly underperforms BERT-wwm. We hypothesis this is because the domain of BQ Corpus more fits the pretraining data of BERTwwm than that of ChineseBERT. OntoNotes 4.0 Weibo Model P R F P R F BERT RoBERTa ChineseBERT RoBERTa ChineseBERT Table 7: Performances of different models on NER datasets OntoNotes 4.0 and Weibo. Results of precision (P), recall (R) and F1 (F) on test sets are reported for comparison. 5.5 Named Entity Recognition (NER) For NER tasks (Chiu and Nichols, 2016; Lample et al., 2016; Li et al., 2019a), the model is asked to identify named entities within a piece of text, which is formalized as a sequence labeling task. We use OntoNotes 4.0 (Weischedel et al., 2011) and Weibo (Peng and Dredze, 2015) for this task. We use OntoNotes 4.0 and Weibo NER for this task. OntoNotes has 18 named entity types and Weibo has 4 named entity types. OntoNotes and Weibo respectively contain 15K/4K/4K and 1,350/270/270 instances for training/dev/test. Results are shown in Table 7. As we can see, ChineseBERT significantly outperforms BERT and RoBERTa in terms of F1. In spite of a slight loss on precision for the base version, the gains on recall are particularly high, leading to a final performance boost on F Chinese Word Segmentation The task divides text into words and is formalized as a character-based sequence labelling task. We use the PKU and MSRA datasets for Chinese word segmentation. PKU consists of 19K/2K sentences for training and test, and MSRA consists of 87k/4k sentences for training and test. Output character embedding is fed to the softmax function for final predictions. Results are shown in Table 8, where we can see that ChineseBERT is able to outperform BERT-wwm and RoBERTa-wwm on both datasets for both metrics. 6 Ablation Studies In this section, we conduct ablation studies to understand the behaviors of ChineseBERT. We use the Chinese named entity recognition dataset OntoNotes 4.0 for analysis and all models are based on the base version.

8 MSRA PKU Model F1 Acc F1 Acc BERT RoBERTa ChineseBERT RoBERTa ChineseBERT Table 8: Performances of different models on CWS datasets MSRA and PKU. We report F1 and accuracy (Acc) for comparison. represents models pretrained on extended data. OntoNotes 4.0 Model Precision Recall F1 RoBERTa ChineseBERT Glyph (-1.52) Pinyin (-1.17) Glyph Pinyin (-1.89) Table 9: Performances for different models without considering glyph or pinyin information. 6.2 The Effect of Training Data Size We hypothesize glyph and pinyin embeddings also serve as strong regularization over text semantics, which means that the proposed ChineseBERT model is able to perform better with less training data. We randomly sample 10% 90% of the training data while maintaining the ratio of samples with entities w.r.t. samples without entities. We perform each experiment five times and report the average F1 value on the test set. Figure 5 shows the results. As can be seen, ChineseBERT performs better across all setups. With less than 30% of the training data, the improvement of ChineseBERT is slight, but with over 30% training data, the performance improvement is greater. This is because ChineseBERT still requires sufficient training data to fully train the glyph and pinyin embeddings, and insufficient training data would lead to inadequate training The Effect of Glyph Embeddings and Pinyin Embeddings We would like to explore the effects of glyph embeddings and pinyin embeddings. For fair comparison, we pretrained different models on the same dataset, with the same number of training steps, and with the same model size. Setups include -glyph, where glyph embeddings are not considered and we only consider pinyin and char-id embeddings; -pinyin, where pinyin embeddings are not considered and we only consider glyph and char-id embeddings; -glyph-pinyin, where only char-id embeddings are considered, and the model degenerates to RoBERTa. We finetune different models on the OntoNotes dataset of the NER dataset for comparison. Results are shown in Table 9. As can be seen, either removing glyph embeddings or pinyin embeddings results in performance degradation, and removing both has the greatest negative impact on the F1 value, which is a drop of about 2 points. This validates the importance of both pinyin and glyph embeddings for modeling Chinese semantics. The reason why -glyph-pinyin performs worse than RoBERTa is that the model we use here is trained on a smaller size of data with smaller number of training steps. Test F1-score Training Size BERT RoBERTa ChineseBERT Figure 5: Performances when varying the training size. 7 Conclusion In this paper, we introduce ChineseBERT, a largescale pretraining Chinese NLP model. It leverages the glyph and pinyin information of Chinese characters to enhance the model s ability of capturing context semantics from surface character forms and disambiguating polyphonic characters in Chinese. The proposed ChineseBERT model achieves significant performance boost across a wide range of Chinese NLP tasks. The proposed ChineseBERT performs better than vanilla pretrained models with less training data, indicating that the introduced glyph embeddings and pinyin embeddings serve

9 as a strong regularizer for semantic modeling in Chinese. Future work involves training a large size version of ChineseBERT. Acknowledgement This work is supported by Key-Area Research and Development Program of Guangdong Province(No.2019B ). We also want to acknowledge National Key R&D Program of China (2020AAA ) and Beijing Academy of Artificial Intelligence (BAAI). References Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhou, and Hsiao-Wuen Hon Unilmv2: Pseudo-masked language models for unified language model pre-training. Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam Mc- Candlish, Alec Radford, Ilya Sutskever, and Dario Amodei Language models are few-shot learners. Duo Chai, Wei Wu, Qinghong Han, Fei Wu, and Jiwei Li Description based text classification with reinforcement learning. Wanxiang Che, Zhenghua Li, and Ting Liu LTP: A Chinese language technology platform. In Coling 2010: Demonstrations, pages 13 16, Beijing, China. Coling 2010 Organizing Committee. Jing Chen, Qingcai Chen, Xin Liu, Haijun Yang, Daohe Lu, and Buzhou Tang The BQ corpus: A large-scale domain-specific Chinese corpus for sentence semantic equivalence identification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages , Brussels, Belgium. Association for Computational Linguistics. Jason PC Chiu and Eric Nichols Named entity recognition with bidirectional lstm-cnns. Transactions of the Association for Computational Linguistics, 4: Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamás Sarlós, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, and Adrian Weller Rethinking attention with performers. CoRR. Christopher Clark and Matt Gardner Simple and effective multi-paragraph reading comprehension. arxiv preprint arxiv: Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning Electra: Pretraining text encoders as discriminators rather than generators. In International Conference on Learning Representations. Alexis Conneau, Guillaume Lample, Ruty Rinott, Adina Williams, Samuel R Bowman, Holger Schwenk, and Veselin Stoyanov Xnli: Evaluating crosslingual sentence representations. arxiv preprint arxiv: Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, and Guoping Hu Revisiting pretrained models for chinese natural language processing. arxiv preprint arxiv: Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, and Guoping Hu. 2019a. Pre-training with whole word masking for chinese bert. arxiv preprint arxiv: Yiming Cui, Ting Liu, Wanxiang Che, Li Xiao, Zhipeng Chen, Wentao Ma, Shijin Wang, and Guoping Hu. 2019b. A span-extraction dataset for Chinese machine reading comprehension. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages , Hong Kong, China. Association for Computational Linguistics. Falcon Z Dai and Zheng Cai Glyph-aware embedding of chinese characters. In Proceedings of the First Workshop on Subword and Character Level Models in NLP, Copenhagen, Denmark, September 7, 2017, pages Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova Bert: Pre-training of deep bidirectional transformers for language understanding. arxiv preprint arxiv: Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, and Hsiao-Wuen Hon Unified language model pre-training for natural language understanding and generation. In Advances in Neural Information Processing Systems, volume 32, pages Curran Associates, Inc. Xingyi Duan, Baoxin Wang, Ziyue Wang, Wentao Ma, Yiming Cui, Dayong Wu, Shijin Wang, Ting Liu, Tianxiang Huo, Zhen Hu, and et al Cjrc: A reliable human-annotated benchmark dataset for chinese judicial reading comprehension. Chinese Computational Linguistics, page Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S Weld, Luke Zettlemoyer, and Omer Levy Spanbert:

10 Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 8: Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer Neural architectures for named entity recognition. arxiv preprint arxiv: Guillaume Lample and Alexis Conneau Crosslingual language model pretraining. Advances in Neural Information Processing Systems (NeurIPS). Guillaume Lample, Alexandre Sablayrolles, Marc Aurelio Ranzato, Ludovic Denoyer, and Hervé Jégou memory layers with product keys. Advances in Neural Information Processing Systems (NeurIPS). Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut Albert: A lite bert for self-supervised learning of language representations. In International Conference on Learning Representations. Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arxiv preprint arxiv: Patrick Lewis, Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al Retrieval-augmented generation for knowledge-intensive nlp tasks. arxiv preprint arxiv: Jingyang Li and Maosong Sun Scalable term selection for text categorization. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong Han, Fei Wu, and Jiwei Li. 2019a. A unified mrc framework for named entity recognition. arxiv preprint arxiv: Xiaoya Li, Yuxian Meng, Xiaofei Sun, Qinghong Han, Arianna Yuan, and Jiwei Li. 2019b. Is word segmentation necessary for deep learning of Chinese representations? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages , Florence, Italy. Association for Computational Linguistics. Yanran Li, Wenjie Li, Fei Sun, and Sujian Li Component-enhanced chinese character embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, pages Frederick Liu, Han Lu, Chieh Lo, and Graham Neubig Learning character-level compositionality with visual features. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pages Xiaodong Liu, Pengcheng He, Weizhu Chen, and Jianfeng Gao. 2019a. Multi-task deep neural networks for natural language understanding. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages , Florence, Italy. Association for Computational Linguistics. Xin Liu, Qingcai Chen, Chong Deng, Huajun Zeng, Jing Chen, Dongfang Li, and Buzhou Tang Lcqmc: A large-scale chinese question matching corpus. In Proceedings of the 27th International Conference on Computational Linguistics, pages Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019b. Roberta: A robustly optimized bert pretraining approach. arxiv preprint arxiv: Yuxian Meng, Wei Wu, Fei Wang, Xiaoya Li, Ping Nie, Fan Yin, Muyu Li, Qinghong Han, Xiaofei Sun, and Jiwei Li Glyce: Glyph-vectors for chinese character representations. In Advances in Neural Information Processing Systems, volume 32, pages Curran Associates, Inc. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. arxiv preprint arxiv: Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26: Nanyun Peng and Mark Dredze Named entity recognition for chinese social media with jointly trained embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever Language models are unsupervised multitask learners. OpenAI Blog, 1(8). Nils Reimers and Iryna Gurevych Sentencebert: Sentence embeddings using siamese bertnetworks. arxiv preprint arxiv: Rico Sennrich, Barry Haddow, and Alexandra Birch Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational

11 Linguistics (Volume 1: Long Papers), pages , Berlin, Germany. Association for Computational Linguistics. Yan Shao, Christian Hardmeier, Jörg Tiedemann, and Joakim Nivre Character-based joint segmentation and pos tagging for chinese using bidirectional rnn-crf. In Proceedings of the Eighth International Joint Conference on Natural Language Processing, IJCNLP 2017, Taipei, Taiwan, November 27 - December 1, Volume 1: Long Papers, pages Xinlei Shi, Junjie Zhai, Xudong Yang, Zehua Xie, and Chao Liu Radical embedding: Delving deeper to Chinese radicals. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages , Beijing, China. Association for Computational Linguistics. Chan Hee Song and Arijit Sehanobish Using chinese glyphs for named entity recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 34(10): Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie- Yan Liu Mass: Masked sequence to sequence pre-training for language generation. In International Conference on Machine Learning, pages Tzu-Ray Su and Hung-Yi Lee Learning chinese word representations from glyphs of characters. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017, pages Yaming Sun, Lei Lin, Nan Yang, Zhenzhou Ji, and Xiaolong Wang Radical-enhanced chinese character embedding. In International Conference on Neural Information Processing, pages Springer. Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu Ernie: Enhanced representation through knowledge integration. arxiv preprint arxiv: Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang Ernie 2.0: A continual pre-training framework for language understanding. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05): Hanqing Tao, Shiwei Tong, Tong Xu, Qi Liu, and Enhong Chen Chinese embedding via stroke and glyph information: A dual-channel view. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin Attention is all you need. In Advances in neural information processing systems, pages Ralph Weischedel, Sameer Pradhan, Lance Ramshaw, Martha Palmer, Nianwen Xue, Mitchell Marcus, Ann Taylor, Craig Greenberg, Eduard Hovy, Robert Belvin, et al Ontonotes release 4.0. LDC2011T03, Philadelphia, Penn.: Linguistic Data Consortium. Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean Google s neural machine translation system: Bridging the gap between human and machine translation. Liang Xu, Hai Hu, Xuanwei Zhang, Lu Li, Chenjie Cao, Yudong Li, Yechen Xu, Kai Sun, Dian Yu, Cong Yu, Yin Tian, Qianqian Dong, Weitang Liu, Bo Shi, Yiming Cui, Junyi Li, Jun Zeng, Rongzhao Wang, Weijian Xie, Yanting Li, Yina Patterson, Zuoyu Tian, Yiwen Zhang, He Zhou, Shaoweihua Liu, Zhe Zhao, Qipeng Zhao, Cong Yue, Xinrui Zhang, Zhengliang Yang, Kyle Richardson, and Zhenzhong Lan CLUE: A Chinese language understanding evaluation benchmark. In Proceedings of the 28th International Conference on Computational Linguistics, pages , Barcelona, Spain (Online). International Committee on Computational Linguistics. Zhenyu Xuan, Rui Bao, and Shengyi Jiang Fgn: Fusion glyph network for chinese named entity recognition. Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le Xlnet: Generalized autoregressive pretraining for language understanding. In Advances in neural information processing systems, pages Rongchao Yin, Quan Wang, Peng Li, Rui Li, and Bin Wang Multi-granularity chinese word embedding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages Xiang Zhang and Yann LeCun Which encoding is the best for text classification in chinese, english, japanese and korean? arxiv preprint arxiv: Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, and Maosong Sun Cpm: A large-scale generative chinese pre-trained language model.

12 Jinhua Zhu, Yingce Xia, Lijun Wu, Di He, Tao Qin, Wengang Zhou, Houqiang Li, and Tieyan Liu Incorporating bert into neural machine translation. In International Conference on Learning Representations.

Mixtions Pin Yin Homepage

Mixtions Pin Yin Homepage an tai yin 安 胎 饮 775 ba wei dai xia fang 八 味 带 下 方 756 ba zhen tang 八 珍 汤 600 ba zheng san 八 正 散 601 bai he gu jin tang 百 合 固 金 汤 680 bai hu jia ren shen tang 白 虎 加 人 参 汤 755 bai hu tang 白 虎 汤 660 bai

More information

<D2BDC1C6BDA1BFB5CDB6C8DAD7CAB8DFB7E5C2DBCCB3B2CEBBE1C3FBB5A52E786C7378>

<D2BDC1C6BDA1BFB5CDB6C8DAD7CAB8DFB7E5C2DBCCB3B2CEBBE1C3FBB5A52E786C7378> 参 会 人 员 名 单 Last Name 姓 名 公 司 Tel Fax Bai 柏 煜 康 复 之 家 8610 8761 4189 8610 8761 4189 Bai 白 威 久 禧 道 和 股 权 投 资 管 理 ( 天 津 ) 有 限 公 司 8610 6506 7108 8610 6506 7108 Bao 包 景 明 通 用 技 术 集 团 投 资 管 理 有 限 公 司 8610

More information

acl2017_linguistically-regularized-lstm-MinlieHuang

acl2017_linguistically-regularized-lstm-MinlieHuang ACL 2017 Linguistically Regularized LSTM for Sentiment Classification Qiao Qian, Minlie Huang, Jinhao Lei, Xiaoyan Zhu Dept. of Computer Science Tsinghua University 1 aihuang@tsinghua.edu.cn Outline Introduction

More information

Microsoft PowerPoint - Aqua-Sim.pptx

Microsoft PowerPoint - Aqua-Sim.pptx Peng Xie, Zhong Zhou, Zheng Peng, Hai Yan, Tiansi Hu, Jun-Hong Cui, Zhijie Shi, Yunsi Fei, Shengli Zhou Underwater Sensor Network Lab 1 Outline Motivations System Overview Aqua-Sim Components Experimental

More information

Microsoft Word - TIP006SCH Uni-edit Writing Tip - Presentperfecttenseandpasttenseinyourintroduction readytopublish

Microsoft Word - TIP006SCH Uni-edit Writing Tip - Presentperfecttenseandpasttenseinyourintroduction readytopublish 我 难 度 : 高 级 对 们 现 不 在 知 仍 道 有 听 影 过 响 多 少 那 次 么 : 研 英 究 过 文 论 去 写 文 时 作 的 表 技 引 示 巧 言 事 : 部 情 引 分 发 言 该 生 使 在 中 用 过 去, 而 现 在 完 成 时 仅 表 示 事 情 发 生 在 过 去, 并 的 哪 现 种 在 时 完 态 成 呢 时? 和 难 过 道 去 不 时 相 关? 是 所 有

More information

一般社団法人電子情報通信学会 信学技報 THE INSTITUTE OF ELECTRONICS, IEICE Technical Report INFORMATION THE INSTITUTE OF AND ELECTRONICS, COMMUNICATION ENGINEERS IEICE L

一般社団法人電子情報通信学会 信学技報 THE INSTITUTE OF ELECTRONICS, IEICE Technical Report INFORMATION THE INSTITUTE OF AND ELECTRONICS, COMMUNICATION ENGINEERS IEICE L 一般社団法人電子情報通信学会 信学技報 THE INSTITUTE OF ELECTRONICS, IEICE Technical Report INFORMATION THE INSTITUTE OF AND ELECTRONICS, COMMUNICATION ENGINEERS IEICE LOIS2016-85(2017-03) Technical Report INFORMATION AND

More information

Microsoft Word - Final Exam Review Packet.docx

Microsoft Word - Final Exam Review Packet.docx Do you know these words?... 3.1 3.5 Can you do the following?... Ask for and say the date. Use the adverbial of time correctly. Use Use to ask a tag question. Form a yes/no question with the verb / not

More information

UDC Empirical Researches on Pricing of Corporate Bonds with Macro Factors 厦门大学博硕士论文摘要库

UDC Empirical Researches on Pricing of Corporate Bonds with Macro Factors 厦门大学博硕士论文摘要库 10384 15620071151397 UDC Empirical Researches on Pricing of Corporate Bonds with Macro Factors 2010 4 Duffee 1999 AAA Vasicek RMSE RMSE Abstract In order to investigate whether adding macro factors

More information

CIP /. 2005. 12 ISBN 7-5062 - 7683-6 Ⅰ.... Ⅱ.... Ⅲ. Ⅳ. G624.203 CIP 2005 082803 櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶櫶 17 710001 029-87232980 87214941 029-87279675 87279676 880 1230 1/64 4.0 110 2006 2 1 2006 2 1 ISBN

More information

~ ~

~ ~ * 40 4 2016 7 Vol. 40 No. 4 July 2016 35 Population Research 2014 1 2016 2016 9101. 0 40 49. 6% 2017 ~ 2021 1719. 5 160 ~ 470 100872 Accumulated Couples and Extra Births under the Universal Tw o-child

More information

ti2 guan4 bo1 bo5 huai4 zheng4 hong1 xi2 luo2 ren4

ti2 guan4 bo1 bo5 huai4 zheng4 hong1 xi2 luo2 ren4 hui1 ba2 shang1 tu4 gen1 nao3 he2 qing2 jin1 ti2 guan4 bo1 bo5 huai4 zheng4 hong1 xi2 luo2 ren4 chu2 fu4 ling2 jun4 yu4 zhao1 jiang3 che3 shi4 tu2 shi2 wa2 wa1 duan4 zhe2 bu4 lian4 bing1 mu4 ban3 xiong2

More information

48 東華漢學 第20期 2014年12月 後 卿 由三軍將佐取代 此後 中大夫 極可能回歸原本職司 由 於重要性已然不再 故而此後便不見 中大夫 記載於 左傳 及 國 語 關鍵詞 左傳 中大夫 里克 丕鄭 卿

48 東華漢學 第20期 2014年12月 後 卿 由三軍將佐取代 此後 中大夫 極可能回歸原本職司 由 於重要性已然不再 故而此後便不見 中大夫 記載於 左傳 及 國 語 關鍵詞 左傳 中大夫 里克 丕鄭 卿 東華漢學 第 20 期 47-98 頁 東華大學中國語文學系 華文文學系 2014 年 12 月 春秋晉國 中大夫 考 黃聖松** 摘要 本文討論 左傳 國語 所載 中大夫 之含義及職司內容 認為 中大夫 不可與 上大夫 下大夫 排比 視為 大夫 等 第 左傳 所載 大夫 一詞前常冠以其他名詞 如 中軍大夫 上軍大夫 下軍大夫 七輿大夫 公族大夫 及 僕大 夫 筆者認為 冠諸 大夫 前名詞即是該大夫職司範圍

More information

附件1:

附件1: 附 件 1: 全 国 优 秀 教 育 硕 士 专 业 学 位 论 文 推 荐 表 单 位 名 称 : 西 南 大 学 论 文 题 目 填 表 日 期 :2014 年 4 月 30 日 数 学 小 组 合 作 学 习 的 课 堂 管 理 攻 硕 期 间 及 获 得 硕 士 学 位 后 一 年 内 获 得 与 硕 士 学 位 论 文 有 关 的 成 果 作 者 姓 名 论 文 答 辩 日 期 学 科 专

More information

0f3fdf2bb8e55502b65cf5790db2b9fdf793fd84c5ee29b8d80ee7fb09a2cf82.xls

0f3fdf2bb8e55502b65cf5790db2b9fdf793fd84c5ee29b8d80ee7fb09a2cf82.xls International Federation of Bodybuilding and Fitness 019 ``The Belt and Road`` Worldwide Bodybuilding and Fitness Championships 10-14.10.019, Xi`an, O F F I C I A L C O N T E S T R E S U L T S Men's Classic

More information

54 48 6-7 word2vec 8-10 GloVe 11 Word2vec X king - X man X queen - X woman Recurrent Neural Network X shirt - X clothing X chair - X furniture 2 n-gra

54 48 6-7 word2vec 8-10 GloVe 11 Word2vec X king - X man X queen - X woman Recurrent Neural Network X shirt - X clothing X chair - X furniture 2 n-gra Journal of South China Normal University Natural Science Edition 2016 48 3 53-58 doi 106054 /jjscnun201605006 1 2* 2 3 2 1 510631 2 3 510225 Glove TP3911 A 1000-5463 2016 03-0053-06 Research on Academic

More information

Microsoft Word - ChineseSATII .doc

Microsoft Word - ChineseSATII .doc 中 文 SAT II 冯 瑶 一 什 么 是 SAT II 中 文 (SAT Subject Test in Chinese with Listening)? SAT Subject Test 是 美 国 大 学 理 事 会 (College Board) 为 美 国 高 中 生 举 办 的 全 国 性 专 科 标 准 测 试 考 生 的 成 绩 是 美 国 大 学 录 取 新 生 的 重 要 依

More information

Microsoft Word - 01李惠玲ok.doc

Microsoft Word - 01李惠玲ok.doc 康 寧 學 報 11:1-20(2009) 1 數 位 學 習 於 護 理 技 術 課 程 之 運 用 與 評 值 * 李 惠 玲 ** 高 清 華 *** 呂 莉 婷 摘 要 背 景 : 網 路 科 技 在 教 育 的 使 用 已 成 為 一 種 有 利 的 教 學 輔 助 工 具 網 路 教 學 的 特 性, 在 使 學 習 可 不 分 時 間 與 空 間 不 同 進 度 把 握 即 時 性 資

More information

Microsoft Word - chnInfoPaper6

Microsoft Word - chnInfoPaper6 文 章 编 号 :3-77(2)-- 文 章 编 号 :92 基 于 中 文 拼 音 输 入 法 数 据 的 汉 语 方 言 词 汇 自 动 识 别 张 燕, 张 扬 2, 孙 茂 松 (. 清 华 大 学 计 算 机 系, 北 京 市 84;2. 搜 狗 科 技 公 司, 北 京 市 84) 摘 要 : 方 言 研 究 领 域 中 的 语 音 研 究 词 汇 研 究 及 语 法 研 究 是 方 言

More information

Microsoft Word - 201506定版

Microsoft Word - 201506定版 56 Chinese Journal of Library and Information Science for Traditional Chinese Medicine Dec. 2015 Vol. 39 No. 6 综 述 中 医 药 学 语 言 系 统 研 究 综 述 于 彤, 贾 李 蓉, 刘 静, 杨 硕 *, 董 燕, 朱 玲 中 国 中 医 科 学 院 中 医 药 信 息 研 究 所,

More information

62 戲劇學刊 An Analysis of Cao-xie-gong Zhen in Tainan Abstract Te-yu Shih* Tainan is among the first areas that were developed in Taiwan, and there are a

62 戲劇學刊 An Analysis of Cao-xie-gong Zhen in Tainan Abstract Te-yu Shih* Tainan is among the first areas that were developed in Taiwan, and there are a 61 * 1870 61-85 TAIPEI THEATRE JOURNAL 27 (2018): 61-85 School of Theatre Arts, Taipei National University of the Arts 2017.12.21 2018.1.11 * 62 戲劇學刊 An Analysis of Cao-xie-gong Zhen in Tainan Abstract

More information

A.hóng B.jiàng A.sh n B.shàn

A.hóng B.jiàng A.sh n B.shàn YANGTZE RIVER ACADEMIC 2014 2 2014 2 42 2014 No. 2Serial No. 42 1 2 1.100871 2.430062 48 31 17 8 9 30 1 1967 1938 2013 13WYB005 2014 2013 104 105 48 48 4 1. A.hóng B.jiàng 1 2 1 2 2. A.sh n B.shàn 1 2

More information

a b

a b 38 3 2014 5 Vol. 38 No. 3 May 2014 55 Population Research + + 3 100038 A Study on Implementation of Residence Permit System Based on Three Local Cases of Shanghai Chengdu and Zhengzhou Wang Yang Abstract

More information

國家圖書館典藏電子全文

國家圖書館典藏電子全文 i ii Abstract The most important task in human resource management is to encourage and help employees to develop their potential so that they can fully contribute to the organization s goals. The main

More information

Stock Transfer Service Inc. Page No. 1 CENTURY PEAK METALS HOLDINGS CORPORATION (CPM) List of Top 100 Stockholders As of 12/31/2015 Rank Sth. No. Name

Stock Transfer Service Inc. Page No. 1 CENTURY PEAK METALS HOLDINGS CORPORATION (CPM) List of Top 100 Stockholders As of 12/31/2015 Rank Sth. No. Name Stock Transfer Service Inc. Page No. 1 CENTURY PEAK METALS HOLDINGS CORPORATION (CPM) List of Top 100 Stockholders As of 12/31/2015 Rank Sth. No. Name Citizenship Holdings Rank ------------------------------------------------------------------------------------------------------------------------

More information

报 告 1: 郑 斌 教 授, 美 国 俄 克 拉 荷 马 大 学 医 学 图 像 特 征 分 析 与 癌 症 风 险 评 估 方 法 摘 要 : 准 确 的 评 估 癌 症 近 期 发 病 风 险 和 预 后 或 者 治 疗 效 果 是 发 展 和 建 立 精 准 医 学 的 一 个 重 要 前

报 告 1: 郑 斌 教 授, 美 国 俄 克 拉 荷 马 大 学 医 学 图 像 特 征 分 析 与 癌 症 风 险 评 估 方 法 摘 要 : 准 确 的 评 估 癌 症 近 期 发 病 风 险 和 预 后 或 者 治 疗 效 果 是 发 展 和 建 立 精 准 医 学 的 一 个 重 要 前 东 北 大 学 中 荷 生 物 医 学 与 信 息 工 程 学 院 2016 年 度 生 物 医 学 与 信 息 工 程 论 坛 会 议 时 间 2016 年 6 月 8 日, 星 期 三,9:30 至 16:00 会 议 地 址 会 议 网 址 主 办 单 位 东 北 大 学 浑 南 校 区 沈 阳 市 浑 南 区 创 新 路 195 号 生 命 科 学 大 楼 B 座 619 报 告 厅 http://www.bmie.neu.edu.cn

More information

2016 YOUNG MATHEMATICIAN FORUM Introduction To promote academic communication and cooperation between young staffs from the SMS and the BICMR of Pekin

2016 YOUNG MATHEMATICIAN FORUM Introduction To promote academic communication and cooperation between young staffs from the SMS and the BICMR of Pekin 2016 YOUNG MATHEMATICIAN FORUM Introduction To promote academic communication and cooperation between young staffs from the SMS and the BICMR of Peking University and overseas outstanding young scholars,

More information

Microsoft Word - 刘 慧 板.doc

Microsoft Word - 刘  慧 板.doc 中 国 环 境 科 学 2012,32(5):933~941 China Environmental Science 系 统 动 力 学 在 空 港 区 域 规 划 环 境 影 响 评 价 中 的 应 用 刘 慧 1,2, 郭 怀 成 1*, 盛 虎 1, 都 小 尚 1,3, 李 娜 1 1, 杨 永 辉 (1. 北 京 大 学 环 境 科 学 与 工 程 学 院, 北 京 100871; 2.

More information

% % % % % % ~

% % % % % % ~ 1001-5558 2015 03-0021-16 2010 C91 A 2014 5 2010 N. W. Journal of Ethnology 2015 3 86 2015.No.3 Total No.86 2010 2010 2181.58 882.99 40.47% 1298.59 59.53% 2013 2232.78 847.29 37.95% 1385.49 62.05% 1990

More information

東莞工商總會劉百樂中學

東莞工商總會劉百樂中學 /2015/ 頁 (2015 年 版 ) 目 錄 : 中 文 1 English Language 2-3 數 學 4-5 通 識 教 育 6 物 理 7 化 學 8 生 物 9 組 合 科 學 ( 化 學 ) 10 組 合 科 學 ( 生 物 ) 11 企 業 會 計 及 財 務 概 論 12 中 國 歷 史 13 歷 史 14 地 理 15 經 濟 16 資 訊 及 通 訊 科 技 17 視 覺

More information

A VALIDATION STUDY OF THE ACHIEVEMENT TEST OF TEACHING CHINESE AS THE SECOND LANGUAGE by Chen Wei A Thesis Submitted to the Graduate School and Colleg

A VALIDATION STUDY OF THE ACHIEVEMENT TEST OF TEACHING CHINESE AS THE SECOND LANGUAGE by Chen Wei A Thesis Submitted to the Graduate School and Colleg 上 海 外 国 语 大 学 SHANGHAI INTERNATIONAL STUDIES UNIVERSITY 硕 士 学 位 论 文 MASTER DISSERTATION 学 院 国 际 文 化 交 流 学 院 专 业 汉 语 国 际 教 育 硕 士 题 目 届 别 2010 届 学 生 陈 炜 导 师 张 艳 莉 副 教 授 日 期 2010 年 4 月 A VALIDATION STUDY

More information

Shanghai International Studies University THE STUDY AND PRACTICE OF SITUATIONAL LANGUAGE TEACHING OF ADVERB AT BEGINNING AND INTERMEDIATE LEVEL A Thes

Shanghai International Studies University THE STUDY AND PRACTICE OF SITUATIONAL LANGUAGE TEACHING OF ADVERB AT BEGINNING AND INTERMEDIATE LEVEL A Thes 上 海 外 国 语 大 学 硕 士 学 位 论 文 对 外 汉 语 初 中 级 副 词 情 境 教 学 研 究 与 实 践 院 系 : 国 际 文 化 交 流 学 院 学 科 专 业 : 汉 语 国 际 教 育 姓 名 : 顾 妍 指 导 教 师 : 缪 俊 2016 年 5 月 Shanghai International Studies University THE STUDY AND PRACTICE

More information

Improved Preimage Attacks on AES-like Hash Functions: Applications to Whirlpool and Grøstl

Improved Preimage Attacks on AES-like Hash Functions: Applications to Whirlpool and Grøstl SKLOIS (Pseudo) Preimage Attack on Reduced-Round Grøstl Hash Function and Others Shuang Wu, Dengguo Feng, Wenling Wu, Jian Guo, Le Dong, Jian Zou March 20, 2012 Institute. of Software, Chinese Academy

More information

341-367

341-367 BIBLID 0254-4466(2000)18:2 pp. 341-367 18 2 89 12 341 342 1 2 3 1 19 1983 2 3 1996. 10 1218 343 4 5 6 7 11 8 28 9 10 11 12 13 14 4 3 1808 5 3 1349 6 3 343 7 3 1292 8 9 1996 4 988 10 3 187 11 3 1506 12

More information

1 * 1 *

1 * 1 * 1 * 1 * taka@unii.ac.jp 1992, p. 233 2013, p. 78 2. 1. 2014 1992, p. 233 1995, p. 134 2. 2. 3. 1. 2014 2011, 118 3. 2. Psathas 1995, p. 12 seen but unnoticed B B Psathas 1995, p. 23 2004 2006 2004 4 ah

More information

(CIP) : /. :, (/ ) ISBN T S H CI P (2006) CH IJIASH EN GXIAN G YINSHI WEN H U A Y U CHENGY U 1

(CIP) : /. :, (/ ) ISBN T S H CI P (2006) CH IJIASH EN GXIAN G YINSHI WEN H U A Y U CHENGY U 1 (CIP) : /. :, 2006. 12 (/ ) ISBN 7-81064-917-5... - - - - -. T S971-49 H136. 3 CI P (2006) 116732 CH IJIASH EN GXIAN G YINSHI WEN H U A Y U CHENGY U 105 100037 68418523 ( ) 68982468 ( ) www. cnup. cnu.

More information

Wu Yi Shan Slalom Open Wu Yi Shan, China, November 2018 Final Ranking Battle Senior Women Rank ID Name Ctry Zapuskalova Nadezhda

Wu Yi Shan Slalom Open Wu Yi Shan, China, November 2018 Final Ranking Battle Senior Women Rank ID Name Ctry Zapuskalova Nadezhda Wu Yi Shan Slalom Open Wu Yi Shan, China, 24-25 November 2018 Final Ranking Battle Senior Women 1 2851510000097 Zapuskalova Nadezhda RUS 2 2010930000595 Moritoki Mika JPN 3 2891510000072 Pervenenok Oksana

More information

%

% 38 1 2014 1 Vol. 38No. 1 January 2014 51 Population Research 2010 2010 2010 65 100028 Changing Lineal Families with Three Generations An Analysis of the 2010 Census Data Wang Yuesheng Abstract In contemporary

More information

9 * B0-0 * 16ZD097 10 2018 5 3 11 117 2011 349 120 121 123 46 38-39 12 2018 5 23 92 939 536 2009 98 1844 13 1 25 926-927 3 304 305-306 1 23 95 14 2018 5 25 926-927 122 1 1 self-ownership 15 22 119 a b

More information

Microsoft Word - Preface_1_14.doc

Microsoft Word - Preface_1_14.doc 中 文 混 淆 字 集 應 用 於 別 字 偵 錯 模 板 自 動 產 生 Chinese Confusion Word Set for Automatic Generation of Spelling Error Detecting Template 陳 勇 志 Yong-Zhi Chen, 吳 世 弘 Shih-Hung Wu 朝 陽 科 技 大 學 資 訊 工 程 系 Department of

More information

标题

标题 第 19 卷 摇 第 4 期 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 模 式 识 别 与 人 工 智 能 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 Vol. 19 摇 No. 4 摇 006 年 8 月 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 PR & AI 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 Aug 摇 摇

More information

成 大 中 文 學 報 第 五 十 二 期 Re-examination of the Core Value in Yi Jing Studies of Xun Shuang: Constructing the Confucius Meaning via Phenomenon and Number

成 大 中 文 學 報 第 五 十 二 期 Re-examination of the Core Value in Yi Jing Studies of Xun Shuang: Constructing the Confucius Meaning via Phenomenon and Number 成 大 中 文 學 報 第 五 十 二 期 2016 年 3 月 頁 1-28 國 立 成 功 大 學 中 文 系 縱 橫 象 數, 儒 門 義 理 荀 爽 易 學 核 心 價 值 的 重 建 與 再 定 位 * 曾 暐 傑 摘 要 荀 爽 (128-190) 易 學 歷 來 為 人 所 關 注 的 即 其 乾 升 坤 降 的 學 說, 並 將 此 視 為 其 易 學 的 核 心 觀 念 順 此,

More information

2/80 2

2/80 2 2/80 2 3/80 3 DSP2400 is a high performance Digital Signal Processor (DSP) designed and developed by author s laboratory. It is designed for multimedia and wireless application. To develop application

More information

~ 10 2 P Y i t = my i t W Y i t 1000 PY i t Y t i W Y i t t i m Y i t t i 15 ~ 49 1 Y Y Y 15 ~ j j t j t = j P i t i = 15 P n i t n Y

~ 10 2 P Y i t = my i t W Y i t 1000 PY i t Y t i W Y i t t i m Y i t t i 15 ~ 49 1 Y Y Y 15 ~ j j t j t = j P i t i = 15 P n i t n Y * 35 4 2011 7 Vol. 35 No. 4 July 2011 3 Population Research 1950 ~ 1981 The Estimation Method and Its Application of Cohort Age - specific Fertility Rates Wang Gongzhou Hu Yaoling Abstract Based on the

More information

Microsoft PowerPoint - ATF2015.ppt [相容模式]

Microsoft PowerPoint - ATF2015.ppt [相容模式] Improving the Video Totalized Method of Stopwatch Calibration Samuel C.K. Ko, Aaron Y.K. Yan and Henry C.K. Ma The Government of Hong Kong Special Administrative Region (SCL) 31 Oct 2015 1 Contents Introduction

More information

2009 Japanese First Language Written examination

2009 Japanese First Language Written examination Victorian Certificate of Education 2009 SUPERVISOR TO ATTACH PROCESSING LABEL HERE STUDENT NUMBER Letter Figures Words JAPANESE FIRST LANGUAGE Written examination Monday 16 November 2009 Reading time:

More information

* CUSUM EWMA PCA TS79 A DOI /j. issn X Incipient Fault Detection in Papermaking Wa

* CUSUM EWMA PCA TS79 A DOI /j. issn X Incipient Fault Detection in Papermaking Wa 2 *. 20037 2. 50640 CUSUM EWMA PCA TS79 A DOI 0. 980 /j. issn. 0254-508X. 207. 08. 004 Incipient Fault Detection in Papermaking Wastewater Treatment Processes WANG Ling-song MA Pu-fan YE Feng-ying XIONG

More information

现代汉语语料库基本加工规格说明书

现代汉语语料库基本加工规格说明书 TP391 The Basic Processing of Contemporary Chinese Corpus at Peking University SPECIFICATION YU Shi-wen DUAN Hui-ming ZHU Xue-feng Bing SWEN (Institute of Computational Linguistics, Peking University,

More information

Microsoft Word - A200810-897.doc

Microsoft Word - A200810-897.doc 基 于 胜 任 特 征 模 型 的 结 构 化 面 试 信 度 和 效 度 验 证 张 玮 北 京 邮 电 大 学 经 济 管 理 学 院, 北 京 (100876) E-mail: weeo1984@sina.com 摘 要 : 提 高 结 构 化 面 试 信 度 和 效 度 是 面 试 技 术 研 究 的 核 心 内 容 近 年 来 国 内 有 少 数 学 者 探 讨 过 基 于 胜 任 特 征

More information

yòu xù 373 375 xiá : guà jué qi n mi o dú k ng tóng luán xié háng yè jiào k n z z n shèn chì x 1óng l n t n kuáng qi q ch qì yì yùn yo q w zhuàn sù yí qìng hé p suì x tán cuàn mi o jù yú qìng shì sh

More information

Dan Buettner / /

Dan Buettner / / 39 1 2015 1 Vol. 39 No. 1 January 2015 74 Population Research 80 + /60 + 90 + 90 + 0 80 100028 Measuring and Comparing Population Longevity Level across the Regions of the World Lin Bao Abstract Appropriate

More information

08陈会广

08陈会广 第 34 卷 第 10 期 2012 年 10 月 2012,34(10):1871-1880 Resources Science Vol.34,No.10 Oct.,2012 文 章 编 号 :1007-7588(2012)10-1871-10 房 地 产 市 场 及 其 细 分 的 调 控 重 点 区 域 划 分 理 论 与 实 证 以 中 国 35 个 大 中 城 市 为 例 陈 会 广 1,

More information

θ 1 = φ n -n 2 2 n AR n φ i = 0 1 = a t - θ θ m a t-m 3 3 m MA m 1. 2 ρ k = R k /R 0 5 Akaike ρ k 1 AIC = n ln δ 2

θ 1 = φ n -n 2 2 n AR n φ i = 0 1 = a t - θ θ m a t-m 3 3 m MA m 1. 2 ρ k = R k /R 0 5 Akaike ρ k 1 AIC = n ln δ 2 35 2 2012 2 GEOMATICS & SPATIAL INFORMATION TECHNOLOGY Vol. 35 No. 2 Feb. 2012 1 2 3 4 1. 450008 2. 450005 3. 450008 4. 572000 20 J 101 20 ARMA TU196 B 1672-5867 2012 02-0213 - 04 Application of Time Series

More information

Microsoft Word - 論文封面-980103修.doc

Microsoft Word - 論文封面-980103修.doc 淡 江 大 學 中 國 文 學 學 系 碩 士 在 職 專 班 碩 士 論 文 指 導 教 授 : 呂 正 惠 蘇 敏 逸 博 士 博 士 倚 天 屠 龍 記 愛 情 敘 事 之 研 究 研 究 生 : 陳 麗 淑 撰 中 華 民 國 98 年 1 月 淡 江 大 學 研 究 生 中 文 論 文 提 要 論 文 名 稱 : 倚 天 屠 龍 記 愛 情 敘 事 之 研 究 頁 數 :128 校 系 (

More information

PCA+LDA 14 1 PEN mL mL mL 16 DJX-AB DJ X AB DJ2 -YS % PEN

PCA+LDA 14 1 PEN mL mL mL 16 DJX-AB DJ X AB DJ2 -YS % PEN 21 11 2011 11 COMPUTER TECHNOLOGY AND DEVELOPMENT Vol. 21 No. 11 Nov. 2011 510006 PEN3 5 PCA + PCA+LDA 5 5 100% TP301 A 1673-629X 2011 11-0177-05 Application of Electronic Nose in Discrimination of Different

More information

現代學術之建立 陳平 998 7-3-3592-6 美學十五講 淩繼堯 美學 23 7-3-643-4 論集 徐複觀 書店出版社 的方位 陳寶生 宣傳 敦煌文藝出版社 論集續篇 徐複觀 書店出版社 莊子哲學 王博 道家 7-3-666-3 的天方學 沙宗平 伊斯蘭教 7-3-6844- 周易 經傳十

現代學術之建立 陳平 998 7-3-3592-6 美學十五講 淩繼堯 美學 23 7-3-643-4 論集 徐複觀 書店出版社 的方位 陳寶生 宣傳 敦煌文藝出版社 論集續篇 徐複觀 書店出版社 莊子哲學 王博 道家 7-3-666-3 的天方學 沙宗平 伊斯蘭教 7-3-6844- 周易 經傳十 東西方比較研究 範明生, 陳超南 物流發展報告 物流與採購聯合會 物流發展報告 物流與採購聯合會 物流發展報告 丁俊發 唯物史觀與歷史科學 地理學 社會科學院出版 23 23 物流 研究報告 2 物資出版社 22 7-547-88-5 物流 物資出版社 7-547-22-3 龐卓恒 歷史唯物主義 高等教育出版社 7-4-4333-X 周尚意, 孔翔, 朱竑 地理學 高等教育出版社 7-4-446-

More information

2010 Japanese First Language Written examination

2010 Japanese First Language Written examination Victorian Certificate of Education 2010 SUPERVISOR TO ATTACH PROCESSING LABEL HERE STUDENT NUMBER Letter Figures Words JAPANESE FIRST LANGUAGE Written examination Monday 15 November 2010 Reading time:

More information

jiàn shí

jiàn shí jiàn shí hào x n càn w i huàng ji zhèn yù yàng chèn yù bì yuàn ji ng cóng (11) qiàn xué 1 yì bì èi zhé mó yù ù chái sè bá píng sh chài y l guàn ch n shì qí fú luè yáo d n zèn x yì yù jù zhèn

More information

SVM OA 1 SVM MLP Tab 1 1 Drug feature data quantization table

SVM OA 1 SVM MLP Tab 1 1 Drug feature data quantization table 38 2 2010 4 Journal of Fuzhou University Natural Science Vol 38 No 2 Apr 2010 1000-2243 2010 02-0213 - 06 MLP SVM 1 1 2 1 350108 2 350108 MIP SVM OA MLP - SVM TP391 72 A Research of dialectical classification

More information

~ ~ ~

~ ~ ~ 33 4 2014 467 478 Studies in the History of Natural Sciences Vol. 33 No. 4 2014 030006 20 20 N092 O6-092 A 1000-1224 2014 04-0467-12 200 13 Roger Bacon 1214 ~ 1292 14 Berthold Schwarz 20 Luther Carrington

More information

2005 5,,,,,,,,,,,,,,,,, , , 2174, 7014 %, % 4, 1961, ,30, 30,, 4,1976,627,,,,, 3 (1993,12 ),, 2

2005 5,,,,,,,,,,,,,,,,, , , 2174, 7014 %, % 4, 1961, ,30, 30,, 4,1976,627,,,,, 3 (1993,12 ),, 2 3,,,,,, 1872,,,, 3 2004 ( 04BZS030),, 1 2005 5,,,,,,,,,,,,,,,,, 1928 716,1935 6 2682 1928 2 1935 6 1966, 2174, 7014 %, 94137 % 4, 1961, 59 1929,30, 30,, 4,1976,627,,,,, 3 (1993,12 ),, 2 , :,,,, :,,,,,,

More information

NCCC Swim Team James Logan High school - 8/5/2018 Results - Adult Event 102 Mixed Yard Medley Relay Team Relay Finals Time 1 Beida-Zhonglian

NCCC Swim Team James Logan High school - 8/5/2018 Results - Adult Event 102 Mixed Yard Medley Relay Team Relay Finals Time 1 Beida-Zhonglian NCCC Swim Team James Logan High school - 8/5/2018 Results - Adult Event 102 Mixed 60-69 200 Yard Medley Relay 1 Beida-Zhonglian B 3:11.63 2 Beida-Zhonglian A 3:18.29 HY-TEK's MEET MANAGER Event 103 Mixed

More information

STEAM STEAM STEAM ( ) STEAM STEAM ( ) 1977 [13] [10] STEM STEM 2. [11] [14] ( )STEAM [15] [16] STEAM [12] ( ) STEAM STEAM [17] STEAM STEAM STEA

STEAM STEAM STEAM ( ) STEAM STEAM ( ) 1977 [13] [10] STEM STEM 2. [11] [14] ( )STEAM [15] [16] STEAM [12] ( ) STEAM STEAM [17] STEAM STEAM STEA 2017 8 ( 292 ) DOI:10.13811/j.cnki.eer.2017.08.017 STEAM 1 1 2 3 4 (1. 130117; 2. + 130117; 3. 130022;4. 518100) [ ] 21 STEAM STEAM STEAM STEAM STEAM STEAM [ ] STEAM ; ; [ ] G434 [ ] A [ ] (1970 ) E-mail:ddzhou@nenu.edu.cn

More information

國立中山大學學位論文典藏.PDF

國立中山大學學位論文典藏.PDF I II III The Study of Factors to the Failure or Success of Applying to Holding International Sport Games Abstract For years, holding international sport games has been Taiwan s goal and we are on the way

More information

致 谢 本 论 文 能 得 以 完 成, 首 先 要 感 谢 我 的 导 师 胡 曙 中 教 授 正 是 他 的 悉 心 指 导 和 关 怀 下, 我 才 能 够 最 终 选 定 了 研 究 方 向, 确 定 了 论 文 题 目, 并 逐 步 深 化 了 对 研 究 课 题 的 认 识, 从 而 一

致 谢 本 论 文 能 得 以 完 成, 首 先 要 感 谢 我 的 导 师 胡 曙 中 教 授 正 是 他 的 悉 心 指 导 和 关 怀 下, 我 才 能 够 最 终 选 定 了 研 究 方 向, 确 定 了 论 文 题 目, 并 逐 步 深 化 了 对 研 究 课 题 的 认 识, 从 而 一 中 美 国 际 新 闻 的 叙 事 学 比 较 分 析 以 英 伊 水 兵 事 件 为 例 A Comparative Analysis on Narration of Sino-US International News Case Study:UK-Iran Marine Issue 姓 名 : 李 英 专 业 : 新 闻 学 学 号 : 05390 指 导 老 师 : 胡 曙 中 教 授 上 海

More information

2015 Chinese FL Written examination

2015 Chinese FL Written examination Victorian Certificate of Education 2015 SUPERVISOR TO ATTACH PROCESSING LABEL HERE Letter STUDENT NUMBER CHINESE FIRST LANGUAGE Written examination Monday 16 November 2015 Reading time: 11.45 am to 12.00

More information

致 谢 本 人 自 2008 年 6 月 从 上 海 外 国 语 大 学 毕 业 之 后, 于 2010 年 3 月 再 次 进 入 上 外, 非 常 有 幸 成 为 汉 语 国 际 教 育 专 业 的 研 究 生 回 顾 三 年 以 来 的 学 习 和 生 活, 顿 时 感 觉 这 段 时 间 也

致 谢 本 人 自 2008 年 6 月 从 上 海 外 国 语 大 学 毕 业 之 后, 于 2010 年 3 月 再 次 进 入 上 外, 非 常 有 幸 成 为 汉 语 国 际 教 育 专 业 的 研 究 生 回 顾 三 年 以 来 的 学 习 和 生 活, 顿 时 感 觉 这 段 时 间 也 精 英 汉 语 和 新 实 用 汉 语 课 本 的 对 比 研 究 The Comparative Study of Jing Ying Chinese and The New Practical Chinese Textbook 专 业 : 届 别 : 姓 名 : 导 师 : 汉 语 国 际 教 育 2013 届 王 泉 玲 杨 金 华 1 致 谢 本 人 自 2008 年 6 月 从 上 海 外

More information

Fig. 1 Frame calculation model 1 mm Table 1 Joints displacement mm

Fig. 1 Frame calculation model 1 mm Table 1 Joints displacement mm 33 2 2011 4 ol. 33 No. 2 Apr. 2011 1002-8412 2011 02-0104-08 1 1 1 2 361003 3. 361009 3 1. 361005 2. GB50023-2009 TU746. 3 A Study on Single-span RC Frame Reinforced with Steel Truss System Yuan Xing-ren

More information

,, [1 ], [223 ] :, 1) :, 2) :,,, 3) :,, ( ),, [ 6 ],,, [ 3,728 ], ; [9222 ], ;,,() ;, : (1) ; (2),,,,, [23224 ] ; 2,, x y,,, x y R, ( ),,, :

,, [1 ], [223 ] :, 1) :, 2) :,,, 3) :,, ( ),, [ 6 ],,, [ 3,728 ], ; [9222 ], ;,,() ;, : (1) ; (2),,,,, [23224 ] ; 2,, x y,,, x y R, ( ),,, : 24 3 2010 5 J OU RNAL OF CHIN ESE IN FORMA TION PROCESSIN G Vol. 24, No. 3 May, 2010 : 100320077 (2010) 0320117207 1, 1, 1, 2 (1.,100871 ; 2.,100084) :,,,,,,; : ( ) ( ) (,3 600 ),, ABC : ;; ; ; ;;; : TP391

More information

13-4-Cover-1

13-4-Cover-1 106 13 4 301-323 302 2009 2007 2009 2007 Dewey 1960 1970 1964 1967 303 1994 2008 2007 2008 2001 2003 2006 2007 2007 7 2013 2007 2009 2009 2007 2009 2012 Kendall 1990 Jacoby 1996 Sigmon 1996 1 2 3 20062000

More information

謝 辭 能 夠 完 成 這 本 論 文, 首 先 當 然 要 感 謝 的 是 我 的 指 導 教 授, 謝 林 德 老 師 這 段 時 間 老 師 不 厭 其 煩 的 幫 我 檢 視 論 文, 每 次 與 老 師 討 論 後 總 是 收 穫 很 多, 在 臺 灣 求 的 這 段 期 間 深 深 地

謝 辭 能 夠 完 成 這 本 論 文, 首 先 當 然 要 感 謝 的 是 我 的 指 導 教 授, 謝 林 德 老 師 這 段 時 間 老 師 不 厭 其 煩 的 幫 我 檢 視 論 文, 每 次 與 老 師 討 論 後 總 是 收 穫 很 多, 在 臺 灣 求 的 這 段 期 間 深 深 地 文 教 碩 博 士 位 程 碩 士 位 論 文 指 導 教 授 : 謝 林 德 博 士 Dr. Dennis Schilling 華 同 形 漢 字 詞 之 比 較 及 教 建 議 以 台 灣 八 千 詞 及 韓 漢 字 語 辭 典 分 析 為 例 Semantic and Pragmatic Features of Chinese and Korean Homographic Words with

More information

1 引言

1 引言 P P 第 40 卷 Vol.40 第 7 期 No.7 计 算 机 工 程 Computer Engineering 014 年 7 月 July 014 开 发 研 究 与 工 程 应 用 文 章 编 号 :1000-348(014)07-081-05 文 献 标 识 码 :A 中 图 分 类 号 :TP391.41 摘 基 于 图 像 识 别 的 震 象 云 地 震 预 测 方 法 谢 庭,

More information

% % 99% Sautman B. Preferential Policies for Ethnic Minorities in China The Case

% % 99% Sautman B. Preferential Policies for Ethnic Minorities in China The Case 1001-5558 2015 03-0037-11 2000 2010 C95 DOI:10.16486/j.cnki.62-1035/d.2015.03.005 A 1 2014 14CRK014 2013 13SHC012 1 47 2181 N. W. Journal of Ethnology 2015 3 86 2015.No.3 Total No.86 20 70 122000 2007

More information

4 51 1 Scoones 1998 20 60 70 2002 2001 20 90 World Bank DFID Sussex IDS 2012 2 2. 1 UNDP CARE DFID DFID DFID 1997 IDS 2009 1 5

4 51 1 Scoones 1998 20 60 70 2002 2001 20 90 World Bank DFID Sussex IDS 2012 2 2. 1 UNDP CARE DFID DFID DFID 1997 IDS 2009 1 5 38 4 2014 7 Vol. 38 No. 4 July 2014 50 Population Research * 30 4 100872 Livelihood and Development Capacity of Families Obeying the Family Planning Policy in Rural China A Sustainable Livelihood Analytical

More information

ap15_chinese_interpersoanal_writing_ _response

ap15_chinese_interpersoanal_writing_ _response 2015 SCORING GUIDELINES Interpersonal Writing: 6 EXCELLENT excellence in 5 VERY GOOD Suggests excellence in 4 GOOD 3 ADEQUATE Suggests 2 WEAK Suggests lack of 1 VERY WEAK lack of 0 UNACCEPTABLE Contains

More information

A Study on Grading and Sequencing of Senses of Grade-A Polysemous Adjectives in A Syllabus of Graded Vocabulary for Chinese Proficiency 2002 I II Abstract ublished in 1992, A Syllabus of Graded Vocabulary

More information

10384 27720071152270 UDC SHIBOR - Research on Dynamics of Short-term Shibor via Parametric and Nonparametric Models 2 0 1 0 0 5 2 0 1 0 0 5 2 0 1 0 0 5 2010 , 1. 2. Shibor 2006 10 8 2007 1 4 Shibor

More information

Corpus Word Parser 183

Corpus Word Parser 183 95 182 2010 1946 5 15 1948 6 15 1949 3 15 8 1 2011 2012 11 8 2015 12 31 Corpus Word Parser 183 2017. 1 ROST Content Mining 2003 20 60 2003 184 2003 20 60 1999 2009 2003 Discourse Analysis 1952 Language

More information

: : : : : ISBN / C53:H : 19.50

: : : : : ISBN / C53:H : 19.50 : : : : 2002 1 1 2002 1 1 : ISBN 7-224-06364-9 / C53:H059-53 : 19.50 50,,,,,,, ; 50,,,,,,,, 1 ,,,,,,,,,,,,,, ;,,,,,,,,, 2 ,,,, 2002 8 3 ( 1 ) ( 1 ) Deduction One Way of Deriving the Meaning of U nfamiliar

More information

1 VLBI VLBI 2 32 MHz 2 Gbps X J VLBI [3] CDAS IVS [4,5] CDAS MHz, 16 MHz, 8 MHz, 4 MHz, 2 MHz [6] CDAS VLBI CDAS 2 CDAS CDAS 5 2

1 VLBI VLBI 2 32 MHz 2 Gbps X J VLBI [3] CDAS IVS [4,5] CDAS MHz, 16 MHz, 8 MHz, 4 MHz, 2 MHz [6] CDAS VLBI CDAS 2 CDAS CDAS 5 2 32 1 Vol. 32, No. 1 2014 2 PROGRESS IN ASTRONOMY Feb., 2014 doi: 10.3969/j.issn.1000-8349.2014.01.07 VLBI 1,2 1,2 (1. 200030 2. 200030) VLBI (Digital Baseband Convertor DBBC) CDAS (Chinese VLBI Data Acquisition

More information

Microsoft Word - Chord_chart_-_Song_of_Spiritual_Warfare_CN.docx

Microsoft Word - Chord_chart_-_Song_of_Spiritual_Warfare_CN.docx 4:12 : ( ) D G/D Shang di de dao shi huo po de D G/D A/D Shi you gong xiao de D G/D Shang di de dao shi huo po de D D7 Shi you gong xiao de G A/G Bi yi qie liang ren de jian geng kuai F#m Bm Shen zhi hun

More information

1對外華語文詞彙教學的策略研究_第三次印).doc

1對外華語文詞彙教學的策略研究_第三次印).doc 37 92 1 16 1 2 3 4 5 6 7 8????? 9????????? 10???????????????????? 11? 12 13 14 15 16 The Strategy Research of Teaching Chinese as a Second Language Li-Na Fang Department of Chinese, National Kaohsiung

More information

lí yòu qi n j n ng

lí yòu qi n j n ng lí yòu qi n j n ng zhì lú yu n ch nghé liú g p jiá ji n gè liè du zhù g jù yuán cù cì qióng zhu6 juàn p zh n túmí nòu jiong y yùndu láo x n xiá zhì yùn n n gúo jiào zh

More information

a a a 1. 4 Izumi et al Izumi & Bigelow b

a a a 1. 4 Izumi et al Izumi & Bigelow b 26 2012 2 * 10 6 1996 2002 2006 1996 2007 2004 2004 60 4 30 1998 2006 2006-2007 1. 1 * ' 2010 2011 254 2000 2005a 1999 3 2000 2004 2008 1. 2 2004 2005a 1. 3 1 2 3 4 5 4 2000 2004 2005a 1. 4 Izumi et al.

More information

University of Science and Technology of China A dissertation for master s degree Research of e-learning style for public servants under the context of

University of Science and Technology of China A dissertation for master s degree Research of e-learning style for public servants under the context of 中 国 科 学 技 术 大 学 硕 士 学 位 论 文 新 媒 体 环 境 下 公 务 员 在 线 培 训 模 式 研 究 作 者 姓 名 : 学 科 专 业 : 导 师 姓 名 : 完 成 时 间 : 潘 琳 数 字 媒 体 周 荣 庭 教 授 二 一 二 年 五 月 University of Science and Technology of China A dissertation for

More information

Preface This guide is intended to standardize the use of the WeChat brand and ensure the brand's integrity and consistency. The guide applies to all d

Preface This guide is intended to standardize the use of the WeChat brand and ensure the brand's integrity and consistency. The guide applies to all d WeChat Search Visual Identity Guidelines WEDESIGN 2018. 04 Preface This guide is intended to standardize the use of the WeChat brand and ensure the brand's integrity and consistency. The guide applies

More information

2009 Korean First Language Written examination

2009 Korean First Language Written examination Victorian Certificate of Education 2009 SUPERVISOR TO ATTACH PROCESSING LABEL HERE STUDENT NUMBER Letter Figures Words KOREAN FIRST LANGUAGE Written examination Tuesday 20 October 2009 Reading time: 2.00

More information

2 JCAM. June,2012,Vol. 28,NO. 6 膝 关 节 创 伤 性 滑 膜 炎 是 急 性 创 伤 或 慢 性 劳 损 所 致 的 关 节 滑 膜 的 无 菌 性 炎 症, 发 病 率 达 2% ~ 3% [1], 为 骨 伤 科 临 床 的 常 见 病 多 发 病 近 年 来

2 JCAM. June,2012,Vol. 28,NO. 6 膝 关 节 创 伤 性 滑 膜 炎 是 急 性 创 伤 或 慢 性 劳 损 所 致 的 关 节 滑 膜 的 无 菌 性 炎 症, 发 病 率 达 2% ~ 3% [1], 为 骨 伤 科 临 床 的 常 见 病 多 发 病 近 年 来 针 灸 临 床 杂 志 2012 年 第 28 卷 第 6 期 1 临 床 研 究 针 刀 治 疗 膝 关 节 创 伤 性 滑 膜 炎 的 临 床 研 究 向 伟 明 1, 丁 思 明 1, 张 秀 芬 2, 权 五 成 2, 唐 吉 莲 1, 杨 友 金 1, 黄 涣 强 1, 颜 勋 1, 曾 晓 宇 1, 朱 传 芳 1 1, 张 雄 ( 1. 重 庆 市 梁 平 县 第 二 人 民 医 院,

More information

píng liú zú

píng liú zú píng liú zú l láng nèn bó ch yán y n tuò x chèn r cu n ch n cù ruò zhì qù zuì m ng yíng j n bì yìn j yì héng cù ji n b n sh ng qi n lì quó k xì q n qiáo s ng z n nà p i k i y yíng gài huò ch

More information

2006中國文學研究範本檔

2006中國文學研究範本檔 中 國 文 學 研 究 第 三 十 九 期 2015 年 01 月 頁 223~258 臺 灣 大 學 中 國 文 學 研 究 所 由 心 到 腦 從 腦 的 語 義 脈 絡 論 晚 清 民 初 的 文 化 轉 型 * 徐 瑞 鴻 提 要 傳 統 的 中 醫 理 論 以 心 為 神 明 之 主, 掌 管 思 維 記 憶 與 情 感, 此 一 觀 點 在 近 現 代 受 到 西 方 解 剖 學 的 巨

More information

2008 Nankai Business Review 61

2008 Nankai Business Review 61 150 5 * 71272026 60 2008 Nankai Business Review 61 / 62 Nankai Business Review 63 64 Nankai Business Review 65 66 Nankai Business Review 67 68 Nankai Business Review 69 Mechanism of Luxury Brands Formation

More information

清 华 大 学

清 华 大 学 清 华 大 学 综 合 论 文 训 练 题 目 : 基 于 网 络 用 户 行 为 分 析 的 传 染 病 发 病 趋 势 研 究 系 专 姓 别 : 计 算 机 科 学 与 技 术 业 : 计 算 机 科 学 与 技 术 名 : 许 丹 青 指 导 教 师 : 刘 奕 群 助 理 研 究 员 2010 年 6 月 27 日 中 文 摘 要 近 年 来, 传 染 病 的 传 播 与 流 行 已

More information

OncidiumGower Ramsey ) 2 1(CK1) 2(CK2) 1(T1) 2(T2) ( ) CK1 43 (A 44.2 ) CK2 66 (A 48.5 ) T1 40 (

OncidiumGower Ramsey ) 2 1(CK1) 2(CK2) 1(T1) 2(T2) ( ) CK1 43 (A 44.2 ) CK2 66 (A 48.5 ) T1 40 ( 35 1 2006 48 35-46 OncidiumGower Ramsey ) 2 1(CK1) 2(CK2) 1(T1) 2(T2) (93 5 28 95 1 9 ) 94 1-2 5-6 8-10 94 7 CK1 43 (A 44.2 ) CK2 66 (A 48.5 ) T1 40 (A 47.5 ) T2 73 (A 46.6 ) 3 CK2 T1 T2 CK1 2006 8 16

More information

(CIP) : /. :, (/ ) ISBN T S H CI P (2006) XIANGPIAOWANLI JIUW ENH UA YU CH ENGYU

(CIP) : /. :, (/ ) ISBN T S H CI P (2006) XIANGPIAOWANLI JIUW ENH UA YU CH ENGYU (CIP) : /. :, 2006. 12 (/ ) ISBN 7-81064-916-7... - - - - -. T S971-49 H136. 3 CI P (2006) 116729 XIANGPIAOWANLI JIUW ENH UA YU CH ENGYU 105 100037 68418523 ( ) 68982468 ( ) www.cnup.cnu.cn E- mail cnup@

More information

BC04 Module_antenna__ doc

BC04 Module_antenna__ doc http://www.infobluetooth.com TEL:+86-23-68798999 Fax: +86-23-68889515 Page 1 of 10 http://www.infobluetooth.com TEL:+86-23-68798999 Fax: +86-23-68889515 Page 2 of 10 http://www.infobluetooth.com TEL:+86-23-68798999

More information

Microsoft PowerPoint ARIS_Platform_en.ppt

Microsoft PowerPoint ARIS_Platform_en.ppt ARIS Platform www.ixon.com.tw ARIS ARIS Architecture of Integrated Information System Prof. Dr. Dr. h.c. mult. August-Wilhelm Scheer ARIS () 2 IDS Scheer AG International Presence >> Partners and subsidiaries

More information

,,,,, (,1988: 630) 218

,,,,, (,1988: 630) 218 * 1 19 20 * 1,,,,, (,2006) 217 2018. 1 1959 453 1959 472 1 20 20 1928 1929 2014 20 30 1,,,,, (,1988: 630) 218 2003 405 1930 2005 1 2005 2 20 20 1930 2003 405 1934 1936 2003 411 2003 413 2005 206 2005 219

More information

2017 CCAFL Chinese in Context

2017 CCAFL Chinese in Context Student/Registration Number Centre Number 2017 PUBLIC EXAMINATION Chinese in Context Reading Time: 10 minutes Working Time: 2 hours and 30 minutes You have 10 minutes to read all the papers and to familiarise

More information

前 言 香 港 中 文 大 學 優 質 學 校 改 進 計 劃 ( 下 稱 計 劃 ) 團 隊 自 1998 年 起 積 極 於 本 地 推 動 理 論 及 實 踐 並 重 的 學 校 改 進 工 作, 並 逐 步 發 展 成 為 本 地 最 具 規 模 的 校 本 支 援 服 務 品 牌, 曾 支

前 言 香 港 中 文 大 學 優 質 學 校 改 進 計 劃 ( 下 稱 計 劃 ) 團 隊 自 1998 年 起 積 極 於 本 地 推 動 理 論 及 實 踐 並 重 的 學 校 改 進 工 作, 並 逐 步 發 展 成 為 本 地 最 具 規 模 的 校 本 支 援 服 務 品 牌, 曾 支 香 港 中 文 大 學 香 港 教 育 研 究 所 優 質 學 校 改 進 計 劃 : 學 習 差 異 支 援 及 學 校 起 動 計 劃 聯 合 主 辦 中 學 聯 校 教 師 專 業 發 展 日 主 題 : 照 顧 學 習 差 異 日 期 :2011 年 12 月 9 日 ( 星 期 五 ) 時 間 : 上 午 9 時 正 至 下 午 4 時 15 分 地 點 : 樂 善 堂 余 近 卿 中 學

More information

<B3ACBDDD>

<B3ACBDDD> 0 岚 卷 第 二 编 第 二 编 岚 卷 121 日 照 122 第 二 编 安东卫城池图 丁士价 1676 1727 字介臣 号龙溪 丁景次子 日照丁氏 十支 六世 日照市岚区后村镇丁家皋陆人 康熙五十六年 1717 丁酉科举人 与同邑秦 yi 尹纯儒为同科举人 拣选 知县 后参加会试屡试不第 遂弃举子业 居家课子训侄 以故四弟士 可考中甲辰科举人 诸子孙皆累试前茅 丁士价教育子弟兢兢业业 读

More information

穨423.PDF

穨423.PDF Chinese Journal of Science Education 2002,, 423-439 2002, 10(4), 423-439 1 2 1 1 1 2 90 8 10 91 4 9 91 8 22 ) NII 1995 7 14, 1999 1997 (Cooperative Remotely Accessible Learning CORAL) 424 (Collaborative

More information