20,000+ chinese sentences with translations and pinyin

Category: 

Author(s): 

Brian Vaughan

Description: 

These flashcards contain Chinese/English sentence and pinyin.

It's been said that it's best to study a language using sentence flashcards,
instead of individual words, but it can be difficult to find good sets of
entire sentences with tranlsations and pronunciations... so I built this set.

The pinyin may have some errors. It was generated programatically by searching
the CEDICT dictioanry.

These sample sentences are broken up into two files:
zh-en_sentences.xml:
I find this one more useful. The questions in this file are Chinese
sentences. The answers are pinyin pronunciations and English
translations.

en-zh_sentences.xml:
This is more for practicing writing. The questions contain English and
pinyin, and the answer is the Chinese expression.

Each file is broken up into a number of categories labeled as described here:

HSK level:
All of the sentences came from sample sentences intended to describe a
particular word. HSK level (in the category name) signifies the HSK
level of the word this sentence describes. Note that "HSK level" is
1-4, ... I have no idea how that corelates to actual HSK scores, but
since HSK scores range from 1-11, I know they are not equivalent.

Source of words and HSK "level":
http://www.chinese-forums.com/vocabulary/

Limited to:

Sentences are then broken up further into 5 categories based on the HSK
level of the words those sentences contain.

This is a search of all characters in each level, including the
characters that loner words are composed of. This is why even HSK
level 4 sentences can contain sentences in "limited 1."

For example, 作主 (zuo4zhu3) is an HSK level 4 word. It contains 2
characters which both appear in other HSK level 1 words, and so the
sample sentence for 作主 (assuming that sentence contains no other
difficult words) might appear in the category "HSK 4; limited 1;"

Since some characters are not found in any of the HSK level sets, there
are categories containing "limited 5."

Part number:
Within each HSK level there are many sentences. I've divided them up
into parts so that the maximum size would be somewhere around 500
sentences.

Before doing so, I sorted by length of the sentence. This means that
sentences in categories labeled "part 1" will be shorter (and
presumably gramatically simpler) then sentences in categories labeled
"part 4."

The sentences in this collection are the example sentences on dict.cn. I
couldn't find specific licensing information associated with the example
sentences, so, if there's a problem, someone let me know and I'll gladly take
it down. I'm under the impression that dict.cn was built with a free
share-and-share-alike corpus... it being web-based and all anyways.

The sentences are freely available on the internet anyways, so, if I do have to
take this down, I'll gladly share the code I used to generate the lists.

contributed by:
Brian Vaughan
http://brianvaughan.net/
nairbv AT yahoo DOT com

Source: 

sample sentences from dict.cn

Mnemosyne Compatibility: 

  • 2.1+

Card Set File: 

3 Comments