I’m sure most of us who have worked with the Siku quanshu 四庫全書 database have dreamed of extracting texts of whole books without having to copy them page-by-page. It turns out that some books are indeed available as text files in here. The list is not complete, but it includes quit a few books from the History (史), Philosophy (子), and Literature (集) sections. Some books from the Classics (經) section are also available, but not that many.
If you are lucky enough to find the title of your interest, you might need to convert the encoding of the downloaded file before you can see the text properly. A tool that I have found handy is Encoding Master. Use it to open the file, and convert from DOS Chinese Simplified (GBK) to UTF-8. Now you have a clean text file of your favorite book!
Some points of caution:
- The text is in simplified Chinese.
- This comes from an online forum, so it can disappear anytime (although apparently it has been there for over a year now).
- Not all files are from Siku quanshu. (Read the disclaimer in Paragraph #6 at the top of the page.) In any case, the files come from totally unknown sources, so they are as reliable as Wikipedia.
- Depending on your understanding of the copyright of digitalized old texts, you might feel guilty using these files.
This is the most comprehensive list of clean Siku quanshu texts that I have seen so far. If anyone knows of a better source, I’d appreciate the information very much.