Kongzi Beta-3 and the Quest for Icons

All the bugs seem to be worked out. I figured out how to use Inno Setup, which is an amazing tool (for windows anyways). I still don’t know what free multi-platform installers I can use, or what commercial ones would be best for me.

So gee, now I am faced with *many* problems that I didn’t expect to run into so soon:

  • I need an icon! What Icon will I use to represent my program?
  • I need to start entering Chinese characters into the program’s dictionary
  • Ditto for shortcuts.
  • I still need to add more quiz & test types.
  • I’m sure there are 100 miscellaneous user interface niceties i’ve forgotten. Like printing. Kongzi doesn’t print anything.
  • How much to charge? Free trial or crippleware?
  • Gee I guess I need a dedicated website now? (Can you run a business on WordPress? Unlikely that you would even want to, but..)
  • Should I keep talking about Kongz’s features and code, or keep them as trade secrets?

I found a great blog about mISV (micro independant software vendors) just like myself, called Joel on Software. They have a great discussion forum called the Business of Software. I haven’t had time to read it but it has a lot of useful tips.

I’ll say one thing, I’m grateful for the company. Developing Kongzi has been far more of a lone wolf project than Netwhack – until I found Joel on Software I didn’t know there even was a community for people like me 🙂

Then, suddenly, I found the Micro-ISV Journal (via a link on Joel’s). Things are looking up!

Kongzi Beta-2

Kongzi Beta-2: Tag EditorKongzi Beta-2: Add Entries

Kongzi is growing – it is now 4,000 lines of code. Feast your eyes on the new Tag Tree System (shown above).

Additionally, I’ve improved the load/save functions – you can now export tag lists, and they are generally much smaller than exporting word lists – by several times. For example, one popular application seems to require 23 bytes for one dictionary entry in a word list, on average. Kongzi requires around 7 bytes for a tag and an entry, on average. Convenience is improved as well – you don’t need 30 different word lists to use Kongzi – you only need one dictionary file per language.

I’m amazed at how far the program has come. The user intrerface is still amateurish, but the program is bursting with a power I didn’t expect. For example take the innovative Shortcut System I’ve designed. It allows end users to create and share shortcut files (just like they can create and share tag tree files). What this really means is that users do not need an IME to type pinyin – they can just type ni3 and it will automatically resolve to nǐ. If you prefer Japanese, you can configure “no” to mean . It’s so configurable it’s scary – you can type ni3- and it will resolve to . Although, there is a catch – it depends on end users (or myself) to slowly enter all the shortcuts that users might want to use. Anyone with an IME doesn’t need the shortcuts system, but it saves a heck of a lot of time switching between Pinyin, Ascii, and Chinese all the time. The silver lining here is that once entered, the shortcut can be saved, and is integrated within the program. So slowly but surely all of the important tags (read; Pinyin and Hiragana/Katakana/whatever) will be created and distributed with the program.

The program is less than 200k in size; although if you don’t have a Chinese Unicode font you will need the version which includes the Bitstream font the program uses by default.

Over the next few days I plan to add innovative, cutting edge features like a mix and match quiz type, and a memory game with Chinese characters (again, or Japanese, Korean, whatever you have a font and an IME or shortcuts for).

I will also be experimenting with licencing numbers soon, so I can distribute the program. Keep an eye on this space, I’ll be giving away dozens (maybe a hundred or more) free copies to beta testers soon!

Dynamically modifying nodes in a JTree

First let me just say I hate JTrees. Now, let’s proceed.

When you first try to learn how to use a JTree, the examples you find will set up a tree, say with DefaultMutableTreeNodes, and then add them to the tree, which gets displayed. All neat and nice. Well thats crap, I say, because they don’t teach you anything, they just let you alone to figure things out. And when you try to use this level of understanding in the real world something strange happens. When you add the first child of a node like “node.add(childnode)”, it works, but when you add more children it doesn’t display properly. Whats maddening is that every diagnostic you could imagine will tell you that the nodes are there, it’s just that the JTree isn’t displaying them.

What you were never told, and SHOULD have been told, is that the JTree maintains a data model and a selection model separate from the actual data. This is because it allows you to remodel existing data structures into tree form without having to extend DefaultMutableTreeNode to create your tree node element. That allows you to display, as a tree, stuff that really isn’t a tree. Why you would want to do this is up to you. But for the rest of us, who just extend DefaultMutableTreeNode because it is the right thing to do at the time, we need to know this.

When I first ran into this error, I panicked and wondered what I did wrong. The first solution I found was a common one, but was actually quite useless to me (and, I suspect, many others) because of how I wanted to extend TreeNodes for my data structure, and why I wanted to do this. I’ll tell you that solution anyways. It’s to add the nodes like this:

DefaultTreeModel model = (DefaultTreeModel) jtree.getModel(); model.insertNodeInto( (DefaultMutableTreeNode) newnode, (DefaultMutableTreeNode) parentnode, parentnode.getChildCount());

Wow, huh? Who would have thought? And thats why I hate JTrees. And you know what, this is exactly the kind of crap Ruby people bring up when they say Java is bloated. But.. rather than being like everyone else and force you to adopt some twisted Java paradigm that doesn’t make sense, giving Ruby people more ammo, I will teach you a dirty little hack that works, and allows you to keep thinking the way you want to think about jtrees and data structures. So this is for both the 95% of us that will never need to display non-tree data as trees, and also for the 5% of us who are capable of beating ruby people at their own game.

Just call model.reload().

Yeah it’s that simple. All those jerks that tried to make you type
DefaultTreeModel model = (DefaultTreeModel) jtree.getModel(); model.insertNodeInto( (DefaultMutableTreeNode) newnode, (DefaultMutableTreeNode) parentnode, parentnode.getChildCount());

can be safely ignored. All you really need to do is

model.reload();

Okay, maybe yourtree.getModel().reload(), if you don’t keep a model lying around, but same basic thing. Simple, huh? What’s even better is that if you’re loading the information in from a save file or something, and you need to do a lot of .add() operations, you can get away with just .add()ing and it will likely take less processor time. Just model.reload() when you’re done and you’re good to go.

As mentioned this practically encourages you to store your tree data in extended DefaultMutableTreeNodes – since you will only need one copy of the data, ever, regardless of how many jtrees you want to use. Just call model.reload for each jtree. Hey, it’s better than calling insertNodeWhatever for every jtree each time you add a node, right? Of course it is.

Good luck out there 🙂

Seven Things about Me

Uh oh. I knew this would happen eventually. Chessman of FormosaNeijia has tagged me with one of those ridiculous memes. Well at least he tagged me first, before my arch-rival Tim Fong. Life is good 🙂

There are a few people I’d like to counter-tag. One is the guy at Dynamic Balancing Tai Chi, just because I know he won’t do it 😉

Second would be good ‘ol Scott because I think he has an interesting, narrow-but-deep view of things CMA.

Finally, we might as well tag Joanna Zorya, because she refuses to be cubbyholed (still deciding if that’s a good or bad thing :)) and socializing with her in this way is perhaps our last option. Since the link above isn’t a blog, let me link Reeling Silk, her student’s blog, as she posts there quite often.

After those three I’ll wait to see if Tim Fong says anything 🙂

Okay, here we go.

1. I have studied Tai Chi for Twenty Years
I’ve studied Taoist Tai Chi for close to 5 years, Yang Style for another 5, and most recently Chen style for around 5 years. Along the way I’ve easily taken 5 year’s worth of “breaks” 😉

2. I’m a Christian.
I thought about editing this to #1 but that’s just paranoia. One of the greatest uncertanities I have as a Christian is reconciling if God can hear our prayers or not. From what I know of theology (and this is hardcore theology not your average sunday school “french pastry” theology) God may actually be currently “unwilling” shall we say, to hear our prayers.

3. I am a 4 dan in Wei-Qi, known in Japanese as “I-Go”.
I have taught Go in chinese school in Canada. As a white guy that’s saying something. I don’t play much Go anymore. I’ve been busy. At my level seriously studying Go tends to take over everything else. I’ve decided I’d rather study Tai Chi. THAT being said, I’ve decided to study Chinese in the mean time.

4. I’m studying Chinese.
I’m one of those perpetual 500 chinese word guys. But I have plans to improve 😉 I’m writing a MCI Chinese Textbook. If I succeed it will be the only one of it’s kind in the world.

5. I’m an amateur software developer.
Lone wolf – I wouldn’t have it any other way. I wrote LinCycles which is still being distributed via SlackWare if my sources are right. I also wrote NetWhack, which exceeded my expectations and includes a full VT100 simulator in Java. I have written various other programs and systems including a dating website, software for the University of Toronto and a few different Engineering Companies, oh and an I-Ching program. Lost the sources for that one. Pity, it was really cool. I’m currently working on Kongzi. Watch for it. It’s going to be good.

6. I’m in a continuous state of culture shock.
Or maybe not. I’m not sure. What’s culture shock? It’s got to have hit me by now, i’ve been here for 2 years. Living in China is the best thing that ever happened to me. But is it just a cover-up for mistakes I’ve made in my past? It’s best not to think of it, I suppose. China is what it is. Yay me.

7. Recently had a cute little baby boy. Dear lord what a life-shifting event.
I think I’m still adjusting. Everything is just moving way too fast and did I mention I have five projects ranging from tai chi to go to programming to reading the latest harry potter? Oh and is he ever cute. Maybe it’s just because he’s a mixed baby, dunno, but whenever we go shopping, crowds gather. He really does look handsome, and believe me I have seen some pretty sick looking babies on baby food bottles and what not. He looks like he could be on a magazine cover and I’m not even joking. No pictures, please 😉

8. I’m an amateur investor.
Have been for several years. I saved my father thousands of dollars on a foolish stock play he made once – didn’t tell him, took a 2 week holiday in Asia – gave him a lot of money back – everyone was happy. Those were the days. Nowadays I’m angry because I need to go to the dentist, buy baby stuff, pay off my student loan.. Sigh, if only I could fast forward a couple years to when I was super rich 😉 everything would be much better.

Check out GSS and CDE. Check the fundamentals, read the write ups. I’m not an investor so I won’t tell you if I think you should buy or short those two stocks – you will figure that out for yourself. But there’s something going down here. Be advised. These two are huge plays on gold and silver. But like I said don’t take my word for it. Do your own research. This one’s for my readers only, I have never and likely never will ever again write about stocks (except once, on r.m-a).

Oh I’m sorry, that’s eight things.

I would have liked to talk about my musical interests as well – I have produced my own CD on mp3.com although I believe I’ve closed that account by now, so you won’t be able to download or buy my CD (I had nine sales in the 3 months it was up.) I’m currently learning the Recorder. It’s a fascinating instrument.

Good luck and thanks for asking.

Chinese Sudoku Puzzles

Wow – two blog entries in one day! I couldn’t help posting this little gem. I found a sudoku puzzle on the net and started replacing the numbers with Chinese Characters. Enjoy!

Number One

Neat huh. I liked doing that so much I went out and did another one. Hey, if you like Sudoku, and you’re learning Chinese, why not?

Number Two

If you solve both puzzles, send me an e-mail with the answer and I’ll mention you in my next blog entry. You could be famous!

The Missing Link

Before I took Chinese 101 at the U of T way back when, I has tried to learn Chinese by myself. This unique experience helped me to put learning Chinese in an academic environment in perspective. Since that time, I’ve also had the pleasure of being able to speak, read and write more than 500 Chinese words but being unable to communicate in Chinese.

Where did it all go wrong? What did I do to deserve this?

Well, here I am now in China teaching English. And I’ve learned something new. There is a missing link. First, let me give you my version of the process people go through when they try to learn Chinese.

1. Memorize as many characters as possible
3. Profit

Yes folks – it really is that simple. No matter if you’re learning on your own or as part of a degree program, private lessons, etc – this is the real deal. First, memorize as many Chinese characters as you can and then – profit.

So like I said I’ve been teaching English in China for a couple of years now, and let me tell you – the resources that Chinese people have to learn English runs circles around what we have to learn Chinese in every conceivable way. The reason for this is of course there is far more demand to learn English in China than there is to learn Chinese in America. And this has led to us having to put up with some pretty outdated teaching methods, simply because there isn’t any money in designing a proper Chinese curriculum.

As a result of my observations on how English is being taught in China right now, and how Chinese is being taught in Universities and in textbooks across America, I’d like to make a few suggestions for anyone who wants to learn Chinese. People who are responsible for designing a curriculum would also do well to pay heed.

1. You need a copy of the Far East 3000 Chinese Character Dictionary.
2. You need to drill flashcards.
3. You need massive comprehensible input.

Suprise! #3 is the missing link. In fact, it is so vital and important that arguably it is the final step. You see, in all the Chinese Textbooks i’ve seen, invariably every lesson has a structure somewhat like this:

a) Present 10-20 characters
b) Present one or two (usually one) example dialog featuring all of those characters.
c) Miscellaneous notes. Like “Culture Notes” or “Interesting Facts” or whatever.

Friends, let me tell you that this approach is totally bogus. It’s nothing but memorization. And there are two important problems with this. Number one, once you get around 300 or 400 characters stuck in that brain, it’s going to be awfully hard to memorize new ones. That is to say, as you learn new characters you’re going to start to forget old ones. The second problem is that the dialogues are usually fake. Which means no one would speak like that, or that it’s a farfetched scenario. How useful is it, really, to spend your time memorizing endless situations that you will probably never run into? How wise is it to decide on a textbook based on the content of the dialogues alone? It isn’t. Yet Universities use textbooks like Interactions for no other reason than all of the dialogues are about University students trying to learn Chinese. And don’t get me wrong, Interations is one of the better academic-quality textbooks out there.

Wanna know the best one though? Far East Everyday Chinese. It presents several hundred characters per volume (not too many) and it provides several dialogues per lesson. That’s very good. Yet it still does not follow the principles of MCI.

The truth is that I feel most Chinese students set their standards too high (i.e. trying to read newspapers) because they don’t know any other way. For example in America there isn’t a lot of access to very easy Chinese children’s books. Yet such books can be an invaluable tool for the aspiring Chinese student. Even though I only know 500 or so characters, I have several Children’s books I can read right now. That allows me to read and practice my Chinese – actually using it – which is how people really learn.

So I think that the best way to learn Chinese would be to go through a memorization period of no more than 100 to 200 carefully selected characters. These characters would be selected to comprise the entirety of the first “children’s book” or “short story” that the students would need to read through/memorize. These should be introduced in the form of question and answer. So that the students can instantly practice using the chinese by asking each other questions from day one. This also reinforces grammar.

Once the (say) 200 character limit has been reached, the student must transition over to a MCI approach. This approach presents short stories, conversations, etc. on a related theme (similar to “Let’s Talk in English”, the best language learning tool I have ever seen). There should be hundreds if not thousands of such stories and articles. They should be graded in perhaps three or four levels:

Low: For people with a 200-400 word vocabulary.
Medium: For people with a 400-800 word vocabulary.
High: for people with a 800-1500 word vocabulary.
Advanced: for people with 1500-3000 words.

How this is done is simple; the concept of the key word. Every (say) 50 or so words of “low” level story, for example, will introduce a “key word”. This key word will be from a carefully selected pool of characters designed to allow more variety at the medium level; and so on. The key word will have a definition and pronunciation and one or two example sentances. The examples will of course represent grammar as well as vocabulary.

One can see that the difficulty of this approach is in it’s design. The approach I have arrived at is to do a frequency analysis of children’s books. This approach seems to work extremely well; I currently have about 700 characters from a series of children’s books that I would classify as “medium level” in the above scheme. The key is to work out what characters would be required in the low level.

One problem with the “old” (common) method of teaching Chinese in America is that characters that you might think are relatively common and should be introduced early, really aren’t. For example, the number seven is the 920th most common character. Most books will teach you to count first. But based on a frequency analysis this is a mistake, since you will rarely encounter the number seven (for instance).

Of course, a purely frequency based approach is tedious as there will be no meaningful dialogue for quite some time.

And this is why the targeted frequency analysis I’ve come up with works so well. In the series I’ve analyzed, the frequency analysis becomes meaningful because it allows you to quickly arrive at a level which will allow you to read meaningful and interesting Chinese. Once you have achieved this goal, you can then switch to a frequency analysis like the one which rates number seven at 920 – and you will be able to rapidly transition from children’s books to more advanced material such as daily newspapers.

The hard data can be summed up in the following table. Only the first six books of more than thirty are shown, but as you can see, by the 6th book less than 10% of the characters encountered are new, but complete and wildly different stories can easily be written with only 200 characters. This is the basis for the low level being 200 to 400 characters as presented above.

Name of Book

Unique:Book

New:Unique (New to Pool)

Pool

2034-1

137:246

137:137 (100%)

137

2034-2

104:252

66:104 (64%)

203

2034-3

108:298

52:108 (26%)

255

2034-4

94:228

42:94 (16%)

297

2034-5

140:342

50:140 (17%)

347

2035-6

137:331

35:137 (7%)

386

To conclude, the missing link is ensuring that you have enough reading material which is linked to a targeted frequency analysis. Creating such a massive structure is a daunting task. It is far more difficult than it sounds. However, I am almost done that MCI Chinese textbook I mentioned in a previous post. It’s called “Da Jia Shuo Zhong Wen”. I think it’s going to be good 🙂

Kongzi

Kongzi Beta Screenshot 1Kongzi Beta Screenshot 4Kongzi Beta Screenshot 2Kongzi Beta Screenshot 3Kongzi Beta Screenshot 5Kongzi - 3sec build time

For an exciting taste of a new program I’m working on, check out the images above. The program’s name is Kongzi, and all basic features (load, save, add, edit, quiz, stat recording) are finished. I am beginning to devote my time to increase the default dictionary that will come with Kongzi, and to add cool stuff like a “idiom of the day” or something. Although of course the program is already mature enough to let you create your own custom dictionaries should you want to. And you probably will if you’re a student in a Language course – you’d want to create your own lessons/units, and you can. Kongzi already handles Chinese, Japanese, and Korean – in fact, you can use it to drill any language you have a font for.

The program is 3000 SLOC in size. This seems rather large to me considering NetWhack is just a little larger but took a LOT longer to write. Chalk it up to Java’s hardball approach to GUI layout and design. So far Kongzi has taken me only a week to write, which did suprise me given it’s size – that’s a productive 400 SLOC a day, which I guess you could say I’m proud of. Also, my box builds it in only 3 seconds which I think is pretty cool.

In other news… Well, I think the time has come for me to stop blogging about Martial Arts. I don’t have very many more interesting things I care to share, quite frankly. I’ll still participate in discussions on other forums or do the odd book review, if I feel I have something positive to contribute. In the meantime this blog will head in other directions. I’m re-focusing on that MCI Chinese Textbook I talked about earlier on, and perhaps the Chi FAQ – I also should probably update the Lineage Project to be at least where it was way back in the ninties (I lost my backups).

Good luck and let me know what you think of my new program!