KanjiDB
Kanji Search

KanjiDB Kanji Search Documentation

The KanjiDB Kanji search system is intended for English-speaking users with a non-existant or limited knowledge of the Japanese logographic writing system (kanji) based on Chinese hanzi. The system is designed to streamline the process of either a) locating the pronunciation or meaning for a kanji character or b) locating a kanji character with a specific meaning. Of course, the system is by no means limited to these uses, and if you find something new to use it for, have at it. Or better yet, email me, especially if you see a way to improve the search.

To search for kanji compounds, or the Japanese translations of english words, you should use the Dictionary Search.

Users considered literate in kanji may also find this system useful as it indexes a large number of rarely used characters. This system is not very useful for searching for or translating hanzi, as Japanese uses a subset of hanzi, often has assigned different meanings to hanzi characters, and tends to use simplified versions of many hanzi characters in modern writing. For more information on the kanji writing system, try the Wikipedia article.

This system would be completely impossible without the remarkable work of Dr. Jim Breen and his contributors to the KANJIDIC and EDICT datasets which form the backbone of KanjiDB. It would also be impossible without the generous licence for the use of that data provided by the Electronic Dictionary Research Group of Monash University.

Direct Kanji Search

Direct Kanji search allow you to jump directly to the entry for a specific kanji by entering one of its character encodings or the character itself. Characters are also searchable based on their (approximate) frequency of use in modern Japanese, although only 2500 characters are indexed as such.
  1. Direct Kanji Entry: Copy and paste a kanji character into the search box.
  2. Frequency of Use Index: A number between 1 and 2501 indicating the kanji character's rank by frequency of use in modern Japanese. The data is derived from an analysis of word frequencies in the Mainichi Shimbun over 4 years by Alexandre Girardi. Note that the frequency counts are biased towards words and kanji used in newspaper articles, and that the relative frequencies of the last few hundred kanji are quite imprecise.
  3. JIS Code - the four digit hexadecimal encoding of the character in the JIS X 0208 or JIS X 0212 standard. Note that two characters may be represented by the same four digit value: one from JIS X 0208 and one from JIS 0212.
  4. EUC Code - the hexadecimal encoding of the character in the EUC-JP encoding standard. Note that JIS X 0208 characters will have four hex digits, and JIS X 0212 characters will have six hex digits, the first two always being '8f'.
  5. Shift-JIS Code - The four digit hexadecimal encoding of the character in the Shift-JIS standard. Note that two characters may be represented by the same four digit value: one from JIS X 0208 and one from JIS 0212.
  6. Kuten Code - the four digit decimal kuten code. This code locates a kanji character on a 94-by-94 grid. For example, the character is located at 16-01, hence its kuten code is 1601.
  7. Unicode - The four digit hexadecimal encoding of the character under the Unicode standard.

Characteristic Kanji Search

Characteristic Kanji Search performs searches based on information about the visual characteristics of a kanji character, as well as its English meanings and Japanese pronunciation. Each field will accept more than one entry, and filling out more than one field will help greatly narrow your search. Note that De Roo codes and Four Corner codes do not index all characters in the database.

  1. Radical Numbers - The standard radical index of a kanji character. See Radicals.
  2. Stroke Counts - The number of strokes used to write a character. This can occasionally be confusing (for example: is written with 3, rather than 4, strokes).
  3. De Roo Codes - Father Joseph De Roo's method for describing kanji. See De Roo Codes.
  4. Four Corner Codes - The Four Corner method for describing kanji. Note that the codes listed in the database may contain some errors. See Four Corner Codes.
  5. Readings - The phonetic pronunciation of the kanji character in Japanese (most characters have multiple pronuciations dependant on the context). This field will accept either roman characters or kana (the two Japanese syllabic alphabets) characters. When entering roman text, be aware that the database uses a version of revised Hepburn romanization when searching readings although, due to the difficulty of typing macrons, extended vowels are written by doubling the letter. (e.g. きいて = kiite, the -te form of きく - to hear )

    You should also be aware that the database only searches for one romanization per kana. In the case of kana such as (ha, but pronounced wa when used to denote the topic marker particle), or (he, but pronouced e when used to denote the destination particle), the more general pronunciation is the one the romanization engine will render. For example, the result - konnichi wa (hello, good day) will not turn up on a search for the string "konnichi wa" or "konnichiwa" because the romanization engine does not presently recognize the wa syllable as corresponding to the kana (it would instead match to which is always pronounced wa). Searching for "konnichi ha" or "konnichiha" will turn up the correct result. Of course, searching for the string "こんにちは" takes the romanization engine out of the loop and will return the desired result without having to accomodate the limitations of the engine. Future upgrades to the engine may be better able to handle kana with somewhat ambiguous readings.

    There is, naturally, an exception. The kana is technically pronounced wo but this pronuciation is obsolete (except occasionaly when it is used in names, which often have archaic spelling rules). It is only ever used, ignoring the name exception, to denote the direct object particle (pronounced o) so the romanization of the kana is not ambigious and a search for "o" will turn it up (and a search for "wo" will not). For example, the result - en o gaku (to draw a circle) will turn up on a search for the string "en o gaku" or "enogaku".

  6. English Terms - English translations for a character. Note that many terms in Japanese are written as compounds of multiple kanji. This search only scans the English meanings of individual kanji. For English searches against compounds, see the Dictionary Search.

Compositional Kanji Search

Compositional Kanji Search (or "multi-radical" search) performs searches based on common visual elements in kanji characters. Although there are more than 200 such common elements to choose from, with a little practice this search is almost ridiculously useful.

For a given kanji character, check off elements which appear in the glyph and click Search. Some component elements can appear very similar. They have been marked like this. Also note that the original indexing restricted itself to JIS X 0208 kanji to describe the common elements. In some cases, JIS X 0212 kanji are more representative in describing them, and these characters have been replaced where noticed. If you have a suggestion for a compositional element which would be better represented by a different character, email me.

The index is based on work done in 1994/1995 by Michael Raine in which he analyzed all the JIS1/2 kanji and identified the constituent radicals and other common elements. Revisions were made between 1995 and 2001 by Jim Breen and others. Further revisions specific to the KanjiDB database were made in 2006.

Note that the Compositional Kanji Search only indexes JIS X 0208 kanji. JIS X 0212 kanji are not searched.

Kanji Search

KanjiDB makes use of the EDICT and KANJIDIC datasets, as well as a number of EDICT-format dictionaries. EDICT, KANJIDIC, and related files and terms are the property of The Electronic Dictionary Research and Development Group at Monash University and are used in conformance with the Group's license. All EDICT-format dictionary files are the property of their respective owners and are used in conformance with license restrictions where applicable. All other content © 2006-2008 Martin Thorne.