projects:mai:dictionary
Table of Contents
dictionary
two tables, dict and mean, joined by dict.id = mean.did
- dict contains the thai word
- mean contains one or more english translations for each thai word
dict.g is type
- o - one syllable word or syllable
- m - multi-syllable word
- s - symbol, ie, ๆ
pos Part of Speech, included in mean table
- n noun
- v verb
- c conjunction
- p preposition
- j adjective
- e adverb
- r pronoun
- a particle
- i interjection
- s syllable - note: this is not a part of speech
One meaning can have multiple pos.
pm parse manual
- cp, ru, tl have been manually edited by the user, because the parser failed
- click the pm checkbox in the editor to open these fields to editing
- replaces and combines tlm and cpm
- Note: pm = true indicates the parser failed. this field must be updated when the parser succeeds
pf parse fixed (y/n)
- the record cannot be re-parsed
- cp, ru, tl cannot be changed by a user
- the testsuite is built dynamically selecting these records from the dict
- when we train an AI to do parsing, these records become the training set
- Note: when pf = true, pm may be true or false
search special
in addition to search exact and search partial:
- find words which have some known glyphs (replacing Noam::collect())
- find words which have only known glyphs
- pf = true/false
- pm = true/false
- lc = ก
- lc.length > 1
- lc includes ก
- ditto fc, tm, t
- includes rule
- type = m/o/s
options
- include only known words
- include only new words
implement via sql or dictionary iterate
ToDo
- rewrite translate()
- rewrite translit()
- rewrite iterate()
- repurpose tlm/cpm to pm/pf (see above)
- implement search special (see above)
projects/mai/dictionary.txt · Last modified: 2023/01/12 11:13 by jhagstrand