Monthly Archives: July 2017

“words” are [scientifically] baseless things!

i have written 25 minutes ago to apertium-stuff@lists.sourceforge.net : https://sourceforge.net/p/apertium/mailman/message/35955660/ :

“words” are [scientifically] baseless things!

where from they have come? just from spaces between them. who and why decided to put spaces there? i think they had not good proofs, else we would know that proofs. i know only theory about lexemes to put in dictionaries, and their word forms.

also “words” in grammar come from old grammars written in old times for latin, arabic, etc. but it is not authoritative source. you should know how much errors were in old sciences of chemistry, medicine, astronomy.

as i know apertium already does not stick with traditional words, for example, as i know, for turkic languages some words which are written separately are used as word modifier tags in apertium.

but still lemmas with modifier tags are used in apertium and as i know there is no way to show whether some another word is used with lemma only, or with lemma with some suffix(es)…

but i think real atoms of syntax are morphemes and it is an idea written by several authors in several books.

also i think that syntax and morpholgy should be redivided and renamed. one of them (syntax?) should include all trees in both of syntax and morphology. (similar idea is also suggested in a book). and part of morphology should go to a science named like “surface decoration of syntax trees”.

difference is in possible different priority/order of using morphemes. in many cases resulting meaning is similar, because in that cases a(bc) = (ab)c ; it can be written “a bc” but it can have meaning (ab)c and there can be not much practical problem if translation program uses it as a(bc), since a(bc) = (ab)c. for example “a” can be an adverb, “b” – a verb and “c” – gerund suffix. for example, “frankly speaking”.

i can give an example when this has practical differences. in turkic languages verb negation suffix is written sticked and in apertium it is also used as a tag. usually adverb is used with verb stem (ie to part without negation suffix) and negation is used to the phrase consisting of verb and adverb. for example: “кызу бармады” – “qozu barmado” in tatar is “did not go fast” and has structure “{{кызу бар}ма}ды” – “did not {go fast}”. but you cannot use this as a rule, similarly written sequence of morphemes can has also another structure: “бөтенләй эшләмәде” – “botonlay islamadi” means “(he/she/it/they) has not worked at all” and it has structure “{бөтенләй} {эшләмәде}” – “{did not work} {at all}” , or “{{бөтенләй} {эшләмә}}де” – “did {{not work} {at all}}”. ( alternatively it could have structure “{{{бөтенләй эшлә}мә}де}” and meaning “did not make wholly” – “did not {make wholly}”. )

to translate this correctly from tatar to english you should better use morphemes as atoms, as tree nodes instead of words, because you should find correct tree structure before you translate, and you should be able to set morphemes at correct places of tree. as i remember apertium does not use syntax trees at all for now, or uses them only for some language pairs, or you have some instrument for them and experimenting with them, but sets words as word forms in tree nodes.

probably there are also other examples with other suffixes. there is also imperative mood suffix in tatar language, with which i expect to find similar example, and i do not completely deny such problem with other suffixes like negation and gerund suffixes when translating from some language to some language.