Archive for the ‘linguistics’ Category

“words” are [scientifically] baseless things!

i have written 25 minutes ago to apertium-stuff@lists.sourceforge.net : https://sourceforge.net/p/apertium/mailman/message/35955660/ :

“words” are [scientifically] baseless things!

where from they have come? just from spaces between them. who and why decided to put spaces there? i think they had not good proofs, else we would know that proofs. i know only theory about lexemes to put in dictionaries, and their word forms.

also “words” in grammar come from old grammars written in old times for latin, arabic, etc. but it is not authoritative source. you should know how much errors were in old sciences of chemistry, medicine, astronomy.

as i know apertium already does not stick with traditional words, for example, as i know, for turkic languages some words which are written separately are used as word modifier tags in apertium.

but still lemmas with modifier tags are used in apertium and as i know there is no way to show whether some another word is used with lemma only, or with lemma with some suffix(es)…

but i think real atoms of syntax are morphemes and it is an idea written by several authors in several books.

also i think that syntax and morpholgy should be redivided and renamed. one of them (syntax?) should include all trees in both of syntax and morphology. (similar idea is also suggested in a book). and part of morphology should go to a science named like “surface decoration of syntax trees”.

difference is in possible different priority/order of using morphemes. in many cases resulting meaning is similar, because in that cases a(bc) = (ab)c ; it can be written “a bc” but it can have meaning (ab)c and there can be not much practical problem if translation program uses it as a(bc), since a(bc) = (ab)c. for example “a” can be an adverb, “b” – a verb and “c” – gerund suffix. for example, “frankly speaking”.

i can give an example when this has practical differences. in turkic languages verb negation suffix is written sticked and in apertium it is also used as a tag. usually adverb is used with verb stem (ie to part without negation suffix) and negation is used to the phrase consisting of verb and adverb. for example: “кызу бармады” – “qozu barmado” in tatar is “did not go fast” and has structure “{{кызу бар}ма}ды” – “did not {go fast}”. but you cannot use this as a rule, similarly written sequence of morphemes can has also another structure: “бөтенләй эшләмәде” – “botonlay islamadi” means “(he/she/it/they) has not worked at all” and it has structure “{бөтенләй} {эшләмәде}” – “{did not work} {at all}” , or “{{бөтенләй} {эшләмә}}де” – “did {{not work} {at all}}”. ( alternatively it could have structure “{{{бөтенләй эшлә}мә}де}” and meaning “did not make wholly” – “did not {make wholly}”. )

to translate this correctly from tatar to english you should better use morphemes as atoms, as tree nodes instead of words, because you should find correct tree structure before you translate, and you should be able to set morphemes at correct places of tree. as i remember apertium does not use syntax trees at all for now, or uses them only for some language pairs, or you have some instrument for them and experimenting with them, but sets words as word forms in tree nodes.

probably there are also other examples with other suffixes. there is also imperative mood suffix in tatar language, with which i expect to find similar example, and i do not completely deny such problem with other suffixes like negation and gerund suffixes when translating from some language to some language.

Sentence syntax trees should be made from morphemes. Semantically ordered trees.

Previous version of this paper.

I have made new edition. sha1: 7839a0260d009f1e65b7c2c3c9183f22c8369e86 . md5: 2e17164501cc67ba72f9a12f011a6554 . size: 239398 bytes.

right (correct) analysis of phrase structure…

grammarians do not separate morphemes while build dependency tree, they make like this:
he has read the last known bug 1
this is my example. another example from https://en.wikipedia.org/wiki/Dependency_grammar :

better way of grammarians is to connect not only words but blocks of words, like this:
he has read the last known bug 2
this is my example. another example from https://en.wikipedia.org/wiki/Government_and_binding_theory :

and i make analysis / dependency tree separating morphemes and using them as if they are “separate words”, like this:
he has read the last known bughe has read the last known bug -mine2
2013-november-30: this was not correct, order of have, he, s, read – i fix it: he has read the last known bug -mine-correcting
in other words, the separateness of words is only in writing, it is not feature of language itself.

though there is past tense shown as “[PAST]” in the 2nd image in the 2nd article ( http://upload.wikimedia.org/wikipedia/en/7/7c/HeSmashedTheVase1.png ), it is not written like separate word (ie as “ed”) , even it is not written at all, only root of verb is written, so it is not clear. but in other images they do not separate morphemes: “likes”, “saw” have only 1 connection per every of them. and they show some “almost glued” parts separately in the first article i have linked/shown:

– ” ‘s “, ” ‘ll “, ” ‘ve ” are separated, but again, “would” is not separated as will+ed.

i have written (about) this in tatar language yesterday: http://qdinar.wp.kukmara-rayon.ru/2013/11/25/grammatika-no-nicik-yasa-w-doros/ , https://vk.com/wall17077748_2708 .

i had written suggestion to write morphemes separately in ural-altaic languages, last year in several forums.

2015-02-21 : i have made a paper about this and other things: http://qdb.wp.kukmara-rayon.ru/?attachment_id=311 .

there are no adverbs, adjectives, nouns in turkic languages and in english

i have written in http://sourceforge.net/mailarchive/message.php?msg_id=29490187 today ie 2012-07-22 at 13:14 UTC+4 :
hello
another reason for that there are no cases in turkic languages:
we can add other postfixes after that postfixes, for example, loq, raq: oyda – at home, oydalik – being at home, that somebody is at home. urmanga – in direction to the forest, urmangaraq – more in direction to the forest.
and this is not possible with cases in russian.
and an other reason:
this postfixes in turkic languages are not hardly separated from other suffixes, and also not technically/grammatically separated. so all other suffixes also can be easily considered as case suffixes. for example, lo and siz, which are mentioned by you. also, even more suffixes: cho, corresponds to “er” in english: ischi – worker. since “nin” suffix, that means “of”, is considered as case suffix, “cho” also can. by the way how grammarians so easily have included “nin” to case suffixes? it is only one of them 6 in tatar which generally creates word that describes noun, others all generally describe verb, and, in distinction from them, it generally requires “(s)i(n)” suffix added at end of the word (noun) it describes. (all other suffixes are used to create different arguments of main verb of sentence, and nin suffix is not, so it looks like it is just copied from existence of russian genitiv.) though lo and siz and cho and others differ from “nin” with that they do not require “(s)i(n)” at end of the word described, they could be considerd as suffixes, by this logic, i think.
main mistake made by grammarians is that they have not understood why these categories are considered in european languages. of course same semantic meanings there are in all languages, and they just copied, “created” (in quotes) categories in turkic languages, corresponding semantically to words that are in different categories in european languages. that is possible, but that is not correct. same mistake is made not only by turkic grammmarians but also by english grammarians.
error in english grammar is with categorisation in adverbs, adjectives, nouns, while these are always marked in russian language, they are not always marked in english, so, there are no such categories in english grammatically. same is in turkic languages. as in jonathan’s example “without coins” – “tiyinsiz”. other example: “fast”. it can be adverb and adjective, and many meanings of them, ie, meanings of it as adjective and of it as adverb are same. and in meanings that are not same, it is just because it semantically cannot be adverb or adjective. for example, in http://en.wiktionary.org/wiki/fast : “Of people: steadfast, with unwavering feeling” , has not corresponding meaning as adverb. but it just cannot be semantically. also “(computing, of a piece of hardware) Able to transfer data in a short period of time”. “Of dyes or colours: not running or fading when subjected to detrimental conditions such as wetness or intense light; permanent [from 17th c.]” meaning i think could be used as adverb, for example, with verb “color”: “to color fast”. maybe it is used? i think it is used and just semantically just merged/fused with “In a firm or secure manner, securely; in such a way as not to be moved [from 10th c.]” meaning. ” Immediately following in place or time; close, very near [from 13th c.]” is only adverb. could not it be adjective? i think it could, for example, something like “fast arrival”. i do not know real english well, so better you investigate this. what about nouns. i mentioned also nouns. can these be nouns? – adjectives can be just used as nouns, are not they? somebody can say 2 critics: 1. noun can have “s” plural suffix, adjectives cannot, adjectives can have “er” suffix, nouns cannot, but i can argue with that these also just semantically cannot be used, and why not? they can be used both: fasters . and gramatically just plural suffix in english is used only once after block after words, (like in turkic languages), it is not used after so called “adjectives” just because that, only when they have a word after them, ie not last word of block of word. as i shown, if it is last word, it can have that suffix, though with existence of “er” it is considered “adjective” by modern grammarians. 2. other possible critic: in english adverbs, adjectives, nouns has different positons in sentence. my answer: that is other thing, that is about role in sentence. for example, “i am fast” – here it is object role. “i go fast” – here it is in adverbial role. (this is because “am” generally requires object here, and “go” requires not direct object but indirect object or adverbials.) noun or adverb range of meanings are distinguished just semantically, without grammatical markers. it is like when we say “leg of man” and “leg of elephant”, its corresponding meaning is automatically semantically selected, leg of men and and elephant are not same.
same in turkic languages. there positions of “nouns”, “adverbs” are different from that in english, “adjectives” ie words in adjective role are at same position as in english – before described noun.

there are no cases in turkic and finno-ugric languages

http://sourceforge.net/mailarchive/message.php?msg_id=29490187 :

( > at beginning of lines mean quoting, and phrases after them are mostly not mine)

i have posted at 2012-07-03 08:40 UTC (12:40 am MSK):

hello.
i also think there are no cases in tukic languages and probably also in most of finno-ugric languages, also in other uralic and altaic langauges.
because:

what is called case ending in indoeuropean and semitic languages is clearly divided from prepositions by that prepositions are before word, and cases are after word (case endings are at end of words, after main part of word),
and second, less clear division is by that
prepositions are not modified for different words, while cases look differently for different words, (this second rule has little exceptions, for example, english ‘s case suffix, it alway is same “s”, and russian “о” preposition may be different for different words: “о”, “об”, “обо”).

while in turkic languages there are no prepositions,

almost no suffixes that differ for different words, such largely as in indoeuropoean languages, for example, in russian, genitiv “suffix” may be “i”, “a”, “”, “ey”, “ogo”, “ih”, etc, also all other cases, while in turkic languages they differ not such strongly, but just are of little difference: “non”, “nin”, “don”, “din”, for example, for so called “genitiv”.

no, the stronger difference from case endings is that
turkic case suffixes are agglutinative/clitic,
but case in indo-european are inflectional (and may be fusional), that means, main part of word of many types of nouns, always used with case ending, even in nominative case, though some class of nouns can be used with “empty” case ending, in some cases, and empty case ending can mean different case, for exampel, “stol”, and “knig” in russian both has empty case endings, but “stol” is nominative case, “knig” is genitiv case of plural form.

in turkic languages, “main part” of word, (ie with “empty” ie no ending) is just a noun in nominative case, and all case suffixes are just like prepositions that are written after word instead of before, so, they are postpositions. but they differ slightly depending on word, as i said, “non”, “nin”, etc, same happen also with prepositions in indoeuropean languages, as i said, “o”, “ob”, “obo” in russian, also there are other examples: “v”, “vo”, “k”, “ko”. but they both, prepositions and postpositions, do not change word, to what they connect, but cases are not so, as i said, they do not just set near nominative case of noun, but they modify its last part (ending), so, this is why they are called cases in languages they are there truely, they can be named/called “casitive” languages, and this languages, for example, indoeuropeans, are called “inflectional” and this inflectionality is in cases. in turkic languages there are no such thing. and so called case suffixes which are written connectedly, together with noun as one word, like prefixes, and so called postpositions, which are written separately from word to which they apply, in modern turkic orthographies, they should not classified be as 2 things, but they should be classified together, as of 1 class, and all things in it are can be called suffix, postfix, posposition, all this posfixes, i will call them postfixes, differ from others slightly with different properties. for example, “cha” suffix do not get stress on itself: kita’pcha (it is not called/named/classified as case suffix in modern official grammar, but rather as suffix that creates new lexem/meaning, but in fact, its meaning is constant, so it is grammatical thing) , while “qa” gets stress on itself: “kitapqa’ “, most of them get stress on itself, only several don’t, one more that doesn’t: “bilan”: ‘kita’pbilan” (it is written separately in modern orthography).
but so called postpositions and so called cases of modern turkic languages sometimes have a feature that is also in true “casitive” languages: a preposition always require a case of word, to wich it apply, that is in russian: “o knige’ – “about book”, where “book” must be in prepositional case, and for example, while “I” pronoun is used in english after preposition, it must be in accusative case “about me”, but pronouns in english are like exception from all nouns, by this behavior, also smae feature is there in tatar language: “with me” is not “minbilan” but it is “minimbilan” ie, so called genitiv case is required, but that is exception for pronouns, like in english. also there is a postposition, maybe there also others, that require a so called case suffix to be applied to word to which it apply: “taba”, which means “in direction of”, require suffix of so called dative case: “maktapka taba” (maktap is school).

2012/7/3, Mikel Forcada :
> Thanks a lot, guys!
>

>
> I am absolutely persuaded that calling these things cases is wrong.
> There is no “nominative” case, but the absolute form of the word. And
> then the genitive, accusative,etc.. are clitic postpositions that
> attach to the last member of the NP, clearly a noun. I have argued
> about this with Basques for ages.
>

then i have posted (more…)