Indian Lexicon: An Overview

Indian Lexicon
DEDICATED TO: PA_N.INI and TOLKA_PPIYAN-
Dr. S. Kalyanaraman 11 May 1998.

The author assumes full responsibility for the semantic and etymological judgements made and the errors that might have crept in with thousands of database iterations in organizing the semantic clusters found in the word lists (The lexicon includes over half-a-million Indian words). The author is indebted to Prof. Krishnamurthy who first observed from a review (1994) of an earlier draft of the Lexion (alphabetially sequenced) that the model of Carl Darling Buck's work for Indo-European languages may also be adapted. The author hopes that with the impossibility of 'dating' the origin of a word, all its inherent limitations, the omissions, intentional or otherwise and errors that will in due course be pointed out by scholars specialized in their fields, the Indian Lexicon will be a tentative, but bold start of a skeleton dictionary of the Indian linguistic area ca. 3000 B.C. and will be expanded further to include modern words.

Indian Lexicon

Advanced Search Engines:

Organizing (Clustering) the lexemes of Indian languages

Elucidates the method used to organize the synonyms in alphabetic, semantic sequences and to facilitate comprehensive searches for ANY INDIAN LANGUAGE WORD OR ANY ENGLISH MEANING.

An Overview and Objectives

This is a comparative study of the 'semantics' of lexemes of all the languages of India (which may also be referred to, in a geographical/ historical phrase, as the Indian linguistic area). The objective of the lexicon is to discover the semantic repertoire of India ca. 3000 B.C. to further facilitate efforts at deciphering the inscriptions and script of the Sarasvati-Sindhu civilization.

The Indian Lexicon establishes an Indian Linguistic Area, ca. 3000 B.C. by authenticating the use of the lexemes for inscriptions of the civilization of the ancient period.

This Indian lexicon seeks to establish a semantic concordance, across the languages or numeraire facile of the Indian linguistic area: from Brahui to Santali to Bengali, from Kashmiri to Mundarica to Sinhalese, from Marathi to Hindi to Nepali, from Sindhi or Punjabi or Urdu to Tamil. A semantic structure binds the languages of India, which may have diverged morphologically or phonologically as evidenced in the oral tradition of Vedic texts, or epigraphy, literary works or lexicons of the historical periods. This lexicon, therefore, goes beyond, the commonly held belief of an Indo-European language and is anchored on proto-Indian sememes.

The work covers over 8,300 semantic clusters which span and bind the Indian languages. The basic finding is that thousands of terms of the Vedas, the Munda languages (e.g., Santali, Mundarica, Sora; cf. Munda lexemes in Sanskrit)(37 kb.), the so-called Dravidian languages and the so-called Indo-Aryan languages have common roots. This belies the received wisdom of cleavage between, for example, the Dravidian or Munda and the Aryan languages.

The idea of a semantic dictionary

Carl Darling Buck, A dictionary of selected synonyms in the principal Indo-European languages: a contribution to the history of ideas, 1949, Univ. of Chicago Press. "The associations underlyng semantic changes are so complex that no rigid classification of the latter is possible. Many changes may be variously viewed. In a sense, each word has its individual semantic history... in the history of words for domestic animals the conspicuous feature is the frequent interchange between classes of the same species, as when words of the same cognate group denote in different languages 'bull', 'ox', or 'cow', and in another species 'ram, wether', or 'lamb', or show a shift from 'wether', through an intermediate generic use, to 'ewe'. "Semantic borrowing" refers to the borrowing not of the formal word but of some special meaning. There are, of course, great numbers of actual loanwords, some in Greek from pre-Greek sources, many in Latin from Greek, still more in most of the European languages from Latin or in many cases more specifically from Frenh; again from early Germanic and later from German in Balto-Slavic and from Slavic in Rumanian. But besides these there are "translation words". A special use of a familiar foreign word was adopted for the usually corresponding native word. Thus Lat. na_vis 'ship' came to be used in Christian times for the 'nave' of a church... Semantic word study may proceed from two opposite points of view, form or meaning. For example, on may study the history of Lat. di_cere 'say' and its cognates in Latin, or, with enlarged scope, its cognates in all the Indo-European languages; in other words the diverse uses of derivatives of the Indo-European root *deik- and its probable sense. Such is the material brought together in the etymological dictionaries of the usual type. Conversely, one may start from the notion 'say' and study the history of words used to express it in different languages... By the study of synonyms, their etymology and semantic history, one seeks to show the various sources of a given notion, the trails of its evolution... also presents an interesting picture of word distribution... Even for the Indo-European field anything like a complete semantic dictionary is beyond probable realization at present... The specialist can recognize these, and at the same time is aware of how of how large a proportion of the current etymologies, even in most of the best etymological dictionaries, are uncertain, with varying degrees of probability or plausibility... and it is best to let the facts speak for themselves in each case."

Sememes (213 kb.)

"Sememe [sememic(s)] is a term used in SEMANTIC theories to refer to a minimal UNIT of MEANING. For some, a sememe is equivalent to the meaning of a MORPHEME; for others it is a FEATURE of meaning, equivalent to the notion of 'semantic COMPONENT' or 'semantic feature' in some theories. The term sememics is used as part of the description of strata in STRATIFICATIONAL GRAMMAR; the 'sememic stratum', which handles the SYSTEMS of semantic relationship between LEXICAL ITEMS, is here distinguished from the HYPERSEMEMIC stratum, at which is analysed the relationship between LANGUAGE and the external world. SEMOTACTICS, in this approach, involves the study of SEQUENTIAL arrangement of sememes. See Robins 1980: Ch.8." (David Crystal, A Dictionary of Linguistics and Phonetics, Blackwell, 1991).

These sememes should be distinguished from dha_tus or verb roots since the radicals span both nouns and verbs and also include attributive thoughts connoted, for example, by adjectives.

Sememes are clustered in the Indian Lexicon more like classemes (a term used by some European linguists, e.g. Eugene Coseriu, to refer to the relatively abstract SEMANTIC FEATURES shared by LEXICAL items belonging to different semantic FIELDS, e.g. animate/inanimate, adult/child. The term contrasts with the irreducible semantic features (SEMES) which work, at a very particular level, within a particular semantic field, e.g. table being identified in terms of 'number of legs', 'shape', etc. (See Lyons 1977: Ch.9)

Many sememes are from Sanskrit which re-inforced the development of the literary structures of historical periods of all the languages flowing from proto-Indian lingua franca. CDIAL (with comparative etymological groups collected over a period of 40 years until 1966) provides thousands of possible derivations or phonological reconstructions of 'old Indo-aryan form' in Sanskrit, within its 14,845 head-words. A magnificent attempt was made in the past by linguists of unsurpassed erudition, to identify the sememes of Indian languages. A notable result was the formation and delineation of the grammatical rules of the Sanskrit language.

Indian Lexicon establishes that over 3000 etyma of the Dravidian Etymological Dictionary (DEDR) have concordant sememes in the lexemes of Indo-Aryan and Munda languages, thus negating the linguists' differentiation of the Dravidian tongues from the Indo-Aryan and Munda language streams.

The Indus (Sarasvati-Sindhu) Script decipherment problem

Many lexemes will be dated to circa 3000 B.C., when the most expansive civilization of the times flourished in thousands of settlement sites in the Sarasvati-Sindhu doab. This dating for the selected lexemes is based on a suggested rebus/semantic clustering method to decipher the script of the civilization. The underlying language may be called the Indian; hence, the lexicon is called "An Indian Lexicon".

A paradigm change is posited that circa 3000 B.C., the Indo-Aryan, Dravidian and Munda sub-families of languages had not been differentiated fully. This hypothesis has to be validated further through more linguistic/lexical studies.

Etyma in Niruktam (25 kb.)

Roots (Dha_tupa_t.ha) (156 kb.)

Dha_tupa_t.ha (lit. a study of roots or verb forms) reputedly by Pa_n.ini provides a list of 2200 roots (in ten technical classes) with almost all irregular and noteworthy forms which can be expanded in the series of active, passive, casual desiderative and derivative groups.

The classification of verbal bases in the following ten classes is based on vikaran.a the inserted conjugational affix, the conjugational sign placed between the root and the terminations.

Verb forms (Whitney) (42 kb.)

The roots, verb-forms, and primary derivatives of the Sanskrit language : A supplement to his Sanskrit grammar.

Whitney's work lists all the quotable roots of the Sanskrit language together with the tense and conjugation-systems made from them, the noun and adjective (infinitival and participal) formations that attach themselves most closely to the verb and with the other derivative noun and adjective-stems usually classed as primary. "... since etymology is
from beginning to end a matter of balancing probabilities, and thick-set with uncertainties and chances of error. It has been my intention to err rather upon the side of liberality of inclusion than the opposite... main intent is to furnish the means of examining in their chronologic entirety the groups of words and forms that cluster about the so-called roots in Sanskrit, that they may be studied, and have their relations determined, witth more complete understanding... The meanings added after the roots by no means claim to be exhaustive; they are in general intended only to identify the root... The classes of (verb-)forms that contain the most puzzling problems are the reduplicated ones, and the present stems ending in ya..." (Whitney, 1885, pp. v to xiii).

Phonetic guide (Basic sounds of the language)

Abbreviations : Grammatical

Abbreviations used for linguistic categories and other languages

Bibliography (Textual sources of lexemes)