Characteristics of the Hungarian language in AX Semantics

Fundamentals

Hungarian is an agglutinative language with multiple suffixes for words (maximally 6 suffixes for one word). Hungarian doesn't need gender.

Opinions differ on how many cases exist in Hungarian and what is or is not an adverbial suffix. On the AX NLG platform, we consider 18 cases.

grammatical namevaluesexamples
numbersingulara lakás
(the apartment)
plurala lakások
(the apartments)
cases (noun)nominativelakás
dativelakásnak
accusativelakást
terminativelakásig
translativelakássá
adessivelakásnál
ablativelakástól
allativelakáshoz
illativelakásba
elativelakásból
delativelakásról
essive-formallakásként
essive-modallakásul
inessivelakásban
instrumental-comitativelakással
causal-finallakásért
sublativelakásra
superessivelakáson
adjectivesbefore nouna régi lakás
(the old apartment)
verb tensespresentő lát (he sees)
pastő látta (he saw)

The standard order of a noun phrase in Hungarian is the following:

determiner + numeral + adjective + noun + preposition

See for example:

ezen a  három  gyönyörű   folyón               át
these   three  beautiful  rivers[sg,superess]  across
DET     NUM    ADJ        NOUN                 PREP
"about these three popular books"

In Hungarian prepositions are rather expressed via case markings and, if used, they are mostly put as postpositions behind the noun. Some postpositions like át can be used both before or after the noun, e.g. át a folyón = a folyón át ["across the river"].

Lexicon

Nouns

For Hungarian nouns in lexicon, the inflection table needs to encode the case and number. The preposition switch and determiner switch are also available.

The lexicon entry for lakás (apartment) contains:

  • inflection table for case and number:
SingularPlural
nominativelakáslakások
dativelakásnaklakásoknak
accusativelakástlakásokat
terminativelakásiglakásokig
translativelakáslakásokká
adessivelakásnállakásoktól
ablativelakástóllakásoktól
allativelakáshozlakásokhoz
illativelakásbalakásokba
elativelakásbóllakásokból
delativelakásróllakásokról
essive-formallakáskéntlakásokként
essive-modallakásul-
inessivelakásbanlakásokban
instrumental-comitativelakássallakásokkal
causal-finallakásértlakásokért
sublativelakásralakásokra
superessivelakásonlakásokon

The lexicon entry for Egyesült Arab Emírségekben (the United Arab Emirates):

  • case and number (like above inflection table)
  • always get the definite determiner ("az")

Note

If you need lexicon entries for countries, write to the support about that and you will get them for Hungarian with automatic handling of determiners.

Adjectives

Hungarian adjectives do not inflect, when used in the attributive position before a noun. There is no need to add lexical entries for Hungarian adjectives.

Verbs

The verb lemma in Hungarian is always the third person singular indefinite present tense. The AX NLG platform covers the most common inflection rules for verbs. Namely, most verbs will automatically inflect correctly after the user configures number, person, and tense in the verb container. However, you should manually add the verb to the lexicon if you find that a verb inflects inaccurately.

Hungarian verbs have definite and indefinite conjugation (e.g. látok "I see it" vs. látom "I see (something)"); the AX NLG platform offers the indefinite type of conjugation as default.

Container settings

Numerals

The noun will automatically agree with the numeral number when you use a numeral variable. There is no need to add additional branches for numeral. Four types of numerals are possible on the AX NLG platform: cardinal, cardinal as digit, ordinal, and ordinal as digit.

cardinalordinal
textKilenc egér a szobámban. (Nine mice in my room.)A kilencedik nap az iskolában. (The ninth day at school.)
digit9 egér a szobámban. (9 mice in my room.)A 9. nap az iskolában .(The 9th day at school.)

For Hungarian, both cardinal and ordinal numerals are written out until 100 on the platform, otherwise (above 100) the output is in digit form. Take cardinal numerals for example:

száz autó
(one hundred cars)
vs.
101 autó
(101 cars)

Determiner

Definite, indefinite, demonstrative, and possessive determiners are available for Hungarian on the AX NLG platform. These determiners will be automatically added when users configure them in the container. On the other hand, if you manually set the determiner switch for the specific noun in the lexicon, it will automatically switch to another determiner when you add the determiner to the container.

If the definite article "a"; is used in a container and it is followed by a vowel, it is automatically adapted to "az":

a nagy autó (the big car)
az iskola (the school)

Possessive determiners are put inside the noun between plural and case marker:

könyve-i[pl.]-m[poss.1st.sg]-et[acc.] (könyveimet = my books)

Case Switch

  • a lot of countries don't need a lexicon entry. So the default for determiners has to be "none" and only added for the (less common) ones that need a definite article.

In the sentence:

Ausztráliából Párizsba utaztak.
(They traveled from Australia to Paris.)

The container configuration needs the cases annotated: elative for Ausztrália and illative for Párizs to apply them correctly.

For some countries the case is not elative but sublative (the suffix changes from ból to ról). For example,

Máltáról Párizsba utaztak.
(They traveled from Malta to Paris.)

Another case switch happens from illative to delative when for example Máltá is in the second position in the sentence:

Párizsból Máltára utaztak.
(They traveled from Paris to Malta.)

Here the suffix ba changes to ra. Both case switches are saved in the lexicon for the countries that need these switches.

Determiner Switch

Determiners can be switched according to lexical information. If you set the determiner switch for a specific noun in the lexicon, it will automatically switch to another determiner when you add the determiner you intend to switch in the container. The container setting for Németországban in the first example is: determiner is unset (blank) and case="inessive".

Németországban
(in Germany)

In the second example, the container settings for Egyesült Arab Emírségek (United Arab Emirates) are still: determiner is unset (blank) and case="inessive". However, the determiner switches from none to definite, because it is configured in the lexicon entry for Egyesült Arab Emírségek. As the result, the definite determiner (i.e. az) is added to Egyesült Arab Emírségekben (case="inessive").

az Egyesült Arab Emírségekben
(in the United Arab Emirates)

Note

If the lexicon entry of a country includes a switch from none to definite, there is still a way to use the country without the article (e.g., just "United Arab Emirates"). The determiner will always remain none by setting determiner=none in the container. Only an unset determiner (blank) triggers the switch from none to another determiner.

Vowel harmony

The AX platform applies vowel harmonies automatically. Vowel harmony means that many inflection suffixes have different variants, that are used depending on what kind of vowels a word contains. Normally this choice is between front and back vowel variants. For example the 1st person singular present tense suffix for verbs has the variants -ok, -ek and -ük and therefore the choice is between back, short-front and long-front vowel words. Since the verb lát includes only a back vowel, the corresponding suffix is -ok and the inflected form is látok.

lát (to see) -> látok (I see)