Characteristics of the Czech language in AX Semantics

Fundamentals

In Czech, you need to know the gender of a noun to form (together with number and case) the accompanying adjectives, determiners, numerals, and pronouns correctly.

Czech has three genders for nouns: masculine (animate and inanimate), feminine and neuter. There are three numbers: singular, dual, and plural. Additionally, Czech has seven cases for nouns.

grammatical namevaluesexamples
gendermasculine inanimatestarý dům
(an old house)
masculine animatestarý doktor
(an old doctor)
femininestará žena
(an old woman)
neuterstaré auto
(an old car)
numbersingularoko (eye)
pluraloka (eyes)
dualoči (eyes)
cases (noun)nominativepes
(dog)
genitivezvonek psa
(dog's bell)
dativeDám míček psu.
(I give the ball to the dog.)
accusativeVidím psa.
(I see the dog.)
instrumentalSlepec chodí se psem.
(The blind man walks with a dog.)
locativePták přistane na psu.
(The bird lands on the dog.)
vocativeTo bylo dobré, pse.
(This was good, dog.)
adjectives (noun)before nounčervené jablko
(red apple)
verb tensespresenton čeká
(he waits)
futureon bude čekat
(he will wait)
paston čekal
(he waited)
passive participlečekán

The standard order of a noun phrase in Czech is the following:

preposition + determiner + numeral + adjective + noun

See for example:

o      těchto  třech  populárních  knihách
about  these   three  popular      books[pl,loc]
PREP   DET     NUM    ADJ          NOUN
"about these three popular books"

Lexicon

Nouns

Czech nouns are inflected for number and case. Nouns should be added to the lexicon with their grammatical gender if they do not inflect regularly. If the lexicon entry is missing, the NLG platform tries to find the most probable gender based on heuristics.

Lexicon entries for nouns may also be necessary for inflecting determiners, adjectives and pronouns correctly. They are omitted, if a lexicon entry is required, but missing.

Examples

The basic lexicon entry for ruka (hand) contains:

  • gender: feminine
  • inflection table for case and number:
SingularPluralDual
Nominativerukaruceruce
Genitiverukyrukrukou
Dativerucerukámrukám
Accusativerukuruceruce
Instrumentalrukourukamirukama
Locativerucerukáchrukou
Vocativerukoruceruce

Note

If you need lexicon entries for countries, contact support about that and you will get them for Czech with automatic handling of prepositions.

Adjectives

In the lexicon, the inflection table encodes gender, case, and number. For adjective position, the default is "before noun".

Verbs

Czech verbs inflect for person, number, tense, and in some cases gender. The most common verbs are encoded in our software. If a verb inflects incorrectly, you should add it to the lexicon.

Please note that the future tense can be formed in two ways. For imperfective verbs it is formed by taking the future tense conjugation of the verb být (to be) and the infinitive (only the conjugated form needs to be in a container). For perfective verbs, the present form expresses the future.

budu dělat
(I will be doing)

udělám
(I will do, I will have done)

The past tense is formed by the past participle (past tense setting) of the verb and the present tense form of the verb být, which is omitted in the 3rd person.

dělal jsem
(I [male] did)

dělal / dělala / dělalo
(he did / she did / it did)

Container settings

Determiner

The AX NLG platform supports the following determiners for Czech: demonstrative, proximal, distal, and possessive.

Numerals

The noun will automatically agree with the numeral number when a numeral variable is used. Four types of numerals are possible on the AX NLG platform: cardinal, cardinal as digit, ordinal, and ordinal as digit.

cardinalordinal
textdevět dní
(nine days)
devátý den
(the ninth day)
digit9 dní
(9 days)
9. den
(the 9th day)

For Czech, both cardinal and ordinal numerals up to 30 are written out on the platform. The outputs of other numerals are in digit form. Take cardinal numbers (written out vs. digit) for example:

třicet automobilů
(thirty cars)
vs.
31 automobilů
(31 cars)

In Czech, case and number for noun/adjective change based on numerals (if no other case than nominative or accusative is set), for example:

Numeralcase / numberexample
1Nominative/Singular1 červený dům (1 red house)
2-4Nominative/Plural3 červené domy (3 red houses)
>= 5Genitive/Plural7 červených domů (7 red houses)

Also note that verbs agreeing with a phrase including a numeral higher than 4 - or 1 - stand in singular number:

3 domy zůstávají
(3 houses stay)

23 domů zůstává / 1 dům zůstává
(23 houses stay / 1 house stays)

Number

Czech's basic numbers are singular and plural, but it still has some remnants of the dual number. The dual only remained for nouns representing paired body parts like eye(s), leg(s), ear(s), etc. For example, when using "noha" (leg) to refer to the part of the body, the dual form is taken, but when used to refer to a leg on a chair or table, the regular plural is taken:

barva nohou
(the color of the legs [anatomical])
barva noh
(the color of the legs [table/chair])

Additionally, a few of these nouns switch gender when the number changes - neuter in the singular and feminine in the dual. It is a very rare case, so please contact support, if you encounter such a case and need to define different genders per number to inflect associated adjectives and verbs correctly. For example:

NumberGenderexample
SingularNeutermodré oko (a blue eye)
PluralNeutermastná oka (greasy eyes/grease drops [in the soup])
DualFemininemodré oči (blue eyes [anatomical])

Prepositions

If users configure prepositions in the container, they are automatically adapted when phonetic assimilation is required. As the below example shows, the container settings for both examples are: preposition="v". However, in the second example, the preposition v changes to ve, because the next words start with the same, similar, or multiple consonants.

v Německu
(in Germany)

ve Slovinsku
(in Slovenia)

Preposition switch

On the AX NLG platform, the settings for the container Německo (Germany) are: preposition="v" and case="loc".

v Německu
(in Germany)

For the below sentence with a different place (i.e. Seychely (the Seychelles)), the platform settings are the same as above (preposition="v", case="loc"), but the lexical information changes the preposition and shows a switch from v to na:

na Seychelách
(in the Seychelles)