Characteristics of the Czech language in AX Semantics
Fundamentals
In Czech, you need to know the gender of a noun to form (together with number and case) the accompanying adjectives, determiners, numerals, and pronouns correctly.
Czech has three genders for nouns: masculine (animate and inanimate), feminine and neuter. There are three numbers: singular, dual, and plural. Additionally, Czech has seven cases for nouns.
grammatical name | values | examples |
---|---|---|
gender | masculine inanimate | starý dům (an old house) |
masculine animate | starý doktor (an old doctor) | |
feminine | stará žena (an old woman) | |
neuter | staré auto (an old car) | |
number | singular | oko (eye) |
plural | oka (eyes) | |
dual | oči (eyes) | |
cases (noun) | nominative | pes (dog) |
genitive | zvonek psa (dog's bell) | |
dative | Dám míček psu. (I give the ball to the dog.) | |
accusative | Vidím psa. (I see the dog.) | |
instrumental | Slepec chodí se psem. (The blind man walks with a dog.) | |
locative | Pták přistane na psu. (The bird lands on the dog.) | |
vocative | To bylo dobré, pse. (This was good, dog.) | |
adjectives (noun) | before noun | červené jablko (red apple) |
verb tenses | present | on čeká (he waits) |
future | on bude čekat (he will wait) | |
past | on čekal (he waited) | |
passive participle | čekán |
The standard order of a noun phrase in Czech is the following:
preposition + determiner + numeral + adjective + noun
See for example:
o těchto třech populárních knihách
about these three popular books[pl,loc]
PREP DET NUM ADJ NOUN
"about these three popular books"
Lexicon
Nouns
Czech nouns are inflected for number and case. Nouns should be added to the lexicon with their grammatical gender if they do not inflect regularly. If the lexicon entry is missing, the NLG platform tries to find the most probable gender based on heuristics.
Lexicon entries for nouns may also be necessary for inflecting determiners, adjectives and pronouns correctly. They are omitted, if a lexicon entry is required, but missing.
Examples
The basic lexicon entry for ruka (hand) contains:
- gender: feminine
- inflection table for case and number:
Singular | Plural | Dual | |
---|---|---|---|
Nominative | ruka | ruce | ruce |
Genitive | ruky | ruk | rukou |
Dative | ruce | rukám | rukám |
Accusative | ruku | ruce | ruce |
Instrumental | rukou | rukami | rukama |
Locative | ruce | rukách | rukou |
Vocative | ruko | ruce | ruce |
Note
If you need lexicon entries for countries, contact support about that and you will get them for Czech with automatic handling of prepositions.
Adjectives
In the lexicon, the inflection table encodes gender, case, and number. For adjective position, the default is "before noun".
Verbs
Czech verbs inflect for person, number, tense, and in some cases gender. The most common verbs are encoded in our software. If a verb inflects incorrectly, you should add it to the lexicon.
Please note that the future tense can be formed in two ways. For imperfective verbs it is formed by taking the future tense conjugation of the verb být (to be) and the infinitive (only the conjugated form needs to be in a container). For perfective verbs, the present form expresses the future.
budu dělat
(I will be doing)
udělám
(I will do, I will have done)
The past tense is formed by the past participle (past tense setting) of the verb and the present tense form of the verb být, which is omitted in the 3rd person.
dělal jsem
(I [male] did)
dělal / dělala / dělalo
(he did / she did / it did)
Container settings
Determiner
The AX NLG platform supports the following determiners for Czech: demonstrative, proximal, distal, and possessive.
Numerals
The noun will automatically agree with the numeral number when a numeral variable is used. Four types of numerals are possible on the AX NLG platform: cardinal, cardinal as digit, ordinal, and ordinal as digit.
cardinal | ordinal | |
---|---|---|
text | devět dní (nine days) | devátý den (the ninth day) |
digit | 9 dní (9 days) | 9. den (the 9th day) |
For Czech, both cardinal and ordinal numerals up to 30 are written out on the platform. The outputs of other numerals are in digit form. Take cardinal numbers (written out vs. digit) for example:
třicet automobilů
(thirty cars)
vs.
31 automobilů
(31 cars)
In Czech, case and number for noun/adjective change based on numerals (if no other case than nominative or accusative is set), for example:
Numeral | case / number | example |
---|---|---|
1 | Nominative/Singular | 1 červený dům (1 red house) |
2-4 | Nominative/Plural | 3 červené domy (3 red houses) |
>= 5 | Genitive/Plural | 7 červených domů (7 red houses) |
Also note that verbs agreeing with a phrase including a numeral higher than 4 - or 1 - stand in singular number:
3 domy zůstávají
(3 houses stay)
23 domů zůstává / 1 dům zůstává
(23 houses stay / 1 house stays)
With plurale tantum nouns (which only exist in the plural) the collective numeral forms are used. See an example below:
dvoje dveře
(2 (sets of) doors)
vs.
dvě auta
(2 cars)
Number
Czech's basic numbers are singular and plural, but it still has some remnants of the dual number. The dual only remained for nouns representing paired body parts like eye(s), leg(s), ear(s), etc. For example, when using "noha" (leg) to refer to the part of the body, the dual form is taken, but when used to refer to a leg on a chair or table, the regular plural is taken:
barva nohou
(the color of the legs [anatomical])
barva noh
(the color of the legs [table/chair])
Additionally, a few of these nouns switch gender when the number changes - neuter in the singular and feminine in the dual. It is a very rare case, so please contact support, if you encounter such a case and need to define different genders per number to inflect associated adjectives and verbs correctly. For example:
Number | Gender | example |
---|---|---|
Singular | Neuter | modré oko (a blue eye) |
Plural | Neuter | mastná oka (greasy eyes/grease drops [in the soup]) |
Dual | Feminine | modré oči (blue eyes [anatomical]) |
Prepositions
If users configure prepositions in the container, they are automatically adapted when phonetic assimilation is required. As the below example shows, the container settings for both examples are: preposition="v"
. However, in the second example, the preposition v
changes to ve
, because the next words start with the same, similar, or multiple consonants.
v Německu
(in Germany)
ve Slovinsku
(in Slovenia)
Preposition switch
On the AX NLG platform, the settings for the container Německo (Germany) are: preposition="v"
and case="loc"
.
v Německu
(in Germany)
For the below sentence with a different place (i.e. Seychely (the Seychelles)), the platform settings are the same as above (preposition="v", case="loc"
), but the lexical information changes the preposition and shows a switch from v
to na
:
na Seychelách
(in the Seychelles)