Characteristics of the Polish language in AX Semantics
Fundamentals
Polish has three genders for nouns: masculine, feminine and neuter. Masculine nouns also differentiate between personal, animate and inanimate. Furthermore, there are two numbers: singular and plural.
Polish has seven cases for nouns: nominative, accusative, genitive, dative, instrumental, locative, and vocative.
grammatical name | values | examples |
---|---|---|
gender | masculine inanimate | stary port (the old port) |
masculine animate | stary pies (the old dog) | |
masculine personal group | stary człowiek (the old man) | |
feminine | stara kobieta (the old woman) | |
neuter | stare krzesło (the old chair) | |
number | singular | zerwony dom (a red house) |
plural | dwa duże krzesła (two big chairs) | |
cases (noun) | nominative | pies (dog) |
accusative | Widzę psa. (I see the dog) | |
genitive | dzwonek psa (dog's bell) | |
dative | Daję psu jego piłkę. (I give the ball to the dog.) | |
instrumental | Niewidomy chodzi z psem. (The blind man walks with a dog.) | |
locative | Kot siedzi na psie. (The cat sits on the dog.) | |
vocative | To było dobre, psie! (This was good, dog.) | |
verb tenses | present | On pisze (He writes) |
past | On pisał (He wrote)/ Ona pisała (She wrote) | |
future | On będzie pisał (He will write) |
The standard order of a noun phrase in Polish is the following:
preposition + determiner + numeral + adjective + noun
See for example:
o tych trzech popularnych książkach
about these three popular books[pl,loc]
PREP DET NUM ADJ NOUN
"about these three popular books"
Lexicon
Nouns
For Polish nouns the lexicon needs to encode gender, animacy, preposition changes and case changes. If the lexicon entry is missing, the NLG platform tries to find the most probable gender based on heuristics. The case forms should be added to the lexicon if they are not regular.
Lexicon entries for nouns may also be necessary for inflecting determiners, adjectives, numerals and pronouns correctly. They are omitted, if a lexicon entry is required, but missing.
Examples
The basic lexicon entry for dom (house) contains:
Singular | Plural | |
---|---|---|
gender | male inanimate | male inanimate |
Nominative | dom | domy |
Genitive | domu | domów |
Dative | domowi | domom |
Accusative | dom | domy |
Instrumental | domem | domami |
Locative | domu | domach |
Vocative | domie | domy |
Additionally, the lexicon entry for Seychellen (Seychelles) contains:
- gender, case, and number (like above inflection table)
- replace preposition
do
withna
in accusative
Note
If you need lexicon entries for countries, write to the support about that and you will get them for Polish with automatic handling of determiners.
Adjectives
Polish adjectives agree with the noun in gender/animacy, number and case. For adjectives that inflect irregularly lexicon entries may need to be added.
Verbs
The most common verbs are encoded in our software. If a verb inflects the wrong way, you should add it to the lexicon.
Please note that the future tense can be formed by taking the future tense form of the verb być (to be) and putting the past tense form of the main verb in a separate container after it. See for example the verb pytać
(to ask):
On będzie pytał
(He will [future] ask [past])
Container settings
Determiner
The AX NLG platform supports the following determiners in Polish: demonstrative, possessive, and quantifier (every). Polish determiners exist as independent modifiers, for example:
Twoja gazeta jest przestarzała.
(Your newspaper is outdated.)
Ja czytam v tę gazetę.
(I’m reading this newspaper.)
Numerals
The noun will automatically agree with the numeral number when a numeral variable is used. Four types of numerals are possible on the AX NLG platform: cardinal, cardinal as digit, ordinal, and ordinal as digit.
cardinal | ordinal | |
---|---|---|
text | Dziewięć myszy w moim pokoju. (Nine mice in my room.) | Dziewiąty dzień w szkole. (The ninth day at school.) |
digit | 9 myszy w moim pokoju. (9 mice in my room.) | Dziewiąty dzień w szkole. (The 9th day at school.) |
For Polish, both cardinal and ordinal numerals are written out until 100 on the platform, otherwise (above 100) the output is in digit form. Take cardinal numerals for example:
sto samochodów
(one hundred cars)
vs.
101 samochodów
(101 cars)
In Polish, case and number for noun/adjective change based on numerals (if no other case than nominative or accusative is set), for example:
Numeral | case / number | example |
---|---|---|
1 | Nominative/Singular | 1 czerwony dom (1 red house) |
2-4 | Nominative/Plural | 3 czerwone domy (3 red houses) |
>= 5 | Genitive/Plural | 7 czerwonych domów(7 red houses) |
Also note that verbs agreeing with a phrase including a numeral higher than 4 —or 1— stand in singular number:
3 domy zostają
(3 houses stay)
25 domów zostaje / 1 dom zostaje
(25 houses stay / 1 house stays)
Preposition
If users configure prepositions in the container, they are automatically adapted, if phonetic assimilation needs to happen. The prepositions w
and z
change to we
and ze
before words starting with a consonant. For example:
we Wrocławiu
(in Wrocław)
w Austrii
(in Austria)
Preposition switch
In the sentence
Jechali z Paryża do Australii.
(They were traveling from Paris to Australia.)
no article is added to Australii and the preposition do is not changed.
On our platform, the settings for the container Australii are: preposition="do"
and case="genitive"
.
For the same sentence but with another country (i.e. Seszele (Seychelles)) the platform settings are the same, but the lexical information is changing the results:
Jechali z Paryża na Seszele.
(They were traveling from Paris to the seychelles.)
Observing the two examples above, the preposition is changed from do
to na
and the case is switched to accusative.