Characteristics of the Catalan language in AX Semantics
Fundamentals
In Catalan, you need to know the number and gender of a noun to form the accompanying adjectives, determiners, numerals, and pronouns correctly.
Catalan has two genders for nouns: masculine and feminine. There are also two numbers: singular and plural. Additionally, Catalan has only one case for nouns, but three cases for pronouns.
grammatical name | values | examples |
---|---|---|
number | singular | un cotxe vell (one old car) |
plural | cinc cotxes vells (five old cars) | |
gender | masculine | metge vell (old doctor) |
feminine | dona vella (old woman) | |
case (noun) | nominative | el gos (the dog) |
case (pronoun) | nominative | Mario construeix una casa. Ell construeix una casa. (Mario builds a house. He builds a house.) |
accusative | Marío construeix una casa. Marío la construeix. (Mario builds a house. Mario builds it.) | |
dative | La Maria regala un llibre a Luigi. La Maria li regala un llibre. (Maria gives Luigi a book. Maria gives him a book.) | |
adjectives (noun) | after noun | una poma vermella (a red apple) |
before noun | una bona poma (a good apple) | |
verb tenses | present | ell canta (he sings) |
past (imperfect) | ell cantava (he sang) | |
future | ell cantarà (he will sing) |
The standard order of a noun phrase in Catalan is the following:
preposition + determiner + numeral + noun + adjective
See for example:
sobre aquests tres cantants populars
about these three singers popular
PREP DET NUM NOUN ADJ
"about these three popular singers"
Lexicon
Nouns
Catalan nouns are inflected for number. When the lexicon entry is missing, the NLG platform will try to find the most probable gender based on heuristics. However, nouns should be added to the lexicon with their grammatical gender if they do not inflect regularly.
Lexicon entries for nouns may also be necessary for inflecting determiners, adjectives, numerals, and pronouns correctly. They are omitted, if a lexicon entry is required, but missing.
Examples
The basic lexicon entry for pare (father) contains:
- gender: masculine
- inflection table for case and number:
Singular | Plural | |
---|---|---|
Nominative | pare | pares |
Adjectives
In the lexicon, the inflection table encodes number and gender. For adjective position, the default is "after noun".
Verbs
Catalan verbs inflect for person, number, and tense. The most common verbs are encoded in our software. If a verb inflects incorrectly, you should add it to the lexicon.
Container settings
Determiner
The AX NLG platform supports the following determiners for Catalan: definite, indefinite, demonstratives (proximal & distal), and possessives.
Numerals
Nouns will automatically agree with the numeral number when a numeral variable is used. Four types of numerals are possible on the AX NLG platform: cardinal, cardinal as digit, ordinal, and ordinal as digit.
cardinal | ordinal | |
---|---|---|
text | nou dies (nine days) | el novè dia (the ninth day) |
digit | 9 dies (9 days) | el 9è dia (the 9th day) |
For Catalan, both cardinal and ordinal numerals are written out until 20 on the platform, otherwise (above 20) the output is in digit form. For example:
vint cotxes
(twenty cars)
vs.
21 cotxes
(21 cars)
Preposition and Determiner: contraction
If users configure prepositions in the container, they are automatically adapted, if phonetic assimilation needs to happen.
"al" is a contraction of the preposition "a" (to) and the definite article "el" (the). For example:
"Vull anar al parc."
(I want to go to the park.)
Determiner Switch
Determiners can be switched according to lexicon information. If you set the determiner switch for a specific noun in the lexicon, it will automatically switch to another determiner when you add the determiner you intend to switch in the container. The container setting for Alemanya
in the first example is: preposition="a"
, determiner is unset (blank), and case="nominative"
.
a Alemanya
(to Germany)
In the second example, the container settings for Països Baixos
(Netherlands) are still: preposition="a"
, determiner is unset (blank), and case="nominative"
. However, the determiner switches from none
to definite
, because it is configured in the lexicon entry for Països Baixos
. Then the definite determiner automatically contracts with the preposition, so it becomes als
.
als Països Baixos
(to the Netherlands)
Note
If the lexicon entry of a country includes a switch from none to definite, there is still a way to use the country without the article (e.g., just "Netherlands"). The determiner will always remain none
by setting determiner=none
in the container. Only an unset
determiner (blank) triggers the switch from none
to another determiner.