Characteristics of the Catalan language in AX Semantics

Fundamentals

In Catalan, you need to know the number and gender of a noun to form the accompanying adjectives, determiners, numerals, and pronouns correctly.

Catalan has two genders for nouns: masculine and feminine. There are also two numbers: singular and plural. Additionally, Catalan has only one case for nouns, but three cases for pronouns.

grammatical namevaluesexamples
numbersingularun cotxe vell
(one old car)
pluralcinc cotxes vells
(five old cars)
gendermasculinemetge vell
(old doctor)
femininedona vella
(old woman)
case (noun)nominativeel gos (the dog)
case (pronoun)nominativeMario construeix una casa. Ell construeix una casa.
(Mario builds a house. He builds a house.)
accusativeMarío construeix una casa. Marío la construeix.
(Mario builds a house. Mario builds it.)
dativeLa Maria regala un llibre a Luigi. La Maria li regala un llibre.
(Maria gives Luigi a book. Maria gives him a book.)
adjectives (noun)after noununa poma vermella
(a red apple)
before noununa bona poma
(a good apple)
verb tensespresentell canta
(he sings)
past (imperfect)ell cantava
(he sang)
futureell cantarà
(he will sing)

The standard order of a noun phrase in Catalan is the following: preposition + determiner + numeral + noun + adjective. See for example:

sobre   aquests tres    cantants populars   
about   these   three   singers  popular
PREP    DET     NUM     NOUN     ADJ
"about these three popular singers"

Lexicon

Nouns

Catalan nouns are inflected for number. When the lexicon entry is missing, the NLG platform will try to find the most probable gender based on heuristics. However, nouns should be added to the lexicon with their grammatical gender if they do not inflect regularly.

Lexicon entries for nouns may also be necessary for inflecting determiners, adjectives, numerals, and pronouns correctly. They are omitted, if a lexicon entry is required, but missing.

Examples

The basic lexicon entry for pare (father) contains:

  • gender: masculine
  • inflection table for case and number:
SingularPlural
Nominativeparepares

Adjectives

In the lexicon, the inflection table encodes number and gender. For adjective position, the default is "after noun".

Verbs

Catalan verbs inflect for person, number, and tense. The most common verbs are encoded in our software. If a verb inflects incorrectly, you should add it to the lexicon.

Container settings

Determiner

The AX NLG platform supports the following determiners for Catalan: definite, indefinite, demonstratives (proximal + distal), and possessives.

Numerals

The noun will automatically agree with the numeral number when a numeral variable is used. Four types of numerals are possible on the AX NLG platform: cardinal, cardinal as digit, ordinal, and ordinal as digit.

cardinalordinal
textnou dies
(nine days)
el novè dia
(the ninth day)
digit9 dies
(9 days)
el 9è dia
(the 9th day)

For Catalan, both cardinal and ordinal numerals are written out until 20, otherwise (above 20) the output is in digit form. For example:

vint cotxes
(twenty cars)
21 cotxes
(21 cars)

Preposition and Determiner: contraction

If users configure prepositions in the container, they are automatically adapted, if phonetic assimilation needs to happen.

"al" is a contraction of the preposition "a" (to) and the definite article "el" (the). For example:

"Vull anar al parc." 
(I want to go to the park.)

Determiner Switch

Determiners can be switched according to lexicon information. If there are definite determiners (e.g., for country names) for a noun phrase, they will be activated for the container. For the first example, the default for determiner is used: None (no article). The container setting is: determiner="None", and case="nominative".

Viatgen a Alemanya.
(They travel to Germany.)

As for the second example, the container settings for Països Baixos (Netherlands) are still: determiner="None" and case="nominative", but determiner switch is set in its lexicon information.

Viatgen als Països Baixos.
(They travel to the Netherlands.)

Thus, a determiner switches from none to definite. Then, the definite determiner automatically contracted with preposition, so it becomes als.