Characteristics of the Catalan language in AX Semantics


In Catalan, you need to know the number and gender of a noun to form the accompanying adjectives, determiners, numerals, and pronouns correctly.

Catalan has two genders for nouns: masculine and feminine. There are also two numbers: singular and plural. Additionally, Catalan has only one case for nouns, but three cases for pronouns.

grammatical namevaluesexamples
numbersingularun cotxe vell
(one old car)
pluralcinc cotxes vells
(five old cars)
gendermasculinemetge vell
(old doctor)
femininedona vella
(old woman)
case (noun)nominativeel gos (the dog)
case (pronoun)nominativeMario construeix una casa. Ell construeix una casa.
(Mario builds a house. He builds a house.)
accusativeMarío construeix una casa. Marío la construeix.
(Mario builds a house. Mario builds it.)
dativeLa Maria regala un llibre a Luigi. La Maria li regala un llibre.
(Maria gives Luigi a book. Maria gives him a book.)
adjectives (noun)after noununa poma vermella
(a red apple)
before noununa bona poma
(a good apple)
verb tensespresentell canta
(he sings)
past (imperfect)ell cantava
(he sang)
futureell cantarà
(he will sing)

The standard order of a noun phrase in Catalan is the following:

preposition + determiner + numeral + noun + adjective

See for example:

sobre   aquests tres    cantants populars
about   these   three   singers  popular
PREP    DET     NUM     NOUN     ADJ
"about these three popular singers"



Catalan nouns are inflected for number. When the lexicon entry is missing, the NLG platform will try to find the most probable gender based on heuristics. However, nouns should be added to the lexicon with their grammatical gender if they do not inflect regularly.

Lexicon entries for nouns may also be necessary for inflecting determiners, adjectives, numerals, and pronouns correctly. They are omitted, if a lexicon entry is required, but missing.


The basic lexicon entry for pare (father) contains:

  • gender: masculine
  • inflection table for case and number:


In the lexicon, the inflection table encodes number and gender. For adjective position, the default is "after noun".


Catalan verbs inflect for person, number, and tense. The most common verbs are encoded in our software. If a verb inflects incorrectly, you should add it to the lexicon.

Container settings


The AX NLG platform supports the following determiners for Catalan: definite, indefinite, demonstratives (proximal & distal), and possessives.


Nouns will automatically agree with the numeral number when a numeral variable is used. Four types of numerals are possible on the AX NLG platform: cardinal, cardinal as digit, ordinal, and ordinal as digit.

textnou dies
(nine days)
el novè dia
(the ninth day)
digit9 dies
(9 days)
el 9è dia
(the 9th day)

For Catalan, both cardinal and ordinal numerals are written out until 20 on the platform, otherwise (above 20) the output is in digit form. For example:

vint cotxes
(twenty cars)
21 cotxes
(21 cars)

Preposition and Determiner: contraction

If users configure prepositions in the container, they are automatically adapted, if phonetic assimilation needs to happen.

"al" is a contraction of the preposition "a" (to) and the definite article "el" (the). For example:

"Vull anar al parc." 
(I want to go to the park.)

Determiner Switch

Determiners can be switched according to lexicon information. If you set the determiner switch for a specific noun in the lexicon, it will automatically switch to another determiner when you add the determiner you intend to switch in the container. The container setting for Alemanya in the first example is: preposition="a", determiner is unset (blank), and case="nominative".

a Alemanya
(to Germany)

In the second example, the container settings for Països Baixos (Netherlands) are still: preposition="a", determiner is unset (blank), and case="nominative". However, the determiner switches from none to definite, because it is configured in the lexicon entry for Països Baixos. Then the definite determiner automatically contracts with the preposition, so it becomes als.

als Països Baixos
(to the Netherlands)


If the lexicon entry of a country includes a switch from none to definite, there is still a way to use the country without the article (e.g., just "Netherlands"). The determiner will always remain none by setting determiner=none in the container. Only an unset determiner (blank) triggers the switch from none to another determiner.