Characteristics of the Portuguese language in AX Semantics
Fundamentals
In Portuguese, you need to know the gender of the noun in order to form the accompanying determiners, adjectives, and numerals correctly.
Portuguese has two genders for nouns: masculine and feminine. There are two numbers: singular and plural.
Additionally, it only has one case for nouns: nominative. But for personal pronouns, Portuguese has accusative and dative cases.
grammatical name | values | examples |
---|---|---|
gender | masculine | o porto azul (the blue port) |
feminine | a cadeira azul (the blue chair) | |
number | singular | uma casa vermelha (a red house) |
plural | duas casas vermelhas (two red houses) | |
cases (noun) | nominative | o cão (the dog) |
cases (pronoun) | nominative | Mário constrói uma casa. Ele constrói uma casa. (Mario builds a house. He builds a house.) |
accusative | Mario constrói uma casa. Mario la constrói. (Maria builds a house. Maria builds it.) | |
dative | Maria dá um livro a Luigi. Maria lhe dá um livro. (Maria gives Luigi a book. Maria gives him a book.) | |
adjectives (noun) | after-noun | o telefone preto (the black phone) |
before-noun | o excelente produto (the excellent product) | |
verb tenses | present | ele escreve (he writes) |
past (preterite) | ele escreveu (he wrote) | |
past participle | escrito (written) | |
imperfect | ele escrevia (he wrote) | |
future | ele escreverá (he will write) | |
gerund | escrevendo (writing) |
The standard order of a noun phrase in Portuguese is the following:
preposition + determiner + numeral + noun + adjective
See for example:
com estes três livros populares
with these three books[pl] popular
PREP DET NUM NOUN ADJ
"with these three popular books"
Lexicon
Nouns
For Portuguese nouns, the lexicon needs to encode gender and determiner changes. If the lexicon entry is missing, the NLG platform tries to find the most probable gender based on heuristics. The plural forms should be added to the lexicon if they are not regular.
Lexicon entries for nouns may also be necessary for inflecting determiners, adjectives and pronouns correctly. These are omitted completely from the output, if a lexicon entry is required, but missing.
Examples
The basic lexicon entry for casa (house) contains:
- gender: f
- inflection table for case and number:
Singular | Plural | |
---|---|---|
Nominative | casa | casas |
Note
If you need lexicon entries for countries, write to the support about that and you will get them for Portuguese with automatic handling of determiners.
Adjectives
In the lexicon the inflection table for gender and number can be encoded, as well as the adjective position (before the noun or after the noun).
For adjective position, the default is "after noun". Certain adjectives should stay before the noun. In that case, "before noun" has to be selected in the lexicon. For instance, some adjectives like bom ("good") or belo ("nice") often precede the noun:
o bom livro
(the good book)
Verbs
The most common verbs are encoded in our software. If a verb inflects the wrong way, you should add it to the lexicon.
Container settings
Determiners
The AX NLG platform supports the following determiners for Portuguese: definite, indefinite, demonstrative, distal, proximal, possessive, and quantifier (every).
Pronouns
The AX NLG platform supports the following pronouns for Portuguese: proximal demonstrative, personal, reflexive and possessives.
Note that there are 3 variants for the accusative form of the personal pronoun (o, lo, no), which are all offered in the pronoun options. The use of the variants depend on the word preceding the pronoun. We recommend choosing "Personal", if it is clear that the "lo"-variant is needed and "Personal (nasal)", if it is clear that the "no"-variant is needed. When the preceding verb can change depending on the data, we recommend using "Personal (base)", which automatically chooses between the "o"- and "no"-variant depending on the preceding word.
Numerals
The noun will automatically agree with the numeral number when a numeral variable is used. Four types of numerals are possible on the AX NLG platform: cardinal, cardinal as digit, ordinal, and ordinal as digit.
cardinal | ordinal | |
---|---|---|
text | nove dias (nine days) | o nono dia (the ninth day) |
digit | 9 dias (9 days) | o 9º dia (the 9th day) |
For Portuguese, both cardinal and ordinal numerals are written out until 100 on the platform, otherwise (above 100) the output is in digit form. Take cardinal numerals for example:
cem automóveis
(one hundred cars)
101 automóveis
(101 cars)
Preposition contractions
If users configure prepositions and determiners in the container, they are automatically adapted when phonetic assimilation needs to happen. For instance, the preposition em
is contracted with the definite determiner a
:
na cozinha
(in the kitchen)
Additionally, if users configure prepositions with demonstrative pronouns or third-person personal pronouns, they will also be automatically adjusted if contractions are applicable. Take the preposition de
for example:
destas [de + estas]
(of these)
dele [de + ele]
(of him)
Determiner switch
Determiners can be switched according to lexical information. If you set the determiner switch for a specific noun in the lexicon, it will automatically switch to another determiner when you add the determiner you intend to switch in the container. The container setting for Berlim
in the first example is: preposition="para"
, determiner is unset (blank), and case="nominative"
.
para Berlim
(to Berlin)
In the second example, the container setting for Holanda
is: preposition="para"
, determiner is unset (blank), and case="nominative"
. However, the determiner switches from none
to definite
, because it is configured in the lexicon entry for Holanda
. As the result, the definite determiner (i.e. a
) is added to Holanda
.
para a Holanda
(to the Netherlands)
Note
If the lexicon entry of a country includes a switch from none to definite, there is still a way to use the country without the article (e.g., just "Netherlands"). The determiner will always remain none
by setting determiner=none
in the container. Only an unset
determiner (blank) triggers the switch from none
to another determiner.
Language Variants
The AX NLG platform offers 2 variants of the Portuguese language:
- Portugal (Standard)
- Brazil
The differences between these variants are mostly lexical. The grammatical difference shows in the forms of the possessive determiners: for example my book
is o meu livro
in Portugal, but meu livro
in Brazil.