Characteristics of the Arabic language in AX Semantics

Fundamentals

In Arabic, you need to know the gender and number of a noun to form the accompanying numerals, adjectives, and pronouns correctly.

Arabic has two genders for nouns: masculine and feminine. There are three numbers: singular, dual and plural. Additionally, Arabic has three cases (nominative, accusative, and genitive) for nouns.

grammatical namevaluesexamples
gendermasculineمعطف جديد [jadīd miʕṭaf]
(new coat)
feminineمكتبة جَدِيدَة [jadīda maktaba]
(new library)
casenominativeالكلب [alkalb]
(the dog)
accusativeأرى الكلبَ [alkalba ʔarā]
(I see the dog)
genitiveجرس الكلب [alkalbi jaras]
(the dog's bell)
numbersingularمعطف
[miʕṭafun]
(a coat)
dualمعطفان
[miʕṭafāni]
(two coats)
pluralخمس معاطف
[maʕāṭifu ḵams]
(five coats)
adjectives (noun)after noun
(right-to-left order)
تفاحة حمراء
[ḥamrāʔ tuffāḥa]
(red apple)
verb tensespresentهو يساعد
[yusāʕidu huwa]
(he helps)
pastهو ساعد
[sāʕada huwa]
(he helped)

The standard order (right-to-left order) of a noun phrase in Arabic is the following:

preposition + determiner + noun + numeral + adjective

See for example:

حول هذه الكتب الثلاثة الجديدة
ADJ     NUM    NOUN  DET PREP
[aljadida althalathat alkutub hadhih hawl]
"about these three new books"

Note

Arabic has a right-to-left writing system. The AX NLG platform only supports left-to-right scripts for now. You have to copy-paste right-to-left written text if you want to use it.

Lexicon

Nouns

Arabic nouns inflect for number and case. Nouns should be added to the lexicon if they are not regular. Lexicon entries for nouns may also be necessary for inflecting determiners, adjectives, numerals and pronouns correctly. They are omitted, if a lexicon entry is required, but missing.

Examples

The basic lexicon entry for كلب [kalb] (dog) contains:

  • gender:  m
  • inflection table for case and number:
SingularDualPlural
Nominativeكلب
[kalb]
كَلْبَانِ
[kalbāni]
كِلَابٌ
[kilābun]
Accusativeكَلْبًا
[kalban]
كَلْبَيْنِ
[kalbayni]
كِلَابًا
[kilāban]
Genitiveكَلْبٍ
[kalbin]
كَلْبَيْنِ
[kalbayni]
كِلَابٍ
[kilābin]

Adjectives

In Arabic, the default position for an adjective is "after noun". In the inflection table case, gender, and number can be encoded.

Verbs

Arabic verbs inflect for person, number, gender, and tense. The most common verbs are encoded in our software. If a verb inflects the wrong way, you should add it to the lexicon.

Container settings

Determiner

The AX NLG platform supports the following determiners for Arabic: definite and possessives. In general, there are no spaces between Arabic determiners and nouns (or adjectives). Besides, determiners are used both for nouns and adjectives.

Pronoun

The AX NLG platform supports the following pronouns for Arabic: personal, demonstrative (proximal + distal), relative (which) and 3rd person possessives.

Numerals

Four types of numerals are possible on the AX NLG platform: cardinal, cardinal as digit, ordinal, and ordinal as digit. Take يوم [yawm] (day) for example:

cardinalordinal
textتسعة أيام
[ʔayyām tisʕa]
(nine days)
اليوم التاسع
[at-tāsiʕ al-yawma]
(the ninth day)
digitأيام
9
[ʔayyām tisʕa]
(9 days)
اليوم 9
[at-tāsiʕ al-yawma]
(the 9th day)

For Arabic, both cardinal and ordinal numerals are written out until 100, otherwise (above 100) the output is in digit form.

Cardinals one and two and all ordinals follow the noun like adjectives. For cardinals above two the noun follows the numeral:

noun + numeral (right-to-left):
نقطة واحدة [wāḥida nuqṭa] (one point)
vs.
numeral + noun (right-to-left):
خمس نقاط [niqāṭ ḵams] (five points)

Additionally, if the cardinal "2" is used in a noun phrase, the dual form of the noun is taken without writing out the numeral (i.e. two). Take نقطة [nuqṭa] (point) for example:

with numeral:
نقطة واحدة [wāḥida nuqṭa]
vs.
without numeral:
نقطتان [nuqṭatāni]

Determiner switch

Determiners can be switched according to lexical information. If you set the determiner switch for a specific noun in the lexicon, it will automatically switch to another determiner when you add the determiner you intend to switch in the container.

Contractions

On the AX NLG platform, specific Arabic prepositions ("ب", "ك", "ل") are automatically joined with determiners.

preposition + determiner + noun (right-to-left):
حالة + definite det + prep (مع) -> مع الحالَةُ [al-ḥālatu maʕa] (with the case)
vs.
preposition + determiner + noun (right-to-left):
حالة + definite det + prep (ب) -> بالحالة [bialhala] (in the case)