Characteristics of the Arabic language in AX Semantics
In Arabic, you need to know the gender and number of a noun to form the accompanying numerals, adjectives, and pronouns correctly. Arabic has two genders for nouns: masculine and feminine. There are three numbers: singular, dual, and plural. Additionally, Arabic has three cases (nominative, accusative, and genitive) for nouns.
|gender||masculine||معطف جديد [jadīd miʕṭaf]|
|feminine||مكتبة جَدِيدَة [jadīda maktaba]|
|accusative||أرى الكلبَ [alkalba ʔarā]|
(I see the dog)
|genitive||جرس الكلب [alkalbi jaras]|
(the dog's bell)
|adjectives (noun)||after noun|
|verb tenses||present||هو يساعد|
The standard order (right-to-left order) of a noun phrase in Arabic is the following:
preposition + determiner + noun + numeral + adjective. See for example:
حول هذه الكتب الثلاثة الجديدة ADJ NUM NOUN DET PREP [aljadida althalathat alkutub hadhih hawl] "about these three new books"
Arabic has a right-to-left writing system. The AX NLG platform only supports left-to-right scripts for now. You have to copy-paste right-to-left written text if you want to use it.
Arabic nouns inflect for number and case. Nouns should be added to the lexicon if they are not regular. Lexicon entries for nouns may also be necessary for inflecting determiners, adjectives, numerals and pronouns correctly. They are omitted, if a lexicon entry is required, but missing.
The basic lexicon entry for كلب [kalb] (dog) contains:
- gender: m
- inflection table for case and number:
In Arabic, the default position for an adjective is "after noun". In the inflection table case, gender, and number can be encoded.
Arabic verbs inflect for person, number, gender, and tense. The most common verbs are encoded in our software. If a verb inflects the wrong way, you should add it to the lexicon.
The AX NLG platform supports the following determiners for Arabic: definite and possessives. In general, there are no spaces between Arabic determiners and nouns (or adjectives). Besides, determiners are used both for nouns and adjectives.
The AX NLG platform supports the following pronouns for Arabic: personal, demonstrative (proximal + distal), relative (which) and 3rd person possessives.
Four types of numerals are possible on the AX NLG platform: cardinal, cardinal as digit, ordinal, and ordinal as digit. Take
يوم [yawm] (day) for example:
(the ninth day)
(the 9th day)
For Arabic, both cardinal and ordinal numerals are written out until 100, otherwise (above 100) the output is in digit form.
Cardinals one and two and all ordinals follow the noun like adjectives. For cardinals above two the noun follows the numeral:
noun + numeral (right-to-left): نقطة واحدة [wāḥida nuqṭa] (one point) vs. numeral + noun (right-to-left): خمس نقاط [niqāṭ ḵams] (five points)
Additionally, if the cardinal "2" is used in a noun phrase, the dual form of the noun is taken without writing out the numeral (i.e. two). Take
نقطة [nuqṭa] (point) for example:
with numeral: نقطة واحدة [wāḥida nuqṭa] vs. without numeral: نقطتان [nuqṭatāni]
Determiners can be switched according to lexical information. If you set the determiner switch for a specific noun in the lexicon, it will automatically switch to another determiner when you add the determiner you intend to switch in the container.
On the AX NLG platform, specific Arabic prepositions ("ب", "ك", "ل") are automatically joined with determiners.
preposition + determiner + noun (right-to-left): حالة + definite det + prep (مع) -> مع الحالَةُ [al-ḥālatu maʕa] (with the case) vs. preposition + determiner + noun (right-to-left): حالة + definite det + prep (ب) -> بالحالة [bialhala] (in the case)