Creating a Data Model

German Version

In order to generate variant texts, the software first needs a dataset with structured data for each generated text, from which it can take the information on the content of each text - you can later generate a text for each dataset.

This document will help you design a model for the data on your topic.

Step 1: Create a Concept Based on a Product

When designing a data model, imagine that the content you are writing about is an item with the properties that describe it - whether you are actually writing on products, categories, events, football matches, and so on.

Step 2: Define the Characteristics

Consider what characteristics your "products" have and what your texts should say.

In the case of different events to be written on, these could be the date, time, location, type of event, etc. that are to be described individually in each text for each event. These properties can vary per "product" and thus per text - which is why we have to record them in a structured form in the data so that the software can access them.

Step 3: Make a Table

Write each property in its own Excel column, e.g.:

Column 1: Type_of_event, Column 2: Date, Col 3: Location and so forth

WARNING

  • Do not use spaces for data field names (connect words with _)
  • Fill in a column "uid" for a unique ID of each record
  • Fill in a data field "name"

Step 4: Set Data

Fill in the properties of your "products" now.

Each row corresponds to a product, so you maintain all the properties of a product in one row.

It is important to ensure uniformity! Always give the same circumstance the same name and do not write whole sentences in one field - the smaller the data, the better.

Example The properties of the sofa are divided into individual columns, so that many sofas with their properties can be entered in them.

data-model-table

After formulating and configuring the sentences, you can then generate the following text, for example:

data-model-text

Dark green passages: contents that come directly from the data Light green passages: contents that can be derived from the data

Example File to Fill in

You can use the following file as a template for filling in your data:

Click to downloadopen in new window

Real World Example

A sophisticated data model can also be used to implement translation processes. Here is an excerpt of such a document:

{
	"uid":1,
	"Gender": "Kvinne",
	"Gender_en": "Woman",
	"User_Skills": "Avansert, Expert",
	"User_Skills_en": "Advanced, Expert",
	"Winter_Ski_Ski_Boots_Type": "Piste",
	"Winter_Ski_Ski_Boots_Type_en": "Piste",
	"Winter_Binding": "Inkludert",
	"Winter_Binding_en": "Included"
}

The data fields with the ending _en are used to create the text logic and the general text in English. The data fields without the ending contain the translated content to be used in the text. In Writer, the statements and branches have triggers based on the English data fields. The corresponding content in these statements and branches contains containers that use the content of the other data fields (not English).

The process of text generation is as follows. You send the document to a collection with a desired language to automatically generate text in that language.

The advantage of this data model is that it can also be used for any other language. By simply replacing the content of the non-English data fields and sending the new document to the desired collection.

TIP

If you do not have the resources to manage such translation processes on your site, you can also use the NLG Cloud. Use the Lexicon as a dictionary for unknown or ambiguous entries and the Lookup nodes to translate the content of your data fields in Transform into the right language.