What to expect in this guide
This Basic Guide is the starting point for learning to use our platform: It will lead you through the main steps of text generation. You will learn how to create projects, import and analyze data, write statements and generate text.
On this page you will get an introduction into the topic and you can immediately practice on the platform every step you have learned. At the end of the seminar you will have your first project with automatically generated text.
Requirements: For this Basic Guide you don't need any previous knowledge.
Text Projects in Sequence
A text generation project can be divided into individual segments, each of which focuses on a certain task. The tasks in these segments build on each other so that they are usually completed one after the other. In practice however, the tasks are sometimes interlocked, for instance you go back to analyze your data after writing some statements.
Data input Text generation is based on structured data. The software can only talk about things that can be derived from data. So the first step is to upload the data, you want to use.
Data analysis Take a look at the data and check which information you can extract from them.
Text conception You decide which information the text should have, how this information should be formulated and how your text should be structured.
Rule set In the rule set you can access your data and create logical evaluations about it. You can also define which words should be used for the outcomes of those evaluations.
Quality assurance After you finish the configuration of the project, you perform a quality assurance step, where you check the project and its results for correctness.
Text generation Now you are ready to generate your own automated texts.
Start a Text Generation Project and import Data
On the NLG Platform the organizational unit is the "Project". A Project contains everything you need to generate texts:
- the data that you import.
- the ruleset you develop for your texts.
- your statements, e.g. the part of text you write.
Assess and prepare Data for the NLG Platform
Data is a basic prerequisite for text generation on the NLG platform. It provides the essential input for the content of your texts. In order to get meaningful and useful texts, the data you use has to meet a few requirements.
Structured format of the data
Data must be provided in a structured format. This means that data should be provided in separate data fields, e.g. you cannot use continuous text as data source. Tables as in Microsoft Excel are fine, for more complex data structures the JSON data format is supported.
Quality criteria for the data
Structure by itself is not sufficient – data must also meet certain quality requirements:
Technical quality criteria require uniform filling (all data sets express the same fact identically: “black” always has to be “black”) and their ability to be processed by a machine. Data can be machine-processed if, for example, lists are clearly separated, the same units are always used or words are always present in the same grammatical form.
Editorial quality criteria are, for example, correct spelling of the content, the significance of the data fields (a field that always has the same value is textually less relevant than a field that has many values), or whether sufficient data records actually have data in the field so that writing a text is worthwhile.
Start your text generation project:
- Create the project and the collection that you will work with in this tutorial.
- Learn how to upload data to your collection.
- Take a look at one document stored in JSON format.
From data to nodes
The data you have uploaded has to be analyzed and prepared for being used in your text by defining which fields will be used as variables and what kind of values. On the NLG Platform, this procedure is called "adding data nodes". The created data nodes are then later available for further processing in other areas of the platform.
To help you decide which fields to use in the text, the software first runs an analysis. Then you choose the categories and create nodes that you can use in the next steps of the text generation.
Take a closer look at your imported data and create the data nodes
- Understand how the software analyzes the data.
- Add some nodes for further use in your project.
Write statements and define the variable parts
After making the relevant part of your data available for your text, in the next step you will design a text concept that describes what kind of information will appear in your texts. On the NLG platform, a "statement"* is the content unit in this text concept. For example, a headline or an introduction are statements that can be found in most projects.
- One statement can contain one or more sentences.
- You can define different verbal output for different data values within your statements.
- With triggers you decide under which condition a statement will appear.
A statement is composed of static parts that you have to formulate and of variable parts that are whether transferred directly form your data fields or are derived from them. To define the variable parts of your statements you mark this parts as "containers" at the correct positions in the statement.
Within the containers you have several options to specify these variable parts: The outputs of the containers can be simple data values or the content you define in self-created variables, and they offer many possibilities in their configuration. In the container settings you can manage the content, the grammar, the triggers and the formatting of each container.
Whenever you want the content of a container to appear under a certain condition, in a certain style or using a certain grammatical role, case, number etc., you can adjust that in its settings. Therefore, you can set three different types of containers:
- variable output
- grammar provider
- static text
Preview Test Object
When writing on the NLG platform you are not using placeholders, but formulating your statement based on a single real data set: "Samsung presents the smartphone Galaxy Note 4.". For this purpose, you can configure suitable data sets as Preview Test Objects:
- in the Data Sources click on the star in the line of the documents.
- in the Composer you can choose between the designated preview test object or to configure test objects as well.
Switching between different Preview Test Objects is a good method to review your statements and check how they vary with different documents.
Write a statement and create containers
- Get to know the sections of the Write Tab.
- Add statements and create containers.
- Adjust the grammatical output.
- Check your possible outputs by changing the Preview Test Object.
Define the logics and conditions
In order to output more than just data values, it is possible to define your own conditions for outputs. In the NLG platform, this is done with nodes and connections in the Transform Tab of the Composer provides a graphical environment for facilitating the formulation of conditions.
Nodes - anatomy and functions
In the Transform Area Nodes look like small boxes and allow you to modify, evaluate and pass on data. It is possible to connect different nodes via the small yellow and grey plugs, the so-called ports. You use nodes whenever you want to add some individual logic to your project.
In the preview field at the base of each node, you can see the output of the current test object. Each time you change the test object, the changes will be applied automatically. Similar to the statement preview, you also have a node preview to check whether your conditions are working correctly. The color (red or green) of the small circles at the lower edge of the nodes indicates whether a condition for the chosen test object is true or false.
Like data nodes, most node types have ports to connect to other nodes. To use the content of a node in another node, simply drag from the output port of the first node to the input port of the corresponding node. Please note that the color of the ports has to match to be able to connect them.
|position / color||function|
|left side ports||Input ports for incoming value|
|right side ports||Output ports for results|
|grey ports||have primarily the function to distribute content|
|orange ports||have primarily a switch function|
|data nodes||Data nodes allow you to process the values in your data fields and to insert the variables in the correct positions in the statements. (You can create them in the ANALYZE tab)|
|conditions||You can use conditions to create different processing branches that lead to different variable outputs.|
|variables||Variables are the only node types that can be used directly in the text – either to trigger an entire statement or to control the output within a statement.|
Set up conditions and logics
- Get familiar with the layout of the Transform Tab.
- Configure conditions for different field values.
- Create triggers.
- Process conditions for using them in your text.
- Use branches for creating different output.
When writing a statement, you have the possibility to branch off at certain points in your statement to create more than one way of expressing your information or activating different statements depending on the data. You can create branches that contain one or more words, phrases, or even entire sentences. The branches appear one below the other in the WRITE section, so you can easily keep track of your construction. For each branch you can set different branching modes to decide which branch to render, so you can easily implement logic while writing your text.
What you can do with branches:
- Set synonyms - words, sentence parts or complete sentences for variation in your texts.
- Define a proper verbal output for different data values.
- Manage variants for your statements.
Use Branches to put different statements into words
- Practice working with conditions and triggers.
- Create branches that span single words or parts of sentences.
Compose your text
After you have written your statements, you can arrange them: You set the order of their appearance and define under what conditions each statement is triggered or blocked. To create a wider range of different texts, you can also choose to trigger different parts of one statement.
Nodes as variables: Trigger
To switch a statement on or off, you can use the node type "trigger". A trigger is a mechanism for activating and deactivating containers, statements or stories. You define your trigger nodes in the Transform or in the Write area. These are the only logical nodes that can be used to trigger content in the Write Tab. They consist of only one truth port that can be linked to a condition node. When organizing your statements and create your stories in the Narrate Tab you can set triggers for each statement or story, too.
Tip: When you create a new story check the default trigger setting is "off". So if you want this new story to appear, remember to switch the trigger on.
Manage your composition in the Narrate Tab
When you switch to the Narrate Tab, you can put your story together and determine its course: You manage the composition of the text, the style of the statements, the triggers, and the names for your story and statements. In the Narrate Tab you also control the order of the statements in your texts. To increase their variety, you can create several stories and vary the order of the statements. In addition, you can assign triggers to the stories that determine the condition under which a story should be used.
Organize your statements and put them into different stories
- Get familiar with the lay out of the Narrate Tab.
- Activate and deactivate statements under certain conditions.
- Name your statements and add the associated triggers in the Narrate Tab.
- Create different stories.
Review and generate the texts
To control the quality of the generating texts a check your outcome under different conditions and documents is recommended. In the Review Tab you can look at your data sets and possible produced texts and look for spelling or content mistakes or logic errors. In this view you can also find out, how the statements work together.
The actual generating is the final step in your text project. Depending on your usecase and the stage of your project, you might want to produce different amounts and subsets of your texts. Therefore, we allow different modes of text generation. You can decide whether you would like to start generating single texts or the entire text mass.
|single||You can produce a text for a single document. This text is going to be produced instantly.|
|all||You can produce texts for all documents in a collection. This will be handled as a bulk operation.|
|filtered||You can produce texts for a certain subset of your data. This subset is defined by setting filters. You can filter your collection with search terms or by text production status. Filters are only available inside a collection, and will not work across collections. This is handled as a bulk operation as well.|
|auto generation||In the collection settings, you can configure "auto generation", this will trigger a text production automatically, if a new document is added. The automatic regeneration of a text if a document is changed is available as well.|
How to export the generated texts.
If you want to handle your text exports manually, you can download text exports in the web interface. These files are updated automatically and contain a snapshot of your text production, which can be up to an hour old. The file formats are again JSON, CSV and Excel. You can select the format in the settings of every collection. The UID allows you to match the produced text to your database. The exports contain both raw text and HTML.
Review your project and generate one or more texts out of your ruleset
- Perform a quality assurance in the Review Tab*.
- Get an overview over the Results Area.
- Generate your texts.
Congrats! You have successfully completed the AX Seminar, now you are no longer a newbie in NLG. It is like learning a new language or getting a driver’s license: To improve your skills, the best thing you can do is to practice.
With the AX Seminar you have acquired basic knowledge of the principles of text generation and skills to work with the NLG Cloud. Now you are able to take the first steps in your own project. Do you have structured data? Ideas for statements? Then start your projectright now!
Do you have any special requirements or any further questions? The AX Semantics support team can assist you quickly via chat on the NLG platform.