The Analyze Section

The data Analysis feature helps you understand the data you have uploaded to the AX NLG Cloud. You find this feature in the Transform tab of the Composer area. It delivers a brief overview of the distinctive values that are stored in your data and counts them. This is done separately for each data field.

For planning your writing and creating the text concept you need to know what you can expect in your data. Directly after uploading your data In the Data Sources area you get detailed information broken down by individual datasets.

Our Analysis, on the other hand, provides you with an evaluation of the data, such as the distribution of their values. It displays all data fields it could find in the chosen analysis, broken down by path, count and type. Out of the results of an Analysis you can also directly create Data nodes or search for documents with specific values.

An example: Imagine a data field “color”, which you want to write a text about. You need to know what exact colors are in your data and if they all fit into your text. Out of “red”, “green”, “blue”, “multi color” all work, except “multi color” which you would have to convert to “multi colored” to fit in a sentence. This new feature now delivers exactly that information and even counts how often a certain color value appears in all your data.

TIP

You can view around 200 different data fields per analysis at once.

Filter for Analyses

Multiple Analyses can be created for each project, with selection on data subsets. In this view you can access the Analyses, which you can name individually, at any time using the list in the left panel. Their results are also available in the transform tool via Analysis result node.

Run the new Analyses by clicking on Start. If there are changes in your Collections you can refresh your Analysis.

For any new Analysis you choose the area of the data it will cover:

  • Use all data in this project
  • Only languages: filter the languages
  • Only collections: choose the collections you want to use

Information breakdown

After running an Analysis you will get a table with an overview of all datasets of the analyzed data grouped into the categories:

  • Path field names in hierarchically order
  • Count the number of filled fields
  • Type(s) whether the content is text or number

For each Path there is additional detailed information about the distribution of datatypes and values. The section Type distribution shows you whether your data field has a mixed content type and what Datatype it is.

The segment Value distribution offers you a more detailed insight into the values of your data fields. You get information about how diverse the values in your data fields are. You can also look up the documents with certain values and select them directly as Preview Test Objects.

Number distribution

Numbers found in your data field are processed for the number distribution.

Results you will get include:

  • minimum value
  • maximum value
  • average
  • median
  • 25-percentile
  • 75-percentile
  • bar diagram for the numeric value distribution

If you need the calculated values in your text, you do not need to recreate them. Use the Analyze Result Node in Transform to access this information and use it for your text logic or text.