ANALYZE

The data Analysis feature helps you understand the data you have uploaded to the AX NLG Cloud. You find this feature in the Transform tab of the Composer area. It delivers a brief overview of values that are stored in your data and counts them. This is done separately for each data field.

For planing your writing and creating the text concept you need to know what you can expect in your data. Directly after uploading your data In the Data Sources area you get detailed information broken down by individual data sets. The Analyses, on the other hand, provide you with an evaluation of the data, such as the distribution of the values.

It displays all data fields it could find in the chosen analysis, broken down by path, count and type. Out of the results of an Analysis you can also directly create Data nodes or search for documents with specific values.

An example: Imagine a data field “color”, which you want to text about. You need to know what exact colors are in your data and if they all fit into your text. Out of “red”, “green”, “blue”, “multicolor” all work, except “multicolor” which you would have to convert to “multicolored” to fit in a sentence. This new feature now delivers exactly that information and even counts how often a certain color is in all your data.

You can view around 200 different data fields per analysis at once.

Filter for Analyses

Multiple Analyses can be created for each project, with selection on data subsets. In this view you can access the Analyses, which you can name individually, at any time using the list in the left panel. And their results are also available in the transform tool via Analysis result node.

Run the new Analyses by clicking on Start. If there are changes in your Collections you can refresh your Analysis.

For any new Analysis you choose the area of the data it will cover:

  • Use all data in this project
  • Only languages: filter the languages
  • Only collections: choose the collections you want to use

Information break down

After running an Analysis you will get a table with an overview over all datasets of the analyzed data sorted in the categories:

  • Path field names in hierarchically order
  • Count the number of filled fields
  • Type(s) whether the content is text or number

For each Path there is additional detailed information about the distribution of data types and values. In the section Type distribution it is shown you whether your data field has a mixed content type and what Datatype it is.

The segment Value distribution offers you a more detailed insight into the values of your data fields. You get information about how diverse the values in your data fields are. You can also look up the documents with certain values and select them directly as Preview Test Objects

Number distribution

Numbers found in your data field are processed for the number distribution.

Results you will get:

  • minimum value
  • maximum value
  • average
  • median
  • 25-percentile
  • 75-percentile
  • bar diagram for the numeric value distribution

If you need the calculated values in your text, you do not need to recreate them. Use the Analyze Result Node in Transform to access this information and use it for your text logic or text.