Quick Start Guide
This guide will help you get started with Xill4. It describes the basic principles of migrations and will help you set up your first project. For a step to step guide on how to use xill4, please refer to our HelloWorld tutorial
ETL
ETL is short for Extract, Transform and Load. It is a process of extracting data from one source repository and transforming it into a format that is suitable for loading into the target repository. For running a migration you usually need multiple of these ETL processes, combined we call them a migration street. Xill4 allows you to set up a migration street using low code.
Your first Migration street
To set up a migration street, you first need to create a project. A project is a container for all the ETL processes you need to set up. Each ETL process is represented by one or more flows.
To create a project you use the Project -> New Project
menu item.
After you have created a project you start setting up the ETL processes.
First, you need to set up an extracting process. This is the process that extracts data from a source repository. To do this create a new flow or use one of the accelerators. To create a new flow, use the Flow -> New Flow
menu item.
To open the flow, click it or use Flow menu button (...)->Open in new tab
.
As each flow needs a starting point, let's start by adding a Trigger
component. This component is the first component in a flow. It is the component that is triggered when the flow is started. Now add one of the API component that matches the source repository you want to extract data from and connect the Trigger
component to it.
Debugging
To debug a flow you can use the Print
to print the data that is passed through the flow or use the Logger
component to print the data to the console. You can also use the Counter
component to count the number of times it has received input.
Let's add a Print
component to the flow and connect it to the API
component. Now you need to configure the API
component to make it work with the source repository you want to extract data from. For this, we refer to the documentation of that specific API
component.
After you have configured the API
component you can start the flow. You should able to see the data that is extracted from the source repository in the Print
component.
Content Store
The next thing that we need to do is prepare the content to be stored in the Content Store. Usually, there are multiple types of content going through the flow that we need to route differently, for this we use the Conditional Split
component. Let's add a Conditional Split
component and connect it to the API
component. Now we need to configure it to route the different content types that are coming out of our API
component.
To further process the content we need to format it. This is done by using the Template Engine
component. Let's add Template Engine
and connect it to the Conditional Split
component. For more complex transformations we can use a combination of components.
For setting up the Template Engine
component we refer to the Template Engine section.
Finally, we add the Content Store
component and connect its input to the output of the Template Engine
component. The Content Store
component is the last in the flow. It is the component that is responsible for storing the content that is extracted from the source repository. Before storing the content, it will validate its format and warn you if it is incorrect.
Transformation
The next step in ETL is the transformation process. This is the process that transforms the content that is extracted from the source repository into a format that is suitable for loading into the target repository. To do this we create a new flow and call it Transformation
.
Again we start by adding a Trigger
component. The next thing we need to do is retrieve the content from the Content Store, we do this by using the Document Retrieve
component.
Let's add a Document Retrieve
component and connect it to the Trigger
component. After we have configured the Document Retrieve
component we need to transform the content that is retrieved from the Content Store and make it suitable for the target repository. In the transformation process we transform the fields and values from the source
object of the document to the target
object of the document, the source
object stays untouched. For more information about how documents are stored in the Content Store, we refer to the Content Store section.
Besides the common fields, we need to transform the fields and values that are specific to the source repository, those are stored in source.properties
. How we need to transform those fields and values depends on the target repository, for our accelerators these fields and values are documented. Please refer to them for more information.
Finally, we store the transformed documents using the Content Store
component again. Alternatively, is also possible to send updates to the Content Store, allowing for partial updates and for multiple documents at the same time. This is done by using the Document Update
component.
Load
The last step in ETL is the load process. This is the process that loads the content that is transformed from the source repository into the target repository. To do this create a new flow or use one of the accelerators.
During the load, we read the content from the Content Store and load it into the target repository using an API
component for that specific repository. We only load the content that is transformed from the source repository and scheduled for migration (migration.migrate
is set to true
). After we have loaded the content we need to update the document in the Content Store to indicate that the content has been loaded, we do this by setting migration.id
to the id
that has been given for the document by the target repository.
A typical load flow will start with a Trigger
component followed by a Document Retrieve
component, a Template Engine
component, a Generic Resolver
and an API
component. The last component is the Document Update
component, which is used to update the document in the Content Store, setting the migration.id
.
The Generic Resolver
component is used to resolve from the source repository to the target repository. For more information about how to use the Generic Resolver
component, we refer to the Generic Resolver section.