Skip to main content
Version: 4.51.0

File System Crawler

The FS Crawler, or File System Crawler, allows you to recursively scrape a directory on a filesystem or retrieve the metadata of a single file. The local file system or a cloud storage can be used. In order to access the local filesystem, the XILL4_WORKDIRS environment variable must be set to the path of the directory to be accessed.

Configuration

Display Settings
Example: configuration of root directories to scrape on the file system:

{
"rootName": {
"path": "C:\Windows",
"recursive": true
},
"secondRootName": {
"path": "C:\Drivers",
"recursive": true
}
}
Example: configuration of root directories to scrape on a cloud storage:

{
"rootName": {
"path": "<cloud-storage>://test1",
"recursive": true
},
"secondRootName": {
"path": "<cloud-storage>//test2",
"recursive": true
}

}

The keys rootName and secondRootName are descriptive and can have any value as long as they are unique. The recursive parameter is passed through to each result output in order to be able determine if recursion is enabled for the configured root directory.


To retrieve the metadata of a single file, no configuration is required. Use the fullPath key in an message to specify the path of the file to retrieve the metadata of.

Rate Limiting

By default, rate limiting is enabled with 10 jobs per 100ms. The limit and interval can be changed in the configuration. Rate limiting can be disabled by setting these values to 0.

request limit

The max amount of requests during the interval.

Interval

The interval in milliseconds in which the requests happen. Should be a multiple of 250.

Inputs | Outputs


FS Crawler
Input
Output
Error
0 0 0 (ilywsc4)
This input takes objects that have a fullPath key that contains the path to scrape. This can either be a path to a file or a directory. If the recursive key is set to false, only the current directory entries will be output. If the recursive key is set to true, all directory entries of the sub-folders will be output as well.

note
When fullPath is a file, the recursive key is ignored and can be omitted.

Example:
{
"fullPath" : "C:/Users",
"recursive": true
}
If the incoming data doesn't have fullPath defined, the component will check for the configuration. At least one of the two is required.