File System Crawler
The FS Crawler, or File System Crawler, allows you to recursively scrape a directory on a filesystem or retrieve the metadata of a single file. The local file system or a cloud storage can be used. In order to access the local filesystem, the XILL4_WORKDIRS
environment variable must be set to the path of the directory to be accessed.
Configuration
{
"rootName": {
"path": "C:\Windows",
"recursive": true
},
"secondRootName": {
"path": "C:\Drivers",
"recursive": true
}
}
{
"rootName": {
"path": "<cloud-storage>://test1",
"recursive": true
},
"secondRootName": {
"path": "<cloud-storage>//test2",
"recursive": true
}
}
The keys rootName
and secondRootName
are descriptive and can have any value as long as they are unique. The recursive
parameter is passed through to each result output in order to be able determine if recursion is enabled for the configured root directory.
To retrieve the metadata of a single file, no configuration is required. Use the fullPath
key in an message to specify the path of the file to retrieve the metadata of.
Rate Limiting
By default, rate limiting is enabled with 10 jobs per 100ms. The limit and interval can be changed in the configuration. Rate limiting can be disabled by setting these values to 0.
request limitThe max amount of requests during the interval.
IntervalThe interval in milliseconds in which the requests happen. Should be a multiple of 250.
Inputs | Outputs
fullPath
key that contains the path to scrape. This can either be a path to a file or a directory. If the recursive
key is set to false, only the current directory entries will be output. If the recursive
key is set to true, all directory entries of the sub-folders will be output as well.fullPath
is a file, the recursive
key is ignored and can be omitted.Example:
{
"fullPath" : "C:/Users",
"recursive": true
}
fullPath
defined, the component will check for the configuration. At least one of the two is required.