Document Aggregate
The Document Aggregate component allows you to query for documents using an aggregation pipeline.
The aggregation is executed on the documents
collection.
An aggregation pipeline consists of one or more stages that process documents:
- Each stage performs an operation on the collection
documents
. For example, a stage can filter documents, group documents, and calculate values. - The documents that are output from a stage are passed to the next stage.
- An aggregation pipeline can return results for groups of documents. For example, return the total, average, maximum, and minimum values.
- Dynamic values in the pipeline are allowed using the dot-notation(see example below).
Handlebars
This component lets you use the Handlebars templates. More information about Handlebars can be found in this section
Stages
$addFields
: Adds new fields to documents. Similar to $project
, $addFields
reshapes each document in the stream; specifically, by adding new fields to output documents that contain both the existing fields from the input documents and the newly added fields.$set
is an alias for $addFields
.$bucket
: Categorizes incoming documents into groups, called buckets, based on a specified expression and bucket boundaries.$bucketAuto
: Categorizes incoming documents into a specific number of groups, called buckets, based on a specified expression. Bucket boundaries are automatically determined in an attempt to evenly distribute the documents into the specified number of buckets.$collStats
: Returns statistics regarding a collection or view.$count
: Returns a count of the number of documents at this stage of the aggregation pipeline. Distinct from the $count
aggregation accumulator.$facet
: Processes multiple aggregation pipelines within a single stage on the same set of input documents. Enables the creation of multi-faceted aggregations capable of characterizing data across multiple dimensions, or facets, in a single stage.$geoNear
: Returns an ordered stream of documents based on the proximity to a geospatial point. Incorporates the functionality of $match
, $sort
, and $limit
for geospatial data. The output documents include an additional distance field and can include a location identifier field.$graphLookup
: Performs a recursive search on a collection. To each output document, adds a new array field that contains the traversal results of the recursive search for that document.$group
: Groups input documents by a specified identifier expression and applies the accumulator expression(s), if specified, to each group. Consumes all input documents and outputs one document per each distinct group. The output documents only contain the identifier field and, if specified, accumulated fields.$indexStats
: Returns statistics regarding the use of each index for the collection.$limit
: Passes the first n documents unmodified to the pipeline where n is the specified limit. For each input document, outputs either one document (for the first n documents) or zero documents (after the first n documents).$listSessions
: Lists all sessions that have been active long enough to propagate to the system.sessions collection.$lookup
: Performs a left outer join to another collection in the same database to filter in documents from the "joined" collection for processing.$match
: Filters the document stream to allow only matching documents to pass unmodified into the next pipeline stage.$match
uses standard MongoDB queries. For each input document, outputs either one document (a match) or zero documents (no match).$merge
: Writes the resulting documents of the aggregation pipeline to a collection. The stage can incorporate (insert new documents, merge documents, replace documents, keep existing documents, fail the operation, process documents with a custom update pipeline) the results into an output collection. To use the $merge
stage, it must be the last stage in the pipeline.$out
: Writes the resulting documents of the aggregation pipeline to a collection. To use the $out
stage, it must be the last stage in the pipeline.$planCacheStats
: Returns plan cache information for a collection.$project
: Reshapes each document in the stream, such as by adding new fields or removing existing fields. For each input document, outputs one document. See also $unset
for removing existing fields.$redact
: Reshapes each document in the stream by restricting the content for each document based on information stored in the documents themselves. Incorporates the functionality of $project
and $match
. Can be used to implement field level redaction. For each input document, outputs either one or zero documents.$replaceRoot
: Replaces a document with the specified embedded document. The operation replaces all existing fields in the input document, including the _id field. Specify a document embedded in the input document to promote the embedded document to the top level.$replaceWith
: Replaces a document with the specified embedded document. The operation replaces all existing fields in the input document, including the _id field. Specify a document embedded in the input document to promote the embedded document to the top level.$replaceWith
is an alias for $replaceRoot
stage.$sample
: Randomly selects the specified number of documents from its input.$search
: Performs a full-text search of the field or fields in an Atlas collection.$set
: Adds new fields to documents. Similar to $project
, $set
reshapes each document in the stream; specifically, by adding new fields to output documents that contain both the existing fields from the input documents and the newly added fields.$set
is an alias for $addFields
stage.$setWindowFields
: Groups documents into windows and applies one or more operators to the documents in each window.$skip
: Skips the first n documents where n is the specified skip number and passes the remaining documents unmodified to the pipeline. For each input document, outputs either zero documents (for the first n documents) or one document (if after the first n documents).$sort
: Reorders the document stream by a specified sort key. Only the order changes; the documents remain unmodified. For each input document, outputs one document.$sortByCount
: Groups incoming documents based on the value of a specified expression, then computes the count of documents in each distinct group.$unionWith
: Performs a union of two collections; i.e. combines pipeline results from two collections into a single result set.$unset
: Removes/excludes fields from documents.$unset
is an alias for $project
stage that removes fields.$unwind
: Deconstructs an array field from the input documents to output a document for each element. Each output document replaces the array with an element value. For each input document, outputs n documents where n is the number of array elements and can be zero for an empty array.For All Aggregation Pipeline Operators check the documentation from MongoDB.
Configuration
A MongoDB connection string.
Example: mongodb://<username>:<password>@localhost:27017/<databaseName>
Here <databaseName>
is the database to store content.
Whether or not to use TLS in case your mongoDB requires TLS.
Allow Invalid CertificatesChecking this will disable certificate validation. Warning: specifying this option in a production environment makes your application insecure and potentially vulnerable to expired certificates and to foreign processes posing as valid client instances.
Certificate Authority FileOne or more certificate authorities to trust when making a TLS connection. In order to access the local filesystem, the XILL4_WORKDIRS
environment variable must be set to the path of the directory to be accessed.
Example: .\ca.pem
Aggregation pipeline. When left empty the aggregation will be translated to an empty array which results in all documents from the collection.
Example: [{"$match" : { "name":"John" }}]
By enabling allowDiskUse, MongoDB can process the sort operation even if it requires more than 100 megabytes of system memory. If this option is disabled and the operation required more than 100 megabytes of system memory, MongoDB will return an error: `Executor error during find command :: caused by :: Sort exceeded memory limit`
Enable case insensitive collation settingsBy default Mongo sorts upper case characters before lower case characters. By enabling case insensitive collation settings, this behavior will be disabled.
Rate Limit
The rate limit settings are used to throttle the number of documents that are sent into the flow. The minimum interval is set at 10 milliseconds. The minimum batch size is 1 outgoing message per interval.
Batch sizeAllows you to configure the batch size.
IntervalThe interval in milliseconds in which the batches are sent.
Inputs | Outputs
A valid JSON object containing keys that will be used by handlebars
in theaggregation pipeline
in the configuration.$match
and $group
to calculate the number of files and total size for the corresponding binary documents. Incoming message: {
kind: "BINARY";
}
Dynamic values can be used in the aggregation, referenced with {{key}}
as in the example below with kind.
Configuration aggregation:
[
{
$match: {
kind: "{{kind}}",
},
},
{
$group: {
_id: "$source.extension",
count: { $sum: 1 },
totalSize: { $sum: "$source.byteSize" },
},
},
];