Content Store
The Content Store is the Xill4 database concept for storing any type of content targeting migrations. It uses a Mongo database and a combination of schemas. In Mongo terminology each data record stored is called a (BSON) document, to avoid conflicting terminology when talking about documents or content from an external repository the term RECORD
is used when referring to an actual file or document.
A very important concept about the Content Store is that each migratable object is stored separately.
Kinds of content
It distinguishes eight different kinds of content:
- ROOT
- CONTAINER
- RECORD
- BINARY
- RELATION
- ACL
- AUDITLOG
- PRINCIPAL
The CONTAINER, RECORD and BINARY are the three most important ones, because together they form about 80% of the actual content.
ROOT
The ROOT
document(s) are the starting point(s) of a migration. They hold the ID of the target repository container objects where the migrated content needs to land.
CONTAINER
The CONTAINER
kind is used for hierarchical content types like folders, sites, documentLibraries, archives, sub-sites, lists.
RECORD
The RECORD
kind is used for any actual content like pages, paragraphs, list-items, documents. It holds the metadata and references to the actual binaries. For a representing a file-system this means that each file will have both a RECORD
document and a BINARY
document stored.
BINARY
The BINARY
kind is used when representing physical file. It holds the metadata about the files and contains a reference to its location.
RELATION
The RELATION
kind is used for relationship type of objects other than representing a parent-child structure and (language) versions.
ACL
The ACL
kind is used for representing access control list objects.
AUDITLOG
The AUDITLOG
kind is used for representing audit log objects.
PRINCIPAL
The PRINCIPAL
kind is used for representing user and group objects.
Schemas
To store these different kinds of content, the Content Store supports 6 different schemas. Schemas describe the requirements of the content being stored in the Content Store.
The selection below describes the different schemas and there purpose.
Root schema
The root schema is used for the ROOT
kind.
Content schema
The content schema can be considered the main schema. It is used for the CONTAINER
and RECORD
kinds.
Binary schema
The binary schema is used for the BINARY
kind.
Relation schema
The relation schema is used for the RELATION
kind.
This schema is in a beta state and is highly subjected to change.
ACL schema
The ACL schema is used for the ACL
kind.
This schema is in an alpha state and is highly subjected to change.
AuditLog schema
The AuditLog schema is used for the AUDITLOG
kind.
This schema is in an alpha state and is highly subjected to change.
Principal schema
The Principal schema is used for the PRINCIPAL
kind.
Storing binaries
Besides the metadata, the Content Store is also able to store binary content. It does this using a technique called GridFS. That is why components that work with binary files will have a Mongo database connection string and a database name.
GridFS consist of two collections binaries.files
and binaries.chunks
.
binaries.files
: stores the file's metadata. For details, see The files Collection.binaries.chunks
: stores the binary chunks. For details, see The chunks Collection.