Architecture overview
The SDaaS platform is utilized for creating smart data platforms as backing services, empowering your applications to leverage the full potential of the Semantic Web.
The SDaaS platform out-of-the-box contains an extensible set of modules that connect to several Knowledge Graphs through optimized driver modules.
A smart data service conforms to the KEES specifications and is realized by a customization of the SDaaS docker image. It contains one or more scripts calling a set of commands implemented by modules. The command behavior can be modified through configuration variables.
node "SDaaS platform " as SDaaSDocker <<docker image>>{
component Module {
collections "commands" as Command
collections "configuration variables" as ConfigurationVariable
}
}
node "smart data service" as SDaaSService <<docker image>>{
component "SDaaS Script" as App
}
database "Knowledge Graphs" as GraphStore
cloud "Linked Data" as Data
SDaaSService ---|> SDaaSDocker : extends
Command ..(0 Data : HTTP
Command ..(0 GraphStore : SPARQL
Command o. ConfigurationVariable
App --> Command : calls
App --> ConfigurationVariable : set/get
It is possible to add a new modules to extend the power of the platform to match special needs.
Data Model
The SDaaS data model is designed around few concepts: Configuration Variables, Functions, Commands, and Modules.
SDaaS configuration variables
A SDaaS configuration variable is a bash environment variable that the platform uses as configuration option. Configuration variables have a default value; they can be changed statically in the Docker image or runtime in Docker run, Docker orchestration, or in user scripts.
The following taxonomy applies to SDaaS functions:
class "SDaaS Configuration variable" as ConfigVariable
class "SID Variable" as SidVariable
class "Platform Variable" as PlatformVariable <<read only>>
interface EnvironmentVariable
ConfigVariable --|> EnvironmentVariable
SidVariable --|> ConfigVariable
ConfigVariable <|- PlatformVariable
Environment Variable
It is a shell variable.
Platform variable
It is a variable defined by the SDaaS docker tha should not be changed outside the dokerfile.
SID Variable
It is a special configuration variable that states a graph store property. The general syntax is <sid>_<var name>
. For example the variable STORE_TYPE
refers to a driver module name that it must be used to access the graph store connected by the sid STORE
. Some driver can require/define other sid variables with their default values.
See all available configuration variables in the installation guide
SDaaS Functions
An SDaaS function is a bash function embedded in the platform. For example, sd_log
. A bash function accepts a fixed set of positional parameters, writes output on stdout, and returns 0 on success or an error code otherwise.
The following taxonomy applies to SDaaS functions:
Interface "Bash function" as BashFunction
Class "Sdaas function" as SdaasFunction
Class "Driver virtual function" as DriverVirtualFunction
Class "Driver function implementation" as DriverFunction
SdaasFunction --|> BashFunction
DriverFunction --|> SdaasFunction
DriverVirtualFunction --|> SdaasFunction
DriverFunction <- DriverVirtualFunction: calls
Bash Function
Is the interface of a generic function defined in the scope of a bash process.
Driver virtual function
It is a function that act as a proxy for a driver method, its first parameter is always the sid. A _driver virtual function has the syntax <sd_driver_<method name>
(e.g. sd_driver_load
) and has require a set of fixed position parameters.
Driver method implementation
It is a function that implements a driver virtual function for a specific graph store engine driver. A driver method implementation has the syntax <sd_<driver name>_<method name>
(e.g. sd_w3c_load
) and expects a set of fixed position parameters (unchecked). Driver function implementation functions should be called only by a driver virtual function.
SDaaS commands
A Command is a SDaaS function that conforms to the SDaaS command requirements. For example, sd_sparql_update
. A command writes output on stdout, logs on std error and returns 0 on success or an error code otherwise.
In a script, SDaaS commands should be called through the sd
function using the following syntax: sd <module name> <function name> [options] [operands]
. The sd
function allows a flexible output error management and auto-includes all required module. Direct calls to the command functions should be done only inside modules implementation.
For instance calling sd -A sparql update
executes the command function sd_sparql_update
including the sparql
module and aborting the script in case of error. This is equivalent to sd_include sparql; sd_sparql_update || sd_abort
The following taxonomy applies to commands:
Class Command
Class "Facts Provision" as DataProvidingCommand
Class "Ingestion Command" as IngestionCommand
Class "Query Command" as QueryCommand
Class "Learning Command" as LearningCommand
Class "Reasoning Command" as ReasoningCommand
Class "Enriching Command" as EnrichingCommand
Class "SDaaS function" as SDaaSFunction
Class "Store Command" as StoreCommand
Class "Compound Command" as CompoundCommand
Command --|>SDaaSFunction
DataProvidingCommand --|> Command
StoreCommand --|> Command
IngestionCommand --|> StoreCommand
QueryCommand --|> StoreCommand
LearningCommand -|> CompoundCommand
LearningCommand --|> DataProvidingCommand
LearningCommand --|> IngestionCommand
CompoundCommand <|- ReasoningCommand
ReasoningCommand --|> IngestionCommand
ReasoningCommand --|> QueryCommand
EnrichingCommand --|> ReasoningCommand
EnrichingCommand --|> LearningCommand
Compound Command
It is a command resulting from the composition of two or more commands, usually in a pipeline.
Facts Provision
It is a command that provides RDF triples in output.
Query Command
It is a command that can extract information from the knowledge graph.
Ingestion Command
It is a command that stores facts into a knowledge graph.
Reasoning Command
It is a command that both queries and ingests data into the same knowledge graph according to some rules.
Learning Command
It is a command that provides and ingests facts into the knowledge graph.
Enriching Command
It is a command that queries the knowledge base, discovers new data, and injects the results back into the knowledge base.
Store Command
It is a command that interact with a knowledge base. It accepts -s *SID*
and -D "sid=*SID*"
options.
SDaaS modules
A SDaaS module is a collection of commands and configuration variables that conforms to the module building requirements.
You can explicitly include a module content with the command sd_include
The module taxonomy is depicted in the following image:
class "SDaaS module" as Module
Abstract "Driver" as AbstractDriver
Class "Core Module" as Core
Class "Driver implementation" as DriverImplementation
Class "Command Module" as CommandModule
Class "Store module" as StoreModule
Core --|> Module
CommandModule --|> Module
AbstractDriver --|> Module
DriverImplementation <- AbstractDriver : calls
StoreModule --|> CommandModule
StoreModule --> AbstractDriver : includes
Core <- CommandModule : includes
Command Module
Modules that implement a set of related SDaaS command. They always include the Core Module and can depend from other modules.
Core Module
A module singleton exposes core commands and must be loaded before using any SDaaS feature.
Driver
A module singleton that exposes the the abstract Driver functions interface to handle connections with a generic graph store engine.
Driver implementation
Modules that implement the function interface exposed by the Abstract Driver for a specific graph store engine.
Store Module
Modules that export store commands that connects to a graph store using the functions exposed by the Driver module. A store module always includes the driver module.
The big picture
The resulting SDaaS platform data model big picture is:
package "User Application" {
class "User Application" as Application
}
package "SDaaS platform" #aliceblue;line:blue;line.dotted;text:blue {
class Command
class Module
class "SDaaS function" as SDaaSFunction
Abstract "Driver" as Driver
class ConfigVariable
Abstract "SID Variable" as SidVariable
interface "bash function" as Function
interface EnvironmentVariable
}
package "Backing services" {
interface "Backing service" as BakingService
interface "Graph Store" as GraphStore
interface "Knowledge Graph" as KnowledgeGraph
interface "Linked Data Platform" as RDFResource
}
package "smart data service" #aliceblue;line:blue;line.dotted;text:blue {
class "SDaaS script" as Script
class "smart data Service" as SDaaS
}
KnowledgeGraph : KEES compliance
GraphStore : SPARQL endpoint
RDFResource : URL
Command --|> SDaaSFunction
SDaaSFunction --|> Function
ConfigVariable --|> EnvironmentVariable
SidVariable --|> ConfigVariable
Driver --|> Module
GraphStore --|> BakingService
RDFResource --|> BakingService
KnowledgeGraph --|> GraphStore
BakingService <|-- SDaaS
Module *-- Command : exports
Module *-- ConfigVariable : declares
Command o.. RDFResource : learns
Driver --> KnowledgeGraph : connects
Module .> Module : includes
Function --o Script
EnvironmentVariable --o Script
Script -o SDaaS : contains
SidVariable <- Driver : uses
ConfigVariable .o Command : uses
Application ..> KnowledgeGraph : access
Backing service
It is a type of software that operates on servers, handling data storage, resource publishing , and processing tasks for an application.
Graph Store
It is backed service that provides support for SPARQL protocol and an optional support to Graph Store Protocol.
Knowledge Graph
It is a Graph Store compliant with the KEES specification.
Linked Data Platform
It is a web informative resource that exposes RDF data in one of supported serialization according to W3C LDP specifications.
SDaaS script
It is a bash script that uses SDaaS commands.
smart data service
It is a backing service that include the SDaaS platform and implements one or more SDaaS scripts.
User application
It is a (Sematic Web) Application that uses a knowledge graph.