CompositeView

This view is composed by multiple sources and projections.

Processing pipeline

An asynchronous process gets trigger for every view. This process can be visualized as a pipeline with different stages.

The first stage is the input of the pipeline: a stream of sources.

The last stage takes the resulting output from the pipeline and index it on the configured projection.

CompositeView pipeline

Sources

A source defines from where to retrieve the resources. It is the input of the pipeline.

There are 3 types of sources available.

ProjectEventStream

This source will read events in a streaming fashion from the current project where the view gets created.

The events will be consumed by the projections stage.

{
   "sources": [
      {
         "@id": "{sourceId},
         "@type": "ProjectEventStream",
         "resourceSchemas": [ "{resourceSchema}", ...],
         "resourceTypes": [ "{resourceType}", ...],
         "resourceTag": "{tag}"
      }
   ],
   ...
}

where…

  • {sourceId}: Iri - The identifier of the source. This field is optional. When missing, a randomly generated Iri will be assigned.
  • {resourceSchema}: Iri - Selects only resources that are validated against the provided schema Iri. This field is optional.
  • {resourceType}: Iri - Select only resources of the provided type Iri. This field is optional.
  • {tag}: String - Selects only resources with the provided tag. This field is optional.

CrossProjectEventStream

This source will read events in a streaming fashion from any project in the current Nexus deployment. The specified list of identities will be used to retrieve the resources from the project. If the project don’t have the resources/read permissions, the source will be ignored.

The events will be consumed by the projections stage.

{
   "sources": [
      {
         "@id": "{sourceId}",
         "@type": "CrossProjectEventStream",
         "project": "{project}",
         "identities": [ {_identity_}, {...} ],
         "resourceSchemas": [ "{resourceSchema}", ...],
         "resourceTypes": [ "{resourceType}", ...],
         "resourceTag": "{tag}"
      }
   ],
   ...
}

where…

  • {sourceId}: Iri - The identifier of the source. This field is optional. When missing, a randomly generated Iri will be assigned.
  • {project}: String - the target project (in the format ‘myorg/myproject’).
  • _identity_: Json object - the identity against which to enforce ACLs during the resource retrieval process.
  • {resourceSchema}: Iri - Selects only resources that are validated against the provided schema Iri. This field is optional.
  • {resourceType}: Iri - Select only resources of the provided type Iri. This field is optional.
  • {tag}: String - Selects only resources with the provided tag. This field is optional.

RemoteProjectEventStream

This source will read events in a streaming fashion from any project in a remote Nexus deployment.

The events will be then consumed by the projections stage.

{
   "sources": [
      {
         "@id": "{sourceId}",
         "@type": "RemoteProjectEventStream",
         "project": "{project}",
         "endpoint": "{endpoint}",
         "token": "{token}",
         "resourceSchemas": [ "{resourceSchema}", ...],
         "resourceTypes": [ "{resourceType}", ...],
         "resourceTag": "{tag}"
      }
   ],
   ...
}

where…

  • {sourceId}: Iri - The identifier of the source. This field is optional. When missing, a randomly generated Iri will be assigned.
  • {project}: String - the remote project (in the format ‘myorg/myproject’).
  • {endpoint}: Iri - the Nexus deployment endpoint.
  • {token}: String - the Nexus deployment token. This field is optional. When missing, the Nexus endpoint will be accessed without authentication.
  • {resourceSchema}: Iri - Selects only resources that are validated against the provided schema Iri. This field is optional.
  • {resourceType}: Iri - Select only resources of the provided type Iri. This field is optional.
  • {tag}: String - Selects only resources with the provided tag. This field is optional.

Intermediate Sparql space

After the events are gathered from each source, the following steps are executed:

  1. Convert event into a resource.
  2. Discard undesired resources.
  3. Store the RDF triple representation of a resource in an intermediate Sparql space. This space will be used by the projections in the following pipeline steps.

Projections

A projection defines the type of indexing and the transformations to apply to the data. It is the output of the pipeline.

There are 2 types of projections available

ElasticSearch

This projection executes the following steps:

  1. Discard undesired resources.
  2. Transform the resource by executing an SPARQL construct query against the intermediate Sparql space.
  3. Convert the resulting RDF triples into JSON using the provided JSON-LD context.
  4. Stores the resulting JSON as a Document in an ElasticSearch index.
{
   "projections": [
      {
         "@id": "{projectionId}",
         "@type": "ElasticSearch",
         "mapping": _elasticsearch mapping_,
         "query": "{query}",
         "context": _context_,
         "resourceSchemas": [ "{resourceSchema}", ...],
         "resourceTypes": [ "{resourceType}", ...],
         "resourceTag": "{tag}",
         "includeMetadata": {includeMetadata},
         "includeDeprecated": {includeDeprecated}
      }
   ],
   ...
}

where…

  • {projectionId}: Iri - The identifier of the projection. This field is optional. When missing, a randomly generated Iri will be assigned.
  • _elasticsearch mapping_: Json object - Defines the value types for the Json keys, as stated at the ElasticSearch mapping documentation.
  • {resourceSchema}: Iri - Selects only resources that are validated against the provided schema Iri to perform the query. This field is optional.
  • {resourceType}: Iri - Select only resources of the provided type Iri to perform the query. This field is optional.
  • {tag}: String - Selects only resources with the provided tag to perform the query. This field is optional.
  • {includeMetadata}: Boolean - If true, the resource’s nexus metadata (_constrainedBy, _deprecated, …) will be stored in the ElasticSearch document. Otherwise it won’t. The default value is false.
  • {includeDeprecated}: Boolean - If true, deprecated resources are also indexed. The default value is false.
  • {query}: Sparql Query - Defines the Sparql query to execute against the intermediate Sparql space for each target resource.
  • _context_: Json - the JSON-LD context value applied to the query results.

Sparql

This projection executes the following steps:

  1. Discard undesired resources.
  2. Transform the resource by executing an Sparql construct query against the intermediate Sparql space.
  3. Stores the resulting RDF Triple in a Blazegraph namespace.
{
   "projections": [
      {
         "@id": "{projectionId}",
         "@type": "Sparql",
         "query": "{query}",
         "resourceSchemas": [ "{resourceSchema}", ...],
         "resourceTypes": [ "{resourceType}", ...],
         "resourceTag": "{tag}",
         "includeMetadata": {includeMetadata},
         "includeDeprecated": {includeDeprecated}
      }
   ],
   ...
}

where…

  • {projectionId}: Iri - The identifier of the projection. This field is optional. When missing, a randomly generated Iri will be assigned.
  • {resourceSchema}: Iri - Selects only resources that are validated against the provided schema Iri to perform the query. This field is optional.
  • {resourceType}: Iri - Select only resources of the provided type Iri to perform the query. This field is optional.
  • {tag}: String - Selects only resources with the provided tag to perform the query. This field is optional.
  • {includeMetadata}: Boolean - If true, the resource’s nexus metadata (_constrainedBy, _deprecated, …) will be stored in the ElasticSearch document. Otherwise it won’t. The default value is false.
  • {includeDeprecated}: Boolean - If true, deprecated resources are also indexed. The default value is false.
  • {query}: Sparql Query - Defines the Sparql query to execute against the intermediate Sparql space for each target resource.

Payload

{
  "@id": "{someid}",
  "@type": "CompositeView",
  "sources": [ _source_, ...],
  "projections": [ _projection_, ...],
  "rebuildStrategy": {
    "@type": "Interval",
    "value": "{interval_value}"
  }
}

where…

  • _source_: Json - The source definition.
  • _projection_: Json - The projection definition.
  • {interval_value}: String - The maximum interval delay for a resource to be present in a projection, in a human readable format (e.g.: 10 minutes).

Note: The rebuildStrategy block is optional. If missing, the view won’t be automatically restarted.

Example

The following example creates a Composite view containing 3 sources and 2 projections.

The incoming data from each of the sources is stored as RDF triples in the intermediate Sparql space .

The ElasticSearch projection http://music.com/bands is only going to query the Sparql space with the provided query when the current resource in the pipeline has the type http://music.com/Band.

The ElasticSearch projection http://music.com/albums is only going to query the Sparql space with the provided query when the current resource in the pipeline has the type http://music.com/Album.

The view is going to be restarted every 10 minutes if there are new resources in any of the sources since the last time the view was restarted. This allows to deal with partial graph visibility issues.

{
  "@type": "CompositeView",
  "sources": [
    {
      "@id": "http://music.com/sources/local",
      "@type": "ProjectEventStream"
    },
    {
      "@id": "http://music.com/sources/albums",
      "@type": "CrossProjectEventStream",
      "project": "demo/albums",
      "identities": {
          "realm": "myrealm",
          "group": "mygroup"
      }
    },
    {
      "@id": "http://music.com/sources/songs",
      "@type": "RemoteProjectEventStream",
      "project": "remote_demo/songs",
      "endpoint": "https://example2.nexus.com",
      "token": "mytoken"
    }    
  ],
  "projections": [
    {
      "@id": "http://music.com/bands",
      "@type": "ElasticSearch",
      "mapping": {
        "properties": {
          "@type": {
            "type": "keyword",
            "copy_to": "_all_fields"
          },
          "@id": {
            "type": "keyword",
            "copy_to": "_all_fields"
          },
          "name": {
            "type": "keyword",
            "copy_to": "_all_fields"
          },
          "genre": {
            "type": "keyword",
            "copy_to": "_all_fields"
          },
          "album": {
            "type": "nested",
            "properties": {
              "title": {
                "type": "keyword",
                "copy_to": "_all_fields"
              },
              "released": {
                "type": "date",
                "copy_to": "_all_fields"
              },
              "song": {
                "type": "nested",
                "properties": {
                  "title": {
                    "type": "keyword",
                    "copy_to": "_all_fields"
                  },
                  "number": {
                    "type": "long",
                    "copy_to": "_all_fields"
                  },
                  "length": {
                    "type": "long",
                    "copy_to": "_all_fields"
                  }
                }
              }
            }
          },
          "_all_fields": {
            "type": "text"
          }
        },
        "dynamic": false
      },
      "query": "prefix music: <http://music.com/> prefix nxv: <https://bluebrain.github.io/nexus/vocabulary/> CONSTRUCT {{resource_id}   music:name       ?bandName ; music:genre      ?bandGenre ; music:album      ?albumId . ?albumId        music:released   ?albumReleaseDate ; music:song       ?songId . ?songId         music:title      ?songTitle ; music:number     ?songNumber ; music:length     ?songLength } WHERE {{resource_id}   music:name       ?bandName ; music:genre      ?bandGenre . OPTIONAL {{resource_id} ^music:by        ?albumId . ?albumId        music:released   ?albumReleaseDate . OPTIONAL {?albumId         ^music:on        ?songId . ?songId          music:title      ?songTitle ; music:number     ?songNumber ; music:length     ?songLength } } } ORDER BY(?songNumber)",
      "context": {
        "@base": "http://music.com/",
        "@vocab": "http://music.com/"
      },
      "resourceTypes": [
        "http://music.com/Band"
      ]
    },
    {
      "@id": "http://music.com/albums",
      "@type": "ElasticSearch",
      "mapping": {
        "properties": {
          "@type": {
            "type": "keyword",
            "copy_to": "_all_fields"
          },
          "@id": {
            "type": "keyword",
            "copy_to": "_all_fields"
          },
          "name": {
            "type": "keyword",
            "copy_to": "_all_fields"
          },
          "length": {
            "type": "long",
            "copy_to": "_all_fields"
          },
          "numberOfSongs": {
            "type": "long",
            "copy_to": "_all_fields"
          },
          "_all_fields": {
            "type": "text"
          }
        },
        "dynamic": false
      },
      "query": "prefix music: <http://music.com/> prefix nxv: <https://bluebrain.github.io/nexus/vocabulary/> CONSTRUCT {{resource_id}             music:name               ?albumTitle ; music:length             ?albumLength ; music:numberOfSongs      ?numberOfSongs } WHERE {SELECT ?albumReleaseDate ?albumTitle (sum(?songLength) as ?albumLength) (count(?albumReleaseDate) as ?numberOfSongs) WHERE {OPTIONAL { {resource_id}           ^music:on / music:length   ?songLength } {resource_id} music:released             ?albumReleaseDate ; music:title                ?albumTitle . } GROUP BY ?albumReleaseDate ?albumTitle }",
      "context": {
        "@base": "http://music.com/",
        "@vocab": "http://music.com/"
      },
      "resourceTypes": [
        "http://music.com/Album"
      ]
    }
  ],
  "rebuildStrategy": {
    "@type": "Interval",
    "value": "10 minutes"
  }  
}

Endpoints

The following sections describe the endpoints that are specific to a CompositeView.

The general view endpoints are described on the parent page.

Search Documents in a projection

POST /v1/views/{org_label}/{project_label}/{view_id}/projections/{projection_id}/_search
  {...}

where {projection_id} is the @id value of the target ElasticSearch projection.

The special character _ allows to perform a search in every ElasticSearch projection on the current view.

The supported payload is defined on the ElasticSearch documentation

Example

Request
curl -XPOST -H "Content-Type: application/json" "https://nexus.example.com/v1/views/myorg/myproj/nxv:myview/projections/_/_search" -d \
'{
    "query": {
        "term": {
            "name": {
                "value": "Muse"
            }
        }
    }
}'
Full source at GitHub
Payload
{
    "query": {
        "term": {
            "name": {
                "value": "Muse"
            }
        }
    }
}
Full source at GitHub
Response
{
  "_shards": {
    "failed": 0,
    "skipped": 0,
    "successful": 0,
    "total": 0
  },
  "hits": {
    "hits": [
      {
        "_score": 0.6931472,
        "_id": "http://music.com/muse",
        "_index": "kg_2d4b3208-63ad-441b-9eb1-831722b5df88_24a0220e-c546-456f-b904-8770132f8e12_1",
        "_source": {
          "@id": "muse",
          "album": [
            {
              "@id": "absolution",
              "released": "2003-09-15",
              "song": [
                {
                  "@id": "absolution/1.json",
                  "length": 252,
                  "number": 1,
                  "title": "Apocalypse Please"
                },
                {
                  "@id": "absolution/2.json",
                  "length": 236,
                  "number": 2,
                  "title": "Time Is Running Out"
                },
                {
                  "@id": "absolution/3.json",
                  "length": 294,
                  "number": 3,
                  "title": "Sing for Absolution"
                }
              ]
            },
            {
              "@id": "black_holes_and_revelations",
              "released": "2006-07-03",
              "song": [
                {
                  "@id": "black_holes_and_revelations/1.json",
                  "length": 275,
                  "number": 1,
                  "title": "Take a Bow"
                },
                {
                  "@id": "black_holes_and_revelations/2.json",
                  "length": 239,
                  "number": 2,
                  "title": "Starlight"
                },
                {
                  "@id": "black_holes_and_revelations/3.json",
                  "length": 209,
                  "number": 3,
                  "title": "Supermassive Black Hole"
                }
              ]
            }
          ],
          "genre": [
            "progressive rock",
            "alternative rock",
            "space rock",
            "art rock",
            "electronica",
            "hard rock"
          ],
          "name": "Muse"
        },
        "_type": "_doc"
      }
    ],
    "max_score": 0.6931472,
    "total": {
      "relation": "eq",
      "value": 1
    }
  },
  "timed_out": false,
  "took": 0
}
Full source at GitHub

SPARQL query in a projection

POST /v1/views/{org_label}/{project_label}/{view_id}/projections/{projection_id}/sparql
  {query}
GET /v1/views/{org_label}/{project_label}/{view_id}/projections/{projection_id}/sparql?query={query}

In both endpoints, {query} is defined by the SPARQL documentation

where {projection_id} is the @id value of the target Sparql projection.

The special character _ allows to perform a search in every Sparql projection on the current view.

The Content-Type HTTP header for POST request is application/sparql-query.

Example

Request
curl -XPOST -H "Content-Type: application/sparql-query" "https://nexus.example.com/v1/views/myorg/myproj/nxv:myview/projections/nxv:album/sparql" -d \
'SELECT ?s where {?s ?p ?o} LIMIT 2'
Full source at GitHub
Response
{
  "head": {
    "vars": [
      "s"
    ]
  },
  "results": {
    "bindings": [
      {
        "s": {
          "type": "uri",
          "value": "http://example.com/myview"
        }
      },
      {
        "s": {
          "type": "uri",
          "value": "http://example.com/other"
        }
      }
    ]
  }
}
Full source at GitHub

Fetch sources statistics

GET /v1/views/{org_label}/{project_label}/{view_id}/statistics

Example

Request
curl "https://nexus.example.com/v1/views/myorg/myproj/nxv:myview/projections/nxv:albums/statistics"
Full source at GitHub
Response
{
  "@context": [
    "https://bluebrain.github.io/nexus/contexts/search.json",
    "https://bluebrain.github.io/nexus/contexts/resource.json",
    "https://bluebrain.github.io/nexus/contexts/statistics.json"
  ],
  "_total": 3,
  "_results": [
    {
      "sourceId": "http://music.com/sources/albums",
      "projectionId": "http://music.com/albums",
      "totalEvents": 9,
      "processedEvents": 9,
      "evaluatedEvents": 4,
      "remainingEvents": 0,
      "discardedEvents": 5,
      "failedEvents": 0,
      "delayInSeconds": 0,
      "lastEventDateTime": "2020-02-03T10:42:50.002Z",
      "lastProcessedEventDateTime": "2020-02-03T10:42:50.002Z",
      "nextRestart": "2020-02-03T12:54:07.325Z"
    },
    {
      "sourceId": "http://music.com/sources/local",
      "projectionId": "http://music.com/albums",
      "totalEvents": 8,
      "processedEvents": 8,
      "evaluatedEvents": 0,
      "remainingEvents": 0,
      "discardedEvents": 8,
      "failedEvents": 0,
      "delayInSeconds": 0,
      "lastEventDateTime": "2020-02-03T10:42:50.441Z",
      "lastProcessedEventDateTime": "2020-02-03T10:42:50.441Z",
      "nextRestart": "2020-02-03T12:54:07.325Z"
    },
    {
      "sourceId": "http://music.com/sources/songs",
      "projectionId": "http://music.com/albums",
      "totalEvents": 17,
      "processedEvents": 17,
      "evaluatedEvents": 0,
      "remainingEvents": 0,
      "discardedEvents": 17,
      "failedEvents": 0,
      "delayInSeconds": 0,
      "lastEventDateTime": "2020-01-24T13:41:42.177Z",
      "lastProcessedEventDateTime": "2020-01-24T13:41:42.177Z",
      "nextRestart": "2020-02-03T12:54:07.325Z"
    }
  ]
}
Full source at GitHub

where:

  • totalEvents - total number of events in the project
  • processedEvents - number of events that have been considered by the view
  • remainingEvents - number of events that remain to be considered by the view
  • discardedEvents - number of events that have been discarded (were not evaluated due to filters, e.g. did not match schema, tag or type defined in the view)
  • evaluatedEvents - number of events that have been used to update an index
  • lastEventDateTime - timestamp of the last event in the project
  • lastProcessedEventDateTime - timestamp of the last event processed by the view
  • delayInSeconds - number of seconds between the last processed event timestamp and the last known event timestamp

Fetch projection statistics

GET /v1/views/{org_label}/{project_label}/{view_id}/projections/{projection_id}/statistics

where {projection_id} is the @id value of the projection.

The special character _ allows fetch statistics from every projection on the current view.

Example

Request
curl "https://nexus.example.com/v1/views/myorg/myproj/nxv:myview/projections/nxv:albums/statistics"
Full source at GitHub
Response
{
  "@context": [
    "https://bluebrain.github.io/nexus/contexts/search.json",
    "https://bluebrain.github.io/nexus/contexts/resource.json",
    "https://bluebrain.github.io/nexus/contexts/statistics.json"
  ],
  "_total": 3,
  "_results": [
    {
      "sourceId": "http://music.com/sources/albums",
      "projectionId": "http://music.com/albums",
      "totalEvents": 9,
      "processedEvents": 9,
      "evaluatedEvents": 4,
      "remainingEvents": 0,
      "discardedEvents": 5,
      "failedEvents": 0,
      "delayInSeconds": 0,
      "lastEventDateTime": "2020-02-03T10:42:50.002Z",
      "lastProcessedEventDateTime": "2020-02-03T10:42:50.002Z",
      "nextRestart": "2020-02-03T12:54:07.325Z"
    },
    {
      "sourceId": "http://music.com/sources/local",
      "projectionId": "http://music.com/albums",
      "totalEvents": 8,
      "processedEvents": 8,
      "evaluatedEvents": 0,
      "remainingEvents": 0,
      "discardedEvents": 8,
      "failedEvents": 0,
      "delayInSeconds": 0,
      "lastEventDateTime": "2020-02-03T10:42:50.441Z",
      "lastProcessedEventDateTime": "2020-02-03T10:42:50.441Z",
      "nextRestart": "2020-02-03T12:54:07.325Z"
    },
    {
      "sourceId": "http://music.com/sources/songs",
      "projectionId": "http://music.com/albums",
      "totalEvents": 17,
      "processedEvents": 17,
      "evaluatedEvents": 0,
      "remainingEvents": 0,
      "discardedEvents": 17,
      "failedEvents": 0,
      "delayInSeconds": 0,
      "lastEventDateTime": "2020-01-24T13:41:42.177Z",
      "lastProcessedEventDateTime": "2020-01-24T13:41:42.177Z",
      "nextRestart": "2020-02-03T12:54:07.325Z"
    }
  ]
}
Full source at GitHub

where:

  • totalEvents - total number of events in the project
  • processedEvents - number of events that have been considered by the view
  • remainingEvents - number of events that remain to be considered by the view
  • discardedEvents - number of events that have been discarded (were not evaluated due to filters, e.g. did not match schema, tag or type defined in the view)
  • evaluatedEvents - number of events that have been used to update an index
  • lastEventDateTime - timestamp of the last event in the project
  • lastProcessedEventDateTime - timestamp of the last event processed by the view
  • delayInSeconds - number of seconds between the last processed event timestamp and the last known event timestamp

Restart projection

DELETE /v1/views/{org_label}/{project_label}/{view_id}/projections/{projection_id}/progress

where {projection_id} is the @id value of the projection.

The special character _ allows restart the progress from every projection on the current view.

Example

Request
curl -XDELETE "https://nexus.example.com/v1/views/myorg/myproj/nxv:myview/projections/_/progress"
Full source at GitHub
Response
{
  "@context": [
    "https://bluebrain.github.io/nexus/contexts/search.json",
    "https://bluebrain.github.io/nexus/contexts/resource.json",
    "https://bluebrain.github.io/nexus/contexts/progress.json"
  ],
  "_total": 1,
  "_results": [
    {
      "@type": "NoProgress"
    }
  ]
}
Full source at GitHub