Snapshot version

You are browsing the docs for the snapshot version of Nexus, the latest release is available here

v1.11 To v1.12 Migration

Schema changes

Run the different scripts starting with V1_12 in order from the repository.

Indexing in Elasticsearch and Blazegraph

Those different changes in 1.12 require a full reindexing in Elasticsearch and Blazegraph:

  • default indexing in Elasticsearch now pushes to a single index
  • projects, resolvers, schemas, storages and views are not indexed anymore
  • some metadata fields are not indexed anymore

The choice of migration will depend on your current setup, needs, and uptime requirements that you want to uphold for your users.

If you can accept partial results in Elasticsearch/Blazegraph during the reindexing process, this migration can be achieved as follows:

  • Make a backup of your PostgreSQL instance
  • Deploy new instances of Elasticsearch and Blazegraph
  • Scale down Nexus Delta, update the image version and change its configuration to point to those new instances
  • Configure Delta via the plugins.elasticsearch.main-index.shards depending on your amount of data and following the Elastic recommendations about sharding
  • You can optionally allocate more resources to Delta, PostgreSQL, Blazegraph and Elasticsearch to accommodate for the load related to the indexing
  • You may also want to reindex in Elasticsearch first and then in Blazegraph if the load is too high
  • Run the following SQL script to delete the former default elasticsearch views and their progress:
DELETE FROM scoped_events WHERE type = 'elasticsearch' AND id = 'https://bluebrain.github.io/nexus/vocabulary/defaultElasticSearchIndex';
DELETE FROM scoped_states WHERE type = 'elasticsearch' AND id = 'https://bluebrain.github.io/nexus/vocabulary/defaultElasticSearchIndex';
DELETE FROM projection_offsets WHERE module = 'elasticsearch' AND resource_id = 'https://bluebrain.github.io/nexus/vocabulary/defaultElasticSearchIndex';
DELETE FROM projection_offsets WHERE name = 'event-metrics';
  • Run the following SQL script to delete the progress of Blazegraph indexing:
DELETE FROM projection_offsets WHERE module = 'blazegraph';
  • Start Nexus Delta
  • Monitor the load of the different components and the indexing process by running the following query
SELECT * FROM projection_offsets ORDER BY updated_at DESC;

Adopt hash partitioning

  • Make a backup of your PostgreSQL instance
  • Make sure to accommodate your deployment so that it can accomodate twice your data during the migration
  • Scale down Delta
  • Run the following script to init the partition_config table
  • Run the following script to rename the partitioning tables and their indices:
-- Renaming scoped_events table and its indices
ALTER TABLE scoped_events RENAME TO scoped_events_list;
ALTER INDEX scoped_events_type_idx RENAME TO scoped_events_type_idx_list ;
ALTER INDEX scoped_events_ordering_idx RENAME TO scoped_events_ordering_idx_list ;

-- Renaming scoped_states table and its indices
ALTER TABLE scoped_states RENAME TO scoped_states_list;
ALTER INDEX scoped_states_type_idx RENAME TO scoped_states_type_idx_list ;
ALTER INDEX scoped_states_ordering_idx RENAME TO scoped_states_ordering_idx_list ;
ALTER INDEX project_uuid_idx RENAME TO project_uuid_idx_list ;
ALTER INDEX state_value_types_idx RENAME TO state_value_types_idx_list ;
  • Run the following script to create the new tables:
-- scoped_events
CREATE TABLE IF NOT EXISTS public.scoped_events(
    ordering bigint       NOT NULL DEFAULT nextval('event_offset'),
    type     text         NOT NULL,
    org      text         NOT NULL,
    project  text         NOT NULL,
    id       text         NOT NULL,
    rev      integer      NOT NULL,
    value    JSONB        NOT NULL,
    instant  timestamptz  NOT NULL,
    PRIMARY KEY(org, project, id, rev)
) PARTITION BY HASH (org, project);

CREATE INDEX IF NOT EXISTS scoped_events_type_idx ON public.scoped_events(type);
CREATE INDEX IF NOT EXISTS scoped_events_ordering_idx ON public.scoped_events (ordering);

CREATE TABLE public.scoped_states_0000 PARTITION OF public.scoped_states FOR VALUES WITH (MODULUS 2, REMAINDER 0);
CREATE TABLE public.scoped_states_0001 PARTITION OF public.scoped_states FOR VALUES WITH (MODULUS 2, REMAINDER 1);

-- scoped_states
CREATE TABLE IF NOT EXISTS public.scoped_states(
    ordering   bigint       NOT NULL DEFAULT nextval('state_offset'),
    type       text         NOT NULL,
    org        text         NOT NULL,
    project    text         NOT NULL,
    id         text         NOT NULL,
    tag        text         NOT NULL,
    rev        integer      NOT NULL,
    value      JSONB        NOT NULL,
    deprecated boolean      NOT NULL,
    instant    timestamptz  NOT NULL,
    PRIMARY KEY(org, project, tag, id)
) PARTITION BY HASH (org, project);

CREATE INDEX IF NOT EXISTS scoped_states_type_idx ON public.scoped_states(type);
CREATE INDEX IF NOT EXISTS scoped_states_ordering_idx ON public.scoped_states (ordering);
CREATE INDEX IF NOT EXISTS project_uuid_idx ON public.scoped_states((value->>'uuid')) WHERE type = 'project';
CREATE INDEX IF NOT EXISTS state_value_types_idx ON public.scoped_states USING GIN ((value->'types'));
  • Start Nexus with a configuration with the hash partition strategy without making it available to users.

The modulo value should remain low(at most a few hundreds) and should be set to fit your number of projects and resources

app {
  database {
    partition-strategy {
     type = hash
     # Adapt the modulo value to the number of partitions you expect
     modulo = 5
    }
  }
}  
  • Stop Nexus again once this log appears:
2025-03-05 14:05:25 INFO  c.e.b.n.d.s.p.DatabasePartitioner - The partition strategy has not been set yet, initializing...
  • The following queries should return the different partitions
SELECT table_name 
FROM information_schema.tables WHERE table_name LIKE 'scoped_events_0%';
SELECT table_name 
FROM information_schema.tables WHERE table_name LIKE 'scoped_states_0%';
  • Running a query against the table partition_config should also return a row matching your configuration:
SELECT * FROM partition_config;
  • Run the following scripts to copy data to the new tables
INSERT INTO scoped_events (SELECT * FROM scoped_events_list);
INSERT INTO scoped_states (SELECT * FROM scoped_states_list);
  • Run a count against the old tables and the new ones to check they have the same number of rows
SELECT count(*) from scoped_events_list;
SELECT count(*) from scoped_events;
SELECT count(*) from scoped_states_list;
SELECT count(*) from scoped_states;
  • Delete the previous tables:
DROP TABLE IF EXISTS public.scoped_events_list;
DROP TABLE IF EXISTS public.scoped_states_list;
  • Start Delta again