CBGM¶
This page describes how to do the CBGM directly on the VM ntg.uni-muenster.de.
Preparing the Database for the CBGM¶
All input data must be converted and imported into one Postgres database.
The mysql database
ECM
contains the apparatus of the Editio Critica Maior publication. This database is exported from the NTVMR application (New Testament Virtual Manuscript Room).The mysql database
Leitzeile
contains the Leitzeile of the current Nestle-Aland edition or any other appropriate “Leitzeile”.Optionally the mysql database
VarGen
contains previous editorial decisions regarding the priority of the readings. If this database is not supplied default priorities are used.
Database Preparation for CBGM¶
The scripts/cceh/import.py script copies the mysql databases into temporary tables of the
postgres database without doing any integrity checking.
The temporary tables are named original_*
.
These tables are very useful for finding and understanding data errors.
The scripts/cceh/prepare.py script reads the temporary tables in the Postgres database and
writes tables in a database structure suitable for doing the CBGM.
This structure is normalized and data integrity is enforced.
The script will print all data integrity errors found
and also log them in the file prepare.log
.
Warning
The script will not complete if there are data integrity errors.
All data integrity errors that surface at this point must be fixed in the source data with the aid of the NTVMR people. Then the source data must be imported again. This is an iterative and often very time-consuming process.
Applying the CBGM¶
The scripts/cceh/cbgm.py script applies the CBGM method. The CBGM is applied at the start of every new project phase. It must also be applied immediately after the scripts/cceh/prepare.py script.
Applying the CBGM¶
Starting a New Project¶
To start a new project:
create a new Postgres database,
create local copies of the mysql databases,
add an instance to the server,
prepare the new Postgres database,
run the CBGM,
restart the application server.
Worked Example¶
As an example we will create a new project: Mark Phase 3.
The name of the new Postgres database is: mark_ph3
.
We assume having obtained two mysql database dumps from the NTVMR people:
ECM_Mark_Ph3.dump.bz2
and Nestle29.dump.bz2
.
ssh into the server.
Note
You need to have permission to sudo -u postgres
and sudo -u ntg
.
First create a new Postgres database:
sudo -u postgres ~ntg/prj/ntg/ntg/scripts/cceh/create_database.sh mark_ph3
Then import the database dumps into the local mysql databases:
sudo -iu ntg
mysql -e "CREATE DATABASE ECM_Mark_Ph3"
mysql -e "CREATE DATABASE Nestle29"
bzcat ECM_Mark_Ph3.dump.bz2 | mysql -D ECM_Mark_Ph3
bzcat Nestle29.dump.bz2 | mysql -D Nestle29
Then create a new server instance. The fastest way is to just copy an old instance configuration file and edit it:
cd ~/prj/ntg/ntg/instance
cp mark_ph22.conf mark_ph3.conf
emacs mark_ph3.conf
Change all relevant parts of the instance configuration file. See: API Server Configuration Files.
Use the scripts/cceh/import.py and scripts/cceh/prepare.py scripts to import the mysql databases into Postgres and prepare them for CBGM:
cd ~/prj/ntg/ntg
python3 -m scripts.cceh.import -vvv instance/mark_ph3.conf
python3 -m scripts.cceh.prepare -vvv instance/mark_ph3.conf
(Note: If you came from Starting a New Phase With Apparatus Update continue there.)
Then run the CBGM with the scripts/cceh/cbgm.py script:
python3 -m scripts.cceh.cbgm -vvv instance/mark_ph3.conf
Last, restart the application server:
sudo /bin/systemctl restart ntg
If the server doesn’t start, check for configuration errors:
sudo /bin/journalctl -u ntg
Add the database to the file scripts/cceh/active_databases
.
This file controls nightly and weekly backups.
emacs scripts/cceh/active_databases
If you are satisfied with the new project, you may drop the mysql databases. The application server uses the Postgres database only.
mysql -e "DROP DATABASE ECM_Mark_Ph3"
mysql -e "DROP DATABASE Nestle29"
Starting a New Phase¶
A new phase of the project is entered after the editors have completed a pass over the whole text. All editorial decisions taken during this pass are used to recalculate the CBGM for the next phase.
To start a new phase:
copy the database into a new database,
add an instance to the server, and
run the CBGM on the new instance.
Worked Example¶
As an example let us create a new Mark Phase 3 from an existing Mark Phase 2.2.
ssh into the server.
Note
You need to have permission to sudo postgres and sudo ntg.
First stop the application server and make a copy of the mark_ph22 database:
sudo -u ntg sudo /bin/systemctl stop ntg
sudo -u postgres psql -c "CREATE DATABASE mark_ph3 TEMPLATE mark_ph22 OWNER ntg"
sudo -u ntg sudo /bin/systemctl start ntg
Then create a new server instance:
sudo -iu ntg
cd ~/prj/ntg/ntg/instance
cp mark_ph22.conf mark_ph3.conf
Change all relevant parts of the instance configuration file. See: API Server Configuration Files.
emacs mark_ph3.conf
Put the old database in read-only mode (set WRITE_ACCESS=”nobody”):
emacs mark_ph22.conf
Then run the CBGM on the new instance:
cd ~/prj/ntg/ntg
python3 -m scripts.cceh.cbgm -vvv instance/mark_ph3.conf
Last, restart the application server:
sudo /bin/systemctl restart ntg
Starting a New Phase With Apparatus Update¶
Sometimes a new phase goes hand in hand with a change in the apparatus.
To update the apparatus while maintaining (most) editorial decisions:
create a new database for the phase,
add an instance to the server,
prepare the new database with the new apparatus,
save the editorial decisions from the old database,
load the editorial decisions into the new database, and
run the CBGM on the new instance.
Worked Example¶
As an example let us create a new Mark Phase 3 from an existing Mark Phase 2.2 using a new apparatus.
First follow the steps in Starting a New Project above, until you reach the CBGM step.
Put the old database in read-only mode (set WRITE_ACCESS=”nobody”):
cd ~/prj/ntg/ntg/instance
emacs mark_ph22.conf
sudo /bin/systemctl restart ntg
Then use the scripts/cceh/save_edits.py script to save the editorial decisions of the previous phase and the scripts/cceh/load_edits.py script to load them into the new instance:
cd ~/prj/ntg/ntg
python3 -m scripts.cceh.save_edits -vvv -o saved_edits.xml instance/mark_ph22.conf
python3 -m scripts.cceh.load_edits -vvv -i saved_edits.xml instance/mark_ph3.conf
The last command will also output a list of passages in the old apparatus
that are missing or different in the new apparatus and store them
in the file load_edits.log
.
Then run the scripts/cceh/cbgm.py script on the new instance to apply the CBGM method:
python3 -m scripts.cceh.cbgm -vvv instance/mark_ph3.conf
Restart the application server:
sudo /bin/systemctl restart ntg
Add the database to the file scripts/cceh/active_databases
.
This file controls nightly and weekly backups.
emacs scripts/cceh/active_databases
If you are satisfied with the new project, you may drop the mysql databases. The application server uses the Postgres database only.
mysql -e "DROP DATABASE ECM_Mark_Ph3"
mysql -e "DROP DATABASE Nestle29"