Scripts¶

scripts.cceh.import
scripts.cceh.prepare
scripts.cceh.cbgm
scripts.cceh.save_edits
scripts.cceh.load_edits
scripts.cceh.mk_users

scripts.cceh.import ¶

Import databases from mysql.

This script initializes the postgres database and then imports data from one or more mysql databases.

Note

Make sure to follow the steps in Database Access first.

The source databases are:

a database containing the apparatus of the Editio Critica Maior publication (ECM),
a database containing the Leitzeile, and
optionally a database containing the editorial decisions regarding the priority of the readings (VarGen).

The source tables for Acts are partitioned into 28 chapters. This is a historical incident: The software used when the CBGM was first implemented could not handle big tables. The import script joins partitioned tables into one.

After running this script you should run the scripts/cceh/prepare.py script.

concat_tables_fdw(conn, meta, dest_table, fdw, table_mask)¶: Concatenate multiple tables into one.

copy_table_fdw(conn, dest_table, fdw, source_table)¶: Copy a table.

import_att_fdw(dbsrc, dbdest, parameters)¶

Import att and lac tables from mysql.

Import the (28 * 2) mysql tables to 2 tables in the postgres database.

import_genealogical_fdw(dbsrc, dbdest, parameters)¶

Import genealogical tables from mysql.

Import the (28 * 3) mysql tables to 3 tables in the postgres database.

This function is relevant only for Acts, where we had to import genealogical data from a previous implementation of the CBGM. It is not used for new projects.

import_nestle_fdw(dbsrc, dbdest, parameters)¶: Import Nestle table from mysql.

scripts.cceh.prepare ¶

Initialize a CBGM database.

This script converts the databases structure used in the production of the ECM into a database structure suitable for doing CBGM. It

normalizes the databases,
removes manuscripts, passages and readings irrelevant to the CBGM,
builds a positive apparatus from the negative apparatus,
reconstructs the mt.

Database normalization is the usual process of restructuring your tables so they don’t contain redundant data.

The database must then be purged from all readings that are relevant for the ECM and the Nestle-Aland only, but not for the CBGM, eg. all passages without variants (about 2/3 of the New Testament), all corrections except those by the first hand, and readings that are clearly orthographic errors or differing only by orthgographic convention.

The script then transforms the negative apparatus into a positive apparatus, that is, an apparatus that is defined for all manuscripts at all passages.

Finally the script reconstructs the mt.

After running this script you should run the scripts/cceh/cbgm.py script.

Ausgangspunkt ist der Apparat mit allen für die Druckfassung notwendigen Informationen. Diese Datenbasis muss für die CBGM bearbeitet werden. Die Ausgangsdaten stellen einen negativen Apparat dar, d.h. die griechischen handschriftlichen Zeugen, die mit dem rekonstruierten Ausgangstext übereinstimmen, werden nicht ausdrücklich aufgelistet. Aufgelistet werden alle Zeugen, die von diesem Text abweichen bzw. Korrekturen oder Alternativlesarten haben. Ziel ist es, einen positiven Apparat zu erhalten. Wir benötigen einen Datensatz pro griechischem handschriftlichen Zeugen erster Hand und variierten Stelle (einschließlich der Lücken). D.h. für jede variierte Stelle liegt die explizite Information vor, ob die Handschrift dem Ausgangstext folgt, einen anderen Text oder gar keinen Text hat, weil z.B. die Seite beschädigt ist. Korrekturen oder Alternativlesarten werden für die CBGM ignoriert.

—ArbeitsablaufCBGMApg_Db.docx

build_MT_text(dba, parameters)¶

Reconstruct the Majority Text

Build a virtual manuscript that reconstructs the mt.

Im Laufe der Textgeschichte hat sich eine Textform durchgesetzt, der sogenannte Mehrheitstext, der auch Byzantinischer Text genannt wird. Diese Textform wird exemplarisch durch die sieben Handschriften 1, 18, 35, 330, 398, 424 und 1241 repräsentiert. Für jede variierte Stelle wird nun gezählt und festgehalten, wieviele dieser sieben Handschriften bei einer Lesart vertreten sind. Eine Lesart gilt als Mehrheitslesart, wenn sie

von mindestens sechs der oben genannten repräsentativen Handschriften bezeugt wird und höchstens eine Handschrift abweicht, oder

von fünf Repräsentanten bezeugt wird und zwei mit unterschiedlichen Lesarten abweichen.

—PreCo/PreCoActs/ActsMT2.pl

copy_genealogical(dbdest, parameters)¶

Copy and fix genealogical data for Acts

This function is relevant only for Acts, where we had to import genealogical data from a previous implementation of the CBGM. It is not used for new projects.

copy_nestle(dbdest, parameters)¶: Make a working copy of the Nestle table.

copy_table(conn, source_table, dest_table, where='')¶: Make a copy of a table.

delete_corrector_hands(dba, parameters)¶

Delete later hands

Delete all corrections except those by the first hand.

Lesarten löschen, die nicht von der ersten Hand stammen. […] Ausnahme: Bei Selbstkorrekturen wird die *-Lesart gelöscht und die C*-Lesart beibehalten.

—prepare4cbgm_6.py

delete_invariant_passages(dba, parameters)¶

Delete invariant passages

A passage is invariant if all defined manuscripts offer the same (regularized) text. Invariant passages are irrelevant to the CBGM.

Stellen löschen, an denen nur eine oder mehrere f- oder o-Lesarten vom A-Text abweichen. Hier gibt es also keine Variante.

Nicht löschen, wenn an dieser variierten Stelle eine Variante ‘b’ - ‘y’ erscheint.

Änderung 2014-12-16: Act 28,29/22 gehört zu einem Fehlvers. Dort gibt es u.U. keine Variante neben b, sondern nur ein Orthographicum. Wir suchen also nicht mehr nach einer Variante ‘b’ bis ‘y’, sondern zählen die Varianten. Liefert getReadings nur 1 zurück, gibt es keine Varianten.

—prepare4cbgm_5.py

delete_lectionaries(dba, parameters)¶

Delete secondary lectionary readings

Also delete lectionary readings except L1.

Bei mehreren Lektionslesarten gilt die L1-Lesart.

—prepare4cbgm_6.py

fill_apparatus_table(dba, parameters)¶

Fill the apparatus table with a positive apparatus.

The Att table contains a negative apparatus. A negative apparatus contains the text of the archetypus (manuscript ‘A’), but other manuscripts only when they offer a different reading.

The Apparatus table contains a positive apparatus. A positive apparatus contains the text of all manuscripts at all passages.

Steps to transform the negative apparatus into a positive apparatus

Set all passages in all manuscripts to the reading ‘a’.
Overwrite all Fehlverse in all manuscripts with the reading ‘zu’.
Unroll the Lac table. Entries in Lac table may span multiple passages. Overwrite every passage that is inside a lacuna with ‘zz’.
Overwrite with the readings from the negative apparatus in the Att table, if there is such a reading.

N.B. Readings in the negative apparatus do sometimes override lacunae.

In the result we have one entry in the apparatus for every manuscript and every passage.

See paper: Arbeitsablauf CBGM auf Datenbankebene, I. Vorbereitung der Datenbasis für CBGM

fill_manuscripts_table(dba, parameters)¶: Create the Manuscripts and Ms_Ranges tables.

fill_ms_cliques_table(dba, parameters)¶: Create the ms_cliques table.

fill_passages_table(dba, parameters)¶: Create the Passages table.

fill_readings_table(dba, parameters)¶: Create the readings table.

mark_invariant_passages(dba, parameters)¶

Mark invariant passages

A passage is invariant if all defined manuscripts offer the same (regularized) text. Invariant passages are irrelevant to the CBGM.

We need to display the “Leitzeile”, so we cannot simply delete these passages. We mark them as invariant instead.

FIXME: this is moot since we get the Leitzeile from a different database altogether. OTOH Att never contained all passages anyway, so it was impossible to extract the Leitzeile out of it.

Stellen löschen, an denen nur eine oder mehrere f- oder o-Lesarten vom A-Text abweichen. Hier gibt es also keine Variante.

Nicht löschen, wenn an dieser variierten Stelle eine Variante ‘b’ - ‘y’ erscheint.

Änderung 2014-12-16: Act 28,29/22 gehört zu einem Fehlvers. Dort gibt es u.U. keine Variante neben b, sondern nur ein Orthographicum. Wir suchen also nicht mehr nach einer Variante ‘b’ bis ‘y’, sondern zählen die Varianten. Liefert getReadings nur 1 zurück, gibt es keine Varianten.

—prepare4cbgm_5.py

process_commentaries(dba, parameters)¶

Process commentaries

Commentaries often contain more than one reading of the same passage. If those readings are different we must degrade them to uncertain status.

Also promote the manuscript to ‘normal’ status by stripping the commentary suffix (in re_comm) from the hs.

20. Mai 2015. Commentary manuscripts like 307 cannot be treated like lectionaries where we choose the first text. If a T1 or T2 reading is found they have to be deleted. A new zw reading is created containing the old readings as suffix.

This has to be done as long as both witnesses are present.

If the counterpart of one entry belongs to the list of lacunae the witness will be treated as normal witness. The T notation can be deleted.

—prepare4cbgm_5b.py

process_sigla(dba, parameters)¶

Process Sigla

Rewrite the manuscript sigla (hs) and delete all suffixes.

Handschriften, die mit einem “V” für videtur gekennzeichnet sind, werden ebenso wie alle anderen behandelt. Das “V” kann also getilgt werden. Die Eintragungen für “ursprünglich (*)” und “C*” werden ebenfalls gelöscht. Schließlich auch die Zusätze zur Handschriftennummer wie “T1”. Diese Eintragungen werden (bisher) einfach an die Handschriftenbezeichnung angehängt.

Der Eintrag ‘videtur’, gekennzeichnet durch ein ‘V’ hinter der Handschriftennummer, spielt für die CBGM keine Rolle. Ein eventuell vorhandenes ‘V’ muss getilgt werden. Gleiches gilt für die Einträge ‘*’ und ‘C*’.

—prepare4cbgm_6b.py

unroll_zw(dba, parameters)¶

Unroll ‘zw’ entries

When a reading cannot be classified under a labez with absolute certainty, the apparatus sets the labez to ‘zw’ (zweifelhaft, dubious) and offers a list of candidate labez in labezsuf. This list of candidates in labezsuf has to be normalized into multiple table records.

If \(N > 1\) labez candidates exists, the certainty of the unrolled records will be set to \(1 / N\).

But if all candidate labez differ only in their errata or ortographica suffix, as in ‘a/ao1/ao2’ or ‘b/b_f’ then we will output only one record with a certainty of 1.0.

Caveat: in Mark, labezsuf may contain a list of candidate labez even if labez is not ‘zw’. This is why we no longer look for ‘zw’ in labez but for ‘/’ in labezsuf.

zw-Lesarten der übergeordneten Variante zuordnen, wenn ausschliesslich verschiedene Lesarten derselben Variante infrage kommen (z.B. zw a/ao oder b/bo_f). In diesen Fällen tritt die Buchstabenkennung der übergeordneten Variante in labez an die Stelle von ‘zw’.

—prepare4cbgm_7.py

scripts.cceh.cbgm ¶

Perform the CBGM.

This script

rebuilds the ‘A’ text from the local stemmas,
calculates the pre-coherence similarity of manuscripts, and
calculates the post-coherence ancestrality of manuscripts.

This script updates the tables shown in red in the overview. It also updates the Apparatus table where manuscript ‘A is concerned.

build_A_text(dba, parameters)¶

Build the ‘A’ text

The editors’ reconstruction of the archetype is recorded in the locstem table. This functions generates a virtual manuscript ‘A’ from those choices.

The designation of a passage as ‘Fehlvers’ is an editorial decision that the verse is not original, so we set ‘zu’.

If the editors came to no final decision, no ‘original’ reading will be found in locstem. In this case we set ‘A’ to ‘zz’ and there will be a gap in the reconstructed text.

The Lesart of ‘A’ is always NULL, because it is a virtual manuscript.

scripts.cceh.save_edits ¶

Save the state of the editor tables.

This script saves the tables containing the editorial decisions. It does not save the apparatus tables.

scripts.cceh.load_edits ¶

Load a saved state of the editor tables.

This script loads the state of the tables with the editorial decisions as saved by the save_edits.py script. It does not touch the apparatus tables.

While loading, it checks for errors and discrepancies, eg. different passage addresses. Passages in the apparatus that are not in the save state are reset to the default of: reading ‘a’ is original and every other reading is derived from ‘a’.

scripts.cceh.mk_users ¶

Initialize the database for user authentication and authorization.

Creates the tables and inserts the admin user.

Scripts¶

scripts.cceh.import¶

scripts.cceh.prepare¶

scripts.cceh.cbgm¶

scripts.cceh.save_edits¶

scripts.cceh.load_edits¶

scripts.cceh.mk_users¶