# Databanks management

## Directory structure housing the banks

As indicated in the section [Directory structure](/beedeem/installation/directory_structure.md), all databanks installed by *BeeDeeM* are located in the directory *${biobaseRootDir}*.

Lets take the example bank descriptors **PDB\_proteins**, **Uniprot\_Swissprot** and **Refseq\_Viruses; these are '.dsc' suffixed files located in `${installDir}/conf/descriptors` directory of BeeDeeM.**

These banks will be installed into the directories described by their respective bank descriptor. If you look at these descriptors (e.g. ${installDir}/conf/descriptors/PDB\_proteins.dsc) and you look  directive *db.ldir*, this is what you will see:

| Descriptor               | Installation instruction | Type | Bank name          |
| ------------------------ | ------------------------ | ---- | ------------------ |
| *PDB\_proteins.dsc*      | db.ldir=${mirrordir}     | p    | PDB\_proteins      |
| *Uniprot\_SwissProt.dsc* | db.ldir=${mirrordir}     | p    | Uniprot\_SwissProt |
| *Refseq\_Viruses.dsc*    | db.ldir=${mirrordir}     | n    | Refseq\_Viruses    |

All of these 'db.ldir', 'type' and 'bank name' directives translate into the following directory tree:

```
${biobaseDir}
  |
  |- n
  |    |
  |    |- Refseq_Viruses
  |
  |      --> Contains Refseq_Viruses in flat-file annotated format, 
  |          a Lucene index, a FASTA version and the BLAST databank.
  | 
  |- p
       |
       |- Uniprot_SwissProt
       |    
       |   --> contains all data for Swiprot databank
       |
       |- PDB_proteins
```

It is worth noting that reserved descriptor keyword *${mirrordir}* translates to *${biobaseRootDir}* at runtime.

This example shows that in NO CASE the various databases are installed in the same directory. If this is not done, *BeeDeeM* may malfunction, particularly during updates. In effect, if two databases are installed in the same directory, you cannot update them individually.

To install a new database, it is best to comply with the tree structure given above.

For example, if you wan to install the BLAST nucleic database 'MyBank', install it in its own directory under:

```
${mirrordir}/n/MyBank
```

In the same way, if you want to install the flat file proteic database 'MyOtherBank', install it in its own directory under:

```
${mirrordir}/p/pMyOtherBank
```

## What to do after processing

When the installation of a sequence database was successful, it is automatically installed into production for your users.

The “old” copy of the database is always deleted, but this behaviour can be controlled using parameter 'history' of a databank descriptor.

Let's take the example from the database descriptor "PDB\_proteins.dsc" located in `${installDir}/conf/descriptors`. Here, we are interested in two lines of this file:

```
db.name=PDB_proteins
db.ldir=${mirrordir}|p|PDB_proteins
```

the first line gives the name of the database to install and the second line specifies where it should be installed on your system. Therefore, these two lines tell *BeeDeeM* that the database [PDB](http://www.rcsb.org/pdb/home/home.do) (sequences only) will be installed in the directory ${mirrordir}|p|PDB\_proteins under the name PDB\_proteins.

When the processing of this database has succeeded, PDB\_proteins is installed into the following directory:

```
${mirrordir}/p/PDB_proteins/current/PDB_proteins
```

Note the presence of the term *current* in the path: it identifies the PDB\_proteins database currently in production, i.e. the version of the database used by your users.

Now, let us suppose that you ask *BeeDeeM* to update this database as of 26/02/2022. Throughout the processing of the database PDB, *BeeDeeM* will work in a temporary directory named as follows:

```
${mirrordir}/p/PDB_proteins/download/PDB_proteins
```

Note that the sub-directory `download` is at the same level as the sub-directory `current` (see below). This means that the database PDB\_proteins in production remains available (and cohabits with) the version being installed.

If the installation procedure ends successfully, the directories are renamed as follows ONLY if descriptor parameter 'history' is set to '1':

```
${mirrordir}/p/PDB_proteins/current/PDB_proteins

becomes

${mirrordir}/p/PDB_proteins/currentOn20220226/PDB_proteins
```

(the version that was in production migrates out of production). If 'history' is set to '0', then the 'old' release of PDB\_proteins is automatically deleted.

If the installation procedure fails, these two directories are unchanged:

```
${mirrordir}/p/PDB_proteins/current/PDB_proteins

and

${mirrordir}/p/PDB_proteins/download/PDB_proteins
```

In this case, you should consult the *BeeDeeM* log files and search for the lines containing "WARN" (refer to [Control of execution](/beedeem/getting-started/install-banks.md#control-of-execution) for more details).

Each such "WARN" line contains the description of an error that occurred during processing. Once the problem has been identified, *BeeDeeM* can be restarted: see section [Restart after failure](/beedeem/getting-started/advanced-uses.md#restart-after-failure).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://pgdurand.gitbook.io/beedeem/getting-started/banks-organization.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
