To Repository
Steps to Send to Repository
1. Clone and set an environment
A. Clone the Library repository on your computer. All the files of the repository will be copied on your computer so that you can run the code. For more information, see for example the GitLab documentation on the topic.
B. Set up an environment: By default, the files and metadata will be sent on a testing instance of Nakala. If you want to send it on the Nakala repository for real, you will need to document you ID. To do that:
- Login on Nakala (Nakala, Nakala Test), open your profile (top right hand corner), generate a key if you have none, copy the one you have.
- On your computer, at the root of the “library” folder, open the file
env.yaml. - Paste your nakala key to replace the
apiKeyat the end of the file (“your-prod-api-key-here”), save the file and close it.
C. BUT YOUR Nakala ID NEEDS TO BE KEPT SECRET: Make sure the .env file is not shared with anyone nor pushed on any public repository. To ignore a file in the synchronization with Github or GitLab, the syntax on the terminal once you are in the folder “library” is:
git update-index --assume-unchanged env.yaml
2. Scenario
- We have a set of files in a folder called
XXX_dataand we want to send them in the Nakala repository. - We define subsets of files using splitters, each subset will be uploaded to the Nakala repository as one deposit.
- We want to associate rich metadata, possibly in multiple languages, to each deposit.
By default, we add the XXX_data folder in a library/data/XXX folder. We can also used an folder outside library, but in such case, we will need to modify the library/src/Edition/dataconfig.yaml file (i.e. identify a personal path and the characteristics and names of the files in the folder).
Identify the data folder that will be that target of all the notebooks: Open library/src/Edition/data_config.yaml, modify selected_corpus on line 1, it should be one of the list in the corpus: section below.
3. Prepare metadata and list of files
3.1. Manage users and collections (optional)
3.1.1 Users
Open the jupyter notebook library/src/3_ToRepository/Process_Users.ipynb. Follow the documentation on it, you should execute the first three code cells until “Select test or prod environment”. Here the default env is test but you should set prod to True to switch to production (prod = False for test, prod = True for prod).
3.1.2. Collections
Open the jupyter notebook library/src/3_ToRepository/Process_Collections.ipynb. Follow the documentation on it, you should execute the first three code cells until “Select test or prod environment”. Here the default env is test but you should set prod to True to switch to production (prod = False for test, prod = True for prod).
3.2. Document metadata
A. Go to the 1_Metadata folder, open yaml_form.html in a browser. The metadata can be documented from scratch, or you can use a preexisting .yaml file. For more information on the nature of the metadata, see specific documentation. Save your work by clicking on the Generate YAML button.
B. Once the XXX_metadata.yaml for the XXX set of files we want to upload is generated on your computer, move it to the folder library/data/XXX where you also need to store the files of the XXX set.
C. Open the jupyter notebook library/src/1_Metadata/1_Create_Metadata_csv.ipynb. Run all cells. The output is the file XXX_metadata.csv in the XXX corpus folder.
D. Modify the file as needed (all titles are the same in the automatially generated file). Save the modified file as: XXX_metadata_ok.csv in the XXX corpus folder.
4. Send files to Nakala
A.Open the jupyter notebook `library/src/3_ToRepository/rich_nakala_uploader.ipynb.
B. By default, the files will be sent to the testing instance of Nakala. It is recommended to test first.
If the tests are conclusive, to send files to the production instance of Nakala, set prod to True in the second cell.
C. Run all cells.
DONE!