You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Scopus_download job is triggered by incoming email; this job uses scopus_update_email_parser.py to take an email and scan it for url. Once the url is found, it opens it up and saves it to a specified directory. This is part of the code for an automated process which will leverage Jenkins to set off a process when triggered by the reception of an email with a url-link.
The Scopus_update job will be triggered by Scopus_download job; it uses the load.sh and process_pub_zips.sh to extract all publication ZIP files from the specified working directory to a temporary directory, process extracted ZIP files one-by-one and parse all XML files and update data in the database in parallel.
There are several customizable command-line parameters available for these scripts. These options can be found in the documentation within the scripts or by using the "help" option -h while executing them.
Tables
The updated Scopus data will be stored into 22 tables in our Postgres database and all tables can be connected by primary keys and foreign
keys. The detailed information about each table is in scopus_tables.sql.
Entity-Relationship Diagram
Refer to entity-relationship diagram (ERD) below for all the tables and corresponding columns that have been created using the scripts above: