Difference between revisions of "Git to AWS S3 System"
(→Git Access) |
(→Git Access) |
||
Line 108: | Line 108: | ||
In addition to access to AWS S3 set up in the previous section, this system needs to be able to clone, pull, add, commit, push files to the git repositories of interest. There are many ways to do so. |
In addition to access to AWS S3 set up in the previous section, this system needs to be able to clone, pull, add, commit, push files to the git repositories of interest. There are many ways to do so. |
||
− | ''For example'', see the [[Lingoport Wiki Git page |
+ | ''For example'', see the [[Git|Lingoport Wiki Git page]]. |
=== Verification === |
=== Verification === |
Revision as of 16:06, 3 August 2022
Contents
Introduction
Customers may want to isolate the actual repositories from Lingoport's products, especially for Localyzer. One option to do so is to push files to AWS S3 from the repositories and let Localyzer access only S3.
If customers decide on this option, we recommend automating the process from Git to S3 to Localyzer to the TMS and back. One of the keys here is to automate the transfer of the desired files (typically resource files such as.properties or .json) from Git to S3 and from S3 to Git, then to on-board the Localyzer project using S3 as the data source for the resource files.
To do so, the following will be needed:
- Git/S3 System: one Linux system will host the bash scripts to automate the transfer from Git to S3 and back, most likely inside the customers network.
- Lingoport System: The system hosting Jenkins, Dashboard, etc., connected to S3 and the TMS
- A dedicated S3 bucket: The S3 bucket will have two main top directories:
- a to_localyzer top level directory: Under this directory will be a directory per Git repository and branch. This is where the configured files (.properties, .resx, .json, etc.) coming from each Git repositories will be retrieved by Localyzer for analysis and sending to translation.
- a from_localyzer top level directory: Under this directory will be a directory per Git repository and branch, created by Localyzer, with translated files. These files will be picked up by the Git/S3 System scripts and pushed to the repositories.
AWS Installation and Configuration
This section applies to both:
- Git/S3 System: : Note: The disk size should be based on the volume of files sizes for the repositories to be on-boarded.
- Lingoport System:
On both systems, the AWS S3 client need to be installed with the proper credentials.
On the Git/S3 System, the scripts need to be downloaded and set up with Cron with a frequency to be decided by the customer.
On the Lingoport System, the project will be on-boarded using S3 as the VCS method.
Install AWS Client V2
On the Unix box, install AWS Client (Version 2). To do so, follow this link
Or quickreference on Linux is:
$ whoami # should be root, or a user with 'sudo' access $ curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" $ unzip awscliv2.zip $ sudo ./aws/install
AWS User
The system authenticates to S3 by providing an AWS aws_access_key_id and the associated aws_secret_access_key.
Most common is to create a service account to provide these credentials.
Please find associated aws documentation here
The provided account must have permissions to read, download from, and write to, the associated AWS S3 bucket.
Storing the AWS Credentials
As the 'jenkins' user on the target system, create /var/lib/jenkins/.aws (~/.aws as 'jenkins'), along with a .aws/config and .aws/credentials.
Examples:
$ whoami jenkins $ mkdir -p ~/.aws $ : #Substitute your region for us-east-1 as needed: $ cat <<EOF >> ~/.aws/config [default] region=us-east-1 output=json EOF $ : # Fill in the aws_access_key_id and aws_secret_access_key per your organization's AWS service account: $ cat <<EOF >> ~/.aws/credentials [default] aws_access_key_id=<access key id associated with read+write access to the target S3 bucket per your Org> aws_secret_access_key=<secret access key associated with the aws_access_key_id above> notes="S3 Read+Write access for <your Org>" EOF
Test
Make sure you can read, download from, and write to the target s3 bucket. From the system, try running:
echo "Testing view access:" aws s3 ls s3://<your bucket>/<optional path> echo "Testing write access:" echo "Write me." > test.txt aws s3 cp test.txt s3://<your bucket>/<optional path>/test.txt # --SSE AES256 # <--- uncomment that if encryption is required and your org uses the default AES256 encryption. Or replace with other settings as needed. echo "Testing download access:" rm test.txt # remove it so that you have to get it back from s3 aws s3 cp s3://<your bucket>/<optional path>/test.txt . ls # You should see test.txt
Git to S3 Installation and Configuration
In order to set up the automation from Git to S3 and back, make sure you have the git_to_s3.zip file. If you do not have it, please contact support (at) lingoport dot com.
Git Access
In addition to access to AWS S3 set up in the previous section, this system needs to be able to clone, pull, add, commit, push files to the git repositories of interest. There are many ways to do so.
For example, see the Lingoport Wiki Git page.
Verification
Make sure all is set up correctly by simply cloning a project of interest for Localyzer. For instance
git clone https://github.com/mycompany/myrepo
Installation
- Unzip the git_to_s3.zip file in a directory accessible by Cron jobs.
This should result in the following set of files:
- git_to_s3/scripts: where the bash scripts reside to select and transfer files to and from git/S3. Make sure the .sh files are executable. If not, run chmod +x *.sh.
- git_to_s3/config: this is where the git repository, branches, file types, and optionally directories are set up
- git_to_s3/logs: this is where the log files will end up
Configuration
Follow the README.md instructions.
Mostly
- s3config.properties: Set the S3 bucket and to_localyzer/from_localyzer directories
S3_TO_LOCALYZER=s3://<S3 URL>/to_localyzer/ S3_FROM_LOCALYZER=s3://<S3 URL>/from_localyzer/
- repositories.txt: Set the Git URL, branch, and optionally the directories to include, one per line, in the following format:
https://<giturl>/<organization>/<repository> <branch> <optionally, comma separated list of include dirs>
For instance:
https://github.com/lingoport-public/Rebel-Outfitters Payments
- gitProjectLocation.txt: Set the location of where the Git repository will be cloned before selecting which files to push to S3 and the selected files to push to S3. It's a one liner with the directory name. For instance:
/var/lib/s3data
- fileSuffixes.txt: Set the resource file extensions so only those files are copied to the S3 bucket. For instance:
.properties .json .resx
Project Config
First On-Board your Project, (but set the VCS details to 'None'). Then run the associated Jenkins job once (this pre-populates several directories). Note: The Jenkins Job is expected to fail as the setup is not yet complete.
Then, edit the file:
/var/lib/jenkins/Lingoport_Data/L10nStreamlining/<your group>/projects/<your project>/config/config_vcs.properties
Adding the following:
... VCS_TYPE=S3 ... S3_BUCKET_URL=s3://my-bucket/optionalsubdirs ...
If your bucket uses AES256 encryption, add the following at the end:
S3_OPTS=--sse AES256
Otherwise, leave blank:
S3_OPTS=