EC2 Tutorials¶
This documentation is modified from a workshop on Amazon Web Services, offered at UC Davis (and broadcast online) on March 7, 2016. The workshop page is here.
Full table of contents:¶
Start an Amazon Web Services computer:¶
This page shows you how to create a new “AWS instance”, or a running computer.
0. Introduction¶
Why would you use cloud computing?:
- Your computer does not have enough resources to run the desired analysis (memory, processors, disk space, network bandwidth).
- You want to produce results faster than your computer can.
- You cannot install software in your computer (application does not have support for your operating system, conflicts with other existing applications)
- You need dynamic resources – e.g., you only need a high mem machine for a week but not a whole year.
- You don’t want to have to manage the infrastructure of an HPC or have access to an HPC.
Start at the Amazon Web Services console EC2 launch wizard. You’ll need to sign in to EC2.
1. Switch to zone US East (N Virginia) if not already there¶

2. Click on “Launch instance.”¶

3. Select “Community AMIs.”¶

4. Search for ami-002f0f6a (ubuntu-wily-15.10-amd64-server)¶
Use ami-002f0f6a.
5. Click on “Select.”¶
6. Choose m4.large.¶

7. Click “Review and Launch.”¶
8. Click “Launch.”¶

9. Select “Create a new key pair.”¶
Note: you only need to do this the first time you create an instance. If you know where your amazon-key.pem file is, you can select ‘Use an existing key pair’ here. But you can always create a new key pair if you want, too.
If you have an existing key pair, go to step 12, “Launch instance.”

10. Enter name ‘amazon-key’.¶
11. Click “Download key pair.”¶
12. Click “Launch instance.”¶
13. Select View instances (lower right)¶

14. Bask in the glory of your running instance¶
Note that for your instance name you can use either “Public IP” or “Public DNS”. Here, the machine only has a public IP.

You can now Log into your instance with the UNIX shell or Configure your instance firewall.
Log into your instance with the UNIX shell¶
You will need the amazon-key.pem
file that was downloaded in
step 11 of booting up your new instance (see Start an Amazon Web Services computer:).
Then, you can either Log into your instance from a Mac or Linux machine or Log into your instance from a Windows machine.
Log into your instance via the UNIX shell (Mac/Linux)¶
Log into your instance via MobaXTerm (Windows)¶
See: Log into your instance from a Windows machine
Logging in is the starting point for most of the follow-on tutorials. For example, you can now install and run software on your EC2 instance.
Go back to the top page to continue: EC2 Tutorials
Log into your instance from a Mac or Linux machine¶
You’ll need to do two things: first, set the permissions on
amazon-key.pem
:
chmod og-rwx ~/Downloads/amazon-key.pem
Then, ssh into your new machine using your key:
ssh -i ~/Downloads/amazon-key.pem ubuntu@MACHINE_NAME
where you should replace MACHINE_NAME with the public IP or hostname
of your EC2 instance, which is located at the top of the host
information box (see screenshot below). It should be something like
54.183.148.114
or ec2-XXX-YYY.amazonaws.com
.
Here are some screenshots!
Change permissions and execute ssh¶

Successful login¶

Host information box - MACHINE_NAME location¶

Logging in is the starting point for most of the follow-on tutorials. For example, you can now install and run software on your EC2 instance.
Go back to the top page to continue: EC2 Tutorials
Log into your instance from a Windows machine¶
Go follow the instructions this URL:
https://angus.readthedocs.org/en/2015/amazon/log-in-with-mobaxterm-win.html
Logging in is the starting point for most of the follow-on tutorials. For example, you can now install and run software on your EC2 instance.
Go back to the top page to continue: EC2 Tutorials
Configure your instance firewall¶
Normally, Amazon computers only allow shell logins via ssh (port 22 access). If we want to run a Web service or something else, we need to give the outside world access to other network locations on the computer.
Below, we will open ports 8000-9000, which will let us run things like RStudio Server. If you want to run other things, like a Web server, you’ll need to find the port(s) associated with those services and open those instead of 8000-9000. (Tip: Web servers run on port 80.)
1. Select ‘Security Groups’¶
Find “Security Groups” in the lower pane of your instance’s information page, and click on “launch-wizard-N”.

2. Select ‘Inbound’¶

3. Select ‘Edit’¶

4. Select ‘Add Rule’¶

5. Enter rule information¶
Add a new rule: Custom TCP, 8787, Source Anywhere.
Add a new rule: HTTP, 80, Source Anywhere.
Add a new rule: HTTPS, 443, Source Anywhere.
6. Select ‘Save’.¶
Running RStudio Server in the cloud¶
In this section, we will run RStudio Server on a remote Amazon machine. This will require starting up an instance, configuring its network firewall, and installing and running some software.
Reference documentation for running RStudio Server on Ubuntu:
1. Start up an Amazon instance¶
Start an Ubuntu image using ami-05ed6813 on an m4.xlarge machine, as per the instructions here:
2. Configure your network firewall¶
Normally, Amazon computers only allow shell logins via ssh. Since we want to run a Web service, we need to give the outside world access to other network locations on the computer.
Follow these instructions:
Configure your instance firewall
(You can do this while the computer is booting.)
You’ll also want to update your DNS support and ensure that both DNS resolution and DNS hostnames are set to “yes” by following these instructions.
3. Log in via the shell¶
Follow these instructions to log in via the shell:
4. Install R and the RStudio tool¶
Type the following commands
sudo docker pull rocker/tidyverse
sudo docker run -d -p 8787:8787 rocker/tidyverse
This will take a few minutes.
Upon success, you should see something a print out of alphanumerics.
5. Open your RStudio Server instance¶
Finally, go to ‘http://’ + your IPv4 public hostname + ‘:8787’ in a browser, eg.
http://XX.XXX.XXX.XXX:8787/
and log into RStudio with username ‘rstudio’ and the password ‘rstudio’ you set it to above.
Voila!
You can now just go ahead and use this, or you can “stop” it, or you can freeze into an AMI for later use.
Note that on reboot, RStudio Server will start up again and all your files will be there.
Go back to the index: EC2 Tutorials.
Transfer data to and from an EC2 instance using Filezilla¶
You will need the amazon-key.pem
file that was downloaded in
step 11 of booting up your new instance (see Start an Amazon Web Services computer:).
Download Filezilla¶
Download the FTP application Filezilla. Note: There is an optional step in my install that asked if I wanted Yahoo to be my default browser, and I checked “NO”.
Open FileZilla¶
Near the top of the screen, you will need to provide the following information: Host, username, and port. We will also need to provide a password which is associated with your *.pem EC2 key file.
Password:
To let Filezilla know where your key file is, you can assign it through the FileZilla –> Settings –> SFTP –> Add key file –> Select your *pem file
Host:
Your host name is the public DNS of your EC2 instance, e.g., ec2-52-32-45-44.us-west-2.compute.amazonaws.com
Username:
Your username is ubuntu
Port:
By default, the port for SFTP is 22.
Once this is filled in, you can press the Quickconnect button and you will see files that are in your /home/ubuntu directory on your server. You may now move files to and fro.
Working with persistent storage: volumes and snapshots¶
Volumes are basically UNIX disks (“block devices”) that will persist after you terminate your instance. They are tied to a zone within a region and can only be mounted on instances within that zone.
Snapshots are an Amazon-specific thing that let you communicate data on volumes between accounts. They are “read-only” backups that are created from volumes; they can be used to create new volumes in turn, and can also be shared with specific people (or made public). Snapshots are tied to a region but not a zone.
Creating persistent volumes to store data¶
0. Locate your instance zone¶

1. Click on the volumes tab¶

2. ‘Create Volume’¶

3. Configure your volume to have the same zone as your instance¶

4. Wait for your volume to be available¶

5. Select volume, Actions, Attach volume¶

6. Select instance, attachment point, and Attach¶
Here, your attachment point will be ‘/dev/sdf’ and your block device will be named ‘/dev/xvdf’.

7. On your instance, list block devices¶
Type:
lsblk
You should see something like this:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda 802:0 0 8G 0 disk
`-xvda1 802:1 0 8G 0 part /
xvdf 802:80 0 100G 0 disk
Now format the disk (ONLY ON EMPTY DISKS - THIS WILL ERASE ANY DATA ON THE DISK):
sudo mkfs -t ext4 /dev/xvdf
and mount the disk:
sudo mkdir /disk
sudo mount /dev/xvdf /disk
sudo chmod a+rwxt /disk
and voila, anything you put on /disk will be on the volume that you allocated!
The command ‘df -h’ will show you what disks are actually mounted & where.
Detaching volumes¶
1. Unmount it from the instance¶
Change out of the directory, stop any running programs using it, and then:
sudo umount /disk
3. Yes, detach.¶

Note, volumes remain attached when you reboot or stop an instance, but are (of course) detached when you terminate an instance.
Terminating your instance¶
Amazon will happily charge you for running instances and/or associated ephemeral storage until the cows come home - it’s your responsibility to turn things off. The Right Way to do this for running instances is to terminate.
The caveat here is that everything ephemeral will be deleted (excluding volumes that you created/attached). So you want to make sure you transfer off anything you care about.
To terminate:
1. Select Actions, Instance State, Terminate¶
In the ‘Instances’ tab, select your instance and then go to the Actions menu.

2. Agree to terminate.¶

3. Verify status on your instance page.¶
Instance state should be either “shutting down” or “terminated”.

Return to index: EC2 Tutorials
Things to mention and discuss¶
When do disks go away?¶
- never on reboot;
- ephemeral disks go away on stop;
- AMI-attached volumes go away on terminate;
- attached volumes never go away on terminate and have to be explicitly deleted;
- snapshots only go away when you explicitly delete them.
What are you charged for?¶
- you are charged for a running instance at the @@instance price rates;
- ephemeral storage/instance-specific storage is included within that.
- when you stop an instance, you are charged at disk-space rates for the stopped disk;
- when you create a volume, you are charged for that volume until you delete it;
- when you create a snapshot, you are charged for that snapshot until you delete it.
To make sure you’re not getting charged, go to your Instance view and clear all search filters; anything that is “running” or “stopped” is costing you. Also check your volumes and your snapshots - they should be empty.
Regions vs zones:¶
- AMIs and Snapshots (and keys and security groups) are per region;
- Volumes and instances are per zone;