imperialWicket

am i the only croquet-playing computer nerd?

« AWS: Install Erlang/OTP on Amazon Linux AWS: Install PostgreSQL on Amazon Linux (quick and dirty) »

AWS: Install PostgreSQL 9.0 on Amazon Linux

2011-06-3

[UPDATE: Since writing this, Amazon has updated their repos (several times) for Amazon Linux. You can still use pgrpms and get the latest and greatest PostgreSQL, and that's a fine technique. However, I want to point out that you can use: "sudo yum install postgresql-9.1 postgresql-server-9.1" to install PostgreSQL 9.1 directly from the Amazon Linux repos.]

This post is a follow up to the Quick and dirty PostgreSQL installation on Amazon Linux. Use the previous article for quick installations; this post details an installation technique with which I would be more comfortable in a production environment. This post also details installation from pgrpms, instead of the Amazon repos. The rpm installation is not as bleeding edge as building from source, but pgrpms stays quite current, and this avoids manual dealings with dependencies (which is nice).

This production environment will have limited scalability, and the purpose of this post is not to detail each and every aspect of configuring the environment. I am going to focus on getting the database server installed and running, and allowing reasonably secure access. I am not going to cover backups; I am not going to go into details about PostgreSQL roles; I am not going to cover PostgreSQL tuning. These are all things to consider, and after going through this installation walk-through, you will be in a reasonable position to consider them.

The high level steps will be:

  1. Launch an EC2 instance
  2. Launch an EBS volume and mount it for our database cluster
  3. Add the pgrpms repository and disable PostgreSQL in the Amazon repositories
  4. Install and configure PostgreSQL access, ports, and listeners
  5. Start the PostgreSQL server
  6. Create Users for external access
  7. PgAdmin III usage

Launch an EC2 instance

This is a straight-forward EC2 instance running 64 bit Amazon Linux. Points of note are that you probably should avoid Micro instances for a production database server, and be sure that you set an inbound rule for PostgreSQL in your security group.

  1. Login, navigate to EC2 tab. Use the "Launch Instance" button to get started.
  2. Choose "Basic 64-bit Amazon Linux AMI"
  3. Generate a single instance, availability zone probably does not matter, and I am going to run with Micro (Micro is likely sufficient for proof of concept, but I would not recommend using it in a production server).
  4. Default "Advanced Instance Options" should be fine.
  5. Give your instance a meaningful value for the "Name" key. This is a good habit.
  6. Create/Use a key pair for which you have the private key.
  7. Create a new security group - I like to dedicate security groups to server types. If you already have a 'PostgreSQL' server, use that security group. Avoid using the 'basic' security group for all of your servers and simply adding firewall rules whenever necessary.
  8. Security group config (Note the PostgreSQL port!!!)
    • Group name: postgresql
    • Group Description: PostgreSQL database server security group
    • Inbound Rules: SSH - 22 from 0.0.0.0/0 and PostgreSQL 5434 from 0.0.0.0/0
  9. Launch the instance.
  10. Connect with something like:
    ssh -i .ssh/myKey.pem ec2-user@ec2-11-11-11-111.compute-1.amazonaws.com

Launch an EBS volume and mount

We are going to create a dedicated EBS volume for our database cluster. This provides a few benefits for our database server. We can create periodic snapshots of the database without affecting availability of our EC2 instance. We can then use those snapshots to launch new EBS volumes (allowing us to easily increase the size of the database volume, if necessary). Alternatively, we can shut down the EC2 instance, create a new EC2, and point it at the original database volume.

  1. In the AWS Management Console, select the Elastic Block Store "Volumes" option from the Navigation area of the EC2 tab.
  2. Select the "Create Volume" button
  3. Choose a size appropriate to your database needs
  4. Be sure the Availability Zone matches the zone where your root volume is located, and create the volume
  5. Select the new volume in your EBS Volumes list, and select the "Attach Volume" button.
  6. Choose your instance and assign a device ("/dev/sdb" will likely work)
  7. Connect to your EC2 instance, and execute ("/dev/sdb" must match the assigned device, and I am going to mount at "/pgdata")
    sudo su -
    yes | mkfs -t ext3 /dev/sdb
    mkdir /pgdata
    mount /dev/sdb /pgdata
    exit
    
    I am not certain why 'yes' is helpful/required here, I am going with Amazon's recommendation. I have not investigated, but if anyone can tell me how piping 'yes' to mkfs helps, I would love to know.

Update the yum repositories

I want to install the latest stable postgresql from pgrpms.org. We could just download the rpm and manually install from the file, but that inevitably results in some dependency issues. I prefer to configure an alternate yum repository for a particular keyword. So we need to update the configuration for the Amazon repositories (be sure to update both "main" and "updates", and do not forget the asterisk).

sudo vim /etc/yum.repos.d/amzn-main.repo
[At the bottom of the "[amzn-main]" section, after "enabled=1", add "exclude=postgresql*"]
sudo vim /etc/yum.repos.d/amzn-updates.repo
[Add the same exclude to the bottom of the "[amzn-updates]" section]

Download the repository/key installation rpm from pgrpms.org

wget http://yum.pgrpms.org/reporpms/9.0/pgdg-redhat-9.0-2.noarch.rpm
sudo rpm -ivh pgdg-redhat-9.0-2.noarch.rpm

Since this rpm is generated for RHEL6, we need to make a minor change to the resulting /etc/yum.repos.d/pgdg-90-redhat.repo file. Update the URLs, replacing the $releaseserver value with '6'. The Amazon Linux $releaseserver value is a date, instead of the primary version number that this repository configuration is expecting.

sudo vim /etc/yum.repos.d/pgdg-90-redhat.repo

The updated base url values should look like this (update two of them):

## ORIGINAL
# baseurl=http://yum.pgrpms.org/9.0/redhat/rhel-$releasever-$basearch
## UPDATED
baseurl=http://yum.pgrpms.org/9.0/redhat/rhel-6-$basearch

All we have done is told yum that it should use the amzn repositories for everything except packages that meet the "postgresql*" criteria. After that, install a new repository configuration for the pgrpms.org repository.

Install and configure PostgreSQL 9.0 on Amazon Linux

After updating the yum repository configurations, "yum install postgresql*" should provide us with the latest postgresql* packages from pgrmps.org. Notice that a few dependencies will come from the amazon repositories, but most of the pertinent postgresql* packages are coming from pgrpms. It is extremely likely that you do not need all of these packages. Limit the installation however you feel is appropriate.

I usually install something like the following:

sudo yum install postgresql90 postgresql90-contrib postgresql90-devel postgresql90-server 

Now we need to initialize the database cluster, edit the configuration and start the server. First remove the /pgdata/lost+found directory. PostgreSQL's initdb will fail to initialize a database cluster in /pgdata/ when there are files/directories present. Then we will change ownership of the /pgdata directory to the postgres user and group, and change to the postgres user. As the postgres user, we can configure and launch the server.

## Be careful with this
sudo rm -rf /pgdata/lost+found
sudo chown -R postgres:postgres /pgdata
sudo su -
su postgres -
/usr/pgsql-9.0/bin/initdb -D /pgdata

Edit the postgresql.conf file (be sure you are still using the postgres user):

vim /pgdata/postgresql.conf

Update the lines:

#listen_addresses = 'localhost' ...
#port = 5432 ...

To read:

listen_addresses = '*' ...
port = 5434

Edit the pg_hba.conf file (be sure you are still using the postgres user):

vim /pgdata/pg_hba.conf

Update the bottom of the file to read:

# TYPE  DATABASE        USER            CIDR-ADDRESS            METHOD

# "local" is for Unix domain socket connections only
local   all             postgres                                trust
# IPv4 local connections:
host    all             power_user      0.0.0.0/0               md5  
host    all             other_user      0.0.0.0/0               md5 
# IPv6 local connections:
host    all             all             ::1/128                 md5

Start the server:

/usr/pgsql-9.0/bin/pg_ctl start -D /pgdata

Create users for external access

Create the power_user as a superuser:

/usr/pgsql-9.0/bin/createuser power_user
Shall the new role be a superuser? (y/n) y

Create the other_user as a non-superuser, who can not create databases or roles:

/usr/pgsql-9.0/bin/createuser other_user
Shall the new role be a superuser? (y/n) n
Shall the new role be allowed to create databases? (y/n) n
Shall the new role be allowed to create more new roles? (y/n) n

Connect to the database as postgres, and set the new user passwords (Be sure you are still logged in as postgres). Also create a database for the other_user:

/usr/pgsql-9.0/bin/psql -p 5434
postgres=# ALTER USER power_user WITH PASSWORD 'aVerySecurePassword';
postgres=# ALTER USER other_user WITH PASSWORD 'anEquallySecurePassword';
postgres=# CREATE DATABASE other_user WITH OWNER other_user;

PgAdmin III usage

We can download the latest PgAdmin from the same repositories for Linux, or get the latest from pgadmin.org for Win/OS X. Be sure that you download a PgAdmin version that is compatible with the 9.0 release.

After installing and launching PgAdmin, use the plug button in the top left to configure a new server connection. Enter a name to identify your server, the host of your EC2 instance, and change the port to 5434. Connect to the Maintenance DB "postgres" as "power_user" with the password you configured.

Notice that you can also connect as "other_user", but other_user can only see the "other_user" database schema (and not the postgres data). Also notice that you can not connect remotely as the postgres user.

Final thoughts

Now you have a postgresql 9.0 installation on an AWS EC2 instance running Amazon Linux.

A couple of additional notes:

  1. It is a good idea to set a password for the postgres user, and change the local connection from trust to md5 in your pg_hba.conf file.
  2. It is also a good idea to restrict access to the power_user and other_user to a distinct IP or group of IPs. Alter the "0.0.0.0/0" value for each user in pg_hba.conf to match the appropriate IP or group.
  3. This service will not start if the instance is restarted. Run "chkconfig postgresql-9.0 on" (as root) to make it start automatically.

Let me know if I skipped any steps, or if anyone has ideas to make this implementation better. I hope this helps a few people get PostgreSQL 9.0 up and running, and keeps the connections as secure as possible. Feel free to post questions in the comments, I will do my best to assist.

27 Responses to AWS: Install PostgreSQL 9.0 on Amazon Linux

Feed for this Entry

26 Comments

  • Hi, is missing an "s" in this line:
    /usr/pgql-9.0/bin/pg_ctl start -D /pgdata

    Line Corrected:
    /usr/pgsql-9.0/bin/pg_ctl start -D /pgdata

    Thanks

    #1530 | Comment by Douglas Tondo on Jun 15, 2011 04:15pm
  • @DT - Thanks for the correction, I updated in the article.

    Also, DT highlighted some of the pros/cons of using an alternate port for your database server. While you are less likely to encounter brute force attacks and random server hits, it is entirely possible that confusion may result from a database server listening on a non-standard port. Most PostgreSQL utilities allow custom ports, but all of them (that I have used) will default to '5432'. Consider carefully whether an alternate port is worthwhile for your database server.

  • So I need to have to EBS? One added as a root device, and another one, for the database? My ubuntu AMI is installing postgres 8.4 - is it a big deal?

    #2672 | Comment by sad_tuba on Jul 30, 2011 06:46am
  • two*

    #2673 | Comment by sad_tuba on Jul 30, 2011 06:47am
  • @sad_tuba - Strictly speaking, no. If you check out the quick and dirty postgresql installation, it's all installed on the root device.

    The benefits you gain by having the postgresql data directory on a separate EBS are primarily that you can grow it more easily, and that you can share it with other AMIs more easily. It all depends on your needs with the database (up time, backups, expansion, data security, etc.). If you just need a working database, and size/uptime, etc. are not critical, I recommend using the quick and dirty installation.

    The big update from 8.4 to 9.0 has to do with native replication. I am still using 8.4 in a lot of production and test areas, as the 9.0 updates are non-critical for my purposes. If you are not searching for particular functionality, and just need a great relational database, I am sure PostgreSQL 8.4 will be more than adequate.

  • Any plans to update this document for PostgreSQL 9.1?

    #3667 | Comment by sean on Oct 12, 2011 05:36pm
  • It's in my ever-growing queue of topics, but it's behind a few things at the moment. If anyone else gets here and wants it, most of the walk through is valid if you simply replace all the '90' references with '91'.

  • Hi,

    thanks very much with this article, I'm new with Linux AND aws AND postgres so it was really helpful. However I run into a little problem maybe you can help me. Yesterday I've done all the steps and it was working. But today when I've tried to connect with pgadmin3 I get a "server is not listening" error. When I ssh to instance and "ps aux | grep postgre" I can see that postgres is not running. Maybe this is due to the fact that I've run postgresql server and then terminated ssh session. Shouldn't I run that command with & at the end or "nohup"? That is my working theory for now but what confuses me is that when I try to start server again it says that "postgres cannot access the server configuration file /pgdata/postgresql.conf" When I checked it that partiotion is completely empty???

    #5773 | Comment by Nemanja on Dec 11, 2011 12:54pm
  • ok, i've found out that pg_ctl is starting server in the background so that is not the issue. Somehow postgres server is down and /pgdata is empty. I have no idea what happend.

    #5774 | Comment by Nemanja on Dec 11, 2011 01:12pm
  • @Nemanja - Is it possible that the server was restarted?

    If so, "sudo chkconfig postgresql-9.0 on" will ensure that the service starts each time the server is restarted. Other than that, it will likely require some investigation in the logs.

  • I didn't restarted it, but i have created ami from it maybe that procedure restarts the server? But that wouldn't explain why is /pgdata completely empty. Where should i check for logs I have started server by your instructions without specifinig log file...

    #5776 | Comment by Nemanja on Dec 11, 2011 05:57pm
  • Generating an ami (and taking a snapshot) will restart the instance, if the postgres service wasn't configured to start automatically on system restart, that is likely the issue with the postgres service being down.

    Another thing, I didn't detail steps to automate the mounting of the EBS at restart (you could do this with /etc/fstab). The reason pgdata is empty is likely because your EBS volume is not mounted.

    For now, remounting the EBS volume and then restarting postgres will likely get you back to functioning. Adding the chkconfig update and updating fstab to mount that volume on restart will be good updates.

  • :) Now it's all clear to me.

    First of all, I didn't know that creating an AMI will restart my server. That explains why postgres is down. Secondly, I didn't know that partition won't be automatically mounted. I didn't figure that it wasn't mounted because there still was a /pgdata folder. The last thing I didn't know is that when you unmount a partition that folder is still there.

    So now my partition is not mounted, I've restarted the postgres which created all the files on system EBS in folder /pgdata so now my data is not on a partiotion I need it to be.

    But I've lerned a lot, thank you nicholas. I'm a less of a noob than two days a go. :)

    #5778 | Comment by Nemanja on Dec 12, 2011 06:23am
  • Glad it's back up and you are taking something positive out of this endeavor.

    If you don't have a lot of data in the db, and you want to move back to your external EBS volume, just shut down postgres, rm/mv the existing /pgdata folder, recreate the folder and mount your external EBS volume, and restart postgres. If this is just a trial server and is not mission critical, it's probably fine to run off the root EBS volume.

    Remember to delete your EBS volume in the AWS Management Console if you aren't using it. They don't cost very much, but you'll continue to see charges for it.

  • Hi,

    This is a great article - thanks for the post. One quick question - I'm coming from Debian, so I know I must be doing something stupid.

    I changed my port to 5435 in postgresql.conf and started the server. When I try to create the superuser, I get:

    createuser: could not connect to database postgres: could not connect to server: No such file or directory
    Is the server running locally and accepting connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

    Any idea why it's still listening on 5432 even though I changed it to 5435?

    #5790 | Comment by Jason on Dec 21, 2011 03:39pm
  • Great article mate! Thanks!

    #5791 | Comment by Golly on Dec 21, 2011 04:36pm
  • @Jason - Ttry 'createuser -p 5435 USERNAME' to explicitly point createuser at the correct port. By default, all of the postgresql binaries will try to connect to localhost:5432, as the current user, unless you tell them otherwise.

    I don't have an instance up right now to confirm, but I think you identified an issue in my steps. It looks like some of my steps involve changing the port, and a couple of steps were using a standard 5432.

    As soon as I confirm this, I'll update the post body to add that port to the command explicitly, thanks for catching the issue and commenting!

  • If you're interested, to auto-mount the EBS volume, I added the following line to /etc/fstab:

    /dev/xvdf /pgdata auto defaults,noatime 0 0

    Amazon instances seem to like xvdf instead of /sdb. Also, I read the noatime argument will increase performance by reducing logs of disk access time.

    The only thing I'm still having trouble with is getting it to start automatically. Running chkconfig does set the postgres state to on, but when I try to run psql, I get errors it's not running. Haven't had a chance to check the logs yet, but I imagine it has something to so with the custom data location. (Did this with no custom port defined.)

    #5794 | Comment by Jason on Dec 22, 2011 03:18pm
  • Hello, thanks for taking time to share your experience with amazon linux and postgresql 9. Good luck!

    #7552 | Comment by Rene Romero Benavides on Jan 17, 2012 04:26pm
  • Great!

    You are chaning password for same user twice:
    /usr/pgsql-9.0/bin/psql -p 5434
    postgres=# ALTER USER power_user WITH PASSWORD 'aVerySecurePassword';
    postgres=# ALTER USER power_user WITH PASSWORD 'anEquallySecurePassword';
    postgres=# CREATE DATABASE other_user WITH OWNER other_user;

    #10502 | Comment by JH on Sep 22, 2012 07:35am
  • @JH - Haha! It doesn't do all that much good to change one user's password twice in a row, does it?
    Thanks for catching that, it's corrected in the body of the post.

  • Thank you!

    I've only got one problem. When stopping my EC2 instance and restarting it the following message appears when trying to start my postgres:

    -bash-4.1$ /usr/pgsql-9.0/bin/pg_ctl start -D /pgdata
    server starting
    postgres cannot access the server configuration file "/pgdata/postgresql.conf": No such file or directory

    Seems like it doesn't save my files.
    How can I solve that problem?

    #10649 | Comment by iasonp on Oct 15, 2012 06:18am
  • Inability for postgres to access the server configuration file means that either the file isn't there, or the permissions are keeping the 'postgres' user from accessing the file.

    ls -latr /pgdata/postgresql.conf

    It seems like probably root or some other user owns the configuration file?

  • Hmm, I think I've solved the problem by not mounting the pgdata folder to a new ebs volume. I just kept it on the initial 8 gb volume and it is still there after stopping

    Maybe the mount command doesn't work for me or I need to make any additional settings to the new ebs volume.
    Or maybe the micro account doesn't allow more than one ebs?

    #10651 | Comment by iasonp on Oct 15, 2012 08:40am
  • Ah, sorry, I forgot there's no fstab data here. Check this previous comment.

    After restart, you'll need an entry in fstab to automatically mount the separate volume.

  • I'll need this soon! :)

    Thanks!

    #10691 | Comment by Neil on Oct 30, 2012 12:55pm

About You

Email address is not published

Add to the Discussion

Search