In this article, we will learn to define the various ways to set up a PostgreSQL database from scratch. For a data scientist, being able to query relational database systems is a must-have skill. SQL or Structured Query Language allows you to do this very efficiently. SQL not only lets you ask relevant questions about the data but also lets you play with the data in different ways.
Practically no real-world application is possible, without databases. Hence, the knowledge of databases and being able to deal with them are necessary parts of a data scientist’s toolbox. A database is simply an assortment of data or information that is typically stored and accessed on some type of computer system and can be handled or organized in diverse ways. Databases can be very simple, utilized for just storing data, or more complicated forms are used for modelling and design purposes. It will most often consist of some sorts of tables that make storing data more effective.
The PostgreSQL object-relational database system is designed to create a database and tables to manage data effectively. This tutorial will provide step-by-step instructions on how to build a database in Postgres so that you can explain it well in your next PostgreSQL Interview.
PostgreSQL is an object-relational open-source system for database management based on Postgres 4.2. It supports most of the SQL standards and provides a range of modern features. PostgreSQL and other relational database management systems utilise databases and tables to structure and organize their data. We can study the definition of those two terms quickly:
- databases: It segregates different sets of structures and data from one another
- tables: Describes the data structure and store the actual data values within databases
In PostgreSQL, there is also an intermediary object between tables and databases termed as the schema:
- schema: a namespace within a database that consists of tables, views, indexes, and other items.
This article won’t deal directly with PostgreSQL’s notion of a schema, but it’s good to know it’s there. Rather, we’ll be focusing on how to build and destroy PostgreSQL databases and tables.
What is a Postgres Database?
PostgreSQL is a light-weight and open-source RDBMS. The industry has well accepted it. You can learn more about PostgreSQL from its official website. Postgres has been a stable database product and is developing continuously for the past two decades. A lot of enterprise-scale applications, be it mobile or web, utilise Postgres Database as a backend relational database management system. Some of the most popular tech stacks where Postgres is mostly used are as follows.
- LAPP stack – One of the most popular web development technologies, LAPP stack comes with a combination of Apache, Linux, Postgres, PHP/Perl
- OLTP databases/Transactional Database systems – It is utilised as a general-purpose database system that supports real-time high throughout transactional workloads
- Analytical Databases / OLAP – Postgres is also commonly used to support analytical workloads in enterprises
- Geospatial Analyses – Postgres is also employed in the analysis of Geospatial data because of its support for the PostGIS plugin. PostGIS is a community-driven plugin built specifically for the Postgres Database to work with location analytics.
Postgres supports a long list of database features, including different enterprise features. Besides standard relational database features, some of the most prominent features in Postgres are:
- Streaming replication
- User-defined objects such as functions, operators, and data types
- Table inheritance
- Nested transactions
- Different unusual data types, such as IP addresses, Money, Geometry, JSON, and data ranges.
- Can execute stored procedures in over a dozen programming languages, including Java, Perl, Python, Ruby, and C/C++.
Features of PostgreSQL
The features offered by PostgreSQL are:
- PostgreSQL implemented the multi-version concurrency control (MVCC) feature at the very first. In Oracle, the multi-version concurrency control (MVCC) feature is known as the snapshot isolation.
- One of the advanced open-source databases, PostgreSQL provides different benefits for your organisation or business over other database systems. PostgreSQL is an object-relational database management system.
- Users can take part in the PostgreSQL community to post and share knowledge, bugs and doubts. Users can build or suggest the latest modules to its community since it is an open-source database management system.
- PostgreSQL has a broad range of communities owing to its own global community and different kinds of people, the development possibility is very high.
- PostgreSQL supports various programming languages such as Python, C/C++, Perl, Open Database Connectivity (ODBC), Ruby, Tcl, etc, allowing us to create different custom functions.
- SQL functions called ‘Stored Procedure’ can be utilised for a server environment
- PostgreSQL supports the below-mentioned standard procedural languages:
- Using a graphical installer such as Postgres.app or BigSQL
- Employing a package manager to install via the command line.
You can choose whichever option is convenient for you. Let’s see how to install on the command line.
1. Getting Homebrew
We will utilise Homebrew, a package manager, to install PostgreSQL on the command line. If you already have it installed, you can move to the part about installing PostgreSQL. Otherwise, let’s learn about it:
Let’s begin by using the below-mentioned command in our command line:
/usr/bin/ruby -e “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)”
The Ruby interpreter comes pre-installed as long as you have MacOSX 10.7 or higher installed. The rest of this command (the curl command) is a tool which makes web requests from your command line. This one downloads the script at the given URL, followed by which it is executed by the Ruby interpreter. The output can be seen similar to:
Now that we have Homebrew, Postgres can be installed.
MacBook-Pro:~ engineerapart$ /us/bin/ruby -e "S(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
=> This script will install:
Press RETURN to continue or any other key to abort
==> Downloading and installing Homebrew..
remote: Total 0 (delta 0), reused @ (delta 0), pack-reused 0
HEAD is now at Zda012d Merge pull request #1311 from MikeMcQuaid/audit-no-trailing-whitespace
Updated 3 taps (caskroom/cask, homebrew/core, homebrew/services).
==> Updated Formulae
==> Installation successful!
==> Next steps
Run "brew help' to get started
Further documentation: https://git.io/brew-docs
2. Installing Postgres
Installing PostgreSQL is extremely simple since we have Homebrew. Using Homebrew, run the below-mentioned command to install PostgreSQL :
brew install postgresql
The output which you can see is:
Homebrew manages to resolve Postgres’ dependencies, which in this case are Readline and OpenSSL. It also knows about the latest versions, where to find them, and how to unpack them. One can easily miss instructions at the very end of the install:
To have launchd start postgresql now and restart at login:
brew services start postgresql
Or, if you don't want/need a background service you can just run:
pg_ctl -D /usr/local/var/postgres start
/usr/local/Cellar/postgresql/9.5.4_1: 3,147 files, 35M
Let’s make Postgres up and running and make sure that each time the computer starts up, Postgres should also start. Run the following command:
pg_ctl -D /usr/local/var/postgres start && brew services start postgresql
To make sure whether Postgres is installed and running, we have to check which version is running:
Your output should appear like
MacBook-Pro:~ engineerapart$ postgres
postgres (PostgreSQL) 9.5.4
With it, Postgres is installed.
3. Configuring Postgres
To make itself usable, Postgres works pretty hard. It creates the user postgres automatically, by default. Let’s see what other users it has created. Let’s begin by utilising the psql utility. This utility is installed with Postgres that allows you to perform administrative functions without the need to learn about their actual SQL commands.
Begin by entering:
(You may have to use sudo psql postgres for this command to function, which depends on the way your system is configured).
You’ll see output like:
MacBook-Pro:~ engineerapart$ psql postgres
Type "help" for help.
That’s the psql command line. Now, you can enter a command to check what users are installed:
It returns the following:
As you can see, the Postgres user I very well mentioned, but, engineerapart what is this user? This is the thing Postgres does to make your life easier when you first install it. By default, the username that is utilised by all processes is the one you are logged in as, on most Linux-based operating systems. You don’t have to pass your username to most programs. But if a certain program, like Postgres, doesn’t have your username configured, it will fail!
Hence, the moment Postgres is installed, it creates a database user automatically that matches your username, enabling you to get started right away.
How to create a database in PostgreSQL in mac
You can make use of any one of the two commands mentioned below to create a database in Postgresql using psql after opening the psql utility on any of the 3 major operating systems:
- CREATE DATABASE
How to create a database in postgresql with the help of CREATE DATABASE, a SQL command/query
To execute the CREATE DATABASE statement, you should a special CREATEDB privilege or have a superuser role.The following shows the syntax of the CREATE DATABASE statement:
CREATE DATABASE database_name
[OWNER = role_name]
[TEMPLATE = template]
[ENCODING = encoding]
[LC_COLLATE = collate]
[LC_CTYPE = ctype]
[TABLESPACE = tablespace_name]
[ALLOW_CONNECTIONS = true | false]
[CONNECTION LIMIT = max_concurrent_connection]
[IS_TEMPLATE = true | false ]
To create a new database:
- Mention the name of the new database first post the CREATE DATABASE keywords. In the PostgreSQL database server, the database name should be unique. If you try to create a database whose name already exists, PostgreSQL will throw an error.
Followed by mentioning the specify one or more parameters for the new database.
- A new PostgreSQL database with the name database_name will be created by “CREATE DATABASE” command.
- database_name: The name of the database to be created can be specified .
- user_name: The user who will own the new database can be mentioned, or just specify DEFAULT (it will set the owner to be the user executing the command). If you wish to create a database owned by another role, then you must be an indirect or direct member of that role or be a superuser.
- template: You can mention the name of the template which has to be cloned and from which the new database will be created, or you can just add DEFAULT to utilise the default template (the standard system database template1).
- encoding: You can add the character set encoding to use in the new database by mentioning an integer encoding number, or a string constant (like ‘SQL_ASCII’), or simply write DEFAULT to make use of the default encoding (that will set the encoding of the template database). You can mention any character set’s encoding supported by PostgreSQL server.
- locale: It is utilised to set LC_CTYPE or LC_COLLATE parameters at once. If you specify it, you cannot specify either of the two parameters.
- lc_collate: You can specify the collation order (LC_COLLATE) to use in the database created. This will affect the sort order applied to strings, e.g., in queries with ORDER BY, as well as the order used in indexes on text columns. The default collation order used is of the template database.
- lc_ctype: Specify the character classification that the new database will use. It affects the classification of character, for example, digit, lower, upper. It defaults to the LC_CTYPE of the template database.
- tablespace_name: You can specify the name of the tablespace that will be associated with the database created, or simply write DEFAULT for using the tablespace of template database. By default, all the objects that will be created in this database will have this tablespace.
- allowconn: The default value for this is true, allowing connections (other than restricted ones). but no one can connect to this database if set as false.
- connlimit: You can specify the number of concurrent connections that can be made to the database created. The value is -1, by default that implies no limit.
- istemplate: Any user with the CREATEDB privilege can clone it, if the IS_TEMPLATE is true. If false, only superusers or the database owner can clone it.
PostgreSQL CREATE DATABASE examples
Creating a new database using pgAdmin
One of the oldest of the Postgres GUIs, pgAdmin’s first version was released just a few months after Postgre’s first release in 1996. It was re-written several times, it can function on Windows, Linux, and MacOSX, and comes with powerful database management tools such as a syntax-highlighted SQL editor. Designed to function on both client machines and on deployed servers, pgAdmin can handle advanced cases that Postico cannot.
The pgAdmin tool offers you with an intuitive interface for building a new database.
First, using pgAdmin, log in to the PostgreSQL database server.
Second, right-click the Databases node and select Create > Database… menu item
A dialog box will appear for you to enter detailed information on the new database.
Third, enter the name of the database and select an owner in the general tab.
Here, for the new database, we entered sampledb as the name and postgres as the owner.
Fourth, to set the properties for the database, click the Definition tab :
You can choose the encoding, template, tablespace, collation, character type, and connection limit in the Definition tab.
The Security tab lets you define security labels and assign privileges.
The Privileges tab lets you assign privileges to a role.
Fifth, select the SQL tab to see the generated SQL statement that will execute.
Finally, click save, you will see the database available under the Server you selected in pgAdmin’s left-hand panel
Create a Database in PostgreSQL via createdb Command
In an application called createdb, PostgreSQL also offers a command-line wrapper to the CREATE DATABASE command. The only alternate functionality to createdb over its SQL counterpart is that it has the ability to run directly from the command line, and it lets a comment to be added into the database, all in one command. In SQL, this would need at least two statements: a COMMENTstatement and the CREATE DATABASE statement.
The syntax for the createdb application is:
createdb [ options ] dbname [ description ]
In the above-mentioned syntax, the name of the database to be created is dbname, options comprise the listed optional flags, and description is the comment to be added with an implicit COMMENT command.
The options arguments may be provided either as the GNU-style, double-dashed counterpart, each followed by an equals sign (=) and an argument, if required (example - -location=PATH) or a single-dashed flags, each followed by a space and an argument (e.g., -D PATH). Double-dashed flags will be more verbose, comprising an entire word, while single-dashed flags will always comprise a single letter.
The available createdb arguments are:
-D: Specifies the tablespace name for the new database.
-e: Portrays the commands that createdb sends to the server.
-E: It specifies the character encoding to be used in the database.
-l: Specifies the locale to be used in the database.
-T: For the new database, specifies which database to use as a template.
--help: Show help page about the createdb command line arguments.
-h: Displays the hostname of the machine running the server.
-p: Sets the TCP port or the local Unix domain socket file extension which the server uses tisten for connections.
-U: Specifies which username to use to connect.
-w: Instructs createdb to not issue a password prompt.
-W: Instructs createdb to issue a password prompt before connecting to a database.
C: WINDOWSsystem32>createdb - U postgres - T phoenixnap -w mydatabase;
createdb Command Common Errors:
There are a few common errors users may come across while using the createdb command. Let us check the error below:
createdb command not found : When PostgreSQL was not installed properly, this error will be thrown. From the PostgreSQL installation path, run createdb or add the psql.exe path to the system variables section in the advanced settings of your PC.
Fatal: role “username” does not exist : When users run initdb with a role without superuser privileges, this error is thrown. To solve the error, create a new Postgres user with the --superuser option or login to the default admin role, postgres.
Database creation failed: ERROR: permission denied to create a database : When you are trying to create a database with an account that doesn’t have the required permissions, this error will be shown. To resolve the error, grant superuser permissions to the role in question.
No such file or directoryIs the server running locally and accepting
connections on Unix domain socket ...? : You will come across this error when the PostgreSQL server is not currently running or wasn’t properly started.
One of the top relational databases in the world, PostgreSQL is used by individuals, small and huge businesses, alike. Though it delivers high performance and comes with powerful feature set, it is incredibly easy to get started with it. To sum up, there is over one way to build a database and a user in PostgreSQL. In this article, we have focused on three of them:
- PSQL command line
The PSQL command-line tool is more oriented toward trained tech-savvy users who are acquainted with SQL coding. On the other hand, pgAdmin deals with similar features but in a more user-friendly intuitive design, which might help novices join the club of database management.