RDBMS tools — MySQL and PostgreSQL Overview
Introduction
When it comes to relational database management systems (RDBMS), MySQL and PostgreSQL stand out as robust, feature-rich choices. Whether you’re a seasoned developer or just stepping into the world of databases, understanding these tools is crucial. Both MySQL and PostgreSQL are free, open-source, and offer versatile deployment options, including download/install, and cloud versions. Additionally, they provide flexibility with command-line interfaces and user-friendly desktop or web interfaces.
This post is a follow up to my post about [notes on] Relational Databases — RDB Management System, and it is intended to provide a short overview of these RDBMS’s — from creating databases and tables to defining keys and constraints, as well as loading data.
MySQL
MySQL is a widely used open-source relational database management system (RDBMS), while MariaDB, a fork of MySQL, was developed by some of its original creators. The flexibility of MySQL allows you to adapt it to your specific needs. The Community Edition, available under the GNU General Public License, can be freely downloaded and installed, either as a standalone or embedded in your applications. Additionally, commercial editions such as Standard, Enterprise, and Cluster versions are available, offering enhanced functionality.
For cloud deployment, MySQL provides various options. You can self-manage using virtual machine images or containers, or opt for managed services like IBM Cloud, Amazon RDS for MySQL, Azure Database for MySQL, or Google Cloud SQL for MySQL.
To interact with your databases, MySQL offers a suite of tools, including the mysql command line interface, mysqladmin for administrative tasks, MySQL Workbench for desktop use on Windows, Linux, and Mac OS, and the widely used web interface, phpMyAdmin.
The mysql command line interface allows you to issue commands interactively or from a text file. This versatile tool is demonstrated here with the ‘show databases’ command, listing available databases. In batch mode, you can store output messages in a file for later reference.
MySQL Workbench, a visual database design tool, consolidates SQL development, administration, and maintenance into a unified environment. The Administration page provides insights into connection details, server features, and facilitates tasks like data import/export and log reviews. The Schemas page grants direct access to database objects, data manipulation, and Help documentation.
For a graphical user interface (GUI) experience, phpMyAdmin proves invaluable. Upon connecting to the server, you gain access to server information and system databases. Using the intuitive tabs, you can create user databases, manage tables, load/query data, and import/export data seamlessly.
Creating Databases and Tables
In MySQL, creating databases and tables can be done through the command line, a graphical user interface, or API calls. This video demonstrates how to achieve these tasks using both the command line and the phpMyAdmin user interface.
Let’s start with the command line. To create a database, use the CREATE DATABASE
command, and for tables, employ CREATE TABLE
, specifying column names and data types. Utilize the DESCRIBE
command to view the structure of the newly created table.
Alternatively, phpMyAdmin, a popular visual tool with a web interface, simplifies the process. To create a database, navigate to the treeview on the left, click “New” under the Databases tab, input the database name, optionally select encoding, and click “Create.” The new database appears in the treeview, and the Create table tab opens.
For creating a table in phpMyAdmin, enter the table name (e.g., employee_details), specify the number of columns (e.g., four), and click “Go.” Define the columns by entering names, selecting data types, providing lengths if necessary, and clicking “Save.”
The tool then displays a summary of the table’s structure. From here, you can edit, drop, move, or normalize columns. Additional columns can also be added as needed. This user-friendly interface streamlines the process of creating and managing databases and tables in MySQL.
Loading Data
As a Data Engineer or Database Administrator, you’ll frequently need to populate databases and tables. One effective method is to back up an existing database and restore it to a new location. The tool of choice for this task is mysqldump
.
To back up an entire database, use the following command:
mysqldump -u root employees > employeesbackup.sql
Replace “employees” with your database name. If you want to back up specific tables, list their names after the database name.
To restore a backup, use the mysql command in a similar fashion:
mysql -u root destination_database < employeesbackup.sql
Note that the greater than sign (>) signifies output to the .sql file (backup), while the less than sign (<) signifies input to the database (restore).
Alternatively, you can restore a dump file directly from the mysql command prompt using the source command with the file name.
mysql> source employeesbackup.sql
For those using phpMyAdmin, the process is user-friendly. Select a database, go to the Export tab, and click Go for a Quick export, generating an SQL file. To restore, use the Import tab, selecting the destination database and clicking Go.
If you only need to populate specific rows in a table, phpMyAdmin allows manual entry on the Insert tab or execution of SQL INSERT
statements. For larger datasets, the import functionality is preferable. Use the mysql load data infile statement or mysqlimport utility for CSV file imports, ensuring the file name matches the table name.
In phpMyAdmin, importing data into tables is seamless. On the Import tab, browse to select the file, verify format and options, and click Go. You can also export data to CSV format on the Export tab, customizing options as needed.
Using Keys and Constraints
MySQL, like other relational databases, supports various keys and constraints such as primary keys, foreign keys, unique constraints, and null constraints.
Primary keys, enforcing uniqueness and disallowing nulls, can be created on one or a combination of columns. In phpMyAdmin, during table creation, add a PRIMARY index to the desired column(s) and confirm by clicking Go. You can add more columns to the primary key by similarly indexing them. The primary key icon appears next to the column name, and an index named PRIMARY is created.
To automatically generate unique IDs, use the auto-increment property. In the Create table or Structure tab, select the A_I checkbox for the primary key row and click Save. This setting ensures the database engine generates incrementing entries for the specified column (e.g., emp_id) when adding data.
Creating foreign keys to relate data across tables is also possible. In phpMyAdmin’s Relation view of the Structure tab, enter the key name, identify defining columns, and specify actions for deletion or update. The foreign key is visible in the underlying index on the Structure tab.
By default, MySQL columns are defined as not null in phpMyAdmin. You can change this during table creation or alter the column definition later. For example, enabling null values in specific columns involves checking the Null checkbox. If certain columns must have data, leave their null checkboxes blank. To ensure uniqueness, use the unique constraint. On the Structure tab, click More for the relevant column and then Unique.
PostgreSQL
Postgres, an open-source object-relational database management system, is renowned for its reliability and flexibility, supporting both relational and non-relational data types. Widely chosen for OLTP, data analytics, and geographic information systems, PostgreSQL offers diverse deployment options.
You can install PostgreSQL on your own servers, supporting macOS, UNIX, or Windows. Alternatively, it’s accessible in virtual machines, containers, or through managed services like IBM Cloud Databases for PostgreSQL, Amazon RDS, Google Cloud SQL, EnterpriseDB Cloud, or Microsoft Azure.
Various tools facilitate connecting to PostgreSQL databases:
- psql provides a command-line interface
- pgAdmin offers an open-source graphical interface available as a desktop or web application.
- Commercial options like Navicat and DBeaver support PostgreSQL, MySQL, and other databases.
psql
, an interactive command-line tool, allows for interactive queries and displays information about database objects. pgAdmin, on the other hand, handles all development and administrative tasks, offering a Query Tool for running SQL commands and an ERD Tool for creating and visualizing entity-relationship diagrams (ERDs).
In pgAdmin’s Query Tool, you can type or paste SQL queries in the upper pane, with results displayed below. Editable result sets enable data manipulation. Tabs provide insights into query plans, server messages, and asynchronous notifications.
The ERD Tool facilitates the creation of ERDs for existing databases or the generation of SQL statements for new databases. To create an ERD from an existing database, right-click the database in pgAdmin, select “Generate ERD,” and the tool visually organizes tables and relationships. You can further modify the ERD, add notes, and generate SQL statements within the tool.
Creating Databases and Loading Data
In PostgreSQL, much like other relational database management systems (RDBMS), you have various methods to create databases, tables, and load data: through a command line interface, a graphical user interface, or API calls.
Firstly, with psql, you use commands like CREATE DATABASE and CREATE TABLE, specifying column names and data types. The \d command shows the structure of the newly created table.
To load data, psql can restore a previously backed-up database using pg_dump. Just specify the destination database and the dump file, recreating tables, database objects, and the data present during the dump file creation.
Alternatively, pgAdmin, a web-based visual tool, facilitates the same tasks. To create a database, right-click Databases in the treeview, select Create, then Database. Enter the name and click Save. To restore from a dump file, select the target database, click Restore, and specify the dump file location.
Manual table creation using pgAdmin involves right-clicking Tables in the treeview, selecting Create, then Table. Enter the table name (e.g., employee_details), define columns on the Columns tab, and click Save. From here, you can use Import/Export to load data into your table.
For data import in pgAdmin, select Import in the Import/Export data box, enter the data file location, and click OK. The View/Edit Data option allows you to review the loaded data through an SQL query.
Exporting data to a CSV file is straightforward in pgAdmin; just specify the filename and click OK. To back up the entire database, the pg_dump utility comes in handy. Its syntax is similar to psql restoration, specifying the database name and dump file. Customizing the command allows output to a compressed archive file.
Views
A view in PostgreSQL is an alternative representation of data from one or more tables or views, offering the same interaction capabilities as tables — inserting, updating, and deleting data. Views are particularly useful for restricting access to sensitive data, streamlining data retrieval, and limiting access to the underlying tables.
For instance, creating a view that includes only the name and email columns from two tables allows users easy access without revealing the underlying table structure or sensitive information like salaries.
To create a view, navigate to the tree view, right-click “Views,” select “Create,” then “View.” Name the view in the Create — View box, enter the SQL code defining the view on the Code page, and click Save. The created view appears in the Views folder, with columns displayed upon expansion.
To execute the view, right-click its name, select “View/Edit Data,” and click “All Rows” to see the data included in the view.
PostgreSQL also supports materialized views, where the result set is initially materialized or saved for future use. While this means you can only query data, not update or delete it, it enhances performance as the result set is readily available, often stored in memory.
Creating a materialized view is similar to a regular view. Start in the materialized views folder, enter the view name, define it on the Definition page (e.g., including only empid and salary columns for anonymization), and click Save. Refreshing the materialized view with current rows is essential before using it, and this can be done at any time to update it with data from underlying tables.
Database Design
Creating a well-designed database is pivotal for the success of data-driven projects, impacting data integrity, reducing redundancy, enhancing application performance, and ensuring user satisfaction. The database design process involves three key steps:
Requirements Analysis:
- Analyze real-world business information and policies.
- Identify base objects and relationships between them.
- Determine data-associated information for each object.
- Gather information through reviewing existing data stores, interviewing users, and exploring potential improvements.
- Output from this stage could be a report, data diagram, or presentation for stakeholder validation.
Logical Design:
- Map requirements identified in the analysis stage to entities, attributes, and relationships.
- Identify entities, representing people, events, locations, or things.
- Define attributes for each entity, considering characteristics like first and last names for a person.
- Address many-to-many relationships by introducing an associative entity.
- Normalize entities, ensuring adherence to 1st, 2nd, and 3rd normal forms for optimal transactional performance.
Physical Design:
- Consider the impact of the chosen database management system on the design.
- Contemplate data types, naming rules, indexes, and constraints supported by the system.
- Implement naming conventions for clarity.
- Translate logical design into a physical design with tables, typed columns, and defined keys.
- Utilize an Entity-Relationship Diagram (ERD) designer for visualizing and generating SQL scripts.
- In pgAdmin, leverage the ERD Tool for designing and scripting.
In essence, dedicating time to meticulous database design upfront mitigates potential issues later, contributing to the success of your data-driven project.
If you found this blog helpful, you might want to check out some of my other data-related blogs:
Disclaimer: The notes and information presented in this blog post were compiled during the course “Introduction to Relational Databases (RDBMS)” and are intended to provide an educational overview of the subject matter for personal use.