Utilizing SQL Alongside Python
Options for Speaking Your Data Science Love Language
In the world of data analysis and data science, interacting with databases internally is largely relegated to storage solutions provided by Database Management Systems (DBMS). Among these solutions, there exist options such as Alteryx, dbt, and of course the reigning king of DBMS, SQL.
One overlooked facet of SQL — outside of the DBMS framework — is the ability to use it alongside and within your Python IDE of choice. Today we will talk about the options available for utilizing SQL with Python, and also some options available for standalone SQL-specific IDEs.
Getting Started with SQL Databases in Python
We will go over connecting to the databases of SQLite, MySQL, and PostgreSQL using Python to instantiate the libraries and tables, and also the steps needed to access them within Python. Note that when using MySQL and PostgreSQL, you will need to make sure your designated SQL server is already up and running.
SQLite is the most straightforward of the three, as the library and packages are automatically included with every Python/Anaconda installation. By default, Python installations come with a module called sqlite3. The good news about SQLite is that these databases are serverless and self-contained, meaning that there are no external databases to connect to — perfect for testing out your SQL skills in a local sandbox environment. Connecting to an SQLite database using Python code is quite simple:
Once instantiated, the new database file (named sm_app.sqlite) will be created in your path’s root directory. This location can be changed at any time using Unix commands, manually, or in SQL itself.
Dealing with MySQL
Running MySQL and PostreSQL with Python is a little bit trickier and involves a bit more leg work. As mentioned before, you will need to be sure that your SQL server is already running to be able to connect to it via Python. Once that is done, the next step is installing the necessary Python packages to handle the instantiation, as Python does not include support for MySQL and/or PostgreSQL out-of-the-box. You can install the necessary packages using the following command:
Utilizing MySQL in Python once the packages are installed also differs from SQLite in that the function for your database/table statement also requires a separate process.
- One for connecting to the database
- Another for actually creating the database/table
Connecting to the database (once you have the server up and running, of course) can be accomplished alongside the same lines as the SQLite method, except that you will also be defining your default host, username, and password of your running server.
Once this function is created, we will create yet another that will specify the creation parameters of your database.
Then call the above two functions to create your database!
Dealing with PostgreSQL
PostgreSQL runs along the same lines as the MySQL creation process. First, once again we will be installing the necessary packages for Python in order to use the server.
Then we will define our functions in order to connect to the server, and also create our database.
Finally, use the create your second function to create your database, and call the create_connection function to create your database inside of the PostgreSQL server:
The Deal with SQL IDEs
SQL IDEs (integrated development environments) function much like those for other programming languages. You can not only connect to and manage databases, but utilize system trees, create visualizations, view parent-child and one-to-many foreign keys, and modify/create tables. The use of a robust IDE also allows you to save statements for commonly used queries; bypassing the tediousness of writing and re-writing queries. A well-organized IDE might even let you circumvent the scripting process altogether once your workflow is well defined!
While there are many available IDEs available when making the decision on which one best suits your needs, we need to pay close attention to the environment being worked in, as well as the SQL server/s that you will need access to. My IDE of choice is DBeaver — an IDE that allows connections to the big 3 (SQLite, MySQL, PostgreSQL) and much more such as Oracle, DB2, SQL Server, Sybase, MS Access, Teradata, Firebird, Apache Hive, Phoenix, Presto, etc.
Some IDEs are only available on certain platforms (Windows, Mac, Linux), while others have cross-platform functionality, such as DBeaver, allowing you to save and load your scripts and statements no matter what hardware platform you may be currently on.
I recommend checking out web pages such as this for helping to make the decision on which SQL IDE is best for you and your unique work-based needs.
Choosing a great SQL IDE, and using SQL alongside Python in your environment of choice is a great way to streamline your workflow and maximize your productivity. It is also a great way to stay organized and access data from anywhere, no matter the platform you are on.
Happy coding and go show the world that SQL is not dead yet!