AUTHORED BY
Andrew Cross
DATE
09/15/2014
CATEGORY
WORD COUNT
1054
REV
0
REFERENCE IMAGE Database Automation with Python
NOTES
  1. MySQL's Connector for Python is required.
  2. This script easily saves me 10 minutes of repetitive action every day.
SOCIAL REACH

I decided to take the MySQL script I developed the other day for loading a local csv file to a database, and add a bit of automation to it with Python.

Before I really delve into this, I want to establish my setup in case someone runs into issues with my method. I’d also like to point out that I’m by no means an expert at Python, and I’m coming from a MATLAB background, so that may explain my perspective on things.

Let’s get started then, shall we? Like I mentioned, I’m working on a Windows system, and Python(X,Y) 2.7.6.1 was used to install version 2.7 of Python and all the scientific modules that accompany it. In addition, WAMP 2.5 was also installed to give me local access to a Apache/MySQL/PHP stack.

Acquire the Dependent Modules

One of the things I’ve discovered about Python coming from a MATLAB background is that it’s not nearly as “out of the box” ready. For example, even with all the modules that Python(X,Y) provides, there are still two modules that need to be installed to accomplish what I’m trying to do.

  1. Use pip to install the wget module.
  2. Download MySQL’s Connector/Python 2.0. (make sure to download version 2.0)

With the necessary modules now in place, I can now break up discussion of the program into two obvious pieces: downloading the .csv, and adding said .csv to the database.

Downloading the .csv

The .csv in question here is actually a parametric weather file that the National Renewable Energy Laboratory’s (NREL) Measurement and Instrumentation Data Center (MIDC) offers up to the public free of charge. The equipment that records this data is actually maintained by my research group, so it perfectly fits my needs.

One of the cool things about this site is that it offers several meteorological and solar measurements in resolutions as small as 1-minute. I could opt to download all these measurements at once, but the easier-to-manage method is to specify only those parameters I’m concerned with. Specifically:

  • Global Horizontal Irradiance (W/m^2)
  • Direct Normal Irradiance (W/m^2)
  • Dry Bulb Temperature (°C)

Reminder: Don’t forget to click the “Entire Day” radio button, or the data will be restricted to those times when irradiance is above 0.

Selections

Selections

By manipulating the buttons on the page, then hitting “submit”, my browser obediently downloaded the customized .csv for me. I extracted the download link from this file, examined it for parametric values, and was ready to take it into python!

import wget
import datetime
import mysql.connector

start = datetime.date(2014,6,19)
end = datetime.date(2014,9,15)

url = ("http://midcdmz.nrel.gov/apps/plot.pl?"
        "site=UNLV&start=20060318&"
        "edy=%s&emo=%s&eyr=%s&"
        "zenloc=18&amsloc=20&"
        "year=%s&month=%s&day=%s&"
        "endyear=%s&endmonth=%s&endday=%s&"
        "time=1&inst=3&inst=4&inst=9&type=data&"
        "first=3&math=0&second=-1&value=0.0&"
        "global=-1&direct=-1&diffuse=-1&user=0&axis=1" 
        %(end.day,end.month,end.year,start.year,start.month,start.day,end.year,end.month,end.day))

fileloc = (r'C:/<redacted>/%s-%s.csv' 
  %(str(start.year)+str(start.month)+str(start.day),str(end.year)+str(end.month)+str(end.day)))
  
file = wget.download(url,fileloc)

As always, the dependent modules are imported first. Next, the datetime module was used to establish date variables designating the time period I’m intending to query. A url string is then established using Python’s built-in implicit line joining syntax and the formatting operator. Now, when I change my data endpoints, the proper url will still be generated for me to download just the sensor measurements I’d like over just the time period I’m specifying without needing to use the handy web interface.

I then form another string to indicate exactly where on my computer I’d like this .csv to download to, as well as the file name it will have once it gets there, and lastly download the file with the wget.download function.

Adding the .csv to a Database

Admittedly, working with a database is significantly trickier than downloading files, but the MySQL connector actually makes the process fairly painless. First, the user’s database configuration is entered as a dictionary object, then unpacked inside the mysql.connector to establish the database connection. Next, a cursor is created and a query is formed. I won’t go into the query here, but for a general idea of what I’m doing refer to this article that discusses the LOAD INFILE process.

config = {
  'user': 'root',
  'password': '',
  'host': 'localhost',
  'database': '<redacted>',
  'raise_on_warnings': True,
}

cnx = mysql.connector.connect(**config)

cursor = cnx.cursor()
query = ("DELETE FROM `weather`; "
         "LOAD DATA LOCAL INFILE '%s' " % fileloc + 
         "INTO TABLE `weather` "
         "FIELDS TERMINATED BY ',' "
         "LINES TERMINATED BY '\\n' "
         "IGNORE 1 LINES "
         "(@date, @time, GHI, DNI, DBT) "
         "SET ddatetime = timestamp(str_to_date(@date,'%m/%d/%Y'), @time);"
         )

Lastly, the query must be executed, the changes committed, and the cursor/connection closed. The proper syntax definitely eluded me for a while. I was expecting that the (commented out) cursor.execute(query) method would run my query, but that just wouldn’t work. The reason, as I discovered, was because my query was in reality a compound query composed of a DELETE, as well as a LOAD INFILE, statement. To actually run my query, the optional multi=True parameter must be specified. This particular parameter results in an iterator, allowing the cursor to execute both statements in succession. Pass is simply included to illustrate that the iterator is run on the cursor, and no other operations occur for each result that is iterated.

for result in cursor.execute(query,multi = True):
    pass
#cursor.execute(query)
cnx.commit()
cursor.close()
cnx.close()

I will likely be referencing this dataset in future posts, so here is the script in its entirety for those that would find it useful.

UNLV_MIDC_Weather_Automation.py

Profile picture of Andrew standing at the Southern-most point in the United States.
Andrew Cross

Andrew is currently a university research engineer with a post-grad degree in mechanical engineering. He enjoys good food, motivated people, and road biking. He has still not completely come to terms with the fact he will never play center field for the Kansas City Royals.