Fracking: Water Stress in Appalacia

Well water usage in hydraulic fracturing for natural gas in the Appalacian Basin using data from overlaid on the drought conditions from 2000-2017.


Featured post

Predicting Flight Delays using TensorFlow and Machine Learning

In complex systems such as airline travel, predicting delays can be daunting. Given the multitude of factors such as maintenance problems, security concerns, or congestion, weather stands out as the major contributing factor to late arrivals of aircraft. According to the Bureau of Transportation Statistics, weather accounted for 33 to 46% of all delay minutes during the past 10 years, in which they include extreme weather as well as 53% of the National Aviation System delays and spillover from previous flights (‘Aircraft Arriving Late’).

Delay Cause

Examination of the TranStats database reveals significant aircraft delays during weather such as fog, thundershowers, and snow. High winds often accompany heavy rain and are a cause of delay unto themselves. In the United States, inclement weather is particularly strong during the month of January in 2017, and choosing Chicago (“The Windy City”) as a point of origin we take flights to the relatively warm state of Georgia to another major airport in Atlanta. To try to model how weather affects delays on this route, we capture factors such as temperature, precipitation, wind gust speed, and if weather events such as rain or snow occurred in both Chicago and Atlanta using Weather Underground. With this dataset we are able to perform a causal analysis.

Traditional statistical techniques such as multiple linear regression offer potential help in determining weights for various factors causing delay, but linear models are unlikely to help when most flights have zero delays, then occasionally suffer medium to large late spikes. Given this complexity, machine learning offers potential help in not only model construction (open source programs such as Google’s TensorFlow make creating models relatively straightforward) but also in accuracy.

After constructing a model based on Martin Wicke’s automobile estimator for predicting flight delays based on a 930 row weather dataset, we observe the following:

  1. TensorFlow’s DNNRegressor (Deep Neural Network) performs better than the LinearRegressor with about a 25% lower loss rate using a 465 row training set for each.
  2. Loss rates can be further reduced by increasing the training set size to about 2/3rd of the total dataset, but begin increasing again as the evaluation set becomes to statistically insignificant in relative size to offer any meaningful comparison.
  3. When using the predict_scores method on the evaluation data sets, zeroing out the negative predictions results in a substantially improved prediction model, versus the naive model allowing for negative delays (not present in the training dataset.)

Perhaps the ultimate benchmark in machine learning should be one of a simple, intuitive model. In the case of flight delays, where the most common case is where there is no delay, the simplest model is to assume delays will be 0 minutes for each flight. Under the assumption, the average error is approximately 3 minutes. The best DNN average error was close, at about 5 minutes, but still less accurate. What is clear, however, is the linear regression model was much worse, with 13 minutes of error on average. With further tuning, it should be possible to improve the DNN model to beat the 3 minute benchmark. Machine learning results are summarized below:


UX/UI Design Process

A finished product is often more complex than its original design, but is almost certainly simpler than the dozens if not hundreds of iterations that make up its transition between initial inspiration through ideation, protyping, testing, development, and user adoption and upgrade. Below is a small sampling of the steps taken to develop the supply chain tracking suite:











Field Test




GIS Jobs – Oregon

GIS jobs in Oregon, August, 2017 based on 86 job postings from

Portland dominates the state job market with 42 (nearly half) of all the GIS jobs. Salem and Eugene follow with 12 and 6, respectively (14 and 7%,) with the remainder of the positions along the I-5 and I-84 (Columbia River) corridors.

Industry concentration is largely in the construction / surveying market with 17 postings (20%), followed by environmental with 16 postings (19%) and city planning / government with 15 postings (at 17%.)

GIS Jobs - Oregon

Copper Production and Profitability (2015)

2015 global copper production and profitabilty in $USD cents / pound plotted using the SNL Metals & Mining database. Most profitable country: Mexico, with a $1.30/lb profit margin. Least profitable country: Papua New Guinea, with a $1.87/lb loss.


XPO Last Mile Service Zones

A drive time analysis conducted on XPO’s 47 Last Mile locations in the United States. Data extracted using a modified JavaScript version of
to extract the addresses and Longitude-Latitude coordinates to a .csv file, which when imported to an ArcGIS project enabled the Network Drive-Time Analysis of 1, 2, and 4 hours from each location under average Wednesday, 8:00am traffic conditions. Results overlaid on a layer showing projected population growth over 2016-2021. The areas outside the coverage areas in green present potential growth opportunities to XPO, and include Oregon, Nevada, Utah, Tennessee, and West Texas.


Blog at

Up ↑