data natives

We hear a lot of marketing yammer about “digital natives”, that is, folks fluent in social media and in particular marketing using social media. Writers who use this term often juxtapose such digital natives against “analog natives”, i.e., individuals who matured or were educated before online social media became such a significant part of our lives. These writers often imply that such analog natives are unable to understand the world today. This is of course ridiculous; anyone with an open mind and tenacity can develop marketing skill with social media.

I offer a different grouping of individuals that might be considered an analog (pun intended) of digital nativity: “data natives”. By this I mean individuals fluent in using heterogeneous numerical and textual data sources, along with mathematical techniques, to reach conclusions about many facets of their lives and work.

Consider the following example: When I was shopping for a used RV trailer recently, I needed to filter out trailers from the pool of candidates that were too heavy for my truck to pull. However, used RV listings only specified length of the RV trailers, rather than the weight. Therefore I used regression analysis to predict weight from length based on about twenty known length-weight data points. This simple example illustrates a data native’s approach to solving problems.

In my usage of the phrase, “data native” is meant to be a more comprehensive designation than “data scientist”, though certainly there is crossover between the two. In using the word “native” I’m implying an intimate comfort with data and data-driven decision making, like immersion in and skill with data flows like one’s mother tongue. Data science is a job, data native is a way of being.

For now, data natives are as rare as data scientists. But the new world of Big Data is producing both at a rapid clip. This will likely enrich the world.

Posted in big data, data science, marketing, statistics | Tagged , , , , | Leave a comment

GNU Octave: a free, open source MATLAB-like language for numerical computing

I tend to use Python with the Numpy, SciPy, and Matplotlib stack whenever I have to do scientific computing. For statistical computing I use R whenever this Python stack does not provide the necessary features. However, I want to draw readers’ attention to another tool for free, open-source numerical computing: GNU Octave (hereafter called “Octave”), which is an interactive language closely related to MATLAB. In fact, most MATLAB code can be run in Octave with no modification; for MATLAB code that does need a change to run in Octave, the necessary modification is usually slight. MATLAB is common in engineering and scientific environments, but it costs money. For budget-constrained startups or non-profits Octave is a viable alternative. This post introduces Octave and shows its operation in both a statistical and control system design role.

Octave is covered by the GNU General Public License, which enables programmers to modify it as needed, provided the modifications are made publicly available. This license also makes Octave effectively free in the financial sense. The program runs in both Linux and Windows, although the Windows setup is more complicated since you have to use Cygwin, MinGW, or Visual Studio.

Octave has a full range of plotting abilities that MATLAB users will be familiar with. The program automatically uses a third-party program, Gnuplot, to create the plots. For example, a 3D plot of a geodesic sphere rendered by Octave is shown below [1]:

gnuplot_class_1_icosahedron_frequency_2

As another example, a Nyquist plot of a dynamic system is:

my_nyquist

Like MATLAB, Octave has many toolkits, which are published as part of Octave-Forge [2], a central repository for Octave packages. Examples of such toolkits are packages for control systems analysis/design and fuzzy logic.

Under the hood, Octave uses the BLAS and LAPACK libraries to facilitate linear algebra computations, in much the same way R and Python’s NumPy package do.

Recent versions of Octave come with a GUI. Using the GUI to show the iconic “sombrero” plot with code given by [3]:

octave_gui

Statistical Calculations with Octave

Since I’m a statistician in training, I thought I’d demonstrate a few of Octave’s basic statistical tools. First, a box plot of samples from two normal distributions:

a = normrnd(20, 3, 100, 1);
b = normrnd(22, 3, 100, 1);
boxplot([a, b]);
title("Comparison of Samples From Two Normal Distributions")
xlabel("Distribution")
ylabel("Sample")

boxplot

Conducting a t-test on the two sets of samples from normal distributions:

t_test

Conducting linear regression on simulated data:

x = 0:0.01:10;
noise = normrnd(0, 2, 1, length(x));
hist(noise);
title("Noise");
y = 2*x + noise + 5;
figure;
hold on;
scatter(x, y);
F = polyfit(x, y, 1);
a = F(1);
b = F(2);
y_pred = a*x + b;
plot(x, y_pred, 'r');
title("Scatter Plot and Regression Line");
hold off;

histogram

scatter

Control Systems Calculations with Octave

I also have experience designing control systems, which Octave is well suited for. Here is an example of a PID controller’s step response implemented in Octave’s control system toolkit:

pkg load control
Kp = 4
Kd = 1
Ki = 1
P=tf([1], [1 Kp/Kd Ki/Kd])
step(P, 20)

pid

Creating the Nyquist plot shown above for this system:

nyquist(P)

my_nyquist

Conclusion

Octave is a viable, free alternative to MATLAB for many scientific computing applications.

References

1. http://octave-dome.sourceforge.net/ (I have stopped working on this. Use pyDome instead).
2. http://octave.sourceforge.net/
3. http://en.wikipedia.org/wiki/GNU_Octave

Posted in engineering, science, statistics | Tagged , , , , , , , , , , , | Leave a comment

synthetic biology: an emerging engineering discipline

In the last decade a new engineering disciple called “synthetic biology” has emerged. It differs from the science of biology in that it applies engineering strategies to the creation of cells that perform a desired task, such as the production of drugs or biofuels. It also differs from previous genetic engineering approaches by stressing the assembly of systems composed of modular, repeatable genetic components selected from a pool of well described candidate components. This post introduces the subject from a high level.

Modularity in Engineering Design

All mature branches of engineering stress modularity in design of systems and products, such that the designed systems are composed of simpler systems having known input and output behaviors. Examples of this design ethic are the electrical circuits (Butterworth filters from the LTspice example circuits) shown below:

LTlspice_butterworth_filter

These circuits are made of simpler components: resistors, capacitors, and inductors, each with known physical properties. Knowing the physical properties of each of these parts enables simulation of the whole combined circuits, allowing prediction of circuit outcome. For example, the LTspice predicted voltage responses of the top two of these filters are:

LTlspice_butterworth_filter_SIMULATION

Computer-Aided Design (CAD)

In the discussion above the example was expressed in a CAD program called LTspice, which facilitates the specification, communication, and simulation of electrical circuits. Other branches of engineering use CAD for these purposes as well, for example Pro/ENGINEER by mechanical engineers and AutoCAD by civil engineers. These CAD packages also encourage modularity, as demonstrated by the multi-component system shown in Pro/ENGINEER below [1]:

ProE-Connector-for-Aras

Synthetic Biology

Synthetic biology is an approach to genetic engineering that draws from traditional engineering’s use of modular, well-described parts. DNA components of genes such as ribosome binding sites, protein coding regions, promoters, etc. are abstracted into “parts” that can be assembled with other parts—not necessarily from the same gene—into “devices”. These devices can in turn be combined into “systems” that result in a desired cellular behavior once DNA encoding the designed system is inserted into a cell. Key to this design strategy is that the engineer has a suite of genetic parts to choose from when designing the genes that they combine into larger systems. These larger systems are often said to be made of genetic “circuits”, since the designed operations can resemble switching and logic gates. We will explore sources of genetic parts shortly, but first consider CAD.

It is logical that synthetic biologists would seek CAD programs to help facilitate this design process, and such tools are beginning to emerge out of academic labs and commercial institutions. Most of these tools cover a single task in the design process, and therefore must be chained together if the designer is to go from part selection to simulation to DNA specification of the final design. This has driven the creation of markup languages to describe designs (CellML [2] and SBML [3]) so that multiple tools can work with the same design.

An example of the specification of protein production and interaction by a genetic circuit is provided by the iBioSim CAD package’s [4] tutorial:

iBioSim

Here proteins are shown in the blue boxes, and a promoter is shown in the diagonal box. The promoter is repressed by protein Cl2 and activates transcription leading to the production of protein CII. An event (green box) specifies that cell division is to occur at a predefined point during the simulation. In iBioSim, all parameters of the chemical reaction dynamics must be specified prior to simulation, which is challenging because often these parameters are unknown and have to be estimated. iBioSim then enables simulation of the genetic circuit. Below we can see that the proteins created by the cell are expected to reach steady state:

iBioSim_simulation

Another CAD package in development for synthetic biology is Cello, which stands for “Cell Logic” [5]. In Cello the user specifies their desired logic in a truth table, where intracellular chemicals and signals make up the inputs, and the program selects genetic parts necessary to implement the logic. Cello then specifies the DNA necessary to implement the logic [5]. In the NOR gate example shown below, one or both of two promoters activate production of a protein that represses another promoter that activates the output protein [5].

overview_NOR

overview_dnabases

Genetic Part Registries

Several repositories have emerged to store descriptions of genetic parts and devices [6, 7], with the goal of mimicking the specification sheets associated with semiconductor parts today. As semiconductor specification sheets encourage repeatable, modular design of electrical circuits, the plan for genetic part specifications is to encourage repeatable, modular design of genetic circuits. An example of such a repository is the Registry of Standard Biological Parts [6], which provides specification sheets for promoters, ribosome binding sites, coding regions, terminators, etc., e.g., [8]:

igem_example

References

1. http://www.aras.com/integrations/MCAD/creo-parametric-connector.aspx
2. http://www.cellml.org/
3. http://sbml.org/Main_Page
4. http://www.async.ece.utah.edu/iBioSim/
5. http://cidar.bu.edu/cello/server/html/login.html
6. http://parts.igem.org/Catalog?title=Catalog
7. http://www.ncbi.nlm.nih.gov/pubmed/20160009
8. http://parts.igem.org/Part:BBa_K1216007

Posted in bioinformatics, engineering, science | Tagged , , , , , , , , , , , | Leave a comment

building a web-enabled temperature logger

Not wanting to miss out on the “Internet of Things”, I decided to learn some of its foundational technology, namely microprocessor programming. Actually, I used a Raspberry Pi in this project instead of a classic microprocessor, but the idea is the same. Here I describe building a web-enabled temperature logger, complete with a web application to display its results.

The Challenge

I live in an RV with my cat. When I go to work I have to decide whether to leave the windows open or turn on the air conditioner to keep my cat cool, as there is no thermostat on my air conditioner. I usually decide based on the weather forecast, but really don’t know how hot it gets in the RV during the peak temperature of the day. I needed a data logger that reads the temperature regularly and stores it. Ideally such a data logger would report to a web application, so I can monitor the temperature from work.

The Solution

Raspberry Pi

I first bought a Raspberry Pi B+ computer and configured it to run Raspbian Linux. Then I added a USB WiFi dongle so the device can communicate with the Internet.

DS18B20 Digital Temperature Sensor

Next, I bought a DS18B20 digital temperature probe, and connected it to the Raspberry Pi according to the following schematic, which is slightly modified from that specified by [1]:

schematic

The 4.7 kOhm resistor came with the temperature sensor.

The resulting hardware looks like:

hardware_photo_with_resistor_10

Running the Data Logger and Connecting to the Web

On the Raspberry Pi I run the following Python code, which is slightly modified from that shown in [1]. My modification simply calls a URL containing the temperature reading that is processed by the web application described below. This code sends a reading to the URL every two minutes.

import os
import glob
import time
import urllib2

os.system('modprobe w1-gpio')
os.system('modprobe w1-therm')

base_dir = '/sys/bus/w1/devices/'
device_folder = glob.glob(base_dir + '28*')[0]
device_file = device_folder + '/w1_slave'

def read_temp_raw():
    f = open(device_file, 'r')
    lines = f.readlines()
    f.close()
    return lines

def read_temp():
    lines = read_temp_raw()
    while lines[0].strip()[-3:] != 'YES':
        time.sleep(0.2)
        lines = read_temp_raw()
    equals_pos = lines[1].find('t=')
    if equals_pos != -1:
        temp_string = lines[1][equals_pos + 2:]
        temp_c = float(temp_string) / 1000.
        return temp_c

while True:
    try:
        temp_c = read_temp()
        req = urllib2.urlopen('http://my.url.com/logtemp.php?temp=' + str(temp_c))
        time.sleep(2. * 60.)
    except:
        time.sleep(5.)

Web Application for Storing Temperature Readings

A PHP program receives the temperature reading sent by the Python script as a GET argument. It then places the temperature value with a time stamp into a MySQL database. I chose PHP for this task because my web hosting company makes PHP deployment much easier than Django or JSP deployment:

<html>
 <head>
  <title>Log Temperature</title>
 </head>
 <body>
 <?php 

    $temp = $_GET['temp'];

    $con = mysqli_connect("host", "user", "password", "database");

    // Check connection
    if (mysqli_connect_errno()) {
    echo "Failed to connect to MySQL: " . mysqli_connect_error();
    }

    $sql = "INSERT INTO temperature_log VALUES (now(), $temp)";

    if (!mysqli_query($con,$sql)) {
    die('Error: ' . mysqli_error($con));
    }
    echo "1 record added";

    mysqli_close($con);
?>
 </body>
</html>

Web Application for Displaying Temperature Readings

The web application for viewing the temperature readings displays a run chart and a log. The run chart is implemented in JavaScript with the jqPlot library. The application queries the MySQL database for the last 24 hours’ readings. Again, I used PHP just because it is easy to deploy on my web hosting platform.

temperature_logger_screenshot

The code for this application is:

<html>
 <head>
<link rel="stylesheet" type="text/css" href="viewtemp.css">

<script language="javascript" type="text/javascript" src="jqplot/jquery.min.js"></script>
<script language="javascript" type="text/javascript" src="jqplot/jquery.jqplot.min.js"></script>
<script language="javascript" type="text/javascript" src="jqplot/plugins/jqplot.canvasTextRenderer.min.js"></script>
<script language="javascript"type="text/javascript" src="jqplot/plugins/jqplot.canvasAxisTickRenderer.min.js"></script>
<link rel="stylesheet" type="text/css" href="jqplot/jquery.jqplot.css" />

  <title>View Temperature Log</title>
 </head>
 <body>

<center><h1>Trailer Temperature</h1></center>

<div id="chart"></div>
<br><br>

 <?php 

    $con = mysqli_connect("host", "user", "password", "database");

    // Check connection
    if (mysqli_connect_errno()) {
    echo "Failed to connect to MySQL: " . mysqli_connect_error();
    }

    $sql = "select * from temperature_log tl where tl.time >= DATE_SUB(NOW(), INTERVAL 1 DAY) order by tl.time desc";

    $result = mysqli_query($con, $sql);

?>
<table id='time_temp_table'><thead><tr><th>Time</th><th>Temperature (C)</th><th>Temperature (F)</th></tr></thead><tbody>
<?php

    $temp_array = array();
    $time_array = array();
    while($row = mysqli_fetch_array($result)) {
    $temp_f = round($row['temperature'] * 1.8 + 32.0, 2);
    echo "<tr><td id='time_entry'>" . $row['time'] . "</td><td id='temp_entry'>" . $row['temperature'] . "</td><td>" . $temp_f . "</td></tr>";	
    array_push($temp_array, $temp_f);
    array_push($time_array, $row['time']);
    }   

    $temp_array_reverse = array_reverse($temp_array);
    $time_array_reverse = array_reverse($time_array);
?>
    </tbody></table>
<?php
    mysqli_close($con);
?>

<script type="text/javascript">
<?php
$js_array = json_encode($temp_array_reverse);
echo "var tempArrayAsString = " . $js_array . ";\n";
$js_array = json_encode($time_array_reverse);
echo "var timeArrayAsString = " . $js_array . ";\n";
?>

$(document).ready(function() {

	tempArray = [];
	$.each(tempArrayAsString, function(index, value) {
		tempArray.push(parseFloat(value));
	});

	data = [];
	data.push([0, tempArray[0]]);
	timeDiffList = [0];
	for (var i=1; i<timeArrayAsString.length; i++) {
		d = timeArrayAsString[0].split(' ')[0];
		year = parseInt(d.split('-')[0]);
		month = parseInt(d.split('-')[1]) - 1;
		day = parseInt(d.split('-')[2]);
		t = timeArrayAsString[0].split(' ')[1];
		hour = parseInt(t.split(':')[0]);
		minute = parseInt(t.split(':')[1]);
		second = parseInt(t.split(':')[2]);
		dt0 = new Date(year, month, day, hour, minute, second);

		d = timeArrayAsString[i].split(' ')[0];
		year = parseInt(d.split('-')[0]);
		month = parseInt(d.split('-')[1]) - 1;
		day = parseInt(d.split('-')[2]);
		t = timeArrayAsString[i].split(' ')[1];
		hour = parseInt(t.split(':')[0]);
		minute = parseInt(t.split(':')[1]);
		second = parseInt(t.split(':')[2]);
		dti = new Date(year, month, day, hour, minute, second);

		var timeDiff = (dti - dt0) / (1000. * 60.);
		timeDiffList.push(timeDiff);

		data.push([timeDiff, tempArray[i]]);
	}

	var ticksToUse = [];
	var position_dict = {};
	for (i=0; i<timeDiffList.length; i++) {
		var a = Math.round(timeDiffList[i] / 5) * 5;
		if (a % 120 == 0) {
	 		var label = timeArrayAsString[i];
			 
			if (!position_dict.hasOwnProperty(a)) {
			    ticksToUse.push([timeDiffList[i], label]);
			    position_dict[a] = true;
			}
		}
	}
	label = timeArrayAsString[i-1];
	ticksToUse.push([timeDiffList[i-1], label]);


	$.jqplot('chart',  [data], 
	{
		series: [{showMarker: false, lineWidth: 2}],
		axesDefaults : {
			tickRenderer: $.jqplot.CanvasAxisTickRenderer ,
			tickOptions: {
				angle: -80
			}
		},
		axes: {
			xaxis: {
			label: 'Time (Minutes)',
			ticks: ticksToUse,
			},
			yaxis: {
			label: 'Temperature (Fahrenheit)',
			tickOptions: { angle: 0 }
			},
		},
	});

});
</script>
 </body>
</html>

The CSS for the application is:

#time_temp_table {
    border-collapse: collapse;
    background-color: lightblue;
}

#time_temp_table th {
    border: 1px solid black;
    text-align: center;
    padding: 2px 15px 2px 15px
}

#time_temp_table td {
    border: 1px solid black;
    text-align: center;
    padding: 2px 15px 2px 15px
}

Future Plans

For the web application that displays the recorded temperatures, it would be nice to add a box plot to summarize the results.

Ultimately, I’d like to connect this hardware to my air conditioner so that it will automatically turn on when a set temperature point is reached. I’ll need some high-amp relays for this.

References

1. https://learn.adafruit.com/adafruits-raspberry-pi-lesson-11-ds18b20-temperature-sensing

Related Post

engineer moves into an RV

Posted in engineering, science | Tagged , , , , , , , , , , , , , | Leave a comment

rapidly extracting a subsequence from chromosome sequence data in Java

The Challenge

We have a text file containing the nucleotides of a chromosome, say human chromosome 11, and need to be able to quickly extract a subsequence from the chromosome text given a nucleotide position and number of subsequent nucleotides to include. The problem is that chromosome files are huge, e.g. 135 megabytes for chromosome 11, so we don’t want to use typical string processing tools. An example of the text we are extracting nucleotides from is:

chromosome_snapshot

Note that this file contains no line breaks or header, so that the byte position of a nucleotide corresponds to its position in the genome.

The Solution

Our solution is to use random file access to jump to the desired start nucleotide position in the file and read forward from that position for the required number of bases. Java code that implements this strategy is:

import java.io.*;

public class ReadNucleotidePositions {

    public static void main(String[] args) {

	// user settings
	Integer start = 68081400 - 1;   // base zero
	Integer numberOfCharactersToRead = 200;
	String chromosomeFile = "chromosome_11.txt";

	String output = "";

	try {
	    // open a file for reading
	    RandomAccessFile raf = new RandomAccessFile(chromosomeFile, "r");

	    // seek the start position
	    raf.seek(start);

	    // repeat read operation for the number of times specified
	    for (Integer i=0; i<numberOfCharactersToRead; i++) {

		// this has to be type "int", not type "Integer" for the cast to work
		int someCharInteger = raf.read();

		// cast as character and append to output string
		output += (char) someCharInteger;
	    }
	}
	catch (IOException ex) {
	    ex.printStackTrace();
	}

	System.out.println();
	System.out.println(output);
	System.out.println();
    }
}

Here we start at position 68081400 and subtract one to make the coordinate system base zero. We specify that we want to read the nucleotide at this position followed by the next 199 nucleotides in chromosome 11.

We open a “RandomAccessFile” and use the “seek” method to move the file pointer to the position we intend to start reading from. Then we loop for the number of nucleotides we are reading, using the “read” method at each iteration to extract the character at that position in the chromosome text. The value returned from the “read” method is of type “int”, which we must cast to type “char” before adding it to our String object containing the extracted sequence.

run_results

Finally, we check the extracted sequence against a reference (in this case the UCSC Genome Browser) to ensure our sequence extraction is accurate:

UCSC_genome_browser

Posted in bioinformatics, engineering, science | Tagged , , , , , , , | Leave a comment

invasive species spreads to low Earth orbit

After colonizing six continents and setting up scientific outposts on the seventh, Earth’s major invasive species sent some of its members into orbit. Not long after starting to use its opposable thumbs to cultivate grain, the species built rockets capable of landing individuals on Earth’s moon and delivering its members regularly to a series of low orbit scientific test labs.

The species now has its sights on Mars, which it has not visited yet in person but has landed several probes on. These probes are collecting data necessary to facilitate colonization, such as looking for evidence of readily available water. The invasive species also is looking at the asteroid belt as a source of minerals needed for further expansion.

Known for long term thinking, the invasive species has set up laboratories on Earth to develop advanced propulsion technologies to make space travel faster, thereby facilitating a full-blown invasion of the solar system. Pressure is building for such an outward migration from the planet as the invasive species’ population reaches seven billion, a number challenging the planet’s food, water, and energy resources.

Posted in humor, science | Leave a comment

using bug tracking software to keep track of life’s tasks

I’ve tried a few mobile task tracking apps for my smart phone, but have found none as useful for keeping track of life’s responsibilities as using a web-based software bug tracking program called MantisBT. MantisBT is used by software engineers to log reported bugs, assign them to staff for correction, and record progress toward bug resolution.

mantis_logo

To get started, I first deployed MantisBT (a PHP application) on the web so I can access it from anywhere with an internet connection. For the underlying database, I selected MySQL.

I then logged in and created five “projects”: “Badass Data Science” to log tasks related to this blog, “Bills” to track when payments are due, “School” to track school related deadlines, “Repairs” to log repair tasks for my RV or truck, and “Miscellaneous” to track tasks that don’t fit cleanly into the other four topics. Since it is easy to add projects, I am not limited to only five categories.

all_projects

When a new task comes my way, I select the relevant project from the drop-down menu and then click on the “Report Issue” link. The following screenshot shows adding a task (building a GitHub page for the pyDome geodesic dome designer) in the “Badass Data Science” project:

creating_a_new_issue

We can then click “View Issues” to see all the tasks related to the project:

BDS_all_issues

At a glance we can see indicators of task priority (column “P”) and task severity. The rows are color-coded by status with a color key on the very bottom row. For example, five of the tasks shown in the image above have status “assigned”.

When a task is worked on or completed, one can select the task from the “View Issues” page and change its status and/or add notes.

editing_an_issue

The changing status workflow provides additional opportunity to add notes, for example information related to resolving a task:

resolving_an_issue

We can also filter issues by status, for example hiding all resolved issues:

hide_resolved

 

Posted in data science, engineering | Tagged , , , , , , , | Leave a comment

frugal anarchy

Of all the systems that we seek freedom within and from, none pervades our lives as much as the “econosphere” we inhabit. By “econosphere”, I mean the global network of economic activity whose nodes are individuals and whose edges are trade relationships between individuals.

Even if we had no government, we’d still likely be trading goods and services. Therefore the econosphere may be more significant than the existence of government to those seeking freedom. Anarchists traditionally focus on the elimination of government as the means of increasing freedom. However, I propose that limited reliance on the econosphere is a more comprehensive goal for anarchistic thinking.

There are two paths to individual economic freedom in a free-market economy: The first is to be wealthy enough to afford whatever transactions one wants to make whenever one wants to make them. This is unfortunately out of reach for most people. The second path is voluntary frugality; limiting the transactions one makes to well thought out targets, such that utility and satisfaction of purchases is maximized and very few dollars are spent on things outside those targets.

This strategy of voluntary frugality limits individual reliance on the econosphere by limiting the amount of money that an individual needs to acquire and spend, thereby enhancing their freedom to choose their path in life. I cannot think of a more practical expression of anarchism within the “real world” that we inhabit today.

Related Post

engineer moves into an RV

Posted in Uncategorized | Tagged , , , , , , , | Leave a comment

100th post to badass data science

This marks the 100th post to badass data science. I’ve written about everything from Lady Gaga to computational fluid dynamics, usually with a science or data related spin.

I thought I’d look at my posts analytically rather than simply reminisce. First, here is a tag cloud for the first 99 posts:

tag_cloud_CROPPED

From this tag cloud, I can see that either Python or R is used in many posts, and that most posts cover statistical and data science topics. Engineering is also a frequent tag.

I then produced a graph view using Networkx, where the nodes are tags and the edges are formed by tags that occur in the same post. Displaying this graph as VRML:

3D_distant_view

It is a little hard to see, so here is a closer view:

3D_close_view

In this image one can see that “statistics” is a primary hub. In rotated views of the graph (not shown), Python, R, and data science show similar prominence.

Finally, I computed the frequency of the top occurring tags:

tag_frequencies

From this I see that I wrote about Python more than R, which surprised me. I expected an even split. However this insight matches the fact that I favor using Python rather than R whenever possible, because Python is a full-featured programming language capable of easy string parsing and web deployment. It also looks like the number of engineering posts and the number of science posts is split evenly, which accurately reflects my technical interactions with the world.

The Future

I do not think my posts are “badass” enough for the blog’s title, so I’ll try to up the ante. Maybe I need something involving sharks and tornadoes. The post on claiming squatters’ rights is the closest I’ve come to my goal of “badass data science”.

I actually want to branch out and write more detailed analyses of less technical things, such as policy. Or take highly technical topics, such as synthetic biology, and write about them in a laypersons’ voice.

Most importantly, I’ll just keep writing. I’m bound to hit on something good.

Code

Code used to create the VRML shown above is attached.

process_tags.py

Posted in Uncategorized | Tagged , , | Leave a comment

engineer moves into an RV

I recently moved into a travel trailer to lessen the southern California cost of living (and because I like the idea of portable structures as an answer to housing scarcity). This living arrangement sparks my engineering creativity, which is the motivation for this post. Here I discuss RV living from a mechanical and software engineer’s viewpoint.

The following sections explore renting computing power since I no longer can carry a large desktop, modeling the trailer’s computational fluid dynamics (CFD) and mechanical dynamics, setting up an online registry of good boondocking sites, examining the 12 VDC system and figuring out how to add solar and wind power generation to enable off-grid living, improving the insulation and adding a thermostat to the air conditioner, and learning how to get on without my engineering textbooks (which do not fit in the RV).

Renting Computing Power

I no longer have room for a large desktop computer; instead I’m working with a laptop. This is a severe reduction in computing power, so whenever I’m working with a high computation, memory, or disk load I’ll rent computing power from Amazon’s Web Services. Amazon EC2 instances are inexpensive and easy to set up so this strategy is very practical.

Computational Fluid Dynamics

My truck manual limits the amount of surface area perpendicular to the direction of travel that can be towed, presumably due to drag forces. So I tried to use computational fluid dynamics (CFD) to find out if making the front of a trailer more aerodynamic buys one more surface area. I was unsuccessful at this experiment (other than generating cool pictures, below) since I couldn’t figure out how to get the OpenFOAM CFD package to work with compressible flow. First I ran incompressible CFD on a block-shaped trailer model:

v_block_incompressible_CROPPED

Then I ran incompressible CFD on a trailer model where the front is arced:

v_arc_incompresible_CROPPED

The pressure at the front appears to be slightly less for the arced model, which says that it is more aerodynamic. However, because this modeled flow is incompressible, I could not really say much about whether towing performance would be improved by using a trailer with a curved shape at the front. Nonetheless, I bought a trailer with an angled front assuming it will reduce towing drag.

Online Boondocking Site Registry

I won’t be boondocking (parking somewhere for the night without utility hookup) anytime soon since I have a day job that requires 40 hours/week presence. I’m staying in an RV park where there are showers and a laundry facility. However, I’ve been thinking of the needs of boondockers, and plan to create an online registry of good boondocking sites, if one does not exist.

The web application for such a registry will have fields for site location and ratings that users can fill in. I’ll have it generate maps if possible. For the infrastructure of the web application, I envision writing it in Python using Django, with MySQL as the supporting database. However, it may be better to use a NoSQL database like a document store—I’ll have to investigate this more thoroughly. For hosting the application, I’ll likely use an Amazon EC2 instance.

Part of my motivation for creating such a site is that I can see myself boondocking in the future, if I find myself between jobs.

UPDATE:  After writing this I found three such boondocking registry sites online. Their addresses are http://www.boondocking.org/https://www.boondockerswelcome.com/, and http://freecampsites.net/.

Vehicle Mechanical Dynamics

While towing the trailer, I noticed that whenever a semi-truck passed me, the front of my truck was pulled toward the passing semi, and I had to turn the steering wheel to keep my truck straight. This prompted me to explore the dynamics of towing:

model

In this very simple model, a wind gust from the semi-truck is modeled as force Fair, which rotates the trailer clockwise (about the center of mass which is between the wheels). The force at the hitch pin is also modeled such that it rotates the trailer clockwise when it is positive. For this “back of the envelop” calculation, I ignored the forces caused by the tires meeting the pavement to simplify things—assuming that they slip when lateral forces are applied.

On the tow vehicle, I again ignored the forces at the tires and joined the vehicle to the trailer with the hitch pin forces. I then constructed the above equations of motion and solved for the angular velocity of the trailer. This shows that when the trailer is forced to rotate clockwise by a wind gust from a passing semi, my truck is rotated counter-clockwise toward the passing semi, as observed.

12 VDC

The RV’s lights and fan run on 12 Volts DC, so they can run off the grid, but there are no 12 VDC receptacles inside the unit for running arbitrary 12 Volt appliances. However, I have 12 VDC tools that I would like to be able to run off the grid. I can get around this by removing the fuse box cover and attaching power clips to the DC input terminals for the converter:

DC_power_attachment_25

The problem with this arrangement is that the negative terminal is dangerously close to a positive one, so that if for example my cat bumps a clip, it might cause a short circuit and potentially a fire. I could insulate the terminal with electrical tape but that is too much of a “hack” for my taste. If I start boondocking regularly I’ll wire in real 12 VDC receptacles, and be sure to add fuses to them to prevent high current loads.

I added a second marine battery when I purchased the RV, so the overall amp-hours I can power is double the original amount (the batteries are identical, both new, and wired in parallel). A power converter below the fuse box converts grid 120 VAC to 12 VDC to keep the batteries charged. I check the water level in the batteries weekly to ensure the water is not boiled away during charging.

Solar Power

My RV does not have solar panels installed, which I’ll add if I start boondocking regularly. I currently have a small six Watt panel (pictured below) which works as little more than a trickle charger. Since I have no charge controller I have to manually check the batteries while using it, and have to disconnect it at night to prevent battery discharge through it (unless I add a diode to the circuit, which has the downside of lowering the current supplied to the batteries during the day). This six Watt panel certainly does not produce enough charge during the day to recover a night’s worth of DC power use.

solar_panel_25

For true off-the-grid use I’ll buy two 130 Watt panels along with a charge controller. If I can find a charge controller that feeds power back to the grid when I’m plugged in—even better. (It will have to have an inverter built in to do this). I expect the whole package to cost about 1200 USD overall. I’m not ready to commit those funds right now while I’m living in an RV park and have easy grid access, but plan to research the equipment needed now so that I’m ready to buy in case my living arrangements change.

It may be best to design and build a charge controller from scratch, given that I may want to mix solar, grid, and wind power (below) in one system, and I’m not sure if existing controllers can accommodate all three very well. Designing such a controller would be a fun project, but would take a substantial amount of time. If I do this I’ll use an Arduino or Raspberry Pi platform for computing, and use optical relays to separate the computing circuits from the power delivery circuits.

Wind Power

I have never seen an RV with a wind generator installed, but see no reason why a small turbine such as the Primus Air 30 (pictured) cannot be used.

primus-windpower-air-30-turbine-1-ar30-10-48-v

Depending on where I’m staying, I’ll perhaps want a marine-grade turbine. However, my RV is made of aluminum, which is not marine grade, so staying near the coast may be a bad idea no matter what turbine I buy.

I envision mounting it on the bumper and the upper rear of the trailer, as shown in the cheesy image below, with attachments that allow easy removal for when the RV is in motion.

wind_power_image

Insulation

Living in San Diego County requires little use of air conditioning or heating, but while traveling from Texas to San Diego things got hot in the trailer. The “R value” for the wall, floor, ceiling is R-7, which leaves much to be desired. The next time I take a trip across the desert I’ll cover the trailer’s windows with reflective bubble insulation while driving. My thinking is that these window covers can be held on with Velcro for easy removal.

Air Conditioner Thermostat

My trailer’s air conditioner has no thermostat to stop it from running once a desired temperature is reached. This causes problems in the middle of the night when things get cold but I don’t want to get up to turn the machine off. Furthermore, continuing to blow past the point of comfort wastes electricity. To deal with this matter I plan to splice in a household thermostat as soon as I measure the voltage at the AC’s on/off switch, assuming I can make the voltages match. If a household thermostat proves unsuitable, I’ll design a controller from scratch, again using an Arduino or Raspberry Pi system.

Relying on Wikipedia and the Web for Engineering Knowledge

To fit into the RV, I had to get rid of all my engineering and science textbooks. Now I’m relying on the internet whenever I need to look something up.

Related Posts

selecting travel trailers by regression

designing a battery array to power a CPAP machine

Posted in engineering | Tagged , , | 4 Comments