I decided to take a look at R this weekend between our family events. I had looked at R before when I ran across the R Tutorial. I bookmarked it and decided I would come back later. The other day at work a professor performing big data analysis started a conversation about R and offered to create some examples. He recommended working through the examples and tweaking those as a way to learn R.
Upon further consideration, I decided to go back to R Tutorial and work through the basics of input and data types. I felt like I needed a base before trying some of the examples. This approach seemed to work well-at least for me. I worked through the first 2-3 sections of the tutorial. When I got to the plotting section, things got interesting. Creating plots with the built-in functions seemed very straightforward and powerful. This led me on a quest to find more plotting libraries with subjectively more visually appealing graphs (the default plots aren't too shabby).
I had used Google's Charting Tools before and was curious to see if there was an option to use them from R. It turns out there is an excellent library called googleVis. This library creates the HTML, JavaScript and CSS required to create graphs from R. The resulting graphs are visually appealing and interactive. The one downside was, since the rely on the Google Charting Tools, they require an Internet connection to pull down the required JavaScript from Google. I will probably use googleVis in the future for some charts, but it wasn't ideal for my current thought.
The next graphing library I looked at was ggplot2. From the website examples, this library appeared to be exactly what I was looking for. The documentation in the reference manual for each of the functions is well documented. When I started trying to use the library, I had already pulled in my dataset from a CSV file. My initial problem was figuring out how the data was passed to the ggplot function and how this related to the qplot function. The "grammar of graphics" was getting the best of me. After searching the web, I was able to find the R Cookbook which had more examples of scatter plots-not to mention a lot of other good R information. These examples provided the missing link for me: how to supply the data to the graphing functions to get the plots to work. With the combination of ggplot2 and R Cookbook, I was able to create graphs that provided some additional insight into the data.
Other Thoughts
Some other things I wanted to note:
Installing packages available on CRAN are extremely easy and they just worked.
After starting with the binary for R, I then found RStudio. It looks like it is in early development, but it quickly became my default environment. With an editor, workspace and console, it was hard to find something better.
R has a "batch" processing mode (plus Rscript) which looks interesting for processing data outside the environment.
These were just some of my early experiences with R. Overall, I really enjoyed using R and the ideas started flowing on how I might use it-everything from analyzing spending at home to analyzing data at work. Next, I hope to look at using R with ggplot2 to analyze the results from Apache Bench. If the results turn out interesting, I hope to find time to share them.
Running a virtual machine is extremely handy for development or trying out different configurations. VirtualBox is handy virtualization software especially since it is free. My goal was to setup a virtual machine to familarize myself with a few different configuration management tools. I could have tried Vagrant, but since we use Centos for most of our servers and Vagrant defaults to Debian, I went ahead and installed Centos myself from a boot.iso.
I wanted to start my Centos virtual machine in VirtualBox and then minimize it. I planned on using ssh to connect from my host machine's terminal. Since my guest machine was running in NAT mode, I had to tell Virtual Box to forward a port from my host machine to my guest machine. I decided to forward port 2222 on my host machine to port 22 on my guest machine for ssh. From the Terminal, I ran the follow commands:
Your VM name "Centos5" could be different as well as the "VBoxInternal..." path. Once this was done, I booted my VM and then I was able to ssh to the guest:
ssh -p 2222 127.0.0.1
From there, I was able to ssh to my local VM and try out some different configuration management tools.
A few weeks ago I wanted to start learning Erlang. A co-worker pointed out the Ruby Programming Challenge for Newbies that they were completing in Ruby. I decided to try the RPCFN #4, but write it in Erlang. This probably isn’t the most concise or best implementation, but it was a good exercise to encourage me to look at Erlang.
%% This was inspired from the ruby programming challenge for newbies.%% http://rubylearning.com/blog/2009/11/26/rpcfn-rubyfun-4/-module(polynomials).%% -compile(export_all).-export([poly_epr/1]).-import(string,[concat/2]).-import(lists,[append/1]).%% include the test module-include_lib("eunit/include/eunit.hrl").%% Create a polynomial expression from an array of numberspoly_epr(List)whenis_list(List),length(List)>=2->P=gen_epr([],List),caseR=join_epr(P)of""->"0";_->Rend;poly_epr(_)->{error,"Need at least 2 coefficients"}.%% generate the polyonmial expressiongen_epr(Poly,[])->casePolyof[]->"0";_->Polyend;gen_epr(Poly,[H|T])->Poly++gen_epr([term(H,length(T))],T).%% join the expressions term togetherjoin_epr([])->"0";join_epr([H|T])->H++append([check_neg(X)||X<-T]).%% add appropreiate sign in front of expression termcheck_neg([])->"";check_neg(Val="-"++_T)->Val;check_neg(Val)->concat("+",Val).%% create an expression termterm(1,Expo)->expo(Expo);term(-1,Expo)->"-"++expo(Expo);term(0,_Expo)->"";term(Val,0)->integer_to_list(Val);term(Val,Expo)whenis_number(Val),is_number(Expo)->concat(integer_to_list(Val),expo(Expo));term(_Val,_Expo)->"".%% create the exponent expressionexpo(1)->"x";expo(E)->concat("x^",integer_to_list(E)).%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% TESTS%% term teststerm1_test()->"x^2"=term(1,2).term_negative_value_test()->"-x^2"=term(-1,2).term0_test()->""=term(0,5).term_zero_exponent_test()->"5"=term(5,0).term_bad_values_test()->""=term("str","more").%% poly tests from the rpcfnpoly_epr1_test()->?assert("3x^3+4x^2-3"=:=poly_epr([3,4,0,-3])).poly_first_negative_test()->?assert("-3x^4-4x^3+x^2+6"=:=poly_epr([-3,-4,1,0,6])).poly_simple_test()->?assert("x^2+2"=:=poly_epr([1,0,2])).poly_first_minus_one_test()->?assert("-x^3-2x^2+3x"=:=poly_epr([-1,-2,3,0])).poly_all_zera_test()->?assert("0"=:=poly_epr([0,0,0])).poly_test_error_test()->{error,Msg}=poly_epr([1]),?assert("Need at least 2 coefficients"=:=Msg).
You need to have eunit setup in your code path. Then you can start the Erlang shell and run the tests. Really, they pass!
1
2
3
4
5
6
7
8
9
10
$ erlErlang R13B03 (erts-5.7.4)[source][smp:2:2][rq:2][async-threads:0][kernel-poll:false]Eshell V5.7.4 (abort with ^G)1> c(polynomials).{ok,polynomials}2> polynomials:test(). All 11 tests passed.ok3>
Autotest, which is part of ZenTest, is a very handy testing application. It runs tests as changes are made to the code. When using it, I would accidentally leave it running and then notice something using up CPU cycles. It would turn out to be the autotest process that was still scanning files for changes every so often. I would then stop autotest only to be bothered to start it up again when I was working on the project again.
Awhile ago, I ran across autotest-fsevent for the Mac. It uses the Mac's FSEvent core service to determine which files have changed (or to be notified when they are). This appears to have really helped the CPU cycles especially when nothing has changed. It is immediately notified when a file has changed.
I would also recommend upgrading to the latest autotest-growl as well. I was using an older version and there have been improvements.
This year I made the trip to Oshkosh for the EAA Airventure Airshow. One of the coolest planes I got to see there (and there were many nice planes) was the WhiteKnightTwo. The last time I was at Oshkosh, I was able to see Scaled Composites Boomerang which was an aerodynamic wonder. Looking at the WK2, the 2 fuselage approach reminded me of the Boomerang, but was different because of the symmetry of the plane. In looking for differences, the right fuselage had on extra exhaust on the upper outside in the back. I could not reason what this could be used for, but it was a difference. It also appeared that there was more visible wiring in the left fuselage presumably for instruments for flying.
Overall a fascinating aircraft especially if it allows for the start commercial space flight. I can't wait to see it happen and the folks at Scaled have done some pretty remarkable things.
Last week, my wife and I put down new mulch in the natural areas of our yard. As I was loading cart after cart of mulch to lug across the yard, I started thinking we might not be able to get all 8 yards of mulch spread that day. I really wanted to get this done so I didn't have to worry about it over the weekend. Looking at the pile, it did not seem to be getting any smaller and it was approaching lunch time. A thought came to mind: how can I make this into a smaller task that I could accomplish in a few hours instead of looking at the entire pile over the period of a day? What if I worked to split the mulch in the middle into two separate piles (dividing and conquering)? That could make the task interesting (kids might like it) and give me a smaller goal to attempt to reach.
How many times do I find myself asking that question? How can I break task X down into something smaller so I can feel like I am accomplishing something? My understanding of goal setting comes from studying of GTD, Agile development processes and a history of playing basketball. GTD and Agile development have taught this as some of their core concepts (if I understand them correctly). The further along I got playing basketball, I found we were always breaking down plays or watching videos in smaller chunks to analyze how we could get better. The task of breaking down issues into smaller tasks seems to be fairly important and a skill that I frequently find myself using (as long as I don't over plan).
Back to my pile of mulch. I was able to get it divided into two halves.
From there, I proceeded to break it into smaller tasks. I focused on the smaller half first and then started breaking off the corners of the larger half that was left. It sure did help and we were able to get all of the mulch spread by the end of the day (YAY!). Once again, setting smaller goals, although they may not have made a difference in the speed that I got my part done, they did help me focus on small units of work that I needed to get done.
While playing sports (namely basketball) for a number of years, I played on a number of different teams and had a number of different coaches (2 main coaches in high school and college). Looking back, it was very interesting to see the different styles in coaches and players. I was pretty lucky for the most part, the majority of my experiences were on teams that understood the concept of "team play".
One thing I didn't completely realize was how important the coach's leadership was and the values they instilled. When a player first joined the team, they wouldn't completely fit in or at the very least they would struggle a litte. Having these core values, provided by the coach originally and promoted by upperclassmen, the freshmen (or new transfer) has a base to build from. As they grow as a player and person (going from freshman to sophomore and so on), their playing skills would develop, but they would also continue to 'buy-in' (or gel) to the team values. Of course, this happened at a different pace for everyone depending on the player.
I am starting to see that again in the agile software development. As teams shift and grow, you go through this process over and over. I think it is important to have those underlying values, but the overall appearance of the team will reflect the current players. I am learning there is a delicate balance between emphasizing values, letting the team gel and using the strengths of the new players. If too much emphasis is placed on values, it suppresses the strengths of new players. If the values are shifted too much, you lose the history that has brought you success in the past. I feel like it is somewhere in the middle that allows your team to gel the quickest depending on the number of returning starters you have from the previous season (or project).
I really like CouchDB and the flexibility it provides. It is currently under heavy development (the last release was an incubating release) and things are changing frequently. Most of the time, this means improvements or new features, but sometimes this leads to breaking backward compatibility. The other night I tried the latest clone of couchrest with this commit. I looked at the breaking changes page (based on the comment on the commit on github). I found the reference to the moved view URLs which was the problem I was having. After reading the discussion on the mailing list, it sounds like it is a good idea and is worth breaking backward compatibility.
To update to the latest version of CouchDB (basically working from the trunk), here are the steps I took:
This (home grown) blog is now running on Ruby 1.9. Since I wrote it using Sinatra and couch-rest, it was pretty easy to get working. The only real trick was adding a fake JSON gem to fulfill a dependency on JSON (which is now part of Ruby 1.9). I found this reference on the IsItRuby19 which came in very handy in determining which gems work on Ruby 1.9.
Last night, Esther Derby came and talked to the local Agile RTP group. Her talk was called "What's a Manager To Do?". With agile teams being self-organizing, what is the manager's role in this situation? An agile team can start to take on some of the items that a manager used to do. Task assignment and tracking are two items that seemed like good candidates for the team to take over. There are some things that require a manager that aren't going away. HR related issues and budgeting are two examples.
There are several grey areas that could go either way depending on the maturity and organization of the agile team. For instance, conflict/friction issues might need to be handled by a manager, but for the health of the team, might be best addressed at the team or individual level. If someone reports to their manager that someone on the team isn't performing and the manager goes to this individual and says 'I heard you weren't performing (doing or not doing X)', this isn't going to be good for the team. Who can they trust on the team? If the team has been letting an issue fester for a long time (not addressing the issue with the other team member), this can cause even more issues. From Esther's experience, it sounded like 1) openness between the team members is critical and 2) the manager should really fine tune their observation skills so they can catch issues as they come up.
Another grey area is the decision making process. If the team assumes they have the right to make decision X and the manager vetoes this right, that can really discourage the team. It can also work the other way where the team is assuming the manager is going to make a decision, but the manager is expecting that from the team. Esther's suggestion was to layout a decision matrix that helps clarify who is responsible for making decisions. Although it wasn't directly said (or I didn't hear), I suppose this helps with the transparency of the team which is a common theme in the agile space.
An underlying theme, from my perspective, was the reference to teams, in particular team sports, where it truly is a group of players working together (her example was basketball). Many of the agile concepts that appeal to me came from my experiences playing basketball. I am frequently reminded of how my experiences playing on a team were so similar to being on a team at work. Everything from how to interact with others to keeping a goal in mind and staying on target.
Overall, the talk was good and reaffirmed some of the ideas we were already using. Clarifying decision making, dealing with friction and the "team" concept were some of the key points I am taking away from last night.