Friday, June 27, 2008

hackystat data in json

so, i got to thinking that it would be cool to switch to json for our REST data-interchange format. it will definitely speed up the network transfer. for example, 10 of my xml offline files from my last post about hackystat performance is 1.64 MB. but the same data stored in inline (compact) json is 876 KB. half the size! i'm not sure but i think the parsers for json are really fast too.

here is an example of the xml and json formats (i found a cool little converter at
http://www.thomasfrank.se/xml_to_json.html:


<sensordata>
   <timestamp>2008-06-26T21:11:54.314-10:00</timestamp>
   <runtime>2008-06-26T21:11:54.314-10:00</runtime>
   <tool>Checkstyle</tool>
   <sensordatatype>Code&rdatatype>
   <resource>E:\java\svn\hidden\hidden\hidden\package.html</resource>
   <owner>kagawaa@hahah.hahaha</owner>
   <properties/>
</sensordata>


{
   sensordata:{
     timestamp:'2008-06-26T21:11:54.314-10:00',
     runtime:'2008-06-26T21:11:54.314-10:00',
     tool:'Checkstyle',
     sensordatatype:'CodeIssue',
     resource:'E:\java\svn\hidden\hidden\hidden\package.html',
     owner:'kagawaa@hahah.hahaha',
     properties:{}
   }
}

trust, me if you put json in inline form you'll get a big savings (i guess thats the same for a lot of formats).

one of the cool things about Ruby On Rails is that (and this is just an example) RoR handles all the different data-interchange formats is. it lets the client decide. here is a snippet from a rails controller:

respond_to do |format|
   format.html
   format.xml { render :xml => @sensordata.to_xml }
   format.yaml { render :inline => @sensordata.to_yaml }
   format.js { render :text => @sensordata.to_json }
   format.json { render :json => @sensordata.to_json }
end

anyway, just a thought.

Thursday, June 26, 2008

hackystat performance

i started recent discussion on hackystat-dev about some weirdness i was noticing when sending offline data to the hackystat sensorbase.

here is what i said,

  • I had a lot of offline data
  • I executed an ant sensor
  • the sensor started to send the offline data
  • i decided that i didn't want to wait for the offline data and did a Ctrl+C to kill the ant task.
  • the client seemed to recover fine.i didn't check the task manager or anything.
  • but, i had a remote desktop connection to the sensorbase and noticed that the server continued to received data for quite a while. i eventually stopped the server too and restarted it.


  • the discussion continued i got a somewhat strange response from philip that caught me a little off guard.

    > Is it the asynchronous nature of a REST post?
    No, it's the nature of TCP and socket-based communication:

    <http://www.ncsa.uiuc.edu/~vwelch/net_perf/tcp_windows.html>

    Google on "socket buffer size" for more related links.
    Again, this is only my _hypothesis_ as to why you were receiving data after you
    killed the sending process.


    austen and i started to talk about this at work. austen initially agreed with philip. but, i still scratched my head... hm.. doing a search on the "socket buffer size" showed me that

    Typical network latency from Sunnyvale to Reston is about 40ms, and Windows XP has a default TCP buffer size of 17,520 bytes. Therefore, Bob's maximum possible throughput is:

    17520 Bytes / .04 seconds = .44 MBytes/sec = 3.5 Mbits/second

    The default TCP buffer size for Mac OS X is 64K, so with Mac OS X he would have done a bit better, but still nowhere near the 100Mbps that should be possible.

    65936 Bytes / .04 seconds = 1.6 MBytes/sec = 13 Mbits/second



    so... the delay i was seeing couldn't possibly be from 64K buffers could it? i'm not sure. so i designed a little experiment. here is what i did.

  • i created a builds worth of offline data by giving my sensorshell.properties file a bogus password. NOTE a builds worth of data is 11.3 MB.

  • i shutdown the sensorbase (we are using austens postgres version)

  • deleted all the logs from the sensorbase and my client

  • brought up a build for a project and executed ant checkstyle (which calls the sensor

  • i watch the consoles and logs


  • here is some interesting results.

    here is my Checkstyle.log file (edited to save horizontal space):
    22:44:57 Hackystat SensorShell Version: 8.1.530
    22:44:57 SensorShell started at: Thu Jun 26 22:44:56 HST 2008
    22:44:57 SensorProperties
    sensorshell.autosend.maxbuffer : 250
    sensorshell.autosend.timeinterval : 1.0
    sensorshell.logging.level : INFO
    sensorshell.multishell.autosend.timeinterval : 0.05
    sensorshell.multishell.batchsize : 499
    sensorshell.multishell.enabled : false
    sensorshell.multishell.maxbuffer : 500
    sensorshell.multishell.numshells : 10
    sensorshell.offline.cache.enabled : true
    sensorshell.offline.recovery.enabled : true
    sensorshell.sensorbase.host : http://blah:9876/sensorbase
    sensorshell.sensorbase.user : kagawaa@hahaha.hahaha
    sensorshell.statechange.interval : 30
    sensorshell.timeout : 10
    sensorshell.timeout.ping : 2
    sensorshell.properties file location: C:\..\sensorshell.properties
    22:44:57 Type 'help' for a list of commands.
    22:44:59 Host: http://naraku:9876/sensorbase/ is available.
    22:44:59 User akagawa@referentia.com is authorized to login at this host.
    22:44:59 Maximum Java heap size (bytes): 66650112
    22:44:59 AutoSend time interval set to 60 seconds
    22:45:00 Pinged http://blah:9876/sensorbase/ in 188 ms. Result is: true
    22:45:00 Checking for offline data to recover.
    22:45:00 Invoking offline recovery on 48 files.
    22:45:59 Timer-based invocation of send().
    22:46:59 Timer-based invocation of send().
    22:47:59 Timer-based invocation of send().
    22:48:59 Timer-based invocation of send().
    22:49:55 #> quit
    22:49:55 #> send
    22:49:55 Pinged http://blah:9876/sensorbase/ in 0 ms. Result is: false
    22:49:55 Server not available. Storing commands offline.
    22:49:55 Stored 4 sensor data instances in:
    C:\...\sensorshell\offline\2008.06.26.22.49.55.426.xml
    22:49:55 Quitting SensorShell started at: Thu Jun 26 22:44:56 HST 2008
    22:49:55 Total sensor data instances sent: 0



    at the end of the sensor execution it says it can't send 4 sensor data instances! what happened to the server!? hm.. what is going on. for some reason that indicates to me that sending the offline data made the sensorbase unavailable. so, i looked in the Checkstyle-offline-recovery.log file

    22:45:00 Maximum Java heap size (bytes): 66650112
    22:45:00 AutoSend disabled.
    22:45:00 Invoking offline recovery on 48 files.
    22:45:00 Recovering offline data from: 2008.06.26.21.12.01.081.xml
    22:45:00 Found 251 instances.
    22:45:00 Invoking send(); buffer size > 250
    22:45:00 #> send
    22:45:01 Pinged http://blah:9876/sensorbase/ in 204 ms. Result is: true
    22:45:01 Attempting to send 251 sensor data instances.
    Available memory (bytes): 63071744
    22:45:12 Error sending data: org.hackystat....SensorBaseClientException:
    1001: Unable to complete the HTTP call due to a communication error
    with the remote server. Read timed out
    22:45:12 org.hackystat.sensorbase.client.SensorBaseClientException: 1001:
    Unable to complete the HTTP call due to a communication error with the
    remote server. Read timed out at org.hackystat.sensorbase.client.
    SensorBaseClient.putSensorDataBatch(SensorBaseClient.java:827)
    at org.hackystat.sensorshell.command.SensorDataCommand.
    send(SensorDataCommand.java:82)
    22:45:12 Exception during send(): org.hackystat.sensorshell.
    SensorShellException: Could not send data: error in SensorBaseClient
    22:45:12 About to send data
    22:45:12 Successfully sent: 0 instances.
    22:45:12 Did not send all instances.
    C:\...\sensorshell\offline\2008.06.26.21.12.01.081.xml not deleted.


    in the Checkstyle-offline-recovery.log file, i can't find one successful send. but, i know i sent data over. i have no idea but maybe the "Timer-based invocation of send()." was able to send data, but there is no log of that timber-base send. anyway, so there was a problem with the server. but, the logs on the server show nothing. i just get a whole bunch of these

    22:55 Put: 2008-06-26T21:11:56 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:56 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:56 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:56 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:56 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:56 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:56 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:56 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:56 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:56 k@gm.com Checkstyle CodeIssue
    22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue


    so i have no idea why the server wasn't responding. in the end off all of that i only see 1,742 entries in the database (there might be a problem with the sensordatatype for coverage). so that's 10 minutes of work for a very small amount of data entries. i would guess there were tens of thousands of possible entries in the offline data.

    here is another thing. note that in my Checkstyle.log it says that my sensor process ended at

    last entry in from Checkstyle.log
    06/26 22:49:55 Total sensor data instances sent: 0

    last entry on sensorbase
    06/26 22:55 Put: 2008-06-26T21:11:55 k@gm.com Checkstyle CodeIssue


    thats 6 minutes after the sensor ended. so, getting back to those 64K buffer caches... that seems a little strange to me. if thats really what is happening there must be a lot of caches.

    anyway, there are two things happening.

    1) i can send data to the hackystat sensorbase server way way faster than hackystat can consume
    2) because of that backlog on the server its not letting me send more data

    that worries me. after all 11.3 MB of data is really tiny; after all its just one full build of our system with sensors turned on. here are some questions:

    1) how fast can hackystat consume data?
    2) what is happening such that data is being processed on the server 6 minutes after the client is finished sending?
    3) whats the current bottleneck?
    4) how many users can send data (and how much data) simultaneously without killing the server?

    hm.. there are a lot of moving parts to this so this is just the start of looking into this. but, all i know is that 11.3 MBs is really small. in my other project we move that amount of data in an handful of seconds not even close to minutes.

    ps. i had to delete my offline data. :(

    Sunday, June 22, 2008

    pono: do whats right



    an excerpt from maui news, an invitation to do what's right: PONO.

    On June 22, 2004, their 3-year-old son, Pono Viela, died from injuries sustained when the all-terrain vehicle he was riding with his father flipped over.

    In death, the pono campaign was born. Since then, the couple and their daughter, Jrae, and son, Jai, have adopted the “Pono Do What’s Right” motto and performed good deeds throughout the community, such as adopting a portion of the highway near their home for cleanup, staging a golf tournament that has raised tens of thousands of dollars for local nonprofit organizations, running sports programs, accepting invitations to tell their story in the hopes of helping and inspiring others to do good.


    why am i writing about this? well, it is a tragic story but a lot of good has come of the Pono movement. and i'm so proud to say that my cousin dana and friend jen has been heavily involved with pono.

    In printing the T-shirts, Paracuelles went to the same company, Spread Pono Co., that designs and prints T-shirts for the Vielas. In addition, owners Jennifer Tengan and Dana Kagawa print shirts using environmentally friendly water-based ink.

    The reef T-shirts, which are $18 for adults and $16 for children, will be sold at the event and are available at Working Mommies.

    During Reef Night, the Vielas will hold a “Pono Fashion Show” at 7:50 p.m. to unveil their new T-shirt designs. In handing over the Pono T-shirt lines to Tengan and Kagawa recently, Maile Viela said she was hoping to use the talents of the young graphic artists to design shirts attractive to the younger set.

    The Pono T-shirts are $18 for adults and $16 for children. They can be obtained at the Reef Night, at Working Mommies in Wailuku and from the Web site www.spreadpono.com.


    i'm sporting my pono t-shirt today, but i try to live pono every day. i wish i made the trip out to maui today, because for some reason i feel like i need to learn more about Pono Viela, his family, and more about the pono movement.

    Thursday, June 19, 2008

    embrace your imagination

    Somebody much smarter than me once said that pessimists are usually right, optimists are usually wrong, but all the great breakthroughs in history were done by optimists. - Thomas Friedman.


    i just read an awesome blog about The most important competition is the one between you and your own imagination. here are some excerpts:

    In the latest edition, I added a whole section on why liberal arts are more important than ever. It’s not that I don’t think math and science are important. They still are. But more than ever our secret sauce comes from our ability to integrate art, science, music and literature with the hard sciences. That’s what produces an iPod revolution or a Google.


    One thing we know about creativity,” he says, “is that it typically occurs when people who have mastered two or more quite different fields use the framework in one to think afresh about the other. --Marc Tucker


    the messages that i take away from this blog post is to not give up on my imagination. i just so happened to be thinking about this recently and have been posting random ideas that i have:
  • an firefox extension idea for copying links
  • an idea called moody
  • wordle for my blogs and an idea
  • an idea for twitter
  • an idea for ESPN

    i've always practiced communicating my thoughts. now, i've found my self trying to practice to communicate my imagination. its somewhat different to formalize imagination.

    i'll make sure that i give my imagination a fighting chance.
  • Wednesday, June 18, 2008

    an firefox extension idea for copying links

    so, i did some searching around the internet for some cmmi references tonight and found some interesting ones that i wanted to share with my team. so, i started to write a wiki page with the links (hopefully i'll add some summary too). anyway, it occurs to me that i need to visit the page twice to copy. one time to get the title of the page (i hate links that are just the url) and another time to get the url. so i copy the title and url separately.

    there needs to be a "copy title and url" function. that just basically writes this on a paste:
    <a href="http://www.sei.cmu.edu/cmmi/adoption/books.html">CMMI Books</a>

    here is a random side comment: i've just realized that one of the cool things in confluence is that the it dynamically builds links with a macro. for example, if the link was within confluence you'd just have to do [CMMI Books] and confluence will create the correct link for you. so, for linking to confluence pages you only need to know the name of the page. thats cool! (i told you that was random)

    Monday, June 16, 2008

    an idea called moody

    i was chatting with ryank about a few things and had an idea for an application called moody.

    aaron: its kinda like twitter but its a score posting.
    aaron: you can contribute to the mood of the group.
    aaron: your friends are "grumpy" on a scale from 1 - 10
    aaron: random times during the day you can set your mood.
    ryan: i like the name
    ryan: sounds interesting
    aaron: i think things like myspace has it.
    aaron: but never really aggregates it by groups.
    aaron: haha. i just want it for the engineering group.
    aaron: and have "focused" as an option.
    ryan: that would be cool
    ryan: you know...the more i think about it...the more i like it


    i think the idea is just to be able to get the average mood of a group of people. just to be aware of how a group is feeling.

    woah... a lot of people post twitter posts that implies their mood. what if the system could figure out your implied mood. or or... here is another random idea. what if twitter could predict what you are going to post. that would be kind of scary.

    Thursday, June 12, 2008

    wordle for my blogs and an idea

    greg wilson's blog post about wordle got me interested enough to try it on the last three months of my blog. so, i copied and pasted all the text from my blog and got this:



    it looks like for some reason i write about students a lot. i hardly see any software terms in there. i guess this isn't much of a technical blog. haha. wait a minute, i don't even see hackystat in there. i better write more about hackystat, hackystat, hackystat. :)

    what does your wordle look like?

    here is an idea
    wordle should make wordle's for your twitter feed. here is a wordle from ian's tweets for the past couple weeks:


    (twittering ian's wordle - this was my first version using screen scrape from the twitter website)



    (twittering ian's wordle - this is my second version using rss feeds)

    Wednesday, June 11, 2008

    cmmi ftw

    it is official, i am officially trained in cmmi v1.2 via the official training course, Introduction to CMMI Version 1.2. according to sei, i am one out of about 54k people that have taken the intro to cmmi course. haha. an elite group to say the least! here is a very high level overview of cmmi.

    in God we trust, all others bring data --Deming


    when i picture googled cmmi i got some entertaining graphics:

    woah.. that looks really simple. not!. and thats level 3. sheesh.


    haha.. its a money maker for sure.




    some dude post this as a cmmi pic.


    according to sei, 63.8% of the organizations using cmmi is outside of the US. haha. so i guess they are doing it by choice.

    i want one of these!

    process art!


    i don't have all day to teach you all about cmmi. plus i want to keep all that knowledge for myself anyway. haha. go read the wikipedia cmmi page first. then ask me a question.

    and! i am also officially trained in "standard cmmi appraisal method for process improvement (scampi): class b team training".

    i'm making a lot of fun about cmmi, but its serious stuff. its very important for us. and its not that easy. its time to get really serious and knock this cmmi stuff out of the park.

    Tuesday, June 10, 2008

    lucky we live hawaii

    (right outside my hotel room in kona)

    our buddy Ian always keeps on reminding us that we are lucky to live in hawaii by posting his sunset pics. he's right! we are really lucky.

    there are so many reason why i love living in hawaii. as i was looking through some pictures, i was reminded of one reason. i really like to play tourist and stay at hawaii's nice hotels. its kinda like going on vacation without having to leave the island. for some reason its really refreshing to watch the sunset from my fancy hotel room.

    here are some of our favorite island hotels:
  • halekulani - really nice hotel rooms, great service and lots of very good food.
  • kahala resort (formally the mandarin) - swim with dolphins, free scuba lessons, and a secluded beach
  • ihilani - huge rooms, huge bathrooms, jacuzzi rooms, friday hula show
  • hilton hawaiian village (alii tower) - free sunset pupus, great service, good views
  • turtle bay - nothing to do except relax

    (some of my hotel pics)

    there are some tips and tricks to doing this. for example, try the turtle bay escape club, its a lower than kamaaina price. or try the spa room at at ihilani, it is awesome (and is not very well known). playing tourist is a lot of fun!

    yup, i agree... Lucky we live Hawaii
  • Sunday, June 8, 2008

    an idea for twitter

    here is an idea for twitter. an led sign like the one i found at plasmaled can show your updates at your desk at work. obviously, the sign would be situated so that people walking by can see the messages. the basic idea is that maybe you can twitter about work. for example,
    i'm working on the proposal for the acme project.

    or
    i'm having a bad day. don't bother me.




    its kind of a random novelty item that could be branded as a twitter hardware product. it could make millions.

    UPDATE: here is a similar concept for viewing facebook pictures with a Wi-Fi photo frame.

    it adds new feeds to its 8-inch Wi-Fi frame. Already able to get streams from online photo services such as Flickr or Photobucket, the wireless frame has now added Facebook to its networked family.

    Once hooked up to your Facebook account, it will automatically display photos uploaded to the social network, GeekAlerts says. Which means that you probably want to be careful where you place the frame when mom and dad come over.

    Friday, June 6, 2008

    an idea for ESPN

    i just had an idea for ESPN. i was talking to with ryank about his kid's baseball tournament. and an idea about creating a video highlight popped into my head. we all have seen those little league trading cards or fake sports magazines with kids pictures on it. this highlight video would be exactly like those fake magazines and trading cards, but it would be a Sportscenter highlight.



    so, the idea is that ESPN create a online widget that allows the user to upload video segment; for example something that fits into the Top Plays of the Day. The user can adjust the time frames and line up with the Top Plays of the Day count down; maybe making their segment the number one play of the day. Then I guess the user can do a voice over to explain the play. maybe you can have stuart scott say "boo ya!".

    this would be an awesome thing for ESPN to provide. if it was user friendly and very flashy, i bet it would get kids even more excited about their sports. i can see baseball, football, basketball kids all over the country making their own highlight videos.

    Monday, June 2, 2008

    changing to something other than windows

    i've been thinking about changing operating systems lately. i've been putting off using a better operating system cause i just don't want to hassle with all the minor "broken" things. i've been really happy with XP for a while. it is a pretty stable operating system and it works with nearly everything. that is the nice thing about windows. buy a new digital camera and it works with XP; i'm not sure you can say the same thing with linux. here is a funny video about the different os choices.

    there are pluses and minuses for all choices. austen would say its lame that i still use XP for hacking and i definitely agree. but, the simple fact that my work computer is a xp machine and that xp is the operating system installed on all my computers pretty much makes me an xp user.

    so i've been thinking of getting a mac or maybe getting a separate linux machine. i don't know. i'm almost ready to make a change, i wonder what i need to push me over the edge. i guess the bottom line is that i don't care that much; or i would have switched a long time ago.