Monday, September 5, 2011

Functionality Improves

So, I have made progress with www.cubebuilder.org ... It can display project contents, added icons for adding projects & files, added logic to display the sequence cube, and started the logic to display the BEST protein for each finger.

That said, I really don't have a sense where this is all going ... while the PHP programming is fun .. the 'End Game' is not clear to me.

- I need to finish ...
--- BEST LOGIC
--- Remove dnaseq.list logic and get it from the raw data
--- make the project directory display pretty
--- build 'add project' functionality
--- build 'add member' functionality

I really don't have time for to smoke a cigar and have a brandy while programming for no purpose ... I really hope Prof. Bach & others are able to clearly articulate their end game.

Some Questions:

1. The data I was given how old is it?
2. Where did the data come from?
3. There is only finger 2 data, we fabricated fingers 1 & 3 ... at what point will we (you) do something that will generate real current data for all the fingers? Next week? Next month? Next Year?

It just seems like an academic exercise for little gain.What I don't understand is that the data I was given is 20 years old ... and this stuff that I am building is so "101"  this, at least at this level I am being presented, is very basic ...  Someone has had to have already done this before????

I do think that building a module for Joomla/Community builder might be the best way to build a collaborative environment, and then just add the research tools needed.

Wednesday, August 31, 2011

Collaboration tools ...

Once having a basic understanding of how the data is to be stored, and how to organize a tool belt of of utilities, the next step was to address Professor Bach's dream of the world of researchers living together in perfect harmony, sharing their work. I was able to quickly prototype a basic framework for adding members and adding projects ... Professor Bach loved what I had done. But honestly it was a "Hello World" of community building. Joomla, Wordpress, and Drupal handle this type of community and collaboration much easier with a lot less effort.

So, What I am envisioning is using an existing framework, most likely Joomla and then building custom Joomla components to support the unique needs of the researcher.

But even after determining what a given program might have to do, for me the key will be tying the web community/collaboration front-end with a back end High Performance Computing Cloud Community backbone to do all the heavy lifting. It is too much to expect a single unix apache web server to process millions of records and store millions of knowledge cubes. So, I am really looking forward to tying these approaches together.

In the meantime, my work is to design approaches and algorithms which will scale depending upon the amount of data provided by a given researcher.

Expanding upon an idea ...

Not wanting to rest on my laurels of past Wednesday's php program, and a little bit of time to let the ideas brew. I now had a tool which did something, it created a cube of data, in the old days we just called it a report. But the term "Knowledge Cube" seems so sexy and academic. I had been pondering a couple of goals discussed at out first meeting.

Goal 1: The data model needs to be flexible. Prof Bach did want to use a "table" because of all the baggage that goes along with structured data layout. So, I went back to my PICK Basic roots. Pick basic uses delimiters to define the data structure. We have all used comma and tab delimited files over the years, Pick Basic used char(254) char(255) char(256) to allow for a multi deminisional array of data to be stored. At this point I am not using those charaters or PICK Basic (but I just might) I am using the concepts of delimiters and using the semicolon, the comma, the tilda, etc to allow for multiple instances of data to live within one data type. Think of it this way, if you want to store a customer number, or a Protein/DNA value combination in a traditional data table, and you didn't know how many customers you had, or how many Proteins you have, you are not sure how many columns to create for a given table. So, the delimiter methodology allows a researcher or a person storing data to capture has much data as they want for a given item, while keeping it in a single row of information. Honestly, this is VERY old school.

Once I was able to determine the format to store the results in, then now it just becomes a programming exercise to create programs to work with this mufti-demensional data model. At first this sounds like that the data can go in any direction any which way. There will have to be rules, there will have to be standards. These rules and standards will develop as I better understand the business needs of the researcher.

Ahhhh! the crux of the problem ... Academic Systems IT versus Administrative Systems IT. Working at Yale University the past 15 years I have experienced the "great divide" of between the Academic & Administrative sides of the university. And even in my very first class at the University of Bridgeport, comments made by my professor made it clear to me that this divide was not limited to Yale.

I am hoping that my business problem solving approach helps create & improve processes for Prof. Bach and his team to find some of the hidden answers deep within the human genome.

Thursday, August 25, 2011

Welcome to Reseach ...

Every time I attempt blogging it seems like a good idea, and fizzles after a week or two ... so here we go again. What will be different this time?

Well, the context of this blog is to document the process for a research project I am embarking on. This is a new journey for me. In early August I had emailed my new Marketing Prof. Christian Bach, requesting a syllabus and a heads up on the Marketing course I was about to engage with him this coming fall. After a couple of emails he invited me to consider being a part of a research project he was engaged in. Wanting to met my professor before class started, I said sure we can meet.

On Wed. August 24th I met Professor Bach along with 3 other students, and he went over some of the goals of his research plan. Honestly, it was so over my head talking about Human Genomes, proteins, DNA sequences, and more I felt totally clueless. Also, Prof. Bach has this dream and vision of the world of researchers all living in harmony sharing work together for the good of mankind. He talked about a "Facebook" like website for researchers ... again, it is great to dream ... and it is great to set such high goals that we never have a chance in hell of coming close.

After a couple of hours, of discussion, and academic babble that made my head spin ... I was finally able to ask some simple basic questions for simple minded folk like me to understand what he really meant. From there, I asked him ... "If I could write a program to do 1 thing, just 1 thing to make something better, easier for you what would it do?" Prof. Bach pulled out an Excel spreadsheet, and should me 6 tabs of research data, and how he would like to see them combined on to a single page. I said, if I could create a tool that allowed you to choose each high level DNA sequence and display the resulting matrix would that be a good step. Can I have 2 weeks to do this ... He said, I have waited 7 years, 2 weeks is fine.

After that session, Prof. Bach & I enjoy a couple of cold beers, some Mexican food, and it was so great to be stimulated by academic philosophical problem solving, energized by Sam Adams & Blue Moon. I made it home by 11:00 PM that night. Prof. Bach had sent me the spreadsheet, I look at the data, was able to normalize the data into a comma delimited text file. I then wrote simple php program to give him the DNA sequence pull downs, and the resulting data protein matrix display ... by 1:00 AM an email and a link was in his inbox.

The next day Prof. Bach confirmed, I had delivered the first objective. He was pleased.

While feeling very outclassed and don't even pretend to understand what the human genome is, haven't a clue about DNA & Proteins ... I will leave that up to really smart guys like Prof Bach ... I do see that my practical business sense and ability to ask questions to understand and determine what the customer wants will be an asset to this research project.