Onward – 5TB’s

So, rather then expounding on history, I’ve decided I would dive right in to what I’ve been working on recently. Hadoop.

It was a bit of a learning curve to install on my COTS hardware, but in the end I’m happy to have accomplished the install. As we speak, I am initializing Hadoop to use with my Keyword Buttons Applicaiton.

It has been six hours so far, writing out folder after folder of 1GB sections of data. This is my big data area, that I will be using to write to.

These servers are old hardware with 2TB hard drives per Linux box, so it isn’t costing me an arm and a leg in order to allocate large amounts of storage. When setting up Hadoop, it is known that it handles large files much better than smaller files, so even though by today ‘s standards 1GB isn’t that large I decided on using that size.

I set the replication factor to two, since I want to optimize the usage of my 10TB available, which basically gives me RAID1, on 5TB of storage.

I won’t have to use AWS APIS, but rather native Hadoop functions in order to access the data. The goal is to integrate Keyword Buttons i/o to hadoop, so as to be able to write “Big Data”.

It is my goal then, to be able to optimize the Hadoop cluster for usage in Keyword Buttons, hereafter KWB.

The cluster system I am laying out has the format /user/cfleshner/folder/block, where folder is a numbered folder and block is a numbered block. Each folder contains 25 blocks, and each block is a single 1GB file.

The idea is to use the entire cluster (5TB), as a bucket for the urls which will be appended with meta data. It is cheaper to do it this way, than to use AWS, or other cloud based services.

Introduction

My name is Chris L. Fleshner and I live in Omaha, Nebraska. I am 56 years old at the time of this writing. I am “enamored” with technology, and for good reason. It’s cool!

This blog is devoted to explaining the concept of “Keyword Buttons“, a project I have been working on, as an employee, a hobbyist, an innovator, a Patent Draftsman, and as an author of an abandon patent application.

Interestingly, the actual birth of this project, occurred to me in retrospect, when yesterday, I decided I’d write this blog.

It all began back in 1985 or so, and until now, I haven’t seen the relationship to the work I’ve done in this area while working at the Principal Financial Group, formerly The Bankers Life, in Des Monies, Iowa, and what has now become my primary interest. Talk about beating a dead horse!

But the fact is, I’ve enjoyed the ride and want to share my story nevertheless, because the horse isn’t dead yet…

So saunter down and saddle up, as I describe the systems I’ve developed using modern era tools, and my legacy programming experience.

In particular, I’ll describe my experience with having coded a conventional (non-oop) web application running on a home based server, primarily coded in PHP, and running on an in-house Apache Web Server – and most recently my new Hadoop 10TB available, server cluster, where I intend on storing end-user-initiated urls, harvested from the web, (using Keyword Buttons) and encoding them with meta data key references – which incidentally is critical to understanding the premise of Keyword Buttons itself.

Ultimately, the data harvested will be useful to those researching topic areas where the end-user (the one using the meta data) can make a cognitive connection between the intended scope on the resource layer, and the keywords (and associated topic), in order to discover something interesting and useful. More on this later.

The connection to 1985 and the present will be more apparent as I wander down memory lane, talking about my life experiences while programming in COBOL, on MVS/XA, making use of in-house written 370 assembly language utilities, to allocate huge amounts of RAM (back then around 2GB), in order to provide the ability to “key”, or identify macro tables. Basically, to automate a job that was previously done manually. Or even touch on my experience with Atari Basic, back in the 1970’s.

Nearly four decades of technological experience will be discussed in this blog, and I’m truly looking forward to the experience of sharing it with the public.

I enjoy sharing technical information that can help the readers of this blog discover and learn more, not only about using current open source technologies, or other tools, but also to reflect on lessons learned using “old school” hardware and software scenarios, that still are relevant today.

This blog will also serve as a personal memoir, of notable experiences occurring over several decades, relating to technology, from when I first touched the Keyboard of a TRS 80, in the mall in Sioux City Iowa, to my first PC running Windows 95, through today, using my bare-metal non-cloud 6 node 12 TB cluster, running on Ubuntu Server (you can see Hadoop running here).

And perhaps most importantly, this blog, will work toward describing the present day tech I’ve used while developing the Keyword Buttons system layers. Things that went well, and things that cost a lot of time w/o much payoff.

I like to think my situation is unique, because my end-game isn’t defined. There is a Japanese meme for that, but I’ll have to google it and plug it in later here.

By leaving the assignment of defining useful data up to the end user, the libraries of associated content will be unique for each end user (eg. some may use images for deep learning, some may use PDF files for analyzing articles, etc…) .

The expert on what META data will be applied and generated, within their particular discipline, or area of research, will depend on what the end user determines is adequate to answer their particular questions. That is to say, their input data will depend on what they are trying to determine, and Keyword Buttons can play a role in assembling lists of data, that can be used in their processes.

I encourage you to contact me, and ask questions or otherwise engage me and others who participate on this blog, because I am all about sharing knowledge, not hiding or hoarding it.

Thank you for your interest in Keyword Buttons, and please read on as I tell my story.

To get started please watch this 8 minute video. This is a very simple introduction to demonstrate lists of public content urls, being assembled by using a list of keywords. Enjoy!

Chris L. Fleshner, Developer