Lists, by KWB standards is defined as a group of related nouns, pronouns or items having a relational connection, on as in a “Topic”. Also, these lists must have a clear definition for each “member” of the list. To explore this idea, I have taken several examples and ran them through my script.
Author: cfleshner
Identifying lists, using a dictionary, may initially, only work for a subset of Lists. What needs to happen is to identify that subset of lists these ideas will work for. Key to the identification process is the definition, and the language used for the definitions for each “component word”. A component word is a word…
So while brainstorming on the potential to assemble lists, pre·lim·i·nar·i·ly, based on keywords, I’ve been able to at least break down the areas of research that need to be addressed. The “Topic” name, as described by the end user will need to be accurate when using an approach of duality. That is to say, if…
The theory behind keyword buttons, and a critical aspect of its’ independent success story, is that we will be able to assemble large lists of keywords, that are in a given topic. As described earlier, this is the main hurdle facing keyword buttons. Several ideas have manifested in my mind since the inception of KWB,…
I’ve contacted a Professor at BYU, in the linguistics department in order to throw around a few ideas about how to best integrate a good dictionary in KWB. This has been a thorn in my side for development of KWB for a long time. Many dictionaries out there are available, but are either expensive, or…
Well I’ve had to put the searching of Hadoop storage areas (under the guise of Ag), on hold for the last day or two. In particular my website http://keywordbuttons.com has been the main task for the last couple of days. I’ve decided I’d better setup a test server permanently, so I’m not bringing the live…
So after a bit of thought and experimentation, I have decided on 100 folders, each containing 25 files. That is 2.5TB (with 2 replication factor). Each file in the cluster is 1Gib, and is initialized with actual data (in particular binary zeros from dev/zero from Linux). The idea is to use silver search (Ag), as…
So it took the weekend to initialize the cluster. I ended up having to break the process up into 3 jobs (for reasons I won’t get into right now), after I decided to go from 500 groups of folders to 1000. Here is the top level for the cluster, as it sits today. The structure,…
So Hadoop, was a bit of a learning curve as to setup. I used the tutorial on setting up a 3 node cluster on linode.com, and it was kind of a pain in the *** to setup up. People on the web warned that it might be better just to use a VM on Oracle,…
So, rather then expounding on history, I’ve decided I would dive right in to what I’ve been working on recently. Hadoop. It was a bit of a learning curve to install on my COTS hardware, but in the end I’m happy to have accomplished the install. As we speak, I am initializing Hadoop to use…