TQR — Denton’s Three Questions and Four Principles and an Observation of My Own.

Recently I wrote about Denton’s How to Make a Faceted Classification and Put it on the Web. In sec­tion 4 he intro­duces three ques­tions and four prin­ci­ples that pro­vide use­ful guid­ance when imple­ment­ing nav­i­ga­tion sys­tems for infor­ma­tion archi­tec­tures that rely on faceted classification.

The three ques­tions step through some basic ques­tions that you need to answer before you set out to design your nav­i­ga­tion system.

  1. Are you design­ing for free nav­i­ga­tion (aka brows­ing) or nav­i­ga­tion by selec­tion (aka searching)?
  2. What are your facets like?
  3. Will you give the user con­trol over the cita­tion order?

Characterizing your facets in ques­tion 2 is the tricky one. How may facets you have, how they relate to each oth­er, how even­ly the foci are dis­trib­uted, etc. will all inform your deci­sions when design­ing a nav­i­ga­tion sys­tem. Unfortunately there isn’t a check list that match­es facet char­ac­ter­is­tics to best nav­i­ga­tion design. You’ll have to experiment.

When think­ing about your design deci­sions Denton offers the four fol­low­ing principles

  1. Do not allow the user to cre­ate a query with no results.
  2. Show the user where they are.
  3. Make it easy to adjust or refine the query.
  4. Use the URL as the nota­tion for the classification.

I can quib­ble with the details. (A null return is use­ful — if you can show the user which parts of the query have results and which don’t.) But the prin­ci­ples are always valid points to consider.

I par­tic­u­lar­ly love his fourth prin­ci­ple — that the URL should be a human under­stand­able rep­re­sen­ta­tion of the foci loaded into the facets that got the user to the cur­rent loca­tion. Technically a roy­al pain in the ass to cre­ate and some would claim use­ful only to uber-geeks But think on it. How often have you gone to a book­marked page and then chopped of the URL’s tail or altered a word or two to get to some place when you did­n’t fell like typ­ing some big ole URL? How often have you been frus­trat­ed in the attempt?

The par­tic­u­lar cul­prit in most cas­es of incom­pre­hen­si­ble URLs is dynam­ic web pages — AJAX, PHP, and all their data­base gen­er­at­ed con­tent friends.

A recent client was dri­ven nuts by the fact that when she viewed her site sta­tis­tics the page names dis­played as:

www.(somesite.com)/index.asp?PageAction=VIEWPROD&ProdID=116.

ProdID116 was utter­ly mean­ing­less to her. As were 95% of the oth­er page names. There was no easy way to force the use of a rea­son­able fac­sim­i­le of the prod­uct name in the URL. Complex kludg­ing ensued; reports were gen­er­at­ed. But not as fre­quent­ly or accu­rate­ly as they should have been. All very unsatisfactory.

TQR — Sorting Out Card Sorting

Steven Hannah’s Sorting Out Card Sorting isn’t real­ly about card sort­ing. It’s an exam­ple of using a par­tic­u­lar method­oloy to do aca­d­e­m­ic lit­er­a­ture review and then a pro­pos­al for cre­at­ing a tool that can be used to aug­ment and extend the knowl­edge dis­cov­ered dur­ing the review.

But even if the phras­es ground­ed the­o­ry and con­stant com­par­a­tive method make you feel a bit woozy it’s worth hav­ing a look at Chapter 4: Analyzing the Data. Hannah dis­cov­ers twelve char­ac­ter­is­tics that are used to describe card sort exer­cis­es which rough­ly reflect how card sort­ing is done and discussed.

These char­ac­ter­is­tics make a decent basis for a “things you need to con­sid­er while plan­ning your card sort­ing exer­cis­es” check list. I’ve pruned a cou­ple of redun­dant items and rearranged the rest into a use­ful order for planning.

  1. Define the infor­ma­tion domain
  2. Select a tar­get audience
  3. Choose between open and closed sort
  4. Choose between indi­vid­ual or group exercises
  5. Choose a num­ber of cards to be used
  6. Select the objects to be used on the cards
  7. Set a time lim­it for the exercise
  8. Choose analy­sis methods.

Two things emerge as con­sis­tent recommendations:

  • Investigators should pro­vide a lit­tle guid­ance to the test sub­jects as pos­si­ble. Only what is nec­es­sary to get peo­ple pil­ing the cards up.
  • Card sorts should be lim­it­ed to less than 100 objects (cards), or what ever will take less than one hour.

The ques­tion of how to ana­lyze the results of the user sorts is, well… murky. The usu­al dichoto­my of qual­i­ta­tive vs., quan­ti­ta­tive exists here as every where. The dif­fi­cul­ty is that while design­ers (and oth­ers of the edi­to­r­i­al bent — this by JJG is food for thought on the sub­ject) are will­ing to “eye­ball” the con­clu­sions there are oth­ers who want some­thing more rig­or­ous. They jus­ti­fi­ably point out that the mass­es of data brought to light by card sorts over large sets or the responces of of many par­tic­i­pants can yield results that are too large to wrap one’s mind around.

Surely quan­ti­ta­tive analy­sis is a good thing — espe­cial­ly if there is a per­ceived need for “facts and fig­ures” to jus­ti­fy design deci­sions. There is how­ev­er as of yet lit­tle infor­ma­tion on the spe­cif­ic types of quan­ti­ta­tive analy­sis that are appro­pri­ate and how the results would be used in the design process.

One aspect of card sort­ing that is not con­sid­ered in Hannah’s sur­vey is the ques­tion of com­put­er aid­ed sort­ing exer­cis­es. This soft­ware did­n’t exist when most of the lit­er­a­ture sur­veyed in this paper was pub­lished. These tools exist now and whether or not to use them is an impor­tant test design question.

Oh… about the pre­sen­ta­tion on the web page. I have not been able to make it work. It won’t even attempt to load in Firefox. It tries to load under IE7/Vista but enters a loop and takes up all my band­width. I’d say that I’ll go try it on the lap­top with XP and IE6 but, you know that I won’t; it’s too much trou­ble. Too bad.

More on Denton’s How to Make a Faceted Classification and Put It On the Web

During a recent slog up tread­mill hill I read through Wm. Denton’s How to Make a Faceted Classification and Put It on the Web.

I feel bet­ter now. After my grad school deja vu expe­ri­ence with Simplified Facet Classification I was despair­ing of ever being able to bring facets to the mass­es. Or in my case to intel­li­gent but not LIS qual­i­fied engi­neers and oth­er tech types.

You’ll still need some back­ground in clas­si­fi­ca­tion to under­stand the details but it’s a great overview that is under­stand­able by peo­ple with some back­ground knowl­edge mod­el­ing. Most web and soft­ware dev types have, often unknow­ing­ly, done a fair amount of infor­mal knowl­edge modeling.

Denton’s straight for­ward style makes his dis­cus­sion clear enough that, bar­ring melt down
over the tech­ni­cal­i­ties of the entity/instance analy­sis and facet cre­ation in steps 2 and 3
you can hand this essay out as back­ground reading.

Section 1: When to Make a Faceted Classification gives a nice overview of where faceted clas­si­fi­ca­tion sys­tems fits into the field of clas­si­fi­ca­tion and orga­niz­ing schemes in gen­er­al. Denton pro­vides use­ful ques­tions to ask when con­sid­er­ing faceted clas­si­fi­ca­tion. It’s refresh­ing to see a con­sid­er­a­tion of facets that also dis­cuss­es when facets are not the best answer to your questions.

In the sec­ond sec­tion Denton divides the actu­al work of cre­at­ing the clas­si­fi­ca­tion sys­tem (facets and foci) into 7 steps. Beginning with Domain Collection and end­ing with Revision, Testing, and Maintenance. (Love to see that word main­te­nance laid out in black on white!) It’s a sim­ple method­ol­o­gy that will get you through the process and give you a work­able sys­tem at the end of the day.

What fol­lows are a few notes on his descrip­tion of the process.

Starting with what I believe is the miss­ing first step.

Defining Your Domain

Step 0: Define the Domain

You must have a sol­id and agreed on def­i­n­i­tion of the sub­ject and scope of the domain
before you start. We are all aware that assum­ing that what’s already in the sys­tem is the
lim­it of the domain that needs to be con­sid­ered but also be care­ful that you do not make
the assump­tion that every­thing dealt with by your web­site or soft­ware should be includ­ed in
the domain for which you are build­ing the classification.

Building the Faceted Classification

The next five steps make up the build­ing the facets and foci part of the process.

Step 1: Domain Collection.
Step 2: Entity Listing.
Step 3: Facet Creation.

Note that you’ll have to do quite a bit of iter­at­ing over steps 1, 2, and 3. The process of col­lect­ing, ana­lyz­ing, and defin­ing always brings to light bits and pieces that were missed in the first (lat­est) go ’round. It’s just a fact of life, so plan for it.

Denton does not dis­cuss the tech­niques of analy­sis that can be used to get from the enti­ty (things) list to the facets (char­ac­ter­is­tics) list. This sort of analy­sis is com­plex and domain depen­dent and IMHO the most com­mon point of break down for many attempts. I’m always on the look out for mate­r­i­al that describes these tech­niques, as well as “real” world examples.

Once you have a sys­tem of facets and foci you have to decide on how to arrange its pieces:

Step 4: Facet Arrangement

I find Denton’s expla­na­tion of this step not entire­ly clear. You may have to explain that
you are now work­ing with both the foci (terms) and facets. The end result of this step will
be a list of facets and and arrange­ment of the foci with­in each facet. You will definitely
have to reit­er­ate that the foci with­in each facet are arranged in a way that best reflects
the sub­ject of the indi­vid­ual facet. Once again — the things inside one facet don’t have to
be arranged in the same way as the things inside anoth­er facet. (Can you tell that I’ve had
trou­ble with get­ting this across? Database jock­eys are the worst. Facets and foci do not
map well onto tables, fields, and joins.)

Step 5: Citation Order

Citation order is of less impor­tance in an elec­tron­ic sys­tem than it was in the paper
sys­tems in use when faceted clas­si­fi­ca­tion was first invent­ed. Though you may find yourself
in a sit­u­a­tion in which you aren’t going to be able to take full advan­tage of the
flex­i­bil­i­ty of a com­put­er based sys­tem to mix and match your facets for tech­ni­cal or bud­getary rea­sons. No mat­ter how flex­i­ble your sys­tem is you are going to have to decide on a default dis­play order and behav­ior so don’t skip this step entire­ly but don’t allow any­one to get hung up on it either.

Apply the Faceted Classification 

Step 6: Classification

And now we get to the whole point. Applying the clas­si­fi­ca­tion sys­tem to the stuff. And
it’s about as sim­ple as Denton makes it sound. Sometimes…

Before hand­ing this task off to the near­est con­ve­nient, unoccupied,warm body take a clear
eyed look at how many of your facets include terms (foci) that will require judg­ment calls
to get things labeled prop­er­ly. Who’s best qual­i­fied to make these judg­ment calls in a way
that will serve your ‑users-?

Checking it twice, Getting it on the road, and Keeping it running

Step 7: Revision, Testing, and Maintenance.

Note that you have been doing iter­a­tive test­ing and revi­sion through out the cre­ation of
the sys­tem. If you’ve got­ten here with­out hav­ing to rethink or redo any part of your system
you are one of: work­ing in a very lim­it­ed well-know domain, very lucky, not paying
attention.

Actually only revi­sion and test­ing should be includ­ed in step 7. Maintenance should be step
8. Any clas­si­fi­ca­tion sys­tem that does­n’t include main­te­nance as sep­a­rate, on-going phase
is bound to suf­fer ROT.

The final sec­tion: How to Put the Classification on the Web tack­les the ques­tion of how to use your new faceted clas­si­fi­ca­tion scheme to help your users nav­i­gate the stuff. I’ll be dis­cussing Denton’s help­ful sug­ges­tions in anoth­er essay.

Conclusions:

If I were hand­ing this out ’round the table in a con­fer­ence room packed with devs and coders I’d leave out the third sec­tion titled: “How to store the faceted sys­tem in a com­put­er”. The tech­ni­cal Ways, Whys, and Wherefore’s of stor­ing and access­ing meta­da­ta such as a faceted clas­si­fi­ca­tion sys­tem go far beyond what is cov­ered by Denton’s cou­ple of pages of X(F)ML and SQL exam­ples. It nev­er pays to drop a shal­low solu­tion to a prob­lem into a room of peo­ple who are trained to take any prob­lem laid before them and debate the best way to do it.

TQR — Introductory Tutorial on Thesaurus Construction

If the­saurus con­struc­tion is some­thing that comes up only occa­sion­al­ly in the course of your work you should book­mark this tuto­r­i­al cre­at­ed by Dr. Tim Craven of Western Ontario University for his LIS students.

Eight sec­tions take you quick­ly through the basic con­cepts and con­sid­er­a­tions for build­ing a the­saurus. It’s a handy refresh­er that is soft­ware agnostic.

In fact the sec­tion head­ings would make a good out­line for a set of ques­tions to ask the soft­ware ven­dors if you are con­sid­er­ing pur­chas­ing a the­saurus man­age­ment system.

Speaking of the­saurus soft­ware, Dr. Craven also has a hand­ful of free­ware pro­grams to assist in index­ing and the­saurus con­struc­tion. I haven’t checked them out yet and so can’t offer an opinion.

TQR- Berry Picking Time (with apologies to both Ms. Bates and Great Big Sea)

Once in a while it is a good and refresh­ing thing to revis­it some of the clas­sics. In this case a paper that I con­sid­er to be a pri­ma­ry lens for look­ing at infor­ma­tion seek­ing behaviours.

Something struck me as I was reread­ing Marcia Bates’ “The Design of Browsing and Berrypicking Techniques for the On-Line Search Interface” (Published in 1989, a time when on-line search­ing was awk­ward, expen­sive and the pre­serve of aca­d­e­mics and sci­en­tists. We can argue whether or not the sit­u­a­tion has actu­al­ly improved on anoth­er day.)

The berryp­ick­ing (or evolv­ing search) mod­el that she describes is now a wide­ly used short­hand for a set of user behav­iors. Unfortunately like many abbre­vi­at­ed terms, we for­get the full com­plex­i­ty of the ideas that the short­hand represents.

Five of the six spe­cif­ic infor­ma­tion chas­ing strate­gies that she describes as being used by aca­d­e­m­ic searchers are used every­day by the blog­gers and blog read­ers. Blogs have evolved tools for their own ver­sions of:

  • Footnote Chasing: (also known as back­ward chain­ing.) No need to write that cita­tion down and go the library to look up the cit­ed mate­r­i­al, just click on the link in the blog post and get an imme­di­ate look at it.
  • Citation Chasing: (for­ward chain­ing,) Most non-academics don’t ever learn about using a cita­tion index but it’s one of the best ways to move your search for infor­ma­tion for­ward through time. Now with track­backs every­one can do cita­tion chas­ing with­out even know­ing that they are engag­ing in one of the rit­u­als of grad­u­ate school. Also have look at tech­no­rati’s blog reac­tions for links to blog posts that refer to anoth­er post.
  • Journal Run: Instead of sit­ting on the floor of the peri­od­i­cals stacks run­ning your fin­ger down the table of con­tents of each issue of the Journal of Cat-like Things for the last two years just click on the handy archive links in the left (right) hand nav­i­ga­tion pane of the blog.
  • Author Searching: Most blog writ­ers who pub­lish in more than one place add links to their oth­er blogs or guest writ­ing spots in their “home” blogs.

The sixth search tech­nique is a lit­tle hard­er to place in the blog world. At least I thought it was, until I spent some time look­ing at a hand­ful of blogs try­ing to find good exam­ples of the first five techniques.

  • Area Scanning: the habit of look­ing at the adjoin­ing shelves. Once you have found Audubon’s Birds of North America (DDC 598AUD) you will find Kale’s Florida’s Birds (DDC 598.2975 KAL) as well as Garrido’s Field Guide to the Birds of Cuba (DDC 598.097291 GAR) on near­by shelves. Handy if you’re look­ing for infor­ma­tion on birds you might see in the Florida Keys. The blog equiv­a­lent is look­ing at the blog rolls. Perhaps not as tidy as the library shelf mod­el but none-the-less titles co-located by being placed on the same list are like­ly to have use­ful rela­tion­ships to one anoth­er. (This blog is the sad counter exam­ple; my blog roll is exact­ly a list of things that are not relat­ed to the pri­ma­ry top­ic of my essays.)

For the next cou­ple of days I’ll be more aware of which search habits I might be drag­ging from the paper based past into the dig­i­tal present present and think­ing about whether or not they are still use­ful and if use­ful are they well pro­vid­ed for?