deltaflow: home

Viewing entries tagged with 'university'

Markandeya Rsi vs. PhD

27 August 2008 | 9 Comments | Tags: , ,

This PhD degree has been the greatest austerity I've ever undertaken. It was often frustrating, demotivating, felt like it would never end, caused my body to frequently fall ill and resulted in a huge amount of worry and pain.

However, the austerity of this PhD have been child's play compared with what a sage named Markandeya Rsi went through. (His story is told in the 12th Canto of the Srimad Bhagavatam. I'm recalling it in my own words here):

Markandeya was meditating on the Supreme Personality of Godhead in his small heritage for many years. He was very strictly and sincerely meditating. So much so, in fact, that Indra, the King of Heaven (aka Zeus), became worried that this Markandeya might become eligible to take over his position soon. Indra therefore sent a team of people to break Markandeya's meditation.

He sent Cupid along with the best of the heavenly singers (Gandharvas), the most beautiful of the heavenly exotic dancers (Apsaras), the season of spring, a gentle cool breeze, intoxication personified and greed personified (the mode of passion and the false ego of thinking in terms of "I" and "mine"). A celebrate monk's worst nightmare. All these together were to create a situation where Markandeya would be tempted to stop his meditation and enjoy materially.

However, faced with these allurements, Markandeya wasn't even slightly shaken. He remained completely steady and fixed in his worship.

Markandeya Rsi's austerities were so powerful, in fact, that the members of Indra's assault team began to burn-up within (similar to what happen when Kapila Muni was attacked by the sons of Sagara who thought he had stolen a sacrificial horse).

Eventually, while Markandeya was meditating in this way, the Nara-Narayana avatar came and visited him. Markandeya immediately recognized the Supreme Lord and worshiped him with expert poetry.

The sage explained: Krishna is like a spider, He creates everything within the universe like a spider creates his web, and then He retracts it all back within Himself. Through Krishna one can conquer material misery, death and even time itself. Time is so powerful that even Lord Brahma (the oldest and most intelligent person in the universe) fears it, but Krishna's devotee need not fear time. The devotee knows that his self is not the body. The modes of nature generally bind us to the material world, but the devotee knows how to use the mode of goodness as a launch pad to blast himself off on a trajectory back to Godhead. Because of their perverted and sinful activities, materialists cannot understand Krishna. So material philosophers therefore come up with so many different theories, doctrines and religions. These are created to match their particular mix of the modes of nature (satva-, raja- and tama-guna), but have no real substance.

After hearing this nice prayer by Markandeya, Nara-Narayana offered him any benediction he might desire. The sage answered that just seeing his worshipable Lord was all he desired. He could imagine no greater gift. However, he was curious about the illusory energy (maya). He asked to understand how it could bewilder so many people into thinking material life was the one true reality.

The Lord ruefully promised to fulfill his wish and then disappeared.

Markandeya went on meditating for a few years when suddenly strong wind started to blow. Soon after, it started raining very heavily. The intense rains caused severe flooding. This hurricane went on continuously for many years. The intense weather eventually caused the entire surface of the earth to become flooded. Practically all species died off in this intense atmosphere. Gigantic sharks roamed the wild waters. The flooding even spread to the higher-dimensional space of the heavenly planets. It was the devastation at the end of the day of Brahma.

Markandeya was swimming and drifting throughout all of this. He lost all sense of orientation, he felt intense hunger and thirst, he got attacked by sharks, he felt extreme pain from various injuries, he was completely exhausted continuously fighting for his life, he frequently fell ill, he felt lamentation, happiness (when he temporarily escaped some danger), fear and misery. This went on for many, many years, all throughout the night of Brahma (4.32 billion years).

After an extremely long time drifting in the waters of devastation, Markandeya spotted a small island with a banyan tree growing on it. In one corner of the tree he saw a young child. As he swam closer to the island he noticed the wonderful beauty of the child. He noted his blackish-blue skin, wonderful jewelry, shark-shaped earrings, auspicious bodily markings and nice cosmetic decorations.

Then, suddenly, the child inhaled and began to suck everything surrounding him into his mouth. Markandeya also got sucked into the mouth of this wondrous child. Within the mouth he saw his old hermitage, the waters of devastation, the heavenly planets, the creation and destruction of the universe, everything, the entire universal manifestation; he even saw time itself, past, present and future, all at once. The child then exhaled and Markandeya found himself spat out back into the waters of devastation.

As he once again began to struggle to keep his head above the waters, he suddenly found himself transported back to his old heritage, as if nothing had happened. He then realized: "oh ... so this is the power of the illusory energy!".

And I realize: a PhD is nothing compared to that.

PhD result: not quite there yet

27 March 2008 | 0 Comments | Tags:

I had my PhD viva (final oral exam) a few weeks ago. After an incredibly grueling 4:40 hours the result is: "major corrections without the need for another examination".

The examiners were happy with my performance in the viva, but they thought that the thesis had some major shortcoming which needed to be corrected before awarding me the title. They estimate about two months more work is necessary to make the corrections. Then I have to re-submit the thesis, pay a ?£250 "admin fee" and the corrected thesis gets sent to both examiners for review and approval.

This is somewhat of a disappointment, but it could have been a lot worse. At least I (kind of) passed the exam. Still, the grind goes on...

Michael Uschold on Semantic Technology

18 December 2007 | 221 Comments | Tags: , ,

I attended a presentation by Michael Uschold of Boeing corporation Phantom Works. He talked about ontologies and semantic applications and the pressing need for them in today's software industry. I thought it was a great presentation. The following is a summary of his ideas from what I gathered while listening:
Value
Dr. Uschold explained that when one is talking to someone about semantics one needs to sell its value. One should provide answers to the following questions: how will semantics help? Why is it better? What is the cost / benefits? Where will it fit in the architecture?

For example: there was a task at Boeing that required someone to write a report every three months. Writing the report involved the guy formulating a bunch of database queries, loading the results into Excel, messing around with the data a bit to shape it into the required form, and then writing the report. Altogether this was a 20-hour task. Doing the same task with ontology would be much quicker and produce a more accurate and more complete result. This is because ontologies uses the same schema (or language) for everything in the workflow. There is no need to convert between different data representations.

So, the value of ontologies for IT systems is that they allow systems to be more tightly coupled. In a traditional system the semantics are implicit. That is, they are hard wired into the system. You can't see them, you can't change them and you can't maintain them. So, more often than not, the system's requirements are out of sync with the applications'. For example: suppose someone creates a model (in UML) and write the code according to that model (in Java), then the requirements changes and the code is updated to match the new requirements, but no one ever updates the model. Over time the model and the code grow further and further apart until the model is all but useless. With an ontology the model is directly used to drive the system. Any change to the requirements requires a change to the ontology model and that, in turn, results in a change to the system. The result: everything is up-to-date all the time. This is the holy grail of semantic systems: a model driven architecture (remember that buzz word!).

Benefits
The benefit of semantics is that they allow common access to information. Ontologies have unambiguous formal semantics. So, for example: in a semantic data warehouse, the ontology can provide a common schema for querying multiple databases; when doing system integration, the ontology allows for enterprise wide interoperability; and when capturing organizational knowledge, the ontology allows this knowledge to be stored, queried and accessed throughout the organization.

Speaking of querying: semantics enable better search. Semantic search goes a step beyond basic keyword-based search. It allows for detailed and very specific question answering and document retrieval.

Semantics offer many benefits in knowledge management. They allow organization to retain knowledge (e.g. when people retire), share knowledge and enable communities of practice (by e.g. informing people throughout the organization about who knows what). Semantics enable secure knowledge authoring and storage, since a rich ontology- or rule-based specification can accurately and reliably control everything that anyone is allowed to see and/or change. Semantic knowledge management would be especially useful for compliance with the Sarbanes-Oxly business process act (which all large organization are severely struggling to comply with, because it is so ridiculously complicated).

Semantic technology allows for lean and agile application development. With a database you are stuck with a given schema that was designed according to a specific problem scenario. Want to ask a different question? Then you would better get ready to spend at least two days rewriting all your SQL, or watch your performance go down the drain like nobody's business. The ontology allows for improved reliability, consistency and reusability. People still don't know how to reuse code. An ontology, however, is built for re-use.

So, in short, the benefits of semantic technology are: flexibility, flexibility, flexibility!

Limitations
Ontologies do have some limitations, however. They can't do everything.

For one, scaling is a big issue. Reasoners currently have difficulty providing efficient a-box reasoning (answering questions about a large number of individuals/instances), as well as dealing with very large ontologies. Then there also is not much in the way of commercial application support for ontologies. The triple stores on the market are, for the most part, really, really dumb. They just store triples. If you want any reasoning support at all, you need to do it yourself.

Then there is workflow control. There needs to be more support for collaborative ontology development and change management. Large groups need to be able to concurrently build ontologies.

Another major issue that is limiting the adoption of semantic technology is that it is pretty much impossible for a normal person to understand. Take OWL restrictions, for example (please!). To describe a "big red ball", one needs to write: "class: ball, that has an anonymous superclass of which some values from are restricted over the property "hasSize" with the filler of the class of "Big" and some values from are restricted over the property of "hasColor" with the filler of the class of "Red". How bizarre is that?! The non-logician/non-geek just wanted to describe a ball, not get into the details of hopelessly complicated formal logic (and that was an easy example!). The complicated stuff really needs to happen behind the scenes.

Finally: we still need code. Ontology models can't yet drive the whole system. They are just a small part of a very big picture.

Questions that need answering
There are a few common questions that people in industry need to have answered before they will adopt semantic technologies. These include: how do I use my ontology in my architecture? How do I integrate this into my Eclipse framework? How does it link into my middleware? Which API(s) should i use? Will I have to roll-my-own all the time, or can I use some kind of IDE for ontologies?

So, what we really need is a book that covers: semantic middleware and semantic programming (i.e. telling the reading: "this is Jena and this is what it does", "this is Jess and this is what is does", etc.). That, coupled with an ontology programming interface that abstracts some of the APIs and programming tasks needed for ontology development, would go a long way towards enabling the adoption of semantic technologies in real-world applications.

PhD Thesis Submitted!!

17 December 2007 | 1 Comments | Tags:

Yippee! Today I submitted my PhD thesis. All 256 pages of it are now in the graduate office being processed. Copies will eventually be sent to my examiners. So, now the only thing left to do is wait for my final viva.

Paper accepted at WoMo 2007

28 September 2007 | 2 Comments | Tags:

I just had a paper accepted for publication at the Second International Workshop on Modular Ontologies (WoMo 2007) co-located with the Knowledge Capture conference (K-CAP 2007). My paper is "The State of Multi-User Ontology Engineering".

You can download the paper here, or in the publication section of this website. This will be the last paper I publish for a while. From now on it's exclusive PhD thesis writing for me.

6 vedic atoms = 1 photon

29 June 2007 | 22 Comments | Tags: , ,

"The division of gross time is calculated as follows: two atoms make one double atom, and three double atoms make one hexatom. This hexatom is visible in the sunshine which enters through the holes of a window screen. One can clearly see that the hexatom goes up towards the sky." (SB3.11.05)

Scientists currently believe that the photon (also known as light) is the transmitter particle (gauge boson) for electromagnetic force. Photons supposedly have no mass and no electric charge. It is said that Einstein was the first person to theorize that these particles should exist (except he wasn't the first - not by a long shot!).

Photon (obviously) travel at the speed of light. They can be redirected by gravity (not because gravity attracts the photon like e.g. a magnet attracts iron, but because gravity bends the very space through which the photon flies).

Photons are strange because they behave both as waves and as particles at the same time (as demonstrated in the famous double-slit experiment).

Besides photons, which we "see" every day, there are supposedly a few other gauge bosons, or carrier particles for fundamental forces of nature. Specifically, there supposed to exist W and Z bosons (which supposedly cause the weak atomic interaction), gluons (which supposedly cause the strong atomic interaction) and the (totally speculative) gravitons (which supposedly cause gravity - although no one has ever detected a graviton).

Physicists are hard at work trying to figure out how these particles fit together in a grand unification theory. They believe that if they figure this out they will understand everything there is to know about the elegant universe with no need for primitive gods, deities and other "unscientific" stuff like that.

And here we have the Srimad Bhagavatam stating quite plainly and clearly, thousands of years before the advert of modern physics (or more precisely: the sage Maitreya speaking to Vidura sometime around the year 3102 B.C.), that the photon is actually made up of 6 (specifically 3 groups of 2) atomic particles. These Vedic Atoms (parama-anuh) are the true fundamental particles of nature. In different combinations these particles presumably also make up the other gauga bosons.

So, there we have the much vaunted unification theory.

Why do theoretical physicists not take notice?

Update: (disclaimer) My statements above are called into question by some good counter arguments in the comments to this post. This is not to say that the article is incorrect, but I nevertheless advise anyone reading this to read the comments and make up their own mind based upon what they think are the most reasonable assumptions.

Paper accepted at K-CAP 2007

27 June 2007 | 3 Comments | Tags: ,

I just had a paper accepted for publication at this year's Knowledge Capture conference (K-CAP 2007). My paper is "A Methodology for Asynchronous Multi-User Editing of Semantic Web Ontologies". It will serve as the basis of my upcoming PhD thesis.

You can download the paper here, or in the publication section of this website.

So, see you in Whistler, Canada in October.

Bogus intelligent design

12 February 2007 | 3 Comments | Tags: ,

I attended a talk by a "science communicator" who was visiting my University. He was speaking on intelligent design from a neutral (yeah, right) perspective.

He outlined both the evolution and intelligent design theories. He quoted anti-evolution argument of the molecular motors that some bacteria use to propel themselves. These little spinning corkscrews propellers consist of over 30 different proteins. Anti-evolutionist have long argued that it would be impossible for these 30 proteins to come together in just the right configuration all at once in one evolutionary step, yet they would have had to in order to form a working and useful motor. However, apparently scientists have now discovered a bacteria that does a similar thing with just 6 proteins. Ha! (although how or why they got from 6 to 30 is not yet known)

He also gave the famous example of the eye, which is way too complicated to have "evolved". However, scientists have now discovered "light sensitive skin". Creatures with such skin obviously gradually evolved into animals with modern eyes. Ha! (although the exact details of how this happened are not yet known)

Another common misconception is that evolution happens by "chance". It is not at all chance. There is no planned outcome. it is not like drawing a specific pair of card from a deck of cards (which would have a small probability). Much rather, it is like getting any pair of matching cards from a deck (much more likely). Lots of different evolutionary paths will work. Nature just happens to have evolved the way it has. If the Universe's dice had rolled differently then we'd all be completely different. So, the ridiculously low probabilities quoted by some opponents of evolution are inaccurate. They are actually much smaller (but still pretty unlikely).

The final steak in the heart of intelligent design is the motivation of intelligent design advocates. Leaked internal documents reveal that they are all Christians who are trying to use it as an inroad to have their religion taught in public schools. This is against the American constitutions, so it is no wonder that the "evolution is just a theory" stickers on text-books and other such attempts get struck down by the courts. The judges aren't stupid. They know there is an ulterior motive behind it.

Christians are being trained up in special universities like the elite Patrick Henry College and the Opus Dei society. They are then tasked with infiltrating key positions of power in school boards, etc. to push their (unconstitutional) Christian agendas.

After his "neutral" talk I asked him about Michael Cremo's books. His answer (and I paraphrase):

Oh yeah, he is another one of those religious types. Which organization does he belong to? The Hare Krishnas, right? However, he does come up with a few very uncomfortable facts. So, yes, I recommend everyone at least has a browse through one of his books. But, don't read any of them, because they are - like - "this" thick. But keep an open mind and at least look at some of the controversial archeological findings he presents.

So, in summary (according to this science communicator person), intelligent design is a concocted idea that ultimately aims to have Christian creationism taught in schools. Science (the new God) will very soon discover the exact detailed mechanism of evolution (even if a few minor missing links are still missing at the moment). And the world will continue to ignore the extremely detailed (non-Christan) intelligent design theory offered by the Vedic literature (even if it does make perfect sense and answer many of the open questions).

new Vedicsoc: Fresher's Fair

22 September 2006 | 34 Comments | Tags: ,

A new University semester is upon us and Vedicsoc is back in business!

Vedicsoc flyer vedicsoc schedule
We just had the Fresher's Fair at the University. Up to four days of hoards of students being induced to join every kind of club or society one might imagine. I chose two days in the prime location (UoM Academy) for Vedicsoc's recruitment efforts. Kamren helped me.

We distributed loads of prasadam (Coconut Ice and Chinese Almond cookies), as well as 1000 flyers (and foolish me thought I had printed too many). 166 interested people put their email address down to be put on our mailing list.

On advice from Joy I added a timetable of events to the back of the flyers. A definite schedule of interesting topics should hopefully attract more people. I also set the price at ?£1 per session, pay-as-you-go. People liked the cheap price for a two hour long session, as well as the fact that they didn't have to commit to anything.

The fair itself was pretty intense: loud noise everywhere, wall-to-wall people and discarded flyers all over the place.

My realizations:

  • Asian people are becoming more interested in yoga/meditation. We had quite a few Chinese and Japanese students come by, ask questions and sign up. In previous years there was zero interest from students from those countries.
  • Students are getting older. Excessive sense gratification is prematurely aging young people. I remember when the freshers at University looked like little kids. Now I can hardly tell the difference between someone who is 18 and someone who is 28. All their innocence has been lost long long ago.
  • (As my spiritual master also has said) men are generally spaced-out and women are angry. Indulging the senses destroys a man's intelligence and he becomes a spaced-out zombie. Women hope to get some emotional fulfillment from sense indulgence and are (inevitably) disappointed and angry when it does not result.

I tried to capture some of these ideas with my camera as I was distributing flyers and shouting at people trying to get their attention so they would join Vedicsoc. You can view the result of my photographic endeavors here. I think the pictures nicely illustrate the sad and sorry state of the student community (if I accidentally took a picture of anyone reading this blog entry and you don't want it displayed, please email me and I'll remove it).

Sorry for the low quality of the pictures. It was quite dark in the room and I had to resort to less than ideal ISO settings and shutter speeds.

On another side note: I've been watching the Radiant Vista Daily Critique. It is an excellent daily 5-minute video photo critique by master photographer Craig Tanner. He takes viewer/listener submitted photos and gives some encouraging words, as well as suggestions for improvement. I've learnt a lot about photography from these podcasts. I've implemented some of what I've learnt in this latest series of pictures. Further comments and suggestions are, of course, welcome.

Paper accepted at ER2006

8 July 2006 | 727 Comments | Tags: ,

I got a paper accept at the ER2006 conference! That's the 25th International Conference on Conceptual Modeling to be held in Tucson, Arizona, USA from November 6 - 9, 2006 (ER stands for Entity Relation - the age-old method of conceptual modeling in databases).

My paper on "Representing Transitive Propagation in OWL" was accepted in peer-review process. A total of 158 papers were submitted and only 37 were accepted (23.4% acceptance ratio).

I got high marks for Originality and Presentation, but low marks for Significance (when I say "low", that means a "neutral = 4" rating, rather than a "accept = 6" rating; ratings were out of 7). That is fair enough. This research isn't the main, innovative, ground-breaking trust of my PhD. It is just something interesting that came up as a side-idea.

The paper is available in the academic section of this website.

WWW2006 day 5: return journey

30 June 2006 | 2 Comments | Tags: , ,

One "interesting" thing happened to me on my way back from the WWW conference:

I took a late train from Edinburgh to Manchester. I arrived at Manchester at about 11pm on a Friday night. Lots of students were out and about "enjoying" the kali-yuga delights. I was walking down the road from the train station, wondering if I should take a taxi, or walk home when ...

UK Police Armed Response Unit Vehicle

Suddenly, out of no-where, three Mercedes police SUVs appear. They stop in the middle of the road, blocking all the traffic. Within two seconds several police officers in full body armor pile out of the cars, draw their pistols, move in on a group of students, scream at one student to "drop it!" and have the guy in an arm-lock, pinned to the floor.

It happened so fast I didn't have time to react.

It seems one geeky looking teenage student had been brandishing a gun to get some respect (unusual in the UK, since even the police here do not carry firearms - expect, of course, for the Armed Response Units, such as the one that I happened to witness in action). The guy certainly suffered the consequences.

All that made me desire less and less to stay in the UK. Manchester: crime city.

WWW2006 day 5: evaluating websites + free music

30 June 2006 | 1 Comments | Tags: , , ,

There was a talk on "The Web Structure of E-Government - Developing a Methodology for Quantitative Evaluation".

The researchers from University College London (UCL) used several statistical measures for evaluating government websites: worse case strongly connected components, incoming vs. outgoing link, path length between pages, etc. They compared their statistical measure with results from user evaluations. That is, they got a bunch of users together and measured how long it took them to find stuff on various website (both with and without using Google).

They tested the UK, Australia and USA immigration websites. The results:

  • UK is best, both navigating the link structure and searching
  • AU is terrible to navigate, but good to search
  • USA is bad any way you look at it, but at least search will eventually find you what you are looking for.

Automated statistics don't tell you much.

More info at: www.governmentontheweb.org

This was followed by a talk by Ian Pascal Volz from the Johann Wolfgang Goethe University in Germany. He talked about "the Impact of Online Music Services on the Demand for Stars in the Music Industry".

His main (and interesting!) finding is that people tend to buy music they already know and like from online music stores like the iTunes Music Store. Peer-to-peer file sharing networks, on the other hand, tend to get people to try and discover new music. Virtual communities are somewhere in between the two.

People who buy music will not spend any money on something they don't already know and value. Even $1 per song is too high a price for a casual purchase. If you want people to discover your music and you are unknown it must be available for free.

On a related topic: when recording lectures on spiritual subject matter, please, please, please don't try to charge for them. No one will pay. Make them available for free. That way to the whole world will benefit.

And so ends the WWW2006 conference. Next stop Banff, Canada for WWW2007.

WWW2006 day 5: semantic wikipedia

28 June 2006 | 0 Comments | Tags: , , ,

A presentation by some researchers from Karlsruhe, Germany was very interesting (well presented, too). They talked about their "semantic wikipedia", an extension to the popular MediaWiki that allows authors to express some semantics, i.e. to get at the hidden data within the articles.

The normal wikipedia only has plain links between articles. Nevertheless, it is the 16th most successful website of all time (according to alexa.com). However, in the semantic version every link has a type. Object properties map concepts to concepts and datatype properties map concepts to data values.

Why do it this way? Answers: adding these annotations is cheap and easy (no new UI), they can be added incrementally and there is no need to create a whole new RDF layer on top of the existing content, the annotations are right there in the wiki text.

This simple addition is enough to allow for powerful queries. You can create pages that automatically pull in all articles of a specific category, with a specific title and between a specific date range, for example. Checking for completeness because easier too: you can construct a query that tests if every Country has a Capital. If some countries come up that don't, those can be easily fixed.

The whole thing self-regulates. Each property has its own page in the wiki, so that people can suggest property types and eventually come to a consensus about which properties are the right ones to use.

The wiki can be imported into OWL and vica versa. The template system can also be leveraged to quickly create semantic annotations.

The whole thing is a win-win-quick-quick scenario (bit of an in-joke there).

WWW2006 day 5: Active Navigation

27 June 2006 | 53 Comments | Tags: , , ,

Over lunch I bumped into John Darlington, the former CEO of Active Navigation, a small company (spin-off from Southampton University) that I worked for a while ago. John is now working for Southampton University as a Business Manager and was involved in organizing the WWW conference.

Active Navigation was a very nice place to work. It had the atmosphere of a small start-up without the killer, passionate, burn-out, no-holds-barred pace.

The company creates a server technology that automatically injects hyperlinks into web pages pointing to relevant, related pages on the same website. Website navigation can be improved by using these injected links. If someone, for example, creates a web page containing the word "ontology" and someone else has written a web page that also contains the "ontology", then the server transforms those words into links to each other's web pages. Someone browsing the website could find the two related pages by clicking on the automatically created link.

John called me over: "Julian! Wow, great to see you!"

Turning to Nigel Shadbolt next to him: "Julian here worked for me for a while, then disappeared into the either, as you do, and now: I'm chairing a session (the one on education), look down and who do I see? Julian, asking a question!"

He suggested I might look into digital media production in New Zealand as a possible career path. Ever since Lord of the Rings that has apparently taken off in a big way down-under.

WWW2006 day 5: ontology + dictionary

26 June 2006 | 1 Comments | Tags: , , ,

Harith Alani presented his position paper on building ontologies from other online ontologies. He explained how building ontologies is difficult, so it is best to reuse existing knowledge bases, or, even better, completely automate ontology construction. The state of the moment is that there are quite a few ontology editing tools, but little support for reuse. Furthermore, these tools are build for highly trained computer scientists, not the average web-developer.

His idea is to combine three existing research areas:
Ontology libraries (e.g. DAML library, Ontolingua) and ontology search engines (e.g. Swoogle) can be used to located ontologies on the Internet.
Ontology segmentation techniques (like mine) can be used to cut smaller pieces out of these ontologies.
Ontology mapping techniques can be used to reassemble the pieces into new ontologies.

Result: instant custom ontology. However, to get this working in practice takes quite a bit of doing. He himself admitted that is was quite an ambitious undertaking. Good idea though.

Mustafa Jarrar (from Beligum) and Paolo Bouquet (from Trento, Italy) presented the next two papers. They talked about a very similar topic. Both were advocating linking ontology terms to dictionary / glossary definitions.

It was interesting two observe these two researcher's presentation styles. Paolo was very fast and frantic, very much unlike Mustafa who was very slow and relaxed, even when trying to hurry (Vata vs. Kapha, for those knowledge in Ayurveda).

Mustafa told of how he built a complex ontology for some lawyers, but, after he had gone through the trouble of carefully constructing this knowledge base, the lawyers found it to be too complicated to understand and threw everything expect the glossary part away. However, the did really like and appreciate having a sensible glossary of all kinds of law-related knowledge.

He defined this "gloss" as:

auxiliary informal (but controlled) account for the common sense perception of humans of the intended meaning of a linguistic term

The glosses should be written as propositions, consistent with the formal definition, focused on the distinguishing characteristics of what is being described, sufficient, clear, use supportive exampled and be easy to understand.

Advantages are that these glosses are highly reusable (very important for his lawyer clients) and that they are very easy to agree upon.

So everyone: link your ontology to WordNet (or something better)!

Paolo picked up the issue and talked about his WordNet Description Logic (WDL). An extension to DL that adds lexical senses to the vocabulary of logic. It allows for compound meanings. So, UniversityOfMillan is automatically inferred as University that hasLocation some Millan.

Using this type of dictionary-link makes it possible to check for errors by comparing the glossary definition to the logical semantics. If they don't match, a potential error can be flagged.

His system also allows for bridging and mapping between ontologies. If two ontology concept refer to the same dictionary definition, then that is a very good indication that they are describing the same sort of thing.

WWW2006 day 5: health care

17 June 2006 | 2 Comments | Tags: , ,

UK National Health Service (NHS): web-enabled primary care is finally coming, but is still super-clunky. And forget technology use in secondary care, it's non-existent. If only there was a central registry of patient's records. That would be really useful both for patients and statistical medical research. It would also be very cost effective.

The NHS is spending ?£6 billion on modernizing its information technology. Unfortunately, despite being only about one year into the project, they are already ?£1 billion pounds over budget.

I know from first hand ontology building experience that the Systematized Nomenclature of Medical Clinical Terms (SNOMED-CT), which is supposed to underly this whole revamp, is an extremely poorly architected ontology. A disaster just waiting to happen.

USA Health IT: IT in health could prevent some of the 90,000 avoidable annual deaths due to medical errors. Test often have to be re-done, because it's cheaper to re-test someone than to find the previous lab results. We need to get rid of the medical clipboard!

Knowledge diffusion is super-slow. It takes 17 years (!) for observed medical evidence to be integrated into actual practice. Empower the consumer (while also providing privacy and data protection). Also, empower homeland security to protect us from the evildoers.

Most practices don't have Electronic Health Records (EHR). Those would enable some degree of data exchange between practices, which would benefit a practice's competitors. The patient would be less tided to one doctor. Less tie-in means less profit. So, in the fierce competitive market of for-profit health care, there is little reason to go electronic.

However, SNOMED will help (... or so they say).

WWW2006 day 4: application demos

14 June 2006 | 0 Comments | Tags: , , ,

Now the chance for up and coming semantic web developers to demo their killer applications. The apps that will revolutionize the Internet, on display.

Tim Bernes-Lee (who uses a Mac, by the way) showed his Tabulator RDF browser. He gave a brief talk and demo of the app. It gives an "outline" style view of RDF and asynchronously and recursively loads connecting RDF using AJAX technology. It follows the 303 redirects, follows # sub-page links, uses the GRDDL protocol on xHTML and smushes on owl:sameAs and inverse functional properties (the killer feature, apparently).

Some commented to me afterwards that they thought that no one should ever have to see the RDF of a semantic web application, let alone browse it. Oh well.

Then came DBin. Not just a browser, no, a semantic web rich client! It uses so called brainlets (HTMLS) and a new semantic transport layer (not HTTP) to dynamically query and retrieve RDF using peer-to-peer transfer.

Again, I'm skeptical. It is just (yet another useless) RDF browser that saves bandwidth by sending the data through a peer-to-peer network. But RDF file sizes aren't exactly huge and compression will do far more than peer-to-peer to help with bandwidth. This browser is solving a problem that doesn't exist.

Next up: Rhizome. A python-based app that allows one to build RDF applications in a wiki style. Is uses a Raccoon application server to transform incoming HTTP requests into RDF, evolve them using rules and uses schematron validation. In short, it is to RDF what Apache Cocoon is to XML. Or, in more understandable terms: you declaratively build your web-site using RDF for everything from the layout to the database.

Pity, of course, that no one uses Cocoon and this Rhizome system looks really complicated, despite being pitched at "non-techical folks".

At this point I left the semantic web demo session. My thinking: these guys are nuts.

I caught the end of an entirely different demo/paper. The Guide-O system by researchers from Stony Brook University in New York.

It uses a shallow (read: simple) ontology to label areas of a web page according to their functional roles. It also creates a hierarchy of elements inside of each area or module. The third component of the system is a Finite State Automata for moving between functional states of the website.

Putting these three things together allows one to identify common trails of FSA transitions. That is, processes which users tend to perform regularly. Having identified these trails, one can cut out all the modules that do not contribute to the task. All useless clutter is eliminated from each web-page.

Result: mobile web surfing speed could be accomplished twice as fast as before and blind web surfing (using a screen reader) could be performed 4 times faster than before.

Future work: mining for workflows, using web services and analyzing the semantics of web content. Problems: coming up with standard way to describe the process and concept models. A system for semantic annotation of web content is needed.

I was impressed. It sounds like a really good idea. It takes three relatively simple ideas and combines them into something innovate and powerful. Nice.

WWW2006 day 4: education

13 June 2006 | 3 Comments | Tags: , ,

I attended a session on computing and education.

Tim Pearson said:
Schools in the UK spend just 1% of their budget on training and information technology. Business, in comparison spend 3%.

Schools like the web. It means less Microsoft, less expensive, in-school equipment, easy home access and is known to be modern/cool. Web 2.0 is great for lots of applications, but will never completely replace a rich-client for: hardware access, serious graphical work, immersive virtual reality and complex-process based assessment.

Learning is transforming into something more self-driven, interactive, open-ended and creative. Teachers will spend less time lecturing and more time mediating.

In terms of administration: school need to seriously look into getting some decent web-based admin, record keeping and curriculum planning applications. Crazy that this kind of stuff is still often done by hand, or in an Excel spreadsheet.

Gordon Thomson from Cisco said:
The IQ is dead as a measure of a good student. Better: passion + curiosity!

Cisco is working on innovative teaching solutions such as Telepresence. Imagine having a 3D image of Bill Clinton projected into the classroom to give a speech on global warming. It's like Star Trek.

The laptop is overrated. The $100 laptop, for example, is seen as the panacea to bridge the digital divide. However, in a few years technology will become so omnipresent that it doesn't matter anymore. What really matter is, first and foremost, that parents are interested in their children's education.

Addressing the challenge of web-based e-assessement, Neil T. Heffernan talk about an online exam system he and his student's built. It doesn't just assess students, but also offers hints and advice as students get questions wrong. It can also detect differences in performance over time as students learn. Teachers can use it to monitor their students, see which areas they are struggling with and then invest more time in explaining those in the classroom. Indeed, evaluation showed that student knowledge could be predicted very well.

I thought it was a very interesting and well-designed system. Looked good. It actually made answering math questions on a website kind-of fun.

Finally Elizabeth Brown presented her research on "Reappraising Cognitive Styles in Adaptive Web Applications".

We process information either visually or verbally, globally or sequentially, reflective or impulsive, convergent or divergent, tactile or kinesthetic, field dependent or independent, etc.

Focusing on the visual/verbal issue she used the WHURLE adaptive hypermedia system to present students with a customized revision plan best suited to their individual learning style. However, after extensive analysis, she had to conclude, that the adaptive learning environment made no difference whatsoever to students' performance. It might actually result in less learning, since if a student is only subjected to content that matches his or her individual learning style, then he or she will never learn to adapt to compensate for imperfect information. Students did say they liked the system, however.

WWW2006 day 4: security

12 June 2006 | 0 Comments | Tags: , ,

The day started with Mary Ann Davidson, the chief security officer at Oracle Corporation and former Navy officer, giving a keynote talk on the critical issue of security.

She quoted the head of the department of homeland security in the USA as saying:

"a few lines of code can be more destructive than a nuclear bomb".

Poor security costs between $22 and $60 million per year (National Institute of Standards and Technology). People would never accept if we built bridges as poorly as we build software. Software developers need to be accredited and licensed professional like engineers are.

She ended with a quote from Thomas Jefferson (in a letter to George Hammond, 1792):

"A nation, as a society, forms a moral person, and every member of it is personally responsible for his society."

Then Tony Hey (former head of the Computer Science department at Southampton University and now working for Microsoft) talked about e-science, grids and high-performance computing and how Microsoft was getting into it. They would build simple grid services, based upon simple web services protocols. This will result in e-science mash-ups: people combining different services to perform a really useful new task.

His presentation was technically good, but used far too many words on far too many slides. With so much visual clutter, I stopped listening to him half-way through.

WWW2006 day 3: drinks reception

11 June 2006 | 53 Comments | Tags: , , ,

In the evening there was a food and drinks reception at the Edinburgh Castle.

The castle was impressive. Very large and imposing. I could literally feel the history of the place. Many, many wars were fought on its mighty walls. The entire city of Edinburgh has a unique ancient feeling to it. Of course, not everything was awe-inspiring. The dog cemetery, for instance, was laughable (sad, sad, sad).

The reception (price of admission = ?£50) involved pretty waitresses walking around with trays of expensive wine and hors d'oeuvre for everyone's enjoyment and nourishment. However, there was far too much wine and far too little food. Every time a food tray appeared, the poor waitress was jumped upon by a crowd of hungry researchers and raided for all she (or, more accurately, her food tray) was worth.

The food was completely abominable, too. Various varieties of dead animals. The only vegetarian options I saw were plates of deep-fried mushroom balls. Yum. Needless to say, I didn't eat or drink anything, nor did I have much opportunity to.

As the night wore on the who's who of the World Wide Web became more and more drunk. Give famous and powerful innovators, researchers and academics lots of free alcohol and they turn into "high-class" swaying, stammering simpletons. The British are especially renowned for their joy in and expertise at getting themselves utterly and completely drunk. It is, after all, the supreme form of enjoyment.

It was however a good opportunity to meet and rub shoulders with like-minded people from all over the world. I met lots of folks from my alma mater, Southampton University. However, with 1200 delegates attending, it was a bit too overwhelming. With so many people it is difficult to get to know anyone.

Feel free to browse the pictures of this event, as well as the rest of the conference here.

WWW2006 day 3: my presentation

10 June 2006 | 0 Comments | Tags: , ,

Right after the opening keynote came the ontology research track. This track included my presentation on ontology segmentation.

Peter-Patel Schneider gave the first talk in the session. It was a position paper presenting a new idea. He explained how the open-world semantics of classical logics was better suited to the wide-open world wide web, but the well known datalog (database) paradigm also had some useful attributes. The idea therefore was to merge the paradigms of datalog and classical description logic into a ideal hybrid web ontology language.

Sounded like a good idea to me. Unfortunately, he didn't see himself actually doing any work to realize this idea anytime in the foreseeable future.

Then came my presentation. I wasn't half as nervous as I thought I would be, despite there being over 100 people in the audience.

Unfortunately, I forgot to record the talk. Fortunately, I re-did the talk and you can view and listen to a quicktime movie of it.

I got lots of questions, though mostly of the "please explain this semantic web thing to me, I don't understand anything" variety. Many people were also taking pictures of my presentation slides with their digital cameras throughout my talk. I was one of the few people to use the Apple Keynote software to give my presentation. This software allowed me to produce slides that were vastly better-looking than the usual death-by-powerpoint variety.

I got lots of feedback afterwards. Here a few things people said to me:

  • I liked and understood, you explained it well.
  • I liked little story in the middle of your talk.
  • Well done, the story didn't go overboard. It lightened the mood, but wasn't overblown and made a good point.
  • Well done answering questions about what OWL is. I would have just told him to read the paper.
  • I really liked your presentation.
  • That was really interesting; this problem you are working is part of a larger issue of academics who just work on toy examples and never consider large-scale problems.
  • I actually understood you talk, unlike the other two talks in the session, thanks.
  • I think your algorithm is flawed because of the irregularity in the data on slide number 21! (I proceeded to explain how a depth-first search strategy in my implementation would potentially result in such irregularities and that a breath-first search would have resulted in a more regular, linear graph) Oh, now I understand; it was a really good talk.

WWW2006 day 3: next wave, semantic web

3 June 2006 | 59 Comments | Tags: , , ,

Wendy Hall started off the day talking about the trials and tribulations of organizing the conference. She had to put up a ?£0.5 million deposit to secure the conference center three years in advance. She could have kissed her career goodbye, if this conference had not been a success.

Next Charles Hughes the president of the British Computer Society (BCS) spoke. He gave an utterly boring scripted speech about how computing needs to become a respected professional profession.

Carole Goble then spoke about the paper review process. The conference was super-competitive. 700 papers were submitted, over 2000 reviews issued, and only 84 papers accepted (11% acceptance ratio).

Thereafter came a panel discussion on the next wave of the web. Important people from research and industry talked about the semantic web. Business wants TCO figures, risk measures, abundance of skilled ontology engineers and stuff like that. Academia underestimated the amount of work necessary (and wants more grant money).

Ontologies can be used today: they are especially useful for unstructured information and to organize already structured information in database tables.

Tim Berners-Lee brushed off Web 2.0 as just hype. That's just AJAX and tagging. Folksonomy is not going to fly in the business world. The real, hard-core Semantic Web is where it's at. What's more: we're already there. We've reached critical mass, but just haven't realized it yet. All we need is for the right search engining to "connect the dots" and boom! Instant semantic web via network-effect (or something like that).

The right user interface is going to be the most difficult part. Browsers will need an "Oh yeah? Why?" button query the RDF and give a justification for any entailment.

"Don't think of the killer app for the semantic web, think of the semantic web as the killer app for the normal web"

The value of the semantic web will be universal interoperability and findability. We have more information than ever before and are spending longer trying to find stuff. The semantic web will help automate some of the "find stuff". The search engines of today aren't sufficient went searching for information on Exxon Mobile, for example. That will return millions of hits.

Tim: "search engines make their money making order out of chaos, if you give them order, they don't have a business. That's why they are not interested in the semantic web"

Take home message from the panel:

  • "you ain't seen nothing yet"
  • "a lot of education still has to go on. It needs to get simpler for the average business person and there needs to be a lot more investment"
  • "we can already apply the first results in a business context"
  • "it's a great simplifying technology"

My take: they are quite right, we have indeed not seen anything yet ... if nothing else they certainly succeeded in securing the next 5 years of grant money ...

WWW2006 day 2: WOW professional webmaster

2 June 2006 | 6 Comments | Tags: , , ,

The World Organization of Webmasters tutorial session offered a chance to take an exam to become a certified professional webmaster. I though, "what the heck": the exam normally costs $195 to take and here at the conference they are offering it for free, so I might as well give it a go.

The exam wasn't easy. One needed to answer 70% of the questions correctly to qualify as a professional webmaster. There were some tough questions. A typical question would be something like:

Which of the following is valid XHTML 1.0 / HTML 4.0 (mark all that apply):
a. <img src="image.gif" alt="the image" height=25 width=25 />
b. <strong><a href="link.html">click here</strong></a>
c. <DIV CLASS="style.css">text</DIV>
d. <img src="picture.jpg" alt="my picture" />
e. <hr><a href="page.html">next page</a><hr>
f. (none of the above)

Any guesses?

Bill Cullifer was impressed with the exam results. Most people did extremely well. He commended that the individuals present were obviously the top people in the world in the Internet field.

I passed, of course. I'm now a WOW Certified Professional Webmaster.

WWW2006 day 2: Extreme Programming

1 June 2006 | 3 Comments | Tags: , , ,

Along the same lines as Web 2.0 comes eXtreme Programming. This new philosophy of how to program has 12 basic principles:

  • Pair programming: two people to one screen. This is easier than it sounds. Software engineering is a very social activity, so pairing up is only natural. Pairings change naturally over time, sometimes several times a day. This practice helps introduce new people to the team, creates shared knowledge of the codebase and (most important) greatly improves the quality of the resultant code, while only minimally reducing productivity.
  • On-site customer: the customer is present throughout the development process. No huge requirements documents that no one reads. This means that the customer must always be reachable to ask about a design decision. A programmer with a question should not have to wait longer than an hour for an answer.
  • Test-first development: write the tests first and then create the program until all the tests pass.
    Frequent small releases: most important principle. Release a working product at some small fixed period. A beta every two weeks, for example. The customer always has something tangible to use and give feedback on. No big-bang integration.
  • Simple design through user stories: simple 3x5 cards to capture requirements. These serve as a contract to further discuss the feature with the customer and find out exactly what they want.
    Common code ownership: anyone on the team can change anything in the codebase (relies and builds upon test-first development and pair programming)
  • Refactoring: if you need to change something, do it!
  • Sustainable pace: work no longer than 40 hours a week. No burn out.
  • Coding standards
  • Continuos integration
  • Planning game
  • System metaphor

WWW2006 day 2: Web 2.0

31 May 2006 | 1 Comments | Tags: , , ,

After the keynote I attended a tutorial on best practices in web development sponsored by Bill Cullifer of the the World Organization of Webmasters (WOW).

David Leip from IBM and David Shrimpton from the University of Kent talked about Web 2.0. The Web 2.0 phenomenon is exemplified in the difference between mapquest and google maps, ofoto and flickr, britannica online and wikipedia, personal websites and myspace, stickness and syndication, etc. The value of a website can no longer be measured by how many people visit it. Instead people can subscribe to feeds off the website and get all the benefits without ever actually visiting the site.

Websphere is IBM's Java Enterprise Application Server. It's biggest competition no longer comes from products like BEA WebLogic, but instead from Amazon. Amazon offers people a virtual e-marketplace that handles all the accounting, advertizing, searching, buying, selling and refunds. All you have to do it set up the user account and use their APIs. Very easy and very cheap; very Web 2.0.

Another Web 2.0 phenomenon is the perpetual "beta". A product is never finished, but rather is continuously re-evaluated and refined. Updates can be pushed to all users, since the entire application lives on the web.

New application create buzz by being genuinely fun to use. Google Maps delights its visitors. The wow-factor makes people stay loyal. However, as soon as things start to go wrong, people will very quickly switch to using another service that works. Word of mouth is the way! Google never advertise; they don't have to.

Web 1.0 was all about commerce, Web 2.0 is all about people (what Web 3.0 will be is still written in the stars). The myriad number of WS* standards may be useful and necessary for the enterprise, but any normal person will be totally bewildered by WS*-standards vertigo. Web 2.0 is about the people taking back the Internet.

In the Web 2.0 world accessibility matters. Don't use red and green together on a web page, some people are color blind. Use xHTML and CSS, some people use screen readers.

AJAX (asynchronous javascript and XML) is the new buzzword. It was only coined by Jesse james Garrett on February 18, 2005 and already everyone is talking about it. All there is to it is the realization that you can use the XMLHttpRequest javascript function to ask for something from a server. This makes sophisticated Web 2.0 application possible. For example:

Writely - an online word processor
Kiko - an online calendar
Box - online file storage

Exclusive, hierarchical, fixed taxonomies are out. Flexible, flat, multi-tag, emergent folksonomies are in.

Microformats decree: Humans first, machines second. They are the lower-case semantic web. They use simple semantics, adding to the stuff that's already there, instead of inventing this hugely complicated description logic stuff (that I'm working on). Microformat are cheap, easy and, as long as people agree on them, they can be just as powerful and interoperable as if you had created a full XML-Schema monster. More at microformats.org and programmableweb.com.

WWW2006 day 2: motorola

30 May 2006 | 0 Comments | Tags: , ,

The second day of the WWW2006 conference started with Les Carr saying how super-excited he was about everything in the upcoming conference. Les was one of my former teachers back in Southampton University. He is the one who encouraged me to submit a paper for WWW2006.

Then the first minister of Scotland got on stage and gave a talk, singing the glories of mother Scotland. He talked about how the great country of Scotland, with its devolved parliament and independence from oppressive England was making great strides in the world. No nation is more illustrious!

Wendy Hall and Tim Berners-Lee also said a few words. Tim Berners-Lee is the guy who invented the World Wide Web back in 1990 (yup, the Web is only 16 years old).

Sir David Brown, the chairman of Motorola Ltd. gave a speech. He recalled how he estimated ten years ago that there might be 900,000 mobile phones sold every year. Now there are 900,000 mobile phones being sold every 19 hours. He was 46,000% wrong! But at least he was 46,000% wrong in the right direction.

Mobiles are the 4th screen, he said. The computer desktop, the living room, the car and the mobile make up the places were we consume media. The future is personalized content anywhere and anytime. The device formally known as the mobile phone will be central to this ubiquitous media revolution.

Globalization is good. It's a chance for a positive-sum gain for everyone. Smart countries will use communication technology to combat outsourcing of manufacturing by "insourcing" logistics control. For example, there is no reason that a manufacturing plant in China can't be managed and control remotely from the UK.

On to socioeconomics: there will be an estimated 930 million new mobile phones in developing countries by 2008. The proliferation of low-cost mobile devices everywhere will lead to drastically increased economic output from developing nations. Technology innovation will be followed by business innovation, which will be followed by renewed technology innovation, and so on in a spiral of economic growth. More money for everyone! This will create better health, better education, better lifestyle and a better world.

What Sir David does not realize is that with increased economic development there also comes greatly increased suffering, stress, mental illness, pollution and war. As my spiritual master has said: "vaisyas (businessmen) can not be the leaders of any working society, material or spiritual"

WWW2006 day 1: tagging (part 2)

29 May 2006 | 0 Comments | Tags: , ,

Tagging is also being used in the enterprise. IBM has added tagging to its internal contact management system: Fringe Contacts. IBMers are connected by location, projects, position in the organizational hierarchy and now also by the tags they give each other. For example, everyone attending the chi2006 conference might tag themselves, or get tagged with that tag by a co-worker. By collecting all the reverse links one can easily build a list of all attendees, something what would have been otherwise very difficult in such a large organization. No single person has to maintain the list. It is updated organically.

The researchers noticed that the most interesting tags were those that were used by lots of people on a small number of people. These kinds of tags describe special expertise that there few people have. They can be used to identify special skills in the company.

Avaya labs has a similar system. They used to use a system of broad categories (e.g tech, development, marketing, etc) and skills. Every employee was tasked with keeping their own user profile up to date. However, inevitably, people got lazy, forgot the update their profiles and the system became useless.

Tagging collects dynamic user categories by the social relationships that already exist in the company. Changes in people's interests and people learning new skills are reflected in the collective tag cloud.

The talk by the lady from Avaya was somewhat difficult to understand. Loads of text on each slide and a virus scanner constantly coming up during the presentation, blocking the view, all made it very difficult to follow what she was saying. The slides might as well have not been there. Lesson for her to learn: less is more.

Mitre corporation created a system called Onomi. This enables social bookmarking, networks of expertise and information sharing. It integates with del.isio.us, LDAP, email, RSS, Soap, intranet URIs. They now use it as a replacement for email when telling people about something interesting. 18% of the workforce are using it. Most were attracted by a banner ad on the Intranet, as well as by selective announcements to specific user groups.

Yahoo has developed an AJAX tagging engine that suggests tags. This reduces the overlap between tags. If you tag something with one tag, all related tags will be pushed way own on the selection list. It also helps eliminate tag spam. If you use good tags (those used by many other people) those tags get a higher value (in a mutual reenforcement HITS algorithm style). It also awards original tags. People that introduce tags that later become popular are awarded a higher "importance" score. A further advantage is that users don't need to come up with their own tags.

Another presentation by Yahoo research was on combining ontology and flickr tags. Tags are like dynamic/shifting namespace, very different from a static controlled vocabulary. The lack of structure makes it difficult to hunt and search for content, but leans itself well to random browsing and accidental discovery.

Introducing simple subsumption between tags helps highlight that London is in the UK, for example. People will put in hypernyms in the middle strip of an ontology, e.g. golden retriever and dog. But the high-level hypernyms are too obvious, so people forget to add them (e.g. London and UK). Luckily, these kinds of high-level relations are well defined in ontologies. A combination of upper-level ontologies with low-level tags seems to be a promising area of research.

Some people from the steve.museum tagging project gave a talk on how the professional museum curators were very good at describing some things about the museum exhibits and terrible at others. The difference between the professional and layman taggers was staggering.

WWW2006 day 1: tagging (part 1)

28 May 2006 | 0 Comments | Tags: , ,

www2006I attended the World Wide Web 2006 conference in Edinburgh, Scotland last week. It was really interesting. Lots of knowledge on the future of the Internet. Here is what I learnt:
The first day I went to a workshop on tagging organized by Yahoo and RawSugar.

Tagging is the act of annotating something with a keyword. On the Internet anyone can tag. It puts the user in control. Tagging becomes useful when it happens on a large scale. Tags can be aggregated, organized into sets (like in flickr, youTube and technorati). A good tag set will cover as many facets as possible, e.g. music, artists, song, band, etc. People don't think "definition" when they tag. A tag can express an emotion, a insight, a gut reaction, anything. People are willingly telling us how they feel about something. That's part of the power. It's metadata for the masses.

Tagging works because it does not involve high brain functions of conscious sorting. It does not force people to make a choice (does skiing belong in the "recreation" or "sport" category?), things can have any number of tags. This kind of free, loose association is cognitively easy and makes less time. However, categories are arguably more memorable than tags, because you have had to make more of mental effort to add the category.

Tags can also count as opinion votes. Multiple instances of a tag are collected in bags of tags and determine how interesting a webpage, piece of music, photo, or any other tagable resource is (like in lastFM, My Web and delicious).

Tagging gives a sense of community. Like when playing a massively multiplayer online role-playing game like World of Warcraft, it gives a sense of "alone together". As described in the book, the wisdom of crowds, this leads to more cognitive diversity, less group-think, reduces conformity, reduces the correlated effects of individual mistakes, encourages new viewpoints, leads to less herd behavior and encourages participation.

Benefits of tags are:

  • better search
  • less spam / ability to identify genuine content
  • ability to identify trends and trend setters
  • a metric of trust
  • ability to measure how much attention a resource is getting
  • helps filter by interest (really works!)

Tagging is however limited in that people very rarely tag other people's stuff. Most tags are added by the content author. Tags are also often not very prominent, nor identified and collated in one's account.

Tags also lack structure and semantics. They exist in a large cloud, not an ordered hierarchy. Synonyms and polysemy can lead to a vocabulary explosion.

Search is a pull mechanism. Search engines need to go out an crawl the web to index all the content. This can take days. Tagging is push. Blogs notify the search engines when there is something new to to be had. Readers can be notified of new content the very second it appears.

It is difficult to add tagging to an existing system. Amazon tried this and failed. There has to be a clear role for tags. They have to provide some tangible benefit. The best tagging systems highlight unique contributions, give users control, allow for smaller tag-related sub-groups and allow for personalization.

Tagging can be described as going for a hike in the woods, or picking berries, while categorization is more like driving a car, or riding a rollercoaster.

Jeremy's leaving do restaurant outing

21 May 2006 | 1 Comments | Tags:

Jeremy RogersDr. Dr. Jeremy Rogers, a long time researcher at the University, recently took up a new a job in industry working for a large consultancy company. To commemorate the occasion my research group went out to dinner to an Italian restaurant called the Olive Press. I went along for this important social occasion.

However, due to my restrictive diet the only thing I could eat was a red pepper soup and some baked potatoes. It took quite some time negotiating with the waitress to find these few items on the menu that were eatable for me. Most of the so-called vegetarian options had eggs in them. What to speak of all the wheat and cheese in practically everything. I guess that is the nature of material (Mediterranean) world.

I took many pictures. View them here.

re-Slaughter interview

26 March 2006 | 2 Comments | Tags:

I have to give a PhD progress report presentation every year. My end-of-second-year interview was last week. It went quite badly. I was allowed to continue (if only because it is very rare that someone is thrown out after their second year), but the panel did not think I would be able to succeed in an actual PhD viva examination.

My presentation was fine; however I didn??(TM)t handle the questions very well. Example:

Professor: why didn??(TM)t you address transactions in your system?
Me: [look of puzzlement]
Me: why would you need transactions?
Prof: you are doing a database-like locking system, all these kinds of systems have transactions.
Me: what do you mean with transaction exactly?
Prof: two-phase commit, that kind of thing. Surely you know about it?!
Me: transactions aren??(TM)t relevant in this case.
Prof: no, no, I think they are.
Me: my locking does not require transactions.
Prof: all this kinds of systems use transactions, you should have addressed them!
Me: okay, I??(TM)ll look into transactions, but I still don??(TM)t think they are relevant in this case
Prof: ah ha, you haven??(TM)t read the literature. You are far too focused on your particular subject area. A good PhD student learns to not solve just one problem, but abstracts away and finds the general scientific contribution. A second year PhD student should have a firm grasp of all the relevant literature; it is worrying that you don??(TM)t even understand what a transaction is. Furthermore ??¦

[50 minutes later]

Other professor: we??(TM)ve been going for some time already. Maybe we could wrap this up.
Prof: okay, well, good luck with your final year.

What went wrong?

  1. I wasn??(TM)t confident enough in my presentation. One professor commented that my talk was very timid. Indeed, my personality is not very brash or aggressive. I need to be far more assertive.
  2. I??(TM)m not very good at thinking on my feet. I was under pressure, in stuffy lecture room with four professors just waiting to jump on top of me. My brain could not think very clearly.
  3. I made my presentation too simple. They audience thought they understood exactly what I was doing, when the reality was somewhat more complicated. I should have bewildered them with something so complicated that they would have no hope of understanding any of it and therefore think it was some great research. That way no one can ask any difficult questions.
  4. I admitted I might be wrong. Someone told me afterwards to never do that: I must never admit that I??(TM)m wrong, even when I know that I most certainly am. The whole point is to defend my work, nothing else. Never surrender!

Personally, I find it incredibly difficult to cling to a bad idea, just for the sake of it. I mean: it??(TM)s a bad idea, why on earth should I continue to entertain it? Is this how academics should be trained? To be stubborn and uncompromising? No wonder the world is in such a bad state.

When I was first applying for this PhD research one professor told me: "I don??(TM)t think you have what it takes to swim with the sharks." I didn??(TM)t understand what she meant at the time. Now I know ??¦ and, quite frankly, I??(TM)m not sure ??¦