Saturday, May 31, 2008

Resurfacing for good later in June! Enjoy summer, everyone!

May has been a real long month.
Today the 31st I'm finally wrapping up business and moving on with my life.
Apologies for leaving this blog an orphan for so long.
I'll be back soon with daily bites.

Saturday, May 3, 2008

resurfarcing...

Coming up.......

2008 North American Computational Linguistics Olympiad

What is the Computational Linguistics Olympiad?

The North American Computational Linguistics Olympiad (NACLO) is modeled after similar Linguistics Olympiads held in Eastern Europe since 1965. In these events, hundreds of high school age students have participated, challenged by interesting linguistic problems from dozens of the world's languages. In solving the problems, students learn about the richness, diversity and systematicity of language, while exercising natural logic and reasoning skills. No prior knowledge of particular languages or of linguistics is necessary, but the competitions have proven very successful in attracting top students to study and choose careers in fields of linguistics, computational linguistics and language technologies.

Professional linguists and other specialists in natural language processing technologies cooperate to create stimulating and engaging problems that represent cutting edge theoretical and practical issues in their fields. This is truly an opportunity for young people to experience a taste of what natural language processing in the 21st century is all about.


For details and past CL Olympiads visit: NACLO 2008

Friday, March 14, 2008

The unbearable lightness of being

See the movie if you can. Better yet read the book (by Kundera).
A bit post-modernistic by some standards, I found it agrees with me these days.

Tuesday, March 4, 2008

semantic computing and "vaguely-formulated human intentions"

The field of Semantic Computing (SC) brings together those disciplines concerned with connecting the (often vaguely-formulated) intentions of humans with computational content. This connection can go both ways: retrieving, using and manipulating existing content according to user's goals ("do what the user means"); and creating, rearranging, and managing content that matches the author's intentions ("do what the author means").


There is no such thing as "vaguely-formulated intentions of humans". Humans have intentions. Only the human who has expressed the specific intentions knows how vague or not these intentions were "formulated". How? Usually vague intentions bring about vague results or bring about undesired (unintended) results. It's all in the eye (and mind) of the beholder.

What's pertinent for CL and SemComputing here is the fact that intentions are hard to detect. We will never be 100% sure that we matched the intention of the author, neither should we even try for that. Regardless of what the author "meant" (which can be very imprecise and gauged only by the author's own judgement), a document bears evidence of such intentions. If we focus on the intentions evidently expressed in the document we can happily dispense with the "intentions of humans".

Friday, February 29, 2008

Keeping subjectivity out of CL...

Reading the announcement about the COLING workshop on "human judgements in Computational Linguistics":


Human judgements play a key role in the development and the assessment of linguistic resources and methods in Computational Linguistics. [...]
We invite papers about experiments that collect human judgements for Computational Linguistic purposes, with a particular focus on linguistic tasks that are controversial from a theoretical point of view (e.g., some coding tasks having to do with semantics or pragmatics). Such experimental tasks are usually difficult to design and interpret, and they typically result in mediocre inter-rater reliability.


So let me think. "Coding tasks having to do with semantics and pragmatics" "typically result in mediocre inter-rater reliability". Seriously? So are we back to the 50s and Chomsky's concept of "ungrammaticality"? The dominance of syntax and the marginalization of semantics and pragmatics as "subjective"?
Now that we finally made it through enough of Chomsky 50+ years later, now that CL is finally breaking free of attempts to formalize semantics, now that we have finally figured it out how to relate language and information theory, we now willingly take a turn back and look at "human judgements"? Why? Language is definitely not created in a vacuum. Virtually every level of natural language (and hence also of CL) is potentially subjective in that it inevitably reflects the 'theory' of the linguist who looks at it. There is no way around this. Claiming that "some coding" is subjective implies that some other "coding" is not. Well, the point is that if it is not, then it has nothing to do with *natural* language.

Mental exercise of the day

Focus on the negative and you will be immediately inundated by an avalanche of negative experience.

Focus on the positive, and more positive will turn up out of the blue...

On idiotic management...

Managing people is hard enough. Managing smart people is definitely harder.
Managing smart people who constantly blabber about new technology must scare the hell out of most managers.

What takes the cake is:
Managing people and technology and making decisions without listening to your experts.
Managing people and technology and be too scared to make any decisions.

Thursday, February 28, 2008

words....

Interesting neologisms of the day (only read with a sense of humor):

celebritology, noun:
1. the study of the lives of celebrities
2. the endless gossip about Britney's life
3. the main subject of attention of the People magazine


chatological, adj. (as in "chatological humor", reminiscent of "eschatological"):
1. the system or theory concerning online chats, online chat rooms, and any other online life species
2. the branch of logic dealing with the same...


Interesting syntactic phenomenon of the day:

Clapton Invited to Play North Korea*

Lucky North Korea will be played by Clapton...

* to confirm the meaning of this schema look at the article

Monday, February 25, 2008