> Jeremy Clear wrote:
> >... That's the crucial thing -- you spend no significant
> >time agonizing over the task; you just quickly pick some concordance
> >lines and send them in. Sure, not everyone will agree 100% that the
> >lines you've picked exactly match the sense I posted (first because
> >the sense I posted was just an arbitrary definition taken from one
> >dictionary which is clearly inadequate to define and delimit precisely
> >a semantic range; and second, because no-one is going to validate or
> Philip Resnik wrote:
> >I agree -- especially since tolerance of noise is necessary even when
> >working with purportedly "quality controlled" data. And one can
> >always post-process to clean things up if quality becomes an issue
> I don't mean to put a damper on this idea, but we should expect that
> the agreement rate will be far from 100%. Also, the tolerance of noise
> will depend on the amount of noise. I did a comparison between the
> tagging of the Brown files in Semcor and the tagging done by DSO.
> I found that the agreement rate was 56%. This is exactly the rate of
> agreement we would find by chance. So the amount of post-processing
> could be quite a bit of work!
Consider that one has 6 sense tags and the other also has 6 sense tags for the same
word in a sentence, assuming that they use the same set of sense tags
(although not likely). The likelihood that the two tagging
algorithms agreed by chance (independently) is 6 x 1/6 x 1/6. So, the
above seems to be true if there are 2 sense tags for the word:
2 x 1/2 x 1/2.
Is this correct?
For information, we did some work in measuring the agreement of sense
tagging between HUMAN, which is about 80% for both recall and precision
(or 0.8 x 0.8 = 0.64 ~ 0.56). However, this is for Chinese over a small
This archive was generated by hypermail 2b29 : Tue Jun 13 2000 - 03:42:30 MET DST