Saturday, April 29, 2006

TREC project

Introduction
Trec stand for Text Retrieval Conferences. It is retrieval evaluation experiments.Lancaster mentions that probably the first evaluation study in information retrieval was conducted in1953. More recent evaluation studies have been discussed in the annual review of information science and technology some of the recent retrieval evaluation experiments known as Trec experiment
Text Retrieval Conferences
Researchers in information retrieval have concatenate their research on small collection each of the order of thousand of document .The major problem for the researchers was get a text collection large enough to match the real life situation with an infrastructure adequate for conducting with an conducting test on them .In 1991 in order to order t alleviate the difficulty the US Defiance Advanced Research Projects Agency (DARPA) decided to found the TREC the experiment. National Institution Science and Technology (NIST), in order to Annabel information retrieval research to scale up from small collections of data to large experiments.
Smitten and harman- mention that the goals for the TREE experiment have been to :
1) Increase research in information retrieval on large scale test collections
2) Increase communications among academia, industry and government through on open forum
3) Increase technology transfer between research and products
4) Provide a state of the art showcase of retrieval methods of TREC sponsors
5) Improve evaluation techniques
Over a million documents have been used in TREC I, draw mainly from newspapers, newswires and selected journals. The documents range in size: while most of them range between 300-400 terms, some of them several hundred pages. All documents are uniformly formatted into Standard Generalized Markup Language and distributed CD-ROMs.
TREC has sets of activities
the main activity (core in TREC jargon)
subsidiary activities(tracks in TREC jargon)
the core has two types of tasks
Ad-hoc (that corresponds to retrospective retrieval)
Routing (that corresponds to the selective dissemination of information)
Relevance judgement for the old collections and old topic selection are made available in each subsequent TREC series experiment. The routing task involves using some of the old topics on the new collections: the relevance information from the old collection may be use to help formulate the query or the profile.
Out put list from each participating research team are sent to the NIST. Where they are merged for evaluation for each topic the hundred top ranking documents from all the participating teams are merged into a signal set which is then given to the assessor for relevance evaluation. This method called pooling.
First TREC was stated at 1992. From 1992-2005 it held 14th conferences in November month each and every year except TREC- 2. These conferences are co sponsored by DARPA&NIST.

The 15th TREC conference will started in the month of November in this year
Benefits of TREC
Boolean retrieval
passage or paragraph retrieval
combining the results of more than one search
retrieval based on prior relevance assessment
query expansion & query reduction
String & concept based searching.
Dictionary based stemming. And so on.

Semenar

TREC project

Introduction

Trec stand for Text Retrieval Conferences. It is retrieval evaluation experiments.Lancaster mentions that probably the first evaluation study in information retrieval was conducted in1953. More recent evaluation studies have been discussed in the annual review of information science and technology some of the recent retrieval evaluation experiments known as Trec experiment

Text Retrieval Conferences

Researchers in information retrieval have concatenate their research on small collection each of the order of thousand of document .The major problem for the researchers was get a text collection large enough to match the real life situation with an infrastructure adequate for conducting with an conducting test on them .In 1991 in order to order t alleviate the difficulty the US Defiance Advanced Research Projects Agency (DARPA) decided to found the TREC the experiment. National Institution Science and Technology (NIST), in order to Annabel information retrieval research to scale up from small collections of data to large experiments.

Smitten and Harman- mention that the goals for the TREE experiment have been to :

1) Increase research in information retrieval on large scale test collections
2) Increase communications among academia, industry and government through on open forum
3) Increase technology transfer between research and products
4) Provide a state of the art showcase of retrieval methods of TREC sponsors
5) Improve evaluation techniques

Over a million documents have been used in TREC I, draw mainly from newspapers, newswires and selected journals. The documents range in size: while most of them range between 300-400 terms, some of them several hundred pages. All documents are uniformly formatted into Standard Generalized Markup Language and distributed CD-ROMs.

TREC as two sets of activites

the main activity (core in TREC jargon)
subsidiary activities(tracks in TREC jargon)

the core has two types of tasks

Ad-hoc (that corresponds to retrospective retrieval)
Routing (that corresponds to the selective dissemination of information)

Relevance judgement for the old collections and old topic selection are made available in each subsequent TREC series experiment. The routing task involves using some of the old topics on the new collections: the relevance information from the old collection may be use to help formulate the query or the profile.
Out put list from each participating research team are sent to the NIST. Where they are merged for evaluation for each topic the hundred top ranking documents from all the participating teams are merged into a signal set which is then given to the assessor for relevance evaluation. This method called pooling.
First TREC was stated at 1992. From 1992-2005 it held 14th conferences in November month each and every year except TREC- 2. These conferences are co sponsored by DARPA&NIST.The 15th TREC conference will started in the month of November in this year

Benefits of TREC
Boolean retrieval
passage or paragraph retrieval
combining the results of more than one search
retrieval based on prior relevance assessment
query expansion & query reduction
String & concept based searching.
Dictionary based stemming. And so on.