متن6
5 Successful Text Mining solutions
Text Mining solutions process text to enable better access, to extract well-defined
results, to reduce the content to the relevant parts and, in the end, to reduce the
amount of reading as the main benefit to its users. It is yet unresolved, which existing
or future solution will be the best in the end. The following are some of the parameters
relevant in the design of Text Mining solutions that either support improvements
or, if not considered, will hinder usability: Types of data searched in the literature,
types of documents available, different ways to post-process the data, interface design,
linking with other resources etc. On the other hand, every successful Text Mining
solution incorporates design principles, which help to understand how terminological
resources and user profiles and expectations fit together.
Therefore, the third day covered talks presenting ingredients and pitfalls of successful
Text Mining systems. Opportunities for getting Text Mining involved in every day
curation work were explained in detail by Judith Blake (Jackson Lab), using the experience
from the Mouse Genome Database as an example, including relevance
classification, topic-based routing, gene name tagging and information extraction.
Anna Divoli (University of Chicago, U.S.A.) presented results from two user surveys
which were conducted in conjunction with the BioText project to explore on the priorities
in the design of user interfaces for biological users. There was a general agreement
that it is important to keep end users involved in the development phase. HM
Müller (Caltech, California, U.S.A.) presented the design principles of TextPresso,
which is being used by at least 20 curation teams around the world. J?rg Hakenberg
and Martin Krallinger (CNIO, Madrid, Spain) reported on the development of a meta
service for Text Mining tools that emerged from the second BioCreative competition,
which was acknowledged as having the potential of a high impact in the field by giving
access to advanced Text Mining solutions. Services were also the focus of the
presentation of Dietrich Rebholz-Schuhmann, highlighting a suite of Text Mining tools
hosted at the European Bioinformatics Institute. Commercial tools were presented by
Dagstuhl seminar proposal „ Ontologies and Text Mining for Life Science“ 5/5
Michael Schr?der (GoPubMed, University of Dresden, D) and David Milward (Linguamatics,
Cambridge, U.K.). An example for a very innovative application of Text
Mining was shown by Nigel Collier (University of Tokyo, Jp): The BioCastor system
gathers and analyses news for their relevance to indicate disease outbreaks, thus
building an early warning or “rumor surveillance” system.
6 Ongoing work in the development of phenotype resources
A topic that emerged in the course of the seminar was the increasing demand and
importance to manage, represent and integrate conceptual representation of phenotypes.
As an immediate action, present experts in this topic reported on ongoing work
and progress in this domain. Judith Blake (Jackson Laboratory, Maine, U.S.A.) presented
ongoing work in the design and development of the Mammalian Phenotype
Ontology at the Mouse Informatics Centre. This ontology was, among many other
textual resources, used by Ulf Leser and colleagues to infer predictions of protein
functions through the association of concept profiles composed of phenotypic features.
Suzanna Lewis (Berkeley Drosophila Genome Project, U.S.A.) reported on the
development of phenote.org, a novel resource for describing phenotype data in a
very generic data format. The format reduces all representations to tuples that are
formed by an ontological concept and a qualifier from a special qualifier ontology, an
approach which nicely leverages existing ontologies for a new purpose. Finally,
Robert H?hndorf (MPI, Leipzig, D) showed the involved logical consequences of representing
“phenotypes” as derivations from a wildtype which calls for the use of nonmonotonic
or default logics.