|
|
| |
October, 15
14.00 (Hall A
) Conference Opening
|
14.30 – 15.00.
P. Mehra,
I. Belousov. HP Labs in Russia: Overview of Research and Collaboration Agenda (30 m.).Information Management, Digital Archives & Digital Libraries activity of HP Labs |
15.15 (Hall A
) RCDL Tutorials.
|
15.15 – 17.15.
Th. Risse. Approaches for large scale digital library infrastructures (2 h. //
Vol. 1, p. ).Abstract. Current plans for next generation DL architectures are aiming for a
transition from the DL as an integrated, centrally controlled system to
a large scale federation of DL services and information collections. The
transition is driven by DL "market" needs and inspired by new technology
trends that promise to solve at least part of these market needs.
With the uptake of DLs in a wider community there is a need for better
and adaptive tailoring of the content and service offer of a DL to the
needs of the respective community as well as to the current service and
content offer. Furthermore, there is a need for more systematic
exploitation of existing resources like information collections,
metadata collections, and services for making DLs more cost-effective as
well as a need for opening up of DL technology to a wider.
New technologies and paradigms like Peer-to-Peer networking and
Service-oriented Architectures (SOA) suggest digital libraries that
operate on more demand-oriented and flexible distributed or
decentralized infrastructures.
The tutorial aims to introduce to the audience various central aspects
of bringing digital libraries to large scale infrastructures by
discussing core ideas and related architectural options. Furthermore, it
introduces the underlying technologies as a foundation for the
understanding of the concrete solutions. The main part of the tutorial
revolves around the following selected DL topics: - Content and Metadata Management
- Navigation through the information space
For each of the topics the key challenges are discussed together with
possible solutions for the challenges and the lessons learned in
implementing these solutions. The solutions are illustrated with
concrete examples and small system demos from the BRICKS project. |
17.30 (Hall A
) RCDL Tutorials.
|
|
October, 16
9.30 (Hall A
) Scientific Digital Libraries. Scientometrics and Evaluation of Innovation Potential
|
9.30 – 10.00.
I. Zatsman,
S. Shubnikov. Processing Principles of Information Resources for Evaluation of Innovative Potential of Science Fields (30 m. //
Vol. 1, p. 35-44).Abstract. The paper is devoted to a problem of an evaluation of innovative
potential of science fields with use of information of patent electronic
libraries, access to which is given by Rospatent. The structure and
filling of patent documents is analyzed. It is shown, that modern
researches of S&T interaction (which science fields are most cited in
patents?) are based on a processing of patent documents, containing
references to scientific papers. Researches of S&T interaction clarify a
number of requirements to methodology of processing of patent documents,
and also to a degree of elaboration of schemes for patent electronic
libraries. One of these requirements consists in additional
structurization of references to scientific papers in patent documents. 10.00 – 10.30.
M. Kogalovsky,
S. Parinov. Socionet Information Resources, Scientometrics and Metadata Quality Indicators (30 m. //
Vol. 1, p. 45-54).Abstract. The Socionet system is the first Russian professional social network for
education and research areas. It is also a platform to build e-Social
Sciences infrastructure at national level. Socionet information
resources are integrated into national Common Scientific Information
Space of Russian Academy of Sciences and have relations with
international e-Science online infrastructure. The Socionet system used
some ideas, methods and tools of RePEc (Research Papers in Economics)
project and OAI (Open Archives Initiative). An added value of the
Socionet is advanced architecture of relations among different types of
information objects, which allows better navigation, research
performance indexes, scientometric analysis and metadata quality
indicators. The proposed article describes general features of the
Socionet and its methods of information resources integration. The
article also discusses possible implementations of scientometrics based
on Socionet statistics and a problem of metadata quality measurements. |
11.00 (Hall C
) Posters with Coffee
|
S. Tarasov. The automated system of construction the thesaurus (poster //
Vol. 2, p. 63-66).Abstract. In article practical aspects of creating of the automated system of
construction of the thesaurus are considered. Major problems of the
organization of modern thesauruses and methods of their construction are
defined. The problem of automatic generation of thesauruses on the basis
of linguistic sources of different type is examined: results of the
analysis of cases of texts, definitions of the explanatory dictionaries,
the given associative dictionaries, etc. The offered architecture of the
automated system of generation of thesauruses for any subject domain on
the basis of linguistic sources and by means of participation
independent assessors is described.
E. Rabchevsky. Automatic construction of ontologies (poster //
Vol. 2, p. 37-40).Abstract. This paper is devoted to automation of ontology's construction process
that based on linguistic analysis of the Web resources mounted
knowledge. Author offer a method of formal semantic models automatic
construction. Also author design a plan for binding of formal models to
particular domain.
D. Kulikov,
L. Sukina,
S. Nikolaev. E-catalogue of the Pereslavl University Library (poster //
Vol. 2, p. 20-22).Abstract. The article provides information about the Pereslavl University Library,
dwells on purposes and ideas of the indispensability of the e-catalogue
of the library. It describes requirements regarding the structure,
functionality and interface of the catalogue. The article draws
attention to peculiarities, advantages and mission of the system within
the university library. |
11.25 (Hall A
) Scientific Digital Libraries. Digital Libraires for Earth Sciences
|
11.25 (Top floor
) Information Retrieval. News Summarization
|
|
11.25 – 11.55.
N. Abramova,
V. Abramov. Automatic compilation of news stories reviews (30 m. //
Vol. 1, p. 131-141).Abstract. This work deal with one of the topical problems of automatic
summarization - multi-document summarization in respect to news stories.
Abroad this line of researches is widely developed, however in Russia no
is paid to this subject area. Authors propose the method of compilation
of news stories reviews, on the basis of which is developed the
summarization system. We present the sample summaries and describe
experiments of summarization evaluation. The experiments proved that on
average (with coating 80% as to three collections of documents provided
for the research) survey summaries reflect the content of original
texts. 11.55 – 12.25.
P. Braslavski,
V. Gustelev. News Summarization System Based On Machine Learning Approach (30 m. //
Vol. 1, p. 142-147).Abstract. The paper describes an experimental automatic summarization system for
news stories based on machine learning approach. As a main dataset we
use a corpus of 1183 news stories from Gazeta.ru, a popular Russian
online news service. The news stories have highlighted sentences that
are used as summary. For classifier building we use LibSVM - an
implementation of support vector machine. We use a set of easily
computable features for classification. Additionally, we performed
evaluation on a smaller manually tagged Kommersant corpus. Evaluation
shows acceptable quality of the results. |
12.35 (Hall A
) Scientific Digital Libraries. Technologies for Social-Economic Monitoring
|
12.35 (Top floor
) Information Retrieval. Document Stream Analysis
|
|
|
14.15 (Hall A
) Scientific Digital Libraries. Historical Digital Libraries
|
14.15 (Top floor
) Information Retrieval. Clustering and Near-Duplicate Detection
|
14.15 – 14.45.
V. Barakhnin,
A. Fedotov. íethodological approach for developing informational-reference systems on history of science (30 m. //
Vol. 1, p. 84-88).Abstract. This article describes the methodic of developing
informational-reference systems on history of science. The main
principies of this metodic are the following:
Information is grouped around persons, at that detailed biografical data
is classified by chronological view, geographical view, etc.
Bibliographical list of person includes either publication by this
scientist or publications about this scientist. á connection between
scienctific activity of researcher with formal description of object
domain where this researcher worked is clearly shown.
This description includes the informational model of the reference
book, subsystem's realization features of the informational system, and
also main types of user informational requests which are required for
full-featured work with the system. 14.45 – 15.15.
A. Marchuk,
P. Marchuk. Digital archives integration platform (30 m. //
Vol. 1, p. 89-94).Abstract. The article examine the problem of integration of factographic
information systems. Integration means both union of information
resources and migration of information systems to uniform solution,
preserving functionality and interfaces of specific system.
Proposed approach was implemented in the project ``Digital photo-archive
of SB RAS''. 15.15 – 15.35.
Yu. Leonova,
A. Fedotov. Information model of the account of the temporary factor in information-reference systems (20 m. //
Vol. 1, p. 95-102).Abstract. The main subject of consideration given article is an information model
of the account of the temporary factor in information-reference systems
(IRS). IRS must provide execution an inquiry for some moment of time in
past that is to say making the cut to true fact to free date. The
account of the temporary factor is offered realize on base two
dependencies: - to versions document, connected with change attribute document on chosen time lag;
- relations parent-descendant between new and old objects.
|
|
15.35 (Hall C
) Posters with Coffee
|
N. Luneva. Multilingual linguistic knowledge base: architecture and metadata (poster //
Vol. 2, p. 67-70).Abstract. The paper describes some principal architectural decisions and types of
metadata in the multilingual linguistic knowledge base founded on the
new linguistic resource. The linguistic knowledge base is aimed at
debugging semantic-syntactical representations in language processors of
machine translation and text knowledge processing systems. The new
knowledge base is being designed as a major test bed for the research
community in the field of computational linguistics and intellectual
technologies as well as for educational purposes, for comparative
analysis of language structures and creating language training
environments. The knowledge base features the component of the
multilingual translation memory.
S. Volkov. From digital library to an information system "The complete í.V. Lomonosov" (poster //
Vol. 2, p. 18-19).Abstract. The report is devoted to creation of the philological digital library
``M.V. Lomnonosov". The discussed library has scientific, cultural and
educational goals. The necessity of creation of such a library is
proved, requirements to preparation of a material and system of its
organization are described. The paper provides main principles of
formation of philological digital library "M.V. Lomnonosov" and gives
brief account of a search mechanism. The model of information system
``The complete M.V. Lomonosov'' is offered.
N. Markova,
O. Obuhova,
I. Soloviev,
A. Chochia. Web-technology of dynamic classification in quasi-homogeneous digital collections (poster //
Vol. 2, p. 29-32).Abstract. Merging the benefits of attributive search and navigation would allow
the user to navigate a digital collection by progressively selecting
desired facet values of information objects.
The paper presents facet navigation based on dynamic classification -
fast and easy drill down in collections by selecting attributes. The
formal representations of ``facet formulas'', ``facet table'' and
``facet request'' are proposed. Several design decisions that can
improve efficiency of facet navigation are discussed. A sample of
visual interface snapshot illustrates the main ideas of the paper. |
16.00 (Hall A
) Scientific Digital Libraries. Virtual Observatories
|
16.00 (Top floor
) Information Retrieval. Dictionaries and Thesaurus for Digital Libraries
|
16.00 – 16.30.
A. Avramenko. Toward a Consensual Virtual and Real Time Clock in the Collection of Pulsar Timing Data Sets (30 m. //
Vol. 1, p. 103-111).Abstract. A problem of accordance of the observed values and their formal images,
is considered. A structure and components of collection, which are
appropriated to that accordance, are determined. By integration of
observed and modeled data, the pulsar timing sets are transformed to
parametric type, which is defined by observed parameters of rotation of
pulsar. The methods of collation of virtual and real features of sets in
time area of variables, are defined. The instances of problem
application of collection, is considered. 17.00 – 17.30.
O. Zhelenkova,
A. Kopylov,
V. Chernenkov. Application of IVOA software tools for radio sources investigation. (30 m. //
Vol. 1, p. 123-130).Abstract. The interrelationship between the objects of astronomical catalogs in the different ranges of electromagnetic spectrum and their association into the real astrophysical source has obvious scientific interest. Astronomical community actively uses the Internet for the access to the scientific information, but the heterogeneity of data and their constantly growing volumes are the certain difficulty. The gathering of information even about one celestial object is time-taking work because of a large quantity of resources, data access, formats of the obtained results and formats of input data of the program applications, used for further analysis. The community activity on the creation of the architecture of information interaction, standards, format specifications, data models and services, which increase the efficiency of work with the data, coordinates International Virtual Observatory Alliance (IVOA). Within the framework of this activity are created systems, which make it possible to realize the distributed computing and data access. We decided to analyze the usage of existing software tools for investigation the radio sources list.
. |
16.30 – 16.50.
N. Buzikashvili. Dmitry Samoilov's Iskalka (20 m. //
Vol. 2, p. 41-48).Abstract. The paper describes elegant, highly effective and easy-to-use text
aligning and information retrieval solutions of the machine-aided
translation completely developed by Dmitry Samoilov (1958-2005). |
October, 17
9.30 (Hall A
) Ontologies, Data Representation. Access Techniques to Digital Collections
|
9.30 – 10.00.
E. Myasnikov. Digital image collection navigation based onautomatic classification methods (30 m. //
Vol. 1, p. 185-194).Abstract. The central question of image navigation system construction is to build
the projection of the collection into the two-dimensional space of
navigation. The kernel of the method proposed is to build the projection
in two steps. At the first step the hierarchical system of clusters is
constructed. At the second step the initial space of image descriptions
is projected into two-dimensional navigation space.
A survey of methods used for navigation system construction is given.
The results of experimental analysis are present. The proposed method is
compared to known methods. The results of this work allow to draw a
conclusion of ability to successfully apply the method developed. 10.00 – 10.20.
M. Prokhorov,
O. Bartunov. Navigation into Full-text Data Bases and Portals (20 m. //
Vol. 2, p. 71-80).Abstract. Large amount of information in present day portals and information
systems requires effective methods for navigation. Navigation tools
presence even into "classical" paper publications (naturally, in
"classical" form). However, modern electronic navigation mechanisms
sufficiently more various and effective. 10.20 – 10.50.
I. Markov,
N. Vassilieva,
A. Yaremchuk. Image retrieval. Optimal weights for color and texture fusion based on query object. (30 m. //
Vol. 1, p. 195-200).Abstract. It is a common way in CBIR to process different image features
independently to estimate image similarity. Color and texture are common
features which are used for searching in natural images. This paper
proposes the hypothesis that it is possible to mark out optimal weights
for fusing color and texture-based estimations in accordance with query
image features. Linear combination of color and texture metrics is
considered as a mixed-metrics. Clusters of images with common features
and optimal weights for them are presented based on experimental
results. The results of the paper can be used to determine the best
weights for particular query and thus improve image retrieval. |
11.05 (Hall A
) Ontologies, Data Representation. Ontologies in Digital Libraries
|
|
12.05 – 12.35.
Yu. Zagorulko,
O. Borovikova,
G. Zagorulko. Ontology-Based Approach to Getting Content-Based Access to Humanitarian Information Resources (30 m. //
Vol. 1, p. 217-224).Abstract. The paper presents approach to getting the content-based access to
humanitarian information resources using ontology.
Ontology constitutes information basis of Internet knowledge portal that
must provide both integration and systematization of humanitarian
scientific knowledge and of information resources relevant to the
subject domain of a portal and content-based access to them from any
point of Internet space.
Ontology is used for automatic generation of scheme of internal data
base of portal and forms for filling this data base, formulating user
queries in terms of subject domain of portal and navigation through
portal information space.
The structuring of portal ontology to domain-independent and subject
domain ontologies, makes the knowledge portal easily adjustable to any
area of knowledge. |
|
13.35 (Hall A
) Ontologies, Data Representation. Concepts Descriptions and Refinements in Ontologies
|
|
13.35 – 14.05.
N. Skvortsov. Application of concept refinement in salvation of ontology manipulation tasks (30 m. //
Vol. 1, p. 225-229).Abstract. The paper is continuing the investigation line of application of type refinement for heterogeneous ontological descriptions of a subject area. Most typical task of ontological specification manipulations are considered. They are: verification of ontological definition for internal consistency, mapping and integration of ontological contexts, ontology development, information contextualization and personalization, conceptual model development on the base of ontology, querying and messaging in terms of ontology. The claim of the paper is to show ability of application of refinement relation in tasks that are usual in ontology modeling. 14.05 – 14.25.
N. Loukachevitch. Description of Role Concepts in Linguistic and Ontological Resources (20 m. //
Vol. 2, p. 81-89).Abstract. In the paper we consider ontological characteristics of such concepts as
roles and show their distinctions from type concepts. We argue that the
difference between types and roles is necessary to account in
ontological and linguistic resources intended for automatic text
processing if the automatic inference procedure is planned.
We show that information about concepts and entities obtained from text
definitions often leads to incorrect descriptions of type-role
relations, and it is necessary to make special efforts to describe this
information appropriate for the logical inference.
We discuss possible means for description of the type-role relations
used in the Thesaurus of Russian Language RuThes. |
|
14.25 (Hall A
) Ontologies, Data Representation. Manuscripts Digital Libraries
|
|
14.25 – 14.55.
A. Varfolomeyev,
I. Kravtsov.,
V. Filatov. SVG-visualization for digital libraries of hand-written documents (30 m. //
Vol. 1, p. 230-235).Abstract. Our article covers various terms and development technologies for
Scalable Vector Graphic using in digital libraries of handwritten
historical documents.
Full-text nature of the initial information and natural hierarchical
structure of the documents define XML-technology as a choice and a basis
for texts storing. But, if we use XML for allocation of logic elements
in the texts of our library, it looks logically to apply the same
technology to other purposes - for example, for building of queries to a
collection of the documents or for visualization of the information at
different stages of our work with the texts.
In this article, the basic attention is given to four variants of SVG
using. We talk about making dependencies between XML-markup and images
of initial documents. We describe special vector fonts definition for
adequate representing of original texts. We demonstrate different forms
of visualization as results of analytical queries to a collection of the
documents. We also offer to use SVG-based editor for graphs models
creating.
The considered approach is used in practice in development of special
toolkit for information system "Istochnik" ("Source") intended for
network community of the researchers. 14.55 – 15.15.
V.S. Yuzhikov. Segmentation of the image of ancient manuscript page (20 m. //
Vol. 1, p. 236-240).Abstract. In article the algorithm for segmentation of the image of text page was
described. The problem of segmentation consists in correlation of each
element of page to one of two classes - the text or figure. Work of
algorithm begins with splitting the image into small areas. For
classification of each area following criteria are used: - A share of black pixels in all area.
- Value of disorder of width elements into area.
- Presence of alternating lines and line spacing
|
|
15.30 (Hall A
) Ontologies, Data Representation. Tools for Digital Libraries
|
|
15.30 – 16.00.
K. Kudim,
G. Proskudina,
V. Reznichenko. Comparison of repository systems EPrints 3.0 and DSpace 1.4.1 (30 m. //
Vol. 1, p. 241-252).Abstract. The basic facilities and features of most popular
DSpace and EPrints open source systems for
construction of scientific digital libraries are
considered in the work. And also experience in
creation of multilingual digital libraries on these basis
is described. Comparative analysis of DSpace 1.4.1
and EPrints 3.1 is presented. Special attention is
given to problems of localization, external formats
compatibility and usability. 16.00 – 16.20.
O. Bartunov,
T. Sigaev. Specialized data types for digital libraries (20 m. //
Vol. 2, p. 90-96).Abstract. Complex modern informational systems require specialized data types,
optimized for fast access and tasks of informational retrieval. Rapid
changes in patterns of access to information require extensibility of
database engine to allow experts in the data domain to develop custom
data type, optimized for data domain. We describe several data types,
developed for the open-source ORDBMS PostgreSQL, which facilitate
operations with sets, hierarchical data, semistructured data and
full-text search. Also, we describe PostgreSQL infrastructure for
developing extensions. 16.20 – 16.40.
C. Becker,
S. Strodl,
R. Neumayer,
A. Rauber,
E. Nicchiarelli,
M. Kaiser. Long-Term Preservation of Electronic Theses and Dissertations: A Case Study in Preservation (20 m. //
Vol. 2, p. 97-103).Abstract. An increasing number of institutions throughout the world face legal
obligations to collect and preserve digital objects over years. A range
of tools exist today to support the variety of preservation strategies
such as migration or emulation. Yet, different preservation requirements
across institutions and settings make the decision on which solution to
implement very difficult. The Austrian National Library will have to
preserve electronic theses and dissertations provided in PDF. It is thus
investigating potential preservation solutions. The preservation
planning approach taken in the PLANETS project is used to evaluate
various alternatives with respect to specific requirements. It provides
an approach to make informed and accountable decisions on which solution
to implement in order to preserve digital objects for a given purpose.
We analyse the performance of various preservation strategies with
respect to the specified requirements for the preservation of master's
theses and dissertations and present the results. |
|
17.00 (Hall C
) åxcursion
|
17.00 (Hall B
) RCDL Steering Committee Meeting - invitation only
|
October, 18
9.30 (Hall A
) Integration problems. Technologies for Information Resources Integration
|
|
9.30 – 10.00.
D. Briukhov,
L. Kalinichenko,
D. Martynov. Source Registration and Query Rewriting Applying LAV/GLAV Techniques in a Typed Subject Mediator (30 m. //
Vol. 1, p. 253-262).Abstract. New methods and tools for application development in collaborative
scientific enterprises (like Virtual Observatories (VO)) over multiple
distributed sources of data and programs are required. In this paper we
focus on results of research and experimental work oriented on
problem-driven subject mediation emphasizing aspects of LAV/GLAV
information sources integration in the mediator. The approach considered
has the following distinguishing features: typed, object canonical model
is used instead of usually applied relational one; a technique of
refining mapping of source information models into extensible canonical
one is provided; registration in a mediator of a relevant source is done
so that a mediator type should be provably refined by a relevant source
type or by a composition of such types (the conflict resolving
functions are to be specified, if required); rewriting of non-recursive
logical programs containing strongly typed rules is applied. These
features provide methodological context for the current paper that is
focused on description of the role the LAV/GLAV approach plays in the
mediator. Using astronomical example taken from the Russian VO context,
we show the technique of information source registration at the mediator
and query rewriting technique in a typed specification environment
applying LAV/GLAV approach. |
|
11.15 (Hall A
) Integration problems. Geterogeneous Collections Integration
|
|
11.35 – 11.55.
O. Klimenko,
V. Philippov,
M. Philippova. Digital Library of Mathematicle Resources MathTree (20 m. //
Vol. 2, p. 118-121).Abstract. MathTree library represents a set of integrated links to Internet
resources. The links are stored with some metadata in a tree catalogue
where different branches correspond to various aspects of mathematics.
Moving along the branches allows one to get information related to a
specific aspect of mathematics: research laboratories, scientific
schools, departments, specialists in the given field, theses and other
digital resources, links to magazines where articles on the subject are
published, and conferences on related topics. 12.15 – 12.35.
S. Chernov,
E. Minack,
P. Serdyukov. Converting Desktop into a Personal Activity Dataset (20 m. //
Vol. 1, p. 280-283).Abstract. The current experiments on personalization in information retrieval are
limited to the available collections of the real world data. While a
number of publications exploited user interaction with Desktop, often
these experiments are neither repeatable nor comparable. In this paper
we elaborate on the need for logging the Desktop activity data and
creating a common collection for Desktop search evaluation. We describe
the design of such a dataset and necessary logging tools. We also
outline the current state of our Personal Activity Track initiative
towards creation of the Desktop search dataset. While this effort is
currently targeting English-speaking users, it is also applicable to
Russian and other languages. |
|
13.35 (Hall A
) Integration problems. Access Control and Exceptions
|
|
13.35 – 14.05.
A. Berztiss,
B. Thalheim. Exceptions in Information Systems (30 m. //
Vol. 1, p. 284-295).Abstract. The concept of exception has been defined in diverse
ways. We relate exceptions to computational transactions
and to control constructs. Our view of a transaction is
very broad, and we consider transactional exceptions to
be instances of undefined function values. By giving
different interpretations to ``undefined'' we arrive at
a classification of transactional exceptions. Our
primary interest is in information systems, i.e., in database
transactions, and in processes that consist of such transactions.
In the database context we show that liberal treatment of
exceptions is simpler than total quality management for
consistency based on a set of constraints.
We refer to control operations that link transactions into
processes as actions. Actions tend to be time-related, and time
Petri nets provide actions with semantics. The time Petri
net representation indicates where exceptions can arise.
We also consider high-level monitors for the detection
of exceptions. Although our emphasis is on detection of
exceptions, their handling is also discussed. 14.05 – 14.35.
O. Zhizhimov,
A. Fedotov. Models of management of access to the distributed information resources (30 m. //
Vol. 1, p. 296-299).Abstract. On the basis of the analysis of typical scenarios of work of information
servers (WWW, FTP, Z39.50, etc.) problems which should dare at the
organization of the monitoring system of access to the distributed
information resources are formulated. Possibilities of technology LDAP
as similar system most suitable to construction are considered. Within
the limits of this technology by the access, differing degree of
integration of functions of information servers three models of
management are discussed with technology LDAP. |
|
14.55 (Hall C
) Closing of the Conference RCDL2007
|
|
|