Wednesday, January 11, 2006

Duplications in ARTstor and Image Quality

* Clustering images to alleviate duplication *

Like the traditional slide library, the ARTstor Digital Library has
more than its share of redundant images. Some are literally duplicates
� digital images made from the same photographic source. Others are
merely functionally redundant � multiple views of the same object that
seem to contribute nothing extra to teaching or research. Why does
ARTstor have so many duplicative images? There are two primary reasons
for this duplication. First, some of ARTstor�s source collections
themselves contain these redundancies. Secondly, as we are constantly
adding collections, many of the new images represent works of art that
are already in the ARTstor Digital Library. Often, this multiplicity
increases the richness with which ARTstor documents these works;
sometimes, however. it simply leads to more redundancy.
Understandably, while some users welcome � or at least willingly
tolerate � this variety, others find it distracting.

In order to enhance our users� experience while working with the
ARTstor Digital Library, ARTstor staff have been working behind the
scenes to begin to cluster like images and to reduce this kind of
duplication. We have begun to identify redundant images � both literal
duplicates and �functionally redundant� images. Initially, we are
focusing our efforts on a core component of the Charter Collection:
those key works of art that are most frequently sought out and
consulted by ARTstor users. By concentrating on de-duplicating those
images that are most often searched, viewed, and saved into image
groups, we hope to greatly improve the experience of a majority of our
users in the very near term. And because much ARTstor use to date has
revolved around teaching, our early efforts at de-duplication will
likely have the greatest impact on �canonic� works of world art. But
we expect to expand our effort over time in order to embrace less
frequently consulted images as well, with the understanding that such
duplication is much less common outside core areas of art history.

In listening to our users, we have concluded that we should not
completely remove such duplicative images from ARTstor. Rather, we are
clustering these images so that when users perform searches in
ARTstor, they will not be confronted with myriad versions of same
image. Increasingly, they will see a single image of a given work of
art, with additional images clustered behind that main image. These
clustered images are ones that we believe are duplicative in some
meaningful sense. An icon beneath the thumbnail will signal the availability of such
supplementary, �clustered� images.

This approach should, over time, begin to address the dissonance some
users feel when they encounter multiple versions of the same image.
This strategy also preserves the user�s ability to select the image
that best meets his or her immediate need as teacher or scholar �
whether to illustrate a particular point, or to give a sense of how
one image more faithfully represents the original object than another.


----------------------------------------------------------------------


* Improving image quality *

In our continuing effort to develop the collections in the ARTstor
Digital Library, we are often � and increasingly � able to provide
users with truly superior digital images. Sometimes these images
represent new high resolution digital photography from the original
object, whether in a museum or in the Gobi Desert. In other cases,
they are images scanned from large-format photographs of such objects.
In order to highlight and make the most of such superlative images,
our effort to cluster duplicative images has taken on an additional
dimension. In addition to associating affiliated images, we are also
actively drawing the user�s attention to the best image that ARTstor
has to offer for a given work of art. As indicated above, we are often
hesitant to make such judgement calls ourselves. But, when we have
access to an image that seems, based on objective criteria, very
likely to be superior and of greatest interest to our users, we are
assigning this image priority in our clustering efforts.

As a result, you will typically find that a cluster of duplicative
images has been appended to an image that was either made via direct
digital capture from the original object (increasingly, but not
always, an image contributed by the museum that owns that object) or
scanned from a large- format photograph of that object (often
contributed to ARTstor via collections such as the Carnegie Arts of
the United States or collaborations with organizations such as Scala
Archives, which create and assemble high quality photographic archives
documenting museum collections, as well as architectural monuments and
sites).

In some cases, such an objectively superior image will not yet be
available to us for a key work of art that has been identified as a
priority for de-duplication due to frequency of use. Despite the
temporary absence of a superior image, we feel that it is essential to
address the redundancy of these key momuments. For this reason,
ARTstor users should also anticipate encountering image �clusters� in
which the preferred image may not be a high resolution image. In such
instances, we will continue our ongoing effort to provide superior
images, guided as always by the needs of ARTstor users. So please
continue to let us know how we can work to address your needs!