Tuesday, April 21, 2009

I ran DROID on a few files on my computer to see what it would make of them. I was not aware of DROID before it was brought up in class and I imagined that it would find information about my files that I would either have a hard time finding or would be hard for me to access. After running the software, I think DROID would be helpful with unknown formats but the information that it displayed was nothing I could not have researched myself. It would have taken me much longer to do the research so I do see how DROID is a useful application for large quantities of files, especially unknown file formats. What troubled me the most was the missing information such as "Technical Environment", "Supported Until", or "Format Risk". In a project I am working on for another class, all of this missing information is information that we are desperately trying to find in reference to some files with the extension: .krz and .arr but I am not sure DROID would recognize these formats as many programs do not. I wish I had access to these files outside of class to see what DROID would make of them. My favorite piece of information that DROID delivered is the "Description". It was like a mini history of the format. DROID was very easy to use but will probably need a few more years of work to become a tool that works consistently and delivers full informative profiles of even the most diverse file format collections.

Here is a sample of some of the files that I had on my computer. The first is a picture and the second is an xml file that I had to work with for another class:

Name: JPEG File Interchange Format
Version: 1.01
Other names: JFIF (1.01)
Identifiers PUID: fmt/43
MIME: image/jpeg
Apple Uniform Type Identifier: public.jpegFamily
Classification: Image (Raster)
Disclosure: Full
Description: The JPEG File Interchange Format (JFIF) is a file format for storing JPEG-compressed raster images. It was developed by the Independent JPEG Group and C-Cube Microsystems, in the absence of any such format being defined in the JPEG standard, and rapidly became a de facto standard; this is what is commonly referred to as the JPEG file format. A JFIF file comprises a JPEG data stream together with a JFIF marker. It begins with a Start of Image (SOI) marker, immediately followed by a JFIF Application (APP0). This is followed by the JPEG image data, which is terminated by an End of Image (EOI) marker. JFIF supports up to 24-bit colour and uses lossy compression (based on the Discrete Cosine Transform algorithm). Other types of compression are available through JPEG extensions, including progressive image buildup, arithmetic encoding, variable quantization, selective refinement, image tiling, and lossless compression, but these may not be supported by all JFIF readers and writers.
Orientation: Binary
Byte order: Big-endian (Motorola)
Related file formats: Has priority over Raw JPEG Stream Is previous version of JPEG File Interchange Format (1.02)Is subsequent version of JPEG File Interchange Format (1.00)Technical Environment:
Released:
Supported until:
Format Risk:
Developed by: C-Cube Microsystems Independent JPEG Group
Supported by: None.
Source: Digital Preservation Department / The National Archives
Source date: 11 Mar 2005
Source description:
Last updated: 02 Aug 2005

Name: Extensible Markup Language
Version: 1.0
Other names: XML (1.0)
Identifiers: PUID: fmt/101
Apple Uniform Type Identifier: public.xml
MIME: text/xml
Family:
Classification: Text (Mark-up)
Disclosure: Full
Description: The Extensible Markup Language (XML) is a general purpose markup language for creating other, special purpose, markup languages, and is a simplified subset of SGML. The structure and grammar of an XML document can be defined using a markup declaration, such as a Document Type Definition (DTD) or XML schema. A XML document consists of nested elements, each of which may have attributes and content. It typically begins with an XML declaration, defining the XML version and character set used. This may be followed by a Document Type declaration, containing or pointing to a markup declaration for the class of document. An XML document is said to be well-formed if it conforms to the XML specification; it is said to be valid if it additionally complies with a defined markup declaration. The formatting and transformation of XML documents can be controlled using the Extensible Stylesheet Language (XSL).
Orientation Text
Byte order:
Related file formats: Has lower priority than Scalable Vector Graphics (1.0)Has lower priority than Scalable Vector Graphics (1.1)Has lower priority than DROID File Collection File Format (1.0)Has priority over Hypertext Markup Language (2.0)Has priority over Hypertext Markup Language (3.2)Has priority over Hypertext Markup Language (4.0)Has priority over Hypertext Markup Language (4.01)Has priority over Extensible Hypertext Markup Language (1.0)Has priority over Extensible Hypertext Markup Language (1.1)Has priority over Hypertext Markup Language
Technical Environment:
Released: 04 Feb 2004
Supported until:
Format Risk:
Developed by :World Wide Web Consortium
Supported by: None.
Source: Digital Preservation Department / The National Archives
Source date: 11 Mar 2005
Source description:
Last updated: 02 Aug 2005

2 comments:

  1. nice posting. the value of droid is that it correctly identifies the format. That is the main value from a preservation perspective. It is true that you can find info about the file in various ways/sources, but the main point is to be a file identifier.

    ReplyDelete
  2. Once you have identified the file, including year/version, inferring the technical environment is not that hard for most common file formats.

    ReplyDelete