Archives Outside

For people who love, use and manage archives

Archives Outside - For people who love, use and manage archives

It’s official, we’re a part of The Commons on Flickr

It’s been some time since we first applied to be a part of The Commons on Flickr and we are happy to announce that we are now a participating institution.

From The Commons pagetext-the-commons the-commons-institutiuons

Cultural institutions on The Commons share photos from their collections that have ‘no known copyright restrictions’. You can see more about this copyright statement on Flickr.

We’ve been on Flickr since June 2008 and have uploaded over 2100 photos. Our Flickr friends are often busy commenting on/tagging our photos, sharing their considerable knowledge and, researching facts and figures. Our blog series Moments in Time is also shared on our Flickr stream.

By joining The Commons our collection becomes more visible. We are looking forward to meeting new Flickrites to our photostream who enjoy browsing old photos and may also have historical information and knowledge about our photos to share!


Crowdsourcing Christmas! #Christmasinaalborg (#Juleniaalborg ) – Aalborg City Archives on Instagram

Bente Jensen, archivist Aalborg City Archives

Christmas 2012 will soon be History. This was the slogan of Aalborg City Archives’* Christmas project last year using social media as: Instagram, Facebook, and Twitter. The City Archives have celebrated Christmas through calendars with historical films and photos on Facebook, website and Flickr the last couple of years. This year, we added an accession of Christmas photos through social media: Why?

The Christmas Market in Aalborg (photo Anders_Hammer)

The Christmas Market in Aalborg (photo Anders_Hammer)

First because the City Archives lack modern Christmas photographs in the holding. We hold many photographs from the 1900s but lack contemporary documentation of Christmas. At the same time Christmas is a good opportunity because everybody in Denmark connects something with the season.

Secondly because the archives wanted to test a new accession method and user involvement to use in future projects in 2013, # juleniaalborg is a preliminary project.

3rd because we wanted to test whether people wanted to join and if they did, who would?

4th because from a historical point of view it is interesting, which motives people associate with #Christmasinaalborg 2012

Read the full blog post


Happy holidays (who’s up for some photo hunting during the break?)

The end of the year calls and we’ll be having a break here at Archives Outside… but, never fear! It does not mean an end to all your fun. We wish you all the best and thank you for all the brilliant knowledge, facts and fun you’ve contributed in the last year.

Seasons Greetings from Archives Outside. Digital ID 16410_a111_1[1A]_000073A_p1

A tinted Seasons Greetings from Archives Outside! Digital ID 16410_a111_1[1A]_000073A_p1

A virtual holiday tour
(goes nicely with those virtual boxes of chocolates)

Although there should never be anything virtual about chocolates…

We recently came across some wonderful travel brochures hidden in the deep, dark depths of our catalogue and managed to curate some digital galleries for your viewing pleasure (more to come in the New Year).

While browsing this holiday brochure you immediately sprang to mind because some of the photos in this brochure may look familiar to you.

Can you spot which ones are on Flickr and in Photo Investigator? Happy hunting and see you next year!

Snapshot of the Holidays in NSW Travel Brochure

Most viewed 10 posts/pages for 2012

1. Did you watch Underbelly last night? Check out some real life mugshtos of the razor gangs (still)

2. Useful tips for reading handwritten documents

3. Conservation Tip 5: Removing mould from records and archives (it was definitely a wet year)

4. Australian soldiers in black and white

5. Digitising your collection – Part 1: Project Planning

6. Conservation Tip 3: Removing blood from documents

7. What are your tips for dating photos?

8. Regional Archives Centres

9. Digitising your collection (PDF download)

10. Social media strategies for archives – what we learned

Congratulations to all of our photo sleuths this year who have solved many a mystery! Here are a couple of cases in the “unsolved” category from this years Moments in Time series.

Harnessing the power of public participation

The U.S. National Archives and Records Administration (NARA) has approximately 130 million items online and an estimated 110 billion pages of paper in their collection in total. In this video they discuss the way they are creatively exploring the use of public participation to help them meet the challenges of managing and making accessible such a large collection online.

The Citizen Archivist Dashboard is NARA’s new crowdsourcing tool for tagging, transcription, digitization of records, and more. Meredith Stewart demonstrates the various collaboration tools in the dashboard and discusses how the dashboard fits into the National Archives’ online strategy as part of Social Media Week DC.

Feeling inspired? Why not check out the Citizen Archivist Dashboard and start exploring.

Digitising your collection – PDF versions now available for download

We’ve had several requests to convert our five-part digitisation series into a downloadable format. Thanks to everyone who commented, tweeted (loved the Spanish versions of the post titles) and shared the posts.

There is a new page where you can download PDF versions individually or grab the zip file of all five parts. The new page links from the main Resources page.

Good luck with your digitisation!


Digitising your collection – Part 4: Scanning and handling tips

So far in this digitisation blog series we’ve covered program planning, the golden rule of digitisation, the heady world of techs and specs and now we come to practical tips for image capture.

For general handling of archives see our post Moving and handling – the Basics. Most of the tips below are from our reading room posters that were designed for researchers wishing to scan or photograph archives.

See also care and handling guidelines from the National Library of Australia which includes information on glass plate negatives and transparencies.

What can be scanned?

Items smaller than the bed of the scanner:

  • flat cards and single, loose pages
  • photographs
  • glass plate negatives
  • transparencies.

What should be photographed?

  • documents that require the removal of pins or other fasteners
  • documents that would need to be bent or folded in any way on a scanner
  • documents that retain a strong “fold” memory and will not sit flat easily
  • anything that is larger than the glass on your scanner.

General tips for scanning or photographing

  • use soft leather weights to hold documents in place
  • bleed-through from the reverse page can be reduced by placing a sheet of black card between the pages
  • thin documents may benefit from a sheet of white paper placed behind the page.

Using a flat-bed scanner

  • ensure the scanner is calibrated correctly (some equipment includes colour charts, or you can find information about colour management online)
  • is the glass plate clean? Some archives leave dust behind and the screen may need a clean with a soft lens cloth or blower brush between each scan
  • do not place pressure on the scanner lid to keep a document flat – ensure there is a gap by placing your fingers under the scanner cover.

Scanner lid - ensure no pressure is applied

Handling archives for a photography session

Single Pages

  • if the document has been folded place leather weights mainly where the heavy folds will not lie flat (Tip: small undulations will not affect the copy quality)
  • ensure the weights do not obscure text or other information
  • take care not to place weights over damaged areas of the document.

Weights helping to flatten file

Bundled files

– fastened with pins, staples, split pins, thread and plastic ring binders

When fastened at the corner:

  • use weights to position the document while a page is opened
  • maintain a soft curve in the page as you open the document – this will prevent hard creases and tears forming around the folds and indentations
  • use a book pillow to maintain a soft curve where the page does not naturally sit this way.

Using weights with stapled file

When fastened along the edge:

  • use support boards to build a level that matches (or is similar to) the document stack
  • use a soft leather weight to hold the page open on the supporting board stack while you photograph from the document stack
  • maintain a soft curve in the open page to prevent creases and tears.

Stack of documents with a support board and long weight

Volumes/ledgers etc

These objects require special handling to prevent damage and to provide the best quality copy image.

  • place the volume with the spine facing you
  • position pillows (our pillows are filled with beanbag beans) against the spine and open the front cover. The spine should sit easily and with no strain on the sewing
  • increase or decrease the number of pillows to provide the best support
  • open the book in small sections to get to the page you wish to copy
  • use soft leather weights to hold pages open

Large volume supported with pillow

Photographic prints

Photographic materials contain silver compounds and sensitive dyes that are very susceptible to damage from the oils and acids in our skin.

  • always wear plastic gloves (plastic provides a complete barrier between the archives and your skin while allowing for good dexterity and handling feel)
  • do not bend or crease the photo – this will crack the emulsion
  • if a photo is fragile or damaged use a camera rather than a scanner

Maps and plans

Maps and plans can be large and unwieldy, and come on varied supports, including paper and plastic.

  • use soft weights to hold down plans that have been rolled
  • hold from two strong points and carry plans in a u-shape to prevent creases

Carrying maps

Tips for creating your master photos files

Capture the whole image

Capture the edges of the photograph (where possible) to show that the image has not been cropped in any way. The original photos won’t necessarily have square edges so this technique will also ensure no information is left out.

Framing the picture

Frames, mounts, backings

Some photos in our collection have decorative supports (see photo of the doctor below) and some are housed in – or have been glued into – photo albums (see the album below).

Will you include these ‘extras’ in the digitised version? At State Records we do; it ensures the item has been captured in its entirety. Does the backing also need scanning? Check for information that may be relevant to the archive and scan if necessary.

Framing of pictures and text on reverse of image

Framing of pictures and text on reverse of image. Dr Lawrence William Cock, dated March 1903 Digital ID 9873_a025_a025000097

In a recently digitised photographic series at State Records the photos were stored in albums. We scanned the album page in full and then the images separately.

Showing full page scan of album plus individual photo from that page

Showing full page scan of album plus individual photo from the album page

In the next – and final – post we look at quality control, metadata and access.

Digitising your collection – Part 3: Technical specifications

You now know all about the Golden Rule of Digitisation and your plan is starting to come together. In this post we are talking techs and specs such as:  image capture; technical definitions; standards and storage.

This is the third post in a series about starting a digitisation program. The series covers: project planning; technical specifications; handling the archives; scanning tips; file storage, and; access.

In this post:

I’d like to thank our photographer, Tara Majoor, for her time, knowledge and contribution to this post.

Warning: we tried to keep this as basic as possible and link out to more in-depth information but you might want to grab a coffee for this one. Alternatively, if you need some bedtime reading…

Image capture – techs and specs

In our last post you learnt the Golden Rule of Digitisation and the importance of creating a master file (from which derivatives files are made). As you’ll recall, master files are the original files created during the image capture process: the aim of a master file is to be of a high enough quality to meet your organisation’s access and/or preservation needs, both now and in the future.

In order to meet your digitisation goals you need to make some basic decisions relating to image specifications before you begin capturing images. And, more than likely, because of the differences in the original formats (including fragile records, large maps etc) you will need a set of specifications.

It is the unique characteristic of each archive that will often necessitate different approaches to image capture.

For example:

  • photographs and detailed images require a much greater resolution than text-based documents

The main goal when defining your technical specifications is to create the best digital image possible, given the resources available. A basic understanding of the core imaging principles/concepts will assist in this all important decision-making process.

Resolution, bit-depth (colour depth) and colour management make up the core of a digital image. These core ingredients can contain variable amounts of data depending on your selected input parameters – specifications. You should also take time to consider an archival file type for your master files, and determine what compression (if any) you wish to use.

Tech talk – some helpful definitions

Bit depth, colour management  resolution, compression, what the heck is it all about? Please allow us to shed some light on the situation (thanks Tara).

Image resolution

A digital image is a structured matrix (or grid) of tiny squares known as pixels (picture elements). Each of these pixels has an assigned tonal value and when viewed in combination with surrounding pixels form the illusion of a continuous tone image.

Image resolution is simply a measurement of the density (or number) of pixels within the digital image. It describes the amount of detail encoded within a digital image. In the scanning world, resolution is a representation of the number of samples taken from the analogue original (photograph, document etc). In general, a greater number a samples (or higher resolution) should result in a more representative digital surrogate.

Resolution can be measured using two methods. In most software programs these are referred to as pixel dimensions and document size/pixels per inch.

Showing image size properties window

Pixel dimensions (also known as pixel array) – makes reference to the number of pixels in the matrix arrangement (array) horizontally and vertically.

For example:

  • 1024 x 768 pixels, or width=1024 and height=768

Document size/pixels per inch – resolution is most commonly expressed in pixels per inch (ppi) and measures the number of pixels per square inch.

For example:

  • a 1 inch x 1 inch image @ 300ppi image = 300 x 300 pixels

Pixel per inch (ppi) is a variable measurement and is dependent on knowing the size of overall the image; without this scale (or magnification ratio) the measurement loses context.

[You might be familiar with the term dots per inch (dpi) and while the two terms are often interchangeable dpi refers to printed resolution whereas ppi refers to the pixels within the digital image file].

Example of image resolution

Here is a plan from our collection (University Hotel, Parramatta Road, Glebe 1890). Take note of the horse bottom right.

Below is a close-up of the horse and shows three derivatives from the one master file. The higher the resolution, the greater the (uncompressed) file size – from 300ppi for printing down to 75ppi for web delivery.

Showing three version of image resolution

So, should I be scanning at the highest resolution possible?

A common misconception is that scanning at the highest resolution available will always produce the best quality images. Whilst it is true that the amount of detail captured within an image is controlled through resolution there are some factors to be wary of such as interpolated resolution (see below).

And of course, the higher the resolution at which you scan the bigger the file size and this will impact on your storage options (we’ll get to that later).

Optical Resolution vs Interpolated Resolution

  • Optical Resolution describes the maximum sampling rate possible from a given scanning device
  • Interpolated Resolution is additional ‘resolution’ or data made up (an educated guess) by the software program

Interpolation is not desirable, especially for digitisation practices as it can degrade image quality.

Tip: Take note of your scanner’s optical resolution, and only scan up to the optical limit.

Which leads us to another question…

How do I find out my optical resolution?

Consult your scanner’s manual (search online if you don’t have one). To make life extra confusing optical resolution can be expressed in either pixel per inch (where scale = 1:1) or pixel dimensions. When presented in pixel dimensions the smallest value represents ppi at a 1:1 ratio.

For example:

  • an optical resolution of 600 x 1200px is equivalent to 600ppi at a 1:1 scale

Bit depth (tonal or colour depth)

This is the measurement of the number of bits – or binary digits – devoted to storing the colour information about each pixel. The number of bits available determines the maximum possible range of colours and luminosity values (or grey shades) that can be represented within an image’s colour space or palette.

For instance, in a one bit image, each pixel is stored as a single bit (0 or 1) so there are only two digits available (black [0] or white [1]).

The formula for calculating bit-depth is: 2^(number of bit) = number of grey shades. So, for instance, in a one bit image, each pixel is stored as a single bit (0 or 1) meaning there are only two digits available (black [0] or white [1]).

In the image below you can see:

  • 1bit = 2^1 = 2 grey shades (black 0 or white 1)
  • 8 bit = 2^8 = 256 grey shades

1-bit vs 8-bit

So how do we get the colour?

A 24-bit colour image comprises of 8-bits of information for each of the red, green and blue (RGB) channels; so for each pixel there is 8 levels of red, 8 levels of green and 8 levels of blue:

  • 8 x 3 (RGB) = 24-bits

The palette of colours increases to:

  • 256 x 256 x 256 = 16.7 million colours

Down sampling some scanners may present options such as 48-24(bit) or 36-24(bit). The higher figure is the depth at which the scanner samples the raw data; the software then converts this value into a lower bit-depth (the lower figure) which becomes the final bit-depth of the exported image.

Some common bit-depths

Depth No. of Tones Description
1-bit Bi-tonal 2 Monochrome – contains only black (0) and white (1) pixels. Useful when digitising clear printed/typed text documents/publications.
8-bit Greyscale 256 Describes the number of pixels required for continuous tone greyscale, black and white plus a large range of intermediate greys
16-bit Greyscale 65,536 16-bit greyscale uses an extended colour space, creating a much larger file (double 8-bit), and requiring storage in formats that explicitly support this colour depth (TIF).
8-bit Colour*(VGA) 256 This colour mode was used heavy in early digital graphics, and it still sometimes used by web designers. This depth is NOT suitable for digitisation as it does not create True-tone Images.
24-bit Colour 16.7 Million 24-bit colour is the current standard, supported by a wide range of file formats and implication. It comprises of 8-bits of information for the red, green and blue (RGB) values.
48-bit Colour 281 Trillion 48-bit colour (16-bit per RGB channel) uses an extended colour space (trillions of colours) creating a much larger file size (double 24-bit), and requiring storage in formats that explicitly support this colour depth (TIF). Whilst images can be scanned and stored at with high colour depth at present affordable monitors and printers are not available to display or reproduce images with such high quality.

Resolution, bit-depth, file size guide

This table is from the State Records NSW Digitisation Guideline and shows the impact of resolution and bit-depth on file size (in megabytes).

Colour depth Res (ppi) Total bits Uncompressed file size
1 bit bi-tonal 300 8 700 867 1.04mb
1 bit bi-tonal 600 34 803 468 4.15mb
8 bit grey or colour 300 69 606 936 8.30mb
8 bit grey or colour 600 278,427,744 34.00mb
24 bit colour 300 208 820 808 24.89mb
24 bit colour 600 835,283,232 101.96mb

Colour management

We won’t go too in depth on this as the use of colour management is not mandatory, but it does provide the opportunity to create images that have more accurate colour.

However, for the more experienced digitisation readers …

Colour management outlines the colour capabilities of hardware devices – cameras, scanners, monitors and printers – by creating a translation (profile) that controls how the colour is displayed (or printed) by those devices.

Colour profiles ensure the quality of reproduced colour across many output devices. The minimum requirement for most projects should be an input profile outlining the colour space of the device that was used to digitise the document (most devices will default to sRGB).

Printing is a common scenario where the need for colour profile is emphasized. Whilst printing may not be the main objective of your digitisation project, the prospective requirements should be taken into account.

Calibration will also help achieve accurate and reliable colour. Calibration refers to the process of stabilising the imaging equipment to provide a consistent colour representation.

For more information see:

Phew! Still with us? We’re going to power on through to file types and file compression.

File types

Tip: Be wary of proprietary owned files types – eg: PSD files are Photoshop files. Without the Photoshop program the files are inaccessible.

TIFF (TIF) – Tagged Image File Format

This is currently the preferred archival format for storage of images. It is the most common uncompressed image file type and retains all of the image information. It also offers lossless compression options (see below under File Compression). Most software programs use this format and it is available for both Macintosh and Windows.

JPG (JPEG) – Joint Photographic Experts Group

This format is highly compressed and removes “unnecessary” image information. Most software programs use this format and it is available for both Macintosh and Windows.

JPEG 2000

A compression standard enabling both lossless and lossy storage. The compression methods are different from the ones in standard JPEG and improve quality and compression ratios. However it requires more computational power (or to be more technical, grunt) to process.

Format Bit depth Compression
  • RGB – 24/48 bits
  • Grayscale – 8/16 bits
  • Indexed colour – 1 to 8 bits
No Compression or Lossless (LWZ)
  • RGB – 24/48 bits
  • Grayscale – 8/16 bits
  • Indexed colour – 1 to 8 bits
Lossless (ZIP)
JPEG2000 (JP2)
  • RGB – 24/48 bits
  • Grayscale – 8/16 bits
Lossless or Lossy
  • RGB – 24 bits
  • Grayscale – 8 bits
  • RGB – 24/48 bits
  • Grayscale – 8/16 bits
  • Indexed colour – 1 to 8 bits
No compression

File compression

Compression shrinks the digital images for storage. There are two ways to compress:

1. Lossless eg: TIFF – keeps all data by encoding the image files. It can reduce the file size by 40-60% without scarifying (boo!) any pixel information.

The encoding stores adjacent pixels with the same colour value as a single value and the data records how many pixels have been compressed together. This way of compressing files is highly desirable when no resources for storing un-compressed files is available.

We currently store our master files as un-compressed TIFF.

2. Lossy eg: JPEG/JPG – this way of compression permanently removes “un-important data” (subtle colour/tonal information that is hard to distinguish with the human eye) aiming to strike a balance between acceptable loss of detail and bandwidth.

Lossy compression is not recommended for master images, as it scarifies (boo! x2) pixel information. It is, however, very useful for managing the bandwidth of derivative images – particularly those used for online access.

We use compressed JPEG/PNG images on our website.

While lossless compression is preferable you can see in the image below that lossy compression doesn’t always show a loss of detail. It depends on the amount of compression that is applied which in turn depends on the image content and resolution.

Lossy compression showing quality loss with a heavily compressed file

The more compression applied the more visible the result. With lossy compression you can reduce an image from 1/10 to 1/20 of its original size without perceived loss.  

Tip: Lossy compression is irreversible. Each time a jpeg file is saved – even after minor edits – it will lose quality.

File storage – Digital Asset Management

While storage costs decrease as technological capabilities increase, the size and number of individual digital files will have an impact on your resources. Determining an adequate storage capacity for the amount of data your digitisation program will potentially generate is an important part of your plan.

A helpful storage calculation

To estimate the size of storage required for the digital images, a small organisation may have a calculation like this:

[Average file size = 20MB]


[#Digitised files/day = 100]


[Workdays/year =260]


a storage requirement of 520GB/year (or 1.56Tb over 3 years).

A larger organisation could require a storage capacity of up to 10-15Tb per year (increasing each year). This calculation is from the State Records NSW Digitisation Guideline.

Factors to consider for storage

Security – can the files be tampered with/can an unauthorised user gain access?

Accessibility – are the files easy to retrieve by an authorised user? Is there a record of where items are stored? This could include sensible naming conventions for the digital files; organised folders/labels; keywords (metadata). Will they remain accessible long-term as storage systems change/or update?

An example of naming a convention for a series of files:

series number + job number + photo/file in sequence = 17420_a012_00004.jpg

At State Records our master files are stored on a dedicated server. Access is limited to authorised staff only, lessening the chance of lost or tampered data.

Image files for ‘use’ (web delivery, staff requests, copy orders etc) are stored on a separate server. A greater number of authorised staff have access to these files.

Media – will you store images on a hard-drive; CD/DVD; USB stick/memory card? There’s no perfect medium – each has a limited lifespan.

Back-ups – any of the above media could malfunction – have you made a back-up? Do you regularly update your back-up or check its functionality?

Recognised guidelines for capturing digital images

As we’ve discussed above, resolution, colour-depth, file type, compression and storage need to be considered in your plan.

Remember: these parameters often depend of the format of the original item.

Whilst there is currently no universal standard for digitisation specifications, a number of organisation have published recognised guidelines for capturing digital images – we have included here for your reference.

Every organisation will have differing requirements/capabilities depending on the nature of their collection and the digitisation resources available to them.

If you’re still reading give yourself an almighty pat on the back! In the next post we provide some tips on handling and scanning archives.

Digitising your collection – Part 2: The Golden Rule of Digitisation

So you’ve started to lay out your digitisation plan and have made the decision to scan in-house, outsource the work or split between the two.

This is the second post in a series on starting a digitisation program. The series covers: project planning; technical specifications; handling the archives; scanning tips; file storage, and; metadata and access.

The golden rule

‘Capture once, use many times’

By following this philosophy we digitise without an output in mind.

Capture once, use many times

Avoid the trap of creating a digital image to meet an immediate need. You may find that later on that another digital image (with a different file format requirement) of the same archive is requested. This means you will have to access that archive a second time, resulting in further moving and handling and potential damage.

Always create a high-resolution master file

…regardless of the original purpose. Many derivatives can be created from the one master file to meet many different needs in the future.

Future uses have not yet been thought of

Needs change over time, as does the digital life of an archive. Our archives often make the must-digitise list for a Digital Gallery on our website. A low res jpeg is suitable for web access but a master file is still digitised and a low res derivative created from it. If a web visitor likes a gallery image and submits a copy order request then a high-quality derivative of the master file can be generated without having to access the original item.

Example of the ‘capture once’ philosophy

A while ago we digitised some railway posters and brochures for an exhibition installation at the Western Sydney Records Centre…you remember, the one where our boss woke up at 3am? The documents were digitised as high res (master) TIFFs.

One derivative was generated as a print-quality file to be displayed as a poster in an exhibition case here:

Photo of exhibition display

See the poster front and centre?

And one derivative was created to become the whopping, great window transparency here at the front doors:

Window poster of the same image - capture once use many times

Window poster of the same image – capture once, use many times

Even if we think an image is only to be used as low/web resolution jpeg for web delivery we still create a high resolution master TIFF. If someone places a reading room request for a high quality image – or our boss has another 3am moment – we can provide it without disturbing the original archive.

Keep your program cost-effective

For a digitisation program to be cost effective and achieve its access and preservation goals the image file needs be created with flexibility in mind. Maximise the preservation/access benefits and avoid unnecessary handling of the original records.

And remember the Golden Rule…

Next week we get into the nitty gritty of technical specifications (without giving you a headache).