Rival Records Management Models in an Era of Partial Automation

Info: 11418 words (46 pages) Dissertation
Published: 28th Jan 2022

Share this: Facebook Twitter Reddit LinkedIn WhatsApp

Authors: James Lappin¹, Tom Jackson¹, Graham Matthews¹ , Clare Ravenwood¹

¹ Centre for Information Management, Loughborough University, Loughborough, UK

Abstract

Two rival records management models emerged during the 1990s. Duranti’s model involved moving records out of business applications into a repository which has a structure/schema optimised for recordkeeping. Bearman’s model involved intervening in business applications to ensure that their functionality and structure/schema are optimised for record keeping. In 2013 the US National Archives and Records Administration began asking Federal agencies to schedule important email accounts for permanent preservation. This approach cannot be mapped to either Duranti or Bearman’s model. A third records management model has therefore emerged, a model in which records are managed in place within business applications even where those applications have a sub-optimal structure/schema. This model can also be seen in the records retention features of the Microsoft 365 cloud suite. This paper asks whether there are any circumstances in which the in-place model could be preferable to Duranti and Bearman’s models. It explores the question by examining the evolution of archival theory on the organisation of records. The main perspectives deployed are those of realism and of records continuum theory. The paper characterises the first two decades of this century as an era of partial automation, during which organisations have had a general capability to automate the assignment of business correspondence to a sub-optimal structure/schema (that of their email system and/or other messaging system) but not to an optimal structure/schema. In such an era any insistence on optimising the structure/schema within which correspondence is managed may paradoxically result in a reduction in recordkeeping efficiency and reliability.

Keywords Archival theory · Records management · Records continuum theory

Background

Three rival models have emerged as to how organisations should manage their electronic records. Two of the models insist that records be managed within applications that have been designed with recordkeeping requirements in mind, and which assign records to the business activity from which they arose. The third model permits records to be managed within applications that are not designed with recordkeeping requirements in mind and which do not assign records to the business activity from which they arose.

The three models are:

the separate records repository model in which an organisation has a records repository that sits separately from the applications used to communicate messages and documents, and which has a structure and metadata schema that is optimised for recordkeeping. Such a model offers the prospect of bringing together in one place all the records of a given business activity, even where those records were created or received in different business applications. The foundations of this model were laid in the ‘Protection of the Integrity of Electronic Records Project’ led by Luciana Duranti at the University of British Columbia (UBC) (Duranti and MacNeil 1996). Duranti’s project led directly to the US Department of Defense statement of requirements for electronic records systems (DOD 5015.2-STD 1997) that exerted a strong influence over US government procurement (and by extension over vendors in the document and records management space) in the decade after 1997;
the intervention in business applications model in which an originating organisation builds recordkeeping functionality into each application that they use to conduct their business. Records are captured (and assigned to the business activity from which they arose) within those native business applications. The roots of this model can be traced to the ‘Functional Requirements for Evidence in Recordkeeping Project’ (University of Pittsburgh 1996) whose leading light was David Bearman. Bearman’s project did not have the impact on US government practice that Duranti’s project enjoyed, but Bearman did have a strong influence on Australian records management thought. Bearman’s thinking would later be reflected in two standards that both attempted to define a minimum set of functional requirements for a business application to be able to manage its own records (MoReq 2010; ISO 16175–3 2010);
the in-place model in which an organisation manages records within the applications that they were created in, without intervening to add recordkeeping functionality to those applications, or to adapt the structure/schema of those applications; and regardless of whether or not records within those applications are assigned to the business activity from which they arose. The first major policy example of this approach came with the announcement in 2013 by the US National Archives and Records Administration (NARA) of a new ‘Capstone’ policy approach to email, in which federal government agencies were asked to schedule the email accounts of important officials for permanent preservation (NARA 2013). Towards the end of the second decade of this century, tech giants Google and Microsoft used an in-place philosophy in their cloud suites. Microsoft’s cloud suite (Office 365, later renamed Microsoft 365) came equipped with a compliance centre through which a retention policy could be applied to any of the main types of aggregation (collaboration sites, email accounts, chat accounts, etc.) in any of the main applications (SharePoint, Exchange, MS Teams, etc.) within the suite (Microsoft 2020).

The in-place model is different in nature to the other two models. Duranti and Bearmans’ models were aspirational and sought to achieve a form of perfection in the way records are managed, whereas the in-place model is resigned to the necessity of managing records in applications that are not designed with recordkeeping in mind. Duranti and Bearman elaborated their models in detail, whereas the in-place model awaits a precise definition and has emerged through disparate actions and statements of various different decision makers, practitioners, and vendors. Duranti and Bearman’s models have clear roots in archival theory, whereas the relationship between the in-place model and recordkeeping theory has yet to be established.

Purpose and structure

This paper aims to develop a precise definition of the in-place model and to place it in relation to the pre-existing electronic records management models developed by Bearman and Duranti. It also seeks to identify whether or not there is a basis within archival theory for the in-place model.

The paper also aims to establish whether or not the in-place model could ever be preferable to the models of Duranti and Bearman. It uses a review of archival theory in relation to the organisation of records, and a review of the evolution of records management practice, to make this adjudication.

Part one of this paper reviews the writings of six key archival theorists (Jenkinson, Schellenberg, Scott, Upward, Duranti, and Bearman) to compare their viewpoints on the organisation of records. On the basis of this review an explanation is proposed as to the nature of the circumstances in which a model that seeks to manage records within sub-optimal structures/schemas would be preferable to models that seek to optimise the structure/schema(s) within which records are held.

Part two of this paper examines the evolution of recordkeeping practice with regard to the structure/schema of record systems, in order to locate the emergence of the three rival records management models within their historical context.

The conclusion of this paper uses the foregoing analysis to characterise the circumstances in which early twenty-first century organisations are managing their records; and to advance an explanation as to which of the three rival records management models (or which combination of the models) is likely to be most appropriate for those circumstances.

Perspectives deployed

Logic of enquiry

This study uses a realist logic of enquiry.

From a realist perspective the effectiveness of any given policy, programme, or system in the social world is to a great extent dependent upon the context in which that policy/programme/system is deployed (Pawson and Tilley 1997; Pawson et al 2005, p. 23). In adjudicating between the three rival records management models, this paper is not attempting to single out any one of the three models as being ‘better’ than the other two. Rather, it attempts to identify the types of circumstances in which each model is likely to lead to more reliable and efficient recordkeeping than the other two models.

Realists see social systems as consisting of a relatively complex implementation chain. Every point in the implementation chain is a potential point of failure for the system (Pawson et al 2005, pp. 22–23). From this perspective the assignment of records to a structure/schema is one point in the implementation chain of a records system and needs to work in conjunction with the other key records system processes in order for the records system as a whole to be effective.

Upward’s conceptualisation of the four dimensions of the records continuum (creation, capture, organisation, and pluralisation) (Upward 1996, p. 278) is used in this paper as the best approximation within archival theory to the implementation chain of a records system. Records continuum theory does not prescribe any particular model for managing electronic records. Each of the three models under consideration in this paper could be articulated in records continuum terms. This enables it to be used as a diagnostic tool to identify the strengths and weaknesses of different models.

Definition of a records system used in this paper

For the purposes of this paper, an originating organisation’s records system is defined in the broadest possible terms as being the sum total of all the applications, repositories, structures/schemas, processes, policies, and rules that the organisation uses to capture, organise, and manage all the information that it creates and receives in the course of conducting its business.

The three models that are the subject of this paper can be seen as three alternative strategies that an originating organisation might deploy as guiding principles for the management of records within its records system. An organisation might seek to:

move all the content it wishes to treat as records into a separate records repository with a structure/schema that unites records arising from the same business activity;
intervene within business applications to ensure that each application has the necessary recordkeeping functionality and assigns records to the business activity from which they arose; or
manage records within native business applications without intervening in those applications and without insisting that records are assigned to the business activity from which they arose.

Part 1: the evolution of archival thinking on the structure/schema of records systems

The question of how records are best organised has been an ever-present thread in archival and records management thought over the course of the past century. The key fault line concerns the question of whether or not there exists an ideal structure/ schema for a records system.

The term ‘structure/schema’ is used in this paper to cover the contextual information about records provided by both:

the metadata fields by which a record is described; and
the aggregations and classification structures by which records are grouped.

The term provides a degree of continuity between the paper age (when the physical aggregation and arrangement of records was of vital importance) and the digital age (when metadata fields can be as important as structures and aggregations).

Archival thought before the digital age

Jenkinson and the theory of original order

Writing immediately after the First World War, Jenkinson did not seek to identify an ideal order for a records system. He noted the practice of UK government departments of bringing together the records of one particular piece of business within one container (Jenkinson 1922, p. 89). He did not, however, place any special value on the organisation of records in this way.

Jenkinson’s overriding interest was that records serve as a reliable record of the work of an administration. For him there was no method of organising records that was a priori better than any other method. The key priority for Jenkinson was that an archive keeps records in the ’original order’ that they were kept by the original administration whose work they document (Jenkinson 1922, pp. 82, 87).

Schellenberg’s advocacy of functional classification

Jenkinson’s manual of archives administration contained no convincing approach to managing the retention and disposal of records. Further increases in the size of organisations and the volume of records during and after the Second World War made this matter ever more pressing. In 1956 Schellenberg wrote the founding text of records management in which he argued that:

Public records as a rule, should be classified in relation to function. They are the result of function, they are used in relation to function, they should, therefore, be classified according to function (Schellenberg 1956, p. 63).

Functional classification was the optimal method of organising records because it:

provides the basis for preserving or destroying records selectively after they have served the purposes of current business (Schellenberg 1956, p. 52).

Schellenberg outlined a scientific approach to managing records through their lifecycle on the basis of records retention rules specific to each different business function/activity (Schellenberg 1956, pp. 94–110).

Where an organisation is able to apply a functional classification to the records that action officers use as their main source of reference, there is no conflict between Schellenberg’s functional classification and Jenkinson’s concept of the original order.

The Australian series system

Scott, writing a decade after Schellenberg and working at the Commonwealth Archives Office of Australia, developed a new way of thinking about the organisation and description of records which came to be known as the Australian series system. Scott addressed the recordkeeping challenges posed by the dynamic nature of a modern administration with frequent changes to organisational structure and frequent re-allocation of functions from one agency to another (Scott 1966). The series system approach was to separate the description of records from the description of the context within which those records arose. The two key aspects of that context were:

the people (agencies, teams, individuals) that created/received the records;
the business (functions and activities) that the records arose from.

Cunningham summarised the series system insight thus—‘Archives are created when people or organisations perform functions and activities’ (Cunningham et al. 2013, p. 126). This triangle between people, business activities, and records would provide the basis on which the Australian recordkeeping metadata standard was built (McKemmish et al. 2006, p. 12).

In effect Scott had abandoned Schellenberg’s search for the perfect structure for an originating organisation’s record system. Even if the organisation found such a structure, the dynamic nature of government meant that the structure was likely to undergo change. From a series system perspective, it is not essential that an agency’s records are organised by business activity. What is essential is to maintain the triangular linkage between people, business activities, and records. Records in a structure/schema based on people can be mapped to the business activities that those people carried out (and vice versa).

Archival thought at the dawn of the digital age

Records continuum theory

The digital revolution of the 1990s led to a new focus on the concept of a records system and its qualities. Upward worked at Monash University to develop a new theory of how records moved through time and space. This was a conscious attempt to replace Schellenberg’s record lifecycle with a records continuum in which records were created into a world with pre-existing social and technical structures and systems.

Upward saw records continuum theory as a development of the Australian series system. His diagram of the continuum had four axes radiating at right angles to each other from an event in space–time. One axis (the recordkeeping axis) depicted the records arising from an event or set of events. This axis was separate from both:

the axis depicting the purposes/functions/activities/acts from which the event arose (the transactional axis);
the axis depicting the institutions, organisations, units, and actors mandated to fulfil/carry out those functions/activities/acts (the identity axis). (Upward 1996, p. 278).

In effect these three axes reproduce the records triangle of the Australian series system tradition. The structure within which records are aggregated is influenced, but not determined, by the structure of the functions/activities/acts that they document and the structure of the organisations/units/actors that carry out those functions. Figure 1 shows the records triangle superimposed onto the relevant parts of Upward’s diagram of the records continuum.

Upward identified four dimensions of the records continuum, in the shape of four key processes with which records are managed through space and time. He depicted these four dimensions on his records continuum diagram as four concentric circles radiating outwards from an event:

the creation of documents;
the capture of records;
the organisation of records as organisational memory;
the pluralisation of records as collective memory (Upward 1996, p. 278).

The advantage offered by Upward’s conceptualisation is that it makes explicit the relationship between the organisation of records and the capture of records; and between the organisation of records and the management and use of records. The assignment of records to a structure/schema is the third of four key recordkeeping processes. It is dependent on records having been created and captured into a record system. It is itself a pre-condition for the application of the records retention arrangements that enable an originating organisation to manage and pluralise records over time and space, and to contribute records to the collective memory of society.

ISO 15489–1: 2001 and the definition of a reliable records system

Upward resisted turning his records continuum theory into a recipe for a record system standard or specification to guide practitioners. It was left to the wider Australian recordkeeping community to produce such a standard, in the shape of what was first an Australian national standard (AS4390: 1996) and then taken up and revised into an international standard (ISO 15489–1: 2001). However, ISO 15489–1: 2001 took a much less nuanced position than had Upward when it came to the question of the organisation of records within record systems.

ISO 15489: 2001 contained perhaps the most concise and influential statement of what constitutes a perfect records system. The standard stated that a reliable records system must:

routinely capture all records within the scope of the business activities it covers;
organise records to reflect the business processes of the records’ creator;
protect the records from unauthorised alteration or disposition;
routinely function as the primary source of information about actions that are documented in the records;
provide ready access to all relevant records and related metadata (ISO 15489–1: 2001, s. 8.2.2)

Despite the genesis of ISO 15489-1: 2001 in Australia, this was a noticeable departure from the Australian series system, and from Upward’s continuum theory, in that it explicitly stated that records should be organised to reflect business processes.^{^[1]}

From a realist perspective, a weakness of the ISO 15489-1: 2001 conceptualisation of a reliable record system is that it does not discuss the play-offs between these criteria. It is written in a way that implies that is should always be possible for an originating organisation to configure their records system to meet all five of these criteria. We might however wonder what would happen if an organisation found itself in a position when it was not able to routinely capture all records into a structure/schema that organised records to reflect business processes.

The electronic records management system specifications of Duranti and Bearman

Whilst the Australian recordkeeping community were drafting the national standard that would later become ISO 15489: 2001, two rival projects, led, respectively, by Luciana Duranti at UBC and David Bearman at Pittsburgh University, were working to produce specifications of the functionality required from an electronic records management system. Like ISO 15489: 2001, both projects required such systems to link records to the business activity that they arose from.

Duranti’s UBC project found that an essential component of a record was the ‘archival bond’ which ‘refers to the link that every record has with the previous and subsequent one in the conceptual net of relationships among the records produced in the course of the same activity’ (Duranti and MacNeil 1996, p. 53). Duranti’s research led directly to the DoD 5015.2 standard for electronic record systems (DoD 1997).

Bearman’s Pittsburgh project stated that a records system should link each record to:

a single logical business activity (and the other transactions that also form part of that activity) (University of Pittsburgh 1996 requirement 7c1);
the business function to which the transaction belongs and the business rules that apply to that function (University of Pittsburgh 1996, requirement 7c3).

This unanimity among the major records management and electronic records management system standards and specifications of the 1990s gave the outward impression that it could be regarded as an axiom of archival science that records systems should assign records to the business activity/transaction from which they arose. This impression is not, however, representative of the broad stream of archival theory, given that neither Jenkinson, Scott, nor Upward made any such stipulation.

Initial conclusions from the review of archival thought on the organisation of records

The lesson we learn from reconciling Jenkinson’s advocacy of the principle of original order with Schellenberg’s advocacy of functional classification and from reconciling Upward’s four dimensions of the records continuum with the ISO 15489: 2001 criteria for a reliable records system; is that a record system needs to be judged on two key logics:

capture: the logic that determines what is captured into a system and when and how it is captured;
organisation: the logic that determines the structure/schema of the record system.

These two logics have to be considered together because they are interdependent. We can use these two logics to identify the circumstances in which one or other of the three available records management models are preferable:

In circumstances where it is possible to optimise the logic of organisation of the system without jeopardising the logic of capture, then there is no necessity or justification to adopt the in-place model. The originating organisation should try to optimise the structure/schemas in which records are kept by adopting either Duranti’s separate records repository model, or Bearman’s intervention in business applications model, or a combination of the two.
In circumstances where it is not possible to optimise the logic of organisation of the record system without jeopardising the logic of capture into the record system, then an originating organisation faces play-offs between the capture and organisation of its records. In such circumstances if the organisation wishes to maximise the reliability of its record system, it is best advised to adopt the in-place model where records are managed within the structure/schema of the native application, even when it is sub-optimal.

Part 2: the evolution of recordkeeping practice with regard to the structure/schema of record systems

Recordkeeping practice prior to the digital age

The use of registries to optimise both the capture and organisation of records

Jenkinson argued that the bulk of records consisted of correspondence: documents that moved from one person to another. Correspondence could be divided into three types: incoming letters, outgoing letters, and letters that circulated internally (Jenkinson 1922, p. 138).

Jenkinson’s Manual of archives administration described a mechanism for ensuring that business correspondence could be routinely and comprehensively captured as records. Organisations were able to interpose control points to intercept correspondence on its journey from sender to recipient (Jenkinson 1922, p. 143). These control points were called ’registries’ in the British and Australian governmental systems (the nearest equivalent in the US system was the ‘file room’). They were staffed by registry clerks who filed incoming correspondence and copies of outgoing correspondence. The registry system gave organisations control of both the capture of records and the organisation of records. The fact that incoming correspondence was filed before it even reached the action officer meant that the action officer had little option but to use the registered files as their main source of reference on their work.

With such a set-up there is no play-off necessary between the imperative to routinely and comprehensively capture records and the imperative to organise them to reflect business processes. Correspondence can be comprehensively and routinely captured into an order that reflects business processes, which serves as the main source of reference on those processes for individuals carrying out the work and for the organisation as a whole.

Limits to the coverage of any one structure/schema within a record system

Jenkinson admitted that there was one type of record that could not be captured via a registry system—namely records such as index books, registers, ledgers, inventories, and minute books which did not take the form of correspondence and hence did not need to move from person to person (Jenkinson 1922, p. 150). These records were the precursors of what we would call ’structured data’ in the digital age. They are books with a data structure, into which data entries are made. Any record system whose form of capture is based on the interception of documents during movement would not be able to capture these books as records because the books do not need to move—there is no sender or recipient for them to move between. Jenkinson argued that since correspondence in all of its forms accounted for the vast majority of an originating organisation’s records, it was possible to treat such books as exceptions that would sit outside of the registered records of the organisation (Jenkinson 1922, p. 156–157).

The existence of pockets of structured data within organisations places a question mark over whether it is possible for an organisation to establish one order and one logic of capture for its entire record system. It therefore places limits on the coverage of the separate records repository model.

Entries in indexes, ledgers, and (in the digital age) databases are fitted to the structure of that index, ledger, or database and are not necessarily portable to any other structure/schema. One option is to treat each database as one item. However, this would not work if the scope of the index book/ledger/database is broader than the scope of any single node/aggregation within the structure/schema into which the organisation wishes to integrate all its records.

The arrival of the digital age

The coming of line of business systems

Bearman describes how the first application of computers in organisations was for the more transactional and routine areas of organisational life (Bearman 1994, p. 30). These were systems (accounting systems, financial systems, stock control systems, case management systems, etc.) that brought efficiencies to one area of the business and that we might call line of business systems. These systems might be regarded as expansions of the indexes/ledgers/inventories of the paper age and like them could be treated as exceptions to both the capture mechanism and the structure/schema of the main record system of an organisation that deployed registries or file rooms to manage its records.

Line of business systems opened up bigger gaps in the coverage of the structure/ schema of a corporate-wide registry system than did indexes/ledgers/inventories:

a line of business system can be made accessible to anyone on the network on which the line of business sits, whereas hard copy indexes, inventories, and ledgers could only be viewed in one place;
line of business systems could be developed that hold documents as well as structured data.

We can hypothesise that it is perfectly possible for the introduction of a line of business system to result in no reduction in the reliability or efficiency of a record system. The fact that a line of business system is confined to one area of work activity, and the fact that the areas chosen for line of business systems tended to be those with the most predictable work routines, meant that:

it was possible to design workflows to ensure that documents and communications relating to that activity were routed through the line of business system in question;
it was possible to equip the line of business system with a structure/schema that was specifically tailored to the activity in question and which should be at least as good as (if not better than) the structure/schema of the general corporate records system.

However, there are limits to the range of activities for which it is worthwhile to deploy line of business systems. Analysing a work process and designing a system dedicated to it can deliver efficiencies if that work process is repeated in the same way many times. Once an organisation has deployed line of business system for its most transactional/highest volume activities, the law of diminishing returns means that the remainder of activities may be insufficiently standardisable and/or of insufficient volume/frequency to justify the creation of line of business systems.

The coming of email

In 1973 the first email protocol was issued (RFC 561:1973). It defined a format that would enable one Internet computer to send a message to another Internet-connected computer. The format took off in the late 1980s when proprietary systems came onto the market that enabled organisations to allocate email addresses to individual staff and to file messages sent and received from those addresses into an email account corresponding to the address. By 1996, for the first time, more items of correspondence were sent by email than through postal systems (Stephens 2007, p. 125).

The email format was a data structure. The email system of one organisation could instantly file any email received from another email system, because of the common data structure defined by the protocol.

The coming of email challenged the logic of capture of registry systems. Prior to the coming of email, items of correspondence had behaved like unstructured data. They were stand-alone documents that could be integrated by both the sender and recipient organisation into their own separate structures. In contrast, an email behaved more like an entry into an index, register, or database. It was already integrated into a structure from the moment of its creation. The email system of the recipient filed correspondence into the same structure/schema as did the email system of the sender.

Registry systems had fitted into the gap in time and space between a sender sending an item of correspondence and the recipient receiving it. Email systems collapsed that gap. By instantaneously and automatically filing emails into the structure/schema of the email system, the introduction of email rendered impossible any logic of capture based on the manual interception of items whilst in transit between sender and recipient.

An organisation deploying a corporate email system might choose to reject the structure/schema of the email system as a vehicle for the application of records retention rules. Instead, they might seek to apply retention rules through a structure/schema based on business activity (such as the structure of the registered filing system or its electronic successor). However, in so doing they risk a reduction in the reliability of their record system. Reliability could only be maintained if they were able to deploy both:

a routine way of filtering business correspondence from non-business correspondence; and
a routine way of ensuring that business correspondence is assigned to the appropriate place within the business activity-based structure/schema.

If both these conditions were met, an originating organisation would be able to disregard the structure/schema of email accounts as being irrelevant to recordkeeping. It would make an email account a purely temporary holding place for incoming correspondence, just as a pigeon hole in a post room was a purely temporary resting place for correspondence in the paper age.

However, if an organisation that uses email as its main channel of correspondence was not able to establish routine filtration and assignment mechanisms, then the email account would be the only place where an action officer could get a comprehensive view of their business correspondence. Viewed in Jenkinsonian terms email accounts would therefore become the original order of correspondence within an organisation.

David Bearman and the intervention in business applications model

David Bearman wrote what was perhaps the first major article from an archival theorist dedicated to email (Bearman 1994).

Bearman argued that the fact that most individuals carry out a variety of business activities meant that their email account would be unmanageable because the business rules that an organisation wishes to apply to records (and in particular records retention rules) all stem from business activities:

Electronic mail is a utility. As such it carries undifferentiated types of record for which we have very different business requirements. Since our reasons for keeping records have to do with business requirements for records for ongoing activity or long-term accountability, the fact that we don’t know what electronic mail contains, or more accurately what business transaction it carries out, means we don’t know how it needs to be managed. (Bearman 1994, p. 42).

Bearman’s article examines various options for ensuring that emails are routinely linked to the business activities from which they arose. He argues that in most corporate cultures a policy statement would be insufficient to ensure compliance, and that therefore the linking of emails to business activity would have to be built into the process of sending an email (Bearman 1994, pp. 33–36).

He outlined three alternative ways that this could be done:

by building a separate line of business system for each business area through which staff could send emails;
by deploying templates specific to business processes within the general email system and requiring staff to pick the relevant template when sending an email;
by interposing an extra step into the process of sending an email that requires the sender to assign the email to the business process from which it arose (Bearman 1984, p. 42)

Bearman’s approach to email was not widely adopted, and this is usually ascribed to it being too sophisticated technologically for organisations in the 1990s. Prom argued that ‘the software available at the time he wrote was not fit for the task of realizing his vision’ (Prom 2011, p. 6). However, it is also to be noted that the more technologically experienced organisations of the 2010s do not generally seem to have adopted Bearman’s approach to email either. We can therefore surmise that the non-adoption of Bearman’s proposed approach to email was not solely for reasons of technological complexity. The law of diminishing returns means that an organisation is unlikely to find it worthwhile to develop a line of business system for each activity, nor would they find it worthwhile to develop an email template for each business activity.

This leaves us with the option of interposing a corporate classification of business processes as a step in the process of sending an email. Let us suppose that:

there is a small percentage of organisations who have (or are willing to develop) a corporate classification of their business activities and are willing to interpose this as a step in the process of sending an email;
there is a majority of organisations which do not wish to interpose such a classification as a step in sending an email.

We are likely to find that those organisations that interpose that step would be spending longer to send each email than the organisations who do not interpose this step. This may not matter if the general volume of correspondence stays low. However, the greater the rise in the volume of email the greater is the extra time spent sending correspondence in an organisation that does impose the step compared to the time spent in one that does not. This would in effect put organisations who imposed this step at a competitive disadvantage when compared to organisations that did not.

If the above critique of Bearman’s proposals is valid, the implications are that it would be inefficient for organisations to adopt, on a corporate-wide basis, any routine method of assigning business email correspondence to any other structure/ schema than that of their corporate email system. This is because the first two decades of the twenty-first century have been an era of partial automation. Email systems had given organisations the ability to automate the assignment of correspondence to a structure/schema not of their choosing (that of the email system), but organisations did not have the ability to automate the assignment of correspondence to an optimal structure/schema (see for example the UK government’s assessment of its automated capability in Cabinet Office 2017, p. 15).

It did not prove cost-efficient to intervene to customise email systems on a corporate-wide scale to enforce end users to assign emails to the business process/activity/ transaction from which they arose. This left the records management profession with two options. They could either:

abandon the wish to organise correspondence in a way that reflects business processes; or
set up systems that could capture correspondence into a structure that reflects business processes via a mechanism that was not routine, and hence not comprehensive, and therefore not reliable.

Each of the two options above breaks at least one of the five criteria for a reliable records system set out in the first international records management standard (ISO 15489: 2001).

Luciana Duranti and the separate records repository model

For the first two decades of the digital age the choice made by the recordkeeping profession was to continue to try to organise business correspondence in a way that reflects business process. The theory behind the model adopted was provided by Duranti’s UBC project.

The UBC project argued that an essential component of a record was the ’archival bond’ between the record and other records arising from the same business activity (Duranti and MacNeil 1996, p. 53). The UBC conception was for a record system that sits separately from the business applications of the organisation and has the capability to take any form of record from any of those applications. The UBC team worked with the US Department of Defense to draw up a standard for electronic records management systems based on the UBC model (DoD 5015.2:1997). The standard was endorsed by the US National Archives and Records Administration for use in Federal Government (NARA 1998).

Duranti’s model was technically more straightforward to implement than Bearman’s. A DoD 5015.2 compliant system could be set up separately from other business applications and staff could be asked to select and move particularly important items into it. The model could therefore be implemented without any form of customisation to any business applications, including email systems. The model did not require any intervention in communication processes—individuals could send and receive email without the intervention of any records management controls.

The weakness of the UBC/DoD 5015.2 model was that the mechanism for capture of correspondence into the structure/schema of the record system was considerably weaker than any of the mechanisms proposed by Bearman. In the UBC/DoD model it is left to the discretion of individual action officers as to when and whether they declare any particular item of business correspondence into a DoD 5015.2 compliant system.

Assessment of Bearman and Duranti’s approaches to email

We can hypothesise that if the volume and velocity of business email exchanged had not exceeded the volume and velocity of business correspondence prior to email, then both Duranti and Bearman’s approaches to email could have worked. In circumstance where individuals are only sending a relatively small number of business emails per day then neither the constraint (under Bearman’s intervention in business applications model) that each email must be linked to a business activity at the point of sending; nor the responsibility (under Duranti’s separate records repository model) to declare business emails to a separate corporate records system, would have seemed an excessive burden.

In fact, email correspondence volumes did grow exponentially throughout the last decade of the twentieth century and the first decade of the twenty-first century. For example, Baron (2018) gives a comparison of the numbers of emails generated by the successive US presidencies of Clinton, Bush, and Obama. William Clinton’s White House (1993–2001) sent/received 32 million emails. George W. Bush’s White House (2001–2009) sent/received more than 200 million emails. Barack Obama’s White House (2009–2017) sent/received more than 300 million emails. The steepness of this rise in volume meant that, in the absence of an automated capability to assign items to the relevant business activity, neither Bearman nor Duranti’s approach was a viable approach to email.

The in‑place records management model

Jason Baron and the Capstone approach to email

A potential way out of this impasse was offered by Jason Baron in the second decade of the twenty-first century. Baron was Director for Litigation at the US National Archives and Records Administration (NARA) between 2000 and 2013 and his thinking laid the intellectual groundwork behind NARA’s new Capstone policy towards email (NARA 2013).

Baron came to the view that the volume and velocity of business communications in the digital age meant that the curation of records into an ideal structure/ schema was beyond the capability of human beings, and that asking federal officials to move important emails to a DoD 5015.2 compliant system was unfeasible (Baron and Attfield, 2012, p. 583).

Baron had rejected the Duranti/DoD 5015.2 model, but crucially he did not switch to a Bearman inspired model of intervening to customise email applications. Instead he proposed a new approach that could not be mapped to either Duranti’s or Bearman’s models. This new approach to email involved accepting the structure/schema of email systems.

Baron argued that a better way to ensure the preservation of important correspondence necessary for ongoing historical accountability was to identify those post holders within a federal agency that exercised key responsibilities and to select their email accounts for permanent preservation. Baron suggested that the account holders concerned be given a window during which they could identify personal and trivial emails in order that those emails could be excluded from preservation (Baron and Attfield, 2012, p. 587). Baron had, in effect, dropped the insistence that individual business emails be linked to the specific business activity that they arose from. Instead an email account is treated as an aggregation, and a retention/disposition rule is allocated to those aggregations on the basis of the role of the individual email account holder.

The fact that NARA made a dramatic switch in policy towards email without any corresponding attempt to seek a justification within archival theory left room for doubt as to whether Capstone could be considered as a legitimate recordkeeping approach. The preceding analysis of archival theory suggests two ways in which the Capstone approach is compatible with archival theory. The approach is compatible with the Australian series system insight that people can be mapped to functions, and that therefore the email accounts of individuals can be mapped to rules applicable to the business activities/functions that those individuals carried out. It is also compatible with Jenkinson’s theory of original order, if it is accepted that when an organisation fails to establish a routine way of capturing business emails into a separate structure/schema, then by definition email accounts are the original order of that business correspondence.

This still leaves us with the problem however that the Capstone approach is incompatible with both of the two electronic records models that were established in the 1990s, namely those of Duranti and Bearman.

If we accept the premise that record systems should always be organised to reflect business processes (as much of the profession in the 1990s did), then there are only two conceivable electronic record management models:

a model that seeks to deploy a dedicated record repository that is organised to reflect business activities; and
a model that seeks to ensure that every business application links records to the business activities they arose from.

By denying both the possibility and necessity to link business emails to the activity from which they arose, Baron and the Capstone policy opens up space for a third records management model, a model in which no ideal is proposed as to how an application must organise records. Baron himself made no attempt to articulate such a model. It is however a relatively straightforward matter to create a general records management model out of Baron’s insights and the most apt term for such a model is the ‘in-place model’.

Defining the in‑place records management model

The phrase ‘in-place records management’ has been in fairly common usage over the past decade, being loosely used to denote any approach, product or piece of functionality that seeks to apply records management rules to content without first moving that content into a separate records repository.

In order to create a third records management model out of Baron’s insights we need to be more precise about how we are using the term ‘in-place records management’.

One possible definition of the in-place model would define it as the drive to ensure that native business applications have the necessary functionality and the necessary structure/schema to manage their own records. Defined in this way, the inplace records management model would be indistinguishable from Bearman’s model and would exclude the Capstone approach. This would leave unresolved the fact that one of the largest archival institutions in the world has been adopting an approach to the dominant correspondence channel of the age (email) that does not fit with either of the existing electronic records management models. More importantly it would leave the profession with no records management model for those situations where it is neither feasible to move all significant business correspondence into an application optimised for recordkeeping nor feasible to customise or configure messaging systems with a structure/schema optimised for recordkeeping. If the foregoing analysis is correct, this is precisely the situation organisations have found themselves in since the mass adoption of email in the mid-1990s.

This paper therefore proposes a definition of in-place records management as being a model in which records are managed:

in-place within the native application (even where the functionality of the application is sub-optimal for recordkeeping) in circumstances where it is not feasible to consistently move all content needed as records into an application that is optimised for recordkeeping;
in-place within the structure/schema of a native application (even where that structure/schema is sub-optimal for recordkeeping) in circumstances where it is not feasible or cost-effective to customise, enhance, or reconfigure the structure/ schema of the native application to optimise it for recordkeeping.

This model complements but does not replace Bearman’s and Duranti’s models. It provides a justification, consistent with archival theory, for an organisation to manage emails within email accounts (and instant messages within instant messaging accounts) in circumstances where they lack an automated capability to reassign such messages to a structure/schema better optimised for recordkeeping.

Conclusion

The era of partial automation

The mass adoption of email in the mid-1990s ushered in a new era of recordkeeping. Prior to the adoption of email, correspondence was filed manually into a structure/schema of an organisation’s own choice. After the coming of email, correspondence was filed automatically into a structure/schema that was determined not in the originating organisation but by a globally agreed email protocol, and the proprietary email systems based on that protocol. This is, in effect, an era of partial automation.

In order to determine which of the three records management models is the most appropriate to this era, we need first to establish whether or not it is possible, in the era of partial automation, to optimise the efficiency of a corporate-wide records system. The in-place model involves acceptance of sub-optimal structure/schemas, so in circumstances where it is possible to optimise the efficiency of a structure/schema of a corporate records system (and/or of each native business application) we should reject that model.

The efficiency of a structure/schema has two aspects:

the efficiency of the process used to assign records to that structure/schema;
the efficiency with which retention and access rules can be applied to records through that structure/schema.

We can make the following two premises:

automated processes are in general more efficient than manual processes;
the most efficient structure/schema for recordkeeping purposes is one that organises records to reflect business processes.

If the two premises above are accepted, then we are forced into the conclusion that during the era of partial automation, it is not possible to optimise the efficiency of an organisation’s record system.

Organisations have the option of either:

accepting the structure/schema into which correspondence is automatically filed by messaging systems such as email systems (even though it is sub-optimal when it comes to the application of retention and access rules) or
deploying manual processes to assign business correspondence to a better structure/schema (even though the manual process is less efficient than the automatic filing of email systems).

The uneasy co‑existence of all three records management models

In this era of partial automation there is need for all three of the main records management models:

in an era where messages are automatically assigned to a sub-optimal structure, the in-place model appears to be the optimal approach to apply to correspondence in messaging systems, including email. This is because for many or most business activities there will be no cost-effective way of routinely assigning messages to any other structure than that of the messaging system itself;
Duranti’s separate records repository model is suitable for managing documents because, unlike messages, documents are created outside of any structure and therefore there is no loss of efficiency in asking humans to assign them to an optimal structure. It is possible to deploy a structure/schema for a corporate document management system that in theory could embrace all documents and correspondence of all activities. However, the logic of capture into the corporate document system will be extremely weak for any messages sent through generic messaging systems such as email;
for a high volume, highly transactional line of business it should always be possible to improve on both the logic of capture of the corporate document management system and on the structure/schema of a corporate messaging system such as email. This can be done by creating a line of business system supplemented by workflows/forms/automation to channel messages as well as documents through the system (in much the same way that Bearman recommended). For these line of business systems Bearman’s intervention in business applications model is appropriate. There seems little benefit in moving documentation from an application that is optimised for the activity in question to a corporate document management system with a generic structure/schema.

What this in effect means is that organisations in this era of partial automation are likely to deploy all three of the main extant records management models:

Duranti’s separate records repository model is suitable for managing documents from activities for which the construction of a line of business system is not viable;
an in-place model is suitable for managing the correspondence/messages arising from activities for which the construction of a line of business system is not viable;
Bearman’s intervention in business applications model is suitable for managing documents, correspondence, and data arising from business activities whose volume and/or predictability makes the construction of a line of business system cost-efficient.

The models exist in a state of uneasy co-existence, because the logic and ideas behind each model undermine the logic and ideas behind the other two models. However, there is a need for all three models because in an era of partial automation none of the three models on their own can optimise both the logic of capture and the logic of organisation of all the correspondence, documentation, and data arising from an organisation’s business activities.

Implications of the findings of this paper

The findings of this paper in no way invalidate past, present, and future efforts to optimise the design of record systems. They do however indicate that there is a specific set of circumstances in which managing business correspondence within applications which are not designed with recordkeeping mind is the best available option. This set of circumstances occurs when there is an imbalance between:

the relative efficiency of the automated assignment of correspondence to a structure/schema that is not optimised for recordkeeping (for example one where correspondence is aggregated by individual sender/recipient);
the relative inefficiency of the manual assignment of correspondence to a structure/schema that is optimised for record keeping (for example one where correspondence is assigned to the business activity it arose from).

The imbalance will disappear if and when the capability is acquired to automatically and reliably assign electronic correspondence to a structure/schema that is designed with recordkeeping in mind.

Funding: This paper comes from a doctoral research project funded by The National Archives (UK) (TNA). The opinions expressed in this paper are those of the authors, and do not necessarily represent the views of TNA.

Open Access: This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licen ses/by/4.0/.

Originally published at: Archival Science (2021) https://doi.org/10.1007/s10502-020-09354-9

References

AS4390 (1996) Australian standard: records management

Baron J (2018) The future of email archiving: four propositions. Digital preservation coalition briefing day, https://www.dpconline.org/docs/miscellaneous/events/2018-events/1768-dpc-email-ii baron/file. Accessed on 31 July 2020

Baron J, Attfield S (2012) Where light in darkness lies: preservation, access and sensemaking strategies for the modern digital archive. In: Duranti L and Shaffer E (Eds) The memory of the world in the digital age: digitization and preservation, 26–28 Sep 2012, Conference Proceedings. UNESCO, 2013. pp 580–595

Bearman D (1994) Managing electronic mail Arch Manuscr 22(1):28–50

Cabinet Office (2017) Better information for better government. www.gov.uk/government/publicatio ns/better-information-for-better-government . Accessed 30 August 2020

Cunningham A, Miller L, Reed B (2013) PJ Scott and the Australian ‘series’ system: its origins, features, rationale, impact and continuing relevance. Comma Issue 1:2013. https://doi.org/10.3828/comma.2013.1.13

DLM Forum Foundation (2011) MoReq2010: Modular requirements for records systems — volume 1: core services & plug-in modules. https://www.moreq.info/files/moreq2010_vol1_v1_1_en.pdf . Accessed 09 November 2020

Duranti L, MacNeil H (1996) The protection of the integrity of electronic records: an overview of the UBC-MAS Research Project. Archivaria 42:46–67

DOD 5015.2-STD (1997) Design criteria standard for electronic records management software applications. US Department of Defense

ISO 15489:1–2001 Information and documentation — records management — part 1: concepts and principles

ISO 15489–1:2016 Information and documentation — records management — part 1: concepts and principles

ISO 16175–3:2010 Information and documentation — Principles and functional requirements for records in electronic office environments — part 3: guidelines and functional requirements for records in business systems

Jenkinson H (1922) A manual of archives administration. Clarendon Press, Oxford

McKemmish S, Acland G, Ward N, Reed B (2006) Describing records in context in the continuum: the Australian recordkeeping metadata schema. Archivaria 48(September):3–37

Microsoft (2020) Create and configure retention policies. In: Managing information governance. https ://docs.microsoft.com/en-us/microsoft-365/compliance/create-retention-policies?view=o365 worldwide. Accessed 30 Aug 2020

NARA (1998) Baseline requirements for automated record keeping. https://www.archives.gov/recor ds-mgmt/policy/automated-recordkeeping-requirements.html. Accessed 5 Sept 2020

NARA (2013) Bulletin 2013–02 - Guidance on a new approach to managing email records. https://www.archives.gov/records-mgmt/bulletins/2013/2013-02.html. Accessed 27 July 2020

Pawson R, Tilley N (1997) Realistic evaluation. Sage, London

Pawson R, Greenhalgh T, Harvey G, Walshe K (2005) Realist review - a new method of systematic review designed for complex policy interventions. J Health Serv Res Policy 10(Suppl 1):21–34. https://doi.org/10.1258/135581905430853 0

Prom C (2011) Preserving email, DPC technology watch report (1st edn.). Digital preservation coalition. doi: https://doi.org/10.7207/twr11-01

RFC 561 (1973) Standardizing network mail headers. DOI https://doi.org/10.17487/RFC0561

Schellenberg TR (1956/1998) Modern archives: principles and techniques (Repr. ed.). SAA, Chicago (Original work published 1956)

Stephens D (2007) Records management: making the transition from paper to electronic records. ARMA International, Lenexa, KS

University of Pittsburgh (1996) Functional requirements for evidence in recordkeeping. https://www.archimuse.com/papers/nhprc/prog1.html . Accessed 12 June 2020

Upward F (1996) Structuring the records continuum - part one: postcustodial principles and properties. Arch Manuscr 24:268–285

James Lappin is investigating archival policy towards email in a doctoral research project based in Loughborough University’s Centre for Information Management, School of Business and Economics, and co-supervised by the UK National Archives. He has over 20 years’ of experience in the records management field as a practitioner, consultant, researcher and trainer, working for both public sector, private sector and international organisations in the UK and Europe.

Tom Jackson is Professor of Information and Knowledge Management in Loughborough University’s Centre of Information Management where he has conducted innovative research into information overload, the impact of digital sensors on daily life, and into the cost of email communications.

Graham Matthews is Emeritus Professor of Information Management in Loughborough University’s Centre of Information Management. His research interests include emergency planning in the cultural heritage sector and digital resilience.

Clare Ravenwood is an Associate Lecturer in Information Management, at the Centre for Information Management (CIM) at Loughborough University. Her PhD thesis investigated selection for digital preservation in libraries, archives, and museums. Her main research interests are digital preservation, selection and censorship, and archives and community engagement in business.

[1] In 2016 a revised edition of ISO 15489 was issued in which the list of criteria for a reliable records system was expanded (ISO 15489–1: 2016, s. 5.3.2.1). The requirement that a records system must routinely capture records within its scope was retained, but the requirement that a system be organised to reflect business processes was dropped.