Project: Detailed tracking of rail passenger journeys

Reference: STP 7/2/1

Last update: 03/09/2009 12:08:21


The overall aim is to facilitate effective and efficient planning and operation of the rail network, by enabling the use of a range of datasets to capture a more detailed and responsive representation of rail travel in Great Britain.

Specific objectives are:

. Identification of initiatives which would allow the efficient capture of existing datasets, and encourage the continuing creation of new datasets where useful.
. Development of a methodology for processing the available datasets into a single database of rail journeys in Great Britain.
. Development of a design for a central electronic store of the information, accessible as appropriate for use within government and the rail industry.
. Dissemination of the findings to key industry stakeholders in government and commercial organisations.


Efficient and effective investment in transport systems is dependent on a valid understanding of passenger journey requirements and behaviours, derived from observation of historic journey choices and volumes. Knowledge of historic journeys on the rail network is currently derived from ticket sales and customer surveys. These datasets are often incomplete, fragmented, expensive to update or have a limited shelf-life. The creation of a more detailed, responsive and accessible dataset, will enable transparent understanding and tracking of journeys, ultimately facilitating improvements in:

. Investment - guiding investment; identifying inefficiencies; measuring value,
. Planning - maximising passenger benefits; optimising resource use; tracking impact of changes,
. Operations - reducing delay impacts; improving safety.

This project will investigate means of producing a detailed and continuously updated database of rail travel within the UK. It will combine extensive rail experience and expertise with recent e-Science Grid middleware developments, enabling the integration and analysis of distributed and heterogeneous data sources. It will use a combination of traditional sources with 'new' datasets extracted from train-based load measurement and ticket gates. Activities will include:

. a review of availability of data from manual and electronic sources, including ease of access;
. investigation of the statistical validity and accuracy of source data;
. development of mathematically robust techniques for deriving passenger volumes;
. design of an information store to enable the integration and analysis of data sources, providing appropriate routes for access by users;
. dissemination of results through written reports, presentations and meetings with potential users and industry stakeholders.

The project outputs will include a proof of the concept, including a data processing methodology and prototype database. Cross-benefits in other modes, notably the bus industry will also be investigated.


AEA Technology Rail (London)
Central House, Upper Woburn Place, LONDON, WC1H 0JN

Contract details

Cost to the Department: £98,000.00

Actual start date: 01 October 2005

Actual completion date: 01 December 2006

Summary of results

  1. The rail industry collects a lot of passenger journey data. Some data sources include details of passenger journeys, whereas others contain data from which information about passenger journeys can be derived, such as ticket sales and train loads. The data sources vary significantly in their richness, quality, accuracy and coverage.

    Passenger journey data is needed and is used by many parties right across all organisations that make up the rail industry. The uses range from understanding the rail market, capacity allocation, operational and monitoring or regulation. Accurate and timely access to the highest quality information on which decisions are made will increase the efficiency of the investment in the UK rail industry, the meeting of performance targets and the maxirnisation of profits for the commercial organisations.

    The current approach to the collection and use of passenger journey information depends on the organisation that collects the data, the uses to which it is put, and the value that the users perceive they can generate from the information. Typically, government organisations collect information on passenger journeys to inform their strategic planning, and they are likely to make this information available to a wide range of organisations: both government and commercial. Train Operating Companies will collect information in order to better manage their franchise, to aid them in meeting their franchise obligations and to maximise their profits, and may not necessarily share such commercially sensitive information outside of their business. Some passenger journey data sources are snapshots, and some are continuously collected.

    The aim of this research project was to investigate the possibility of creating a detailed and responsive source of information on rail passenger journeys in the UK. It is recognised that a more valuable source of information made available to all parties within the industry, would increase the effectiveness of the investment made into the railways and improve the service offering to passengers.

    Our research shows how rail data sources can be combined together in a systematic way, using the latest e Science technologies, in order to maximise the useful information derived, and that this combination can provide a vital contribution to tackling a range of standard rail industry questions and studies with increased speed and accuracy.

    The outputs resulting from this project can be separated into the following elements:

    - Comprehensive and consistent documentation including an assessment of accuracy, of all the major sources of passenger rail journey data that currently exist within the rail industry.

    - An overview of how rail passenger journey information is used within the rail industry covering all organisations within the industry and uses from strategic planning to operations. A range of Use Cases have been developed to demonstrate how the journey data is currently used, and to show how a more comprehensive source of journey data could provide more value to the users.

    - Demonstration of how journey data may be combined to create a single source of passenger journey information. This concept has been developed for two particular examples:

    + Ticket sales (from Lennon) and a passenger survey (the London Area Travel Survey). The national rail timetable and on train automatic passenger counts.

    - Development of a design for a Passenger journey Information System that incorporates: Data layer ability to draw large data sets from different sources into the system. Middleware layer data analysis tools for manipulating the data. User interface presentation tools to allow users to extract value from the journey data.

    - Development of the concept of a 'workflow' which captures methodologies generated by railway data domain experts within the Passenger journey Information System. Methodologies are required for: data cleaning and pre processing, combining data sources, and performing analysis on the journey data.

    - Development of an illustrative prototype of the Passenger journey Information System in the InforSense system. This prototype demonstrates:

    + Incorporation of an example combination of rail data sources into a system, in order to show how the wide range of different data sources may be combined systematically.
    + Generation of workflows to combine journey data sets, and to generate answers to specific User Queries.
    + User interface from which graphical presentations of the answers to User Queries can be obtained.

    Confirmation that a wide range of rail industry questions or studies can be improved through better combination of the passenger data sources.

    - The Passenger journey Information System concept that we have developed can generate value to the industry via:

    - Ease of access to journey data from a single system;

    - Ability to easily extract value from the journey data via the use of workflows and a user interface appropriate for an expert analyst user or a manager user;

    - Capture of domain knowledge from railway specialists;

    - Access to established methodologies for data combination and data analysis;

    - Consistent approach can be adopted, reducing duplication of effort and thus increasing efficiency across the industry; and

    - Promotion of a standard approach to certain data analyses creating a common language, and increasing understanding across the different parties in the industry.