Skip to content

IID Protocol

Jefferson Smith edited this page Jun 2, 2021 · 10 revisions

Table of Contents

Syntax

After some initial experimentation with other schemes, eventually the INTERACT_IDs became standardized as a 10-digit integer, broken down as follows, from left to right:

  • 2-digits to signify the city in which the user first participated
  • 2-digits to signify the wave number in which the user first participated
  • 6-digits to signify an incremental counter for the number of participants in that city and wave
Participating study cities are numbered sequentially, with:
  • 01 = Victoria, BC
  • 02 = Vancouver, BC
  • 03 = Saskatoon, SK
  • 04 = Montreal, QB
Waves are numbered sequentially for each city, starting at 01. (Although a preliminary proof-of-concept study was conducted in Saskatoon and has been designated Wave 00.)

Uniqueness

The original intent was for each participant to have exactly one IID, which would follow them around through subsequent studies and even different cities, if they relocated. In practice, however, this has not held true. A number of factors make it possible for a single participant to be assigned multiple IID values. For example, if a user registers for two different studies using a different email address for each, it is likely that they will be issued distinct IIDs in each case.

It is also possible, although rare, that a single user might register multiple times for a single study and end up being assigned multiple IIDs. In each case where we have discovered this, only one of the IIDs actually contributed any data, so that was the one included in the dataset.

Generally speaking, researchers should assume that within a given study, each IID corresponds to a single, unique participant. For those wishing to compare individual users across multiple studies, further steps may be required to detect IID "aliases".

Representation

Given that each IID begins with a city_id, and that city_ids are defined to be 2-digit integers, most IID values assigned to date begin with a 0. For performance reasons, however, IIDs are represented in the PostgreSQL DB as integers (bigint). This means that will often be displayed in their 9-digit form, with the leading 0 omitted. Dataset users are advised to take care when comparing IID values - especially when they are taken from disparate sources. If comparisons are being made with IIDs in string format, be sure all IIDs are in their 10-digit form, with leading zeros restored. For this reason it is usually safer to conduct comparisons between all IIDs as integers.

Assignment

IIDs are generated automatically by Treksoft when a participant registers for a study. The IID is visible to coordinators in the participant management portal but cannot be edited.

Clone this wiki locally