Data Management - Overview of Data Files

Is current?: 
Yes

This page gives a general overview of Project Implicit data files downloaded from the RDE - the contents of the files, how they are organized, and how they are related to one another.

For a guide of basic practical issues with cleaning and combining data that are (perhaps) unique to Project Implicit data, see Data Management - Practical Issues with Cleaning/Combining Project Implicit Data

For detailed instruction on data cleaning/combining, see Data Management - Detailed Steps for Cleaning/Combing Project Implicit Data

Project Implicit Data Files

Organization of Data Files from Different Sites

  1. Demo site, mental health site, featured tasks, and private clients all have 4 data files: sessions, sessionTasks, explicit, and iat.
    1. All files are organized by session_id
  2. Research site data have 5 data files: sessions, sessionTasks, explicit, iat, and demographics
    1. Demographics file only is organized by user_id and does not have session_id information since demographics are collected prior to study (at registration)
    2. Sessions and sessionTasks files have both user_id and session_id so can be linked with demographics file using user_id
    3. Explicit and iat files are organized by session_id because all data is collected within an individual study session
    4. One user_id may have more than one session_id within the same study if
      1. The study had no restrictions and participants were allowed to complete the same study more than once
      2. The study had follow-up sessions

Types of Data Files

Study session files - fixed from user point of view; changes require back-end programming

  1. Sessions file contains date and time information, as well as user information like previous sessions and browser
    1. Wide format organized by both user_id and session_id
    2. Definition of ‘completed’ is defined by the researcher

variable name

variable type

variable meaning

session_id

numeric

session ID (one user can have many sessions)

user_id

numeric

user ID (one per user)

study_name

alpha

name of study (from expt file)

session_date

date/time

date of session

session_status

alpha

C = completed session; null = incomplete session are defaults; 'T' for 'tilt' (misbehavior), 'P' for a paused study, 'W' for a withdrawn study, 'E' for an expired study, 'I' for an incomplete study implementing continuation, ‘D’ for researcher defined study completion

creation_date

date/time

date the session was created (default is same as session_date)

last_update_date

date/time

date and time of creation of last task

previous_session_id

numeric

session ID of research or demo study taken prior, if applicable – could be same study or different (within browser).  Only records for studies taken in same browser within 15 minutes of starting new study

previous_session_schema

alpha

r = research site, private link, contract study; s = anything else (demo, featured task, mental health, international); null = none

referrer

alpha

url of site that referred user to Project Implicit

study_url

alpha

url of current study

user_agent

alpha

user’s browser

 

  1. Demographics file contains demographics from user registration
    1. Tall format organized by user_id

variable name

variable type

variable meaning

characteristic

alpha

demographics questions

value

alpha

item response

user_id

numeric

user ID (one per user)

study_name

alpha

name of study (from expt file)

 

  1. SessionTasks file contains information about the assignment and order of tasks for coding experimental manipulations and counterbalances
    1. Tall format organized by session_id and user_id

variable name

variable type

variable meaning

session_id

numeric

session ID (one user can have many sessions)

task_number

numeric

order of task in full study (starts with 0)

task_id

alpha

name of task (from expt file)

task_url

alpha

url of task

user_agent

alpha

user’s browser

study_url

alpha

study url

task_status

alpha

presently junk variable; always ‘null’

task_sequence

alpha

presently junk variable; always ‘null’

task_creation_date

date/time

date and time (to the second) of creation of each task

user_id

numeric

user ID (one per user)

study_name

alpha

name of study (from expt file)

session_date

date/time

date and time of session

session_status

alpha

C = completed session; null = incomplete session

session_creation_date

date/time

date the session was created (default is same as session_date)

session_created_by

alpha

presently junk variable; always ‘YUIAT_RESEARCH_USER’

session_last_update_date

date/time

date and time of creation of last task

Study data storage files – user defined content

  1. Explicit file contains all explicit data and generic IAT feedback, can be adapted to record almost any kind of data that requires a response
    1. Tall format organized by session_id
    2. General format – can be used for almost anything

variable name

variable type

variable meaning

task_number

numeric

order of questionnaire in full study

question_number

numeric

order of question within questionnaire (starts with 0)

questionnaire_name

alpha

name of questionnaire (from expt file)

question_name

alpha

name of question (from questionnaire file)

question_response

alpha

response to question (takes many formats)

attempt

numeric

default is 1; can use for participant feedback

study_name

alpha

name of study (from expt file)

session_id

numeric

session ID (one user can have many sessions)

 

  1. IAT file contains data from the IAT or any implicit measure, and can be adapted for many procedures
    1. Tall format organized by session_id
    2. Name is too confining – can be used to any behavioral task, including but not limited to implicit tasks; easily records response latencies

variable name

variable type

variable meaning

block_number

numeric

order of blocks (starts with 0)

block_name

alpha

name of block (from iat xml file)

block_trial_count

numeric

number of trials in that block

block_pairing_definition

alpha

concepts and/or attributes in that block

study_name

alpha

name of study (from .expt file)

task_number

numeric

order of implicit task in full study

task_name

alpha

name of task (from iat xml file)

trial_number

numeric

order of trials within that block (starts with 0)

trial_name

alpha

stimulus (word or image name) for that trial

trial_response

alpha

category response that advanced that trials (correct response if correct response is required to advance)

trial_latency

numeric

reaction time for that response (in milliseconds)

trial_error

numeric

0 = correct response; 1 = error

session_id

numeric

session ID (one user can have many sessions)