Data Exports
Zooniverse projects provide a large amount of data to research teams. These data can be exported from the Data Export tab on a project's Lab page.
Classification export
This csv file has one row for every classification submitted for a project. This files has the following columns:
classification_id: A unique ID number assigned to each classificationuser_name: The name of the user that submitted the classification. Non logged-in users are assigned a unique name based on (a hashed version of) their IP address.user_id: User ID number is provided for logged-in usersuser_ip: A hashed version of the user's IP address (original IP addresses are not provided for privacy reasons)workflow_id: The ID number for the workflow the classification was made onworkflow_name: The name of the workflowworkflow_version: The major and minor workflow version for the classificationcreated_at: TheUTCtimestamp for the classificationgold_standard: Identifies if the classification was made on a gold standard subjectexpert: Identifies if the classification was made in "expert" modemetadata: AJSONblob containing additional metadata about the classification (e.g. browser size, browser user agent, classification duration, etc...)annotations: AJSONblob with the annotations made for each task in the workflow. The exact shape of this blob is dependent on the shape of the workflow.subject_data: AJSONblob with the metadata associated with the subject that was classified. The exact shape of this blob is dependent on the metadata uploaded to each subjectsubject_ids: The ID number for the subject classified
Subject export
This csv file has one row for every subject uploaded to a project. This file has the following columns:
subject_id: A unique ID number assigned to each subject as they are uploadedproject_id: The ID number for the projectworkflow_id: The workflow ID the subject is associated withsubject_set_id: The ID of the subject set the subject is connected tometadata: AJSONblob with the subject's metadatalocations: AJSONblob with the URL to eachframeof the subjectclassifications_count: How many users have classified the subjectretired_at: If the subject is retired this is theUTCtimestamp for when it was retiredretirement_reason: The reason why it was retiredcreated_at: TheUTCtimestamp for the creation of the subjectupdated_at: TheUTCtimestamp for the latest update to a subject
Workflows export
This csv file has the information for every major version of a workflow. This file has the following columns:
workflow_id: The ID number for the workflowdisplay_name: The display name for the workflowversion: The major version numberactive:trueif the workflow is activeclassifications_count: How many classifications have been made on the workflowpairwise:trueif selection behavior is set to compare subjects against each other (not typically used)grouped:trueif selection behavior set to select subjects by set (not typically used)prioritized:trueif selection behavior shows subjects in a given order (not typically used)primary_language: The language code for the workflowfirst_task: The task key for the first tasktutorial_subject_id: A default subject linked to the tutorial (not typically used)retired_set_member_subjects_count: The number of retired subjects from the workflowtasks: AJSONblob showing the full workflow structureretirement: The retirement rules for the workflowaggregation: Information passed to downstream aggregation services (depreciated)strings: AJSONblob containing all the text associated with the workflowminor_version: The minor version number