OneRoster CSV Provider Template
  • 15 Aug 2022
  • 15 Minutes to read
  • Dark
    Light

OneRoster CSV Provider Template

  • Dark
    Light

Article Summary

CSV Formatting for OneRoster Providers

Note

A direct Excel template is available below.

The file format for each of the data files is a Comma Separated Values (CSV) format as specified in RFC 4180 with the extra restriction that carriage returns are not permitted within a field. Fields containing commas and double-quotes must be enclosed in double-quotes. If double-quotes are used to enclose a field, then a double-quote appearing inside the field must be escaped by preceding it with another double-quote.
The CSV files must be UTF-8 encoded RFC3629. Importing processors must tolerate BOM (Byte Order Mark) prefixes and ignore them. In a UTF-8 encoded file with a BOM, the BOM will appear as the 3-byte sequence 'EF BB BF' at the beginning of the file. If present, the CSV header will begin at the 4th byte of the file; if not present, the CSV header will begin at the 1st byte of the file.

Other key points are:

  • All header names and content values within the CSV file are case-sensitive. Incorrect case will result in conformance failure and should be logged as a detected error;
  • The Header row for each file is required;
  • The Header fields for a file must be supplied in the same order as defined in the tables below. If metadata extension fields are used, then they must be added to the right of the defined data fields in the specification, as the last set of columns. This ensures that the sequence of the defined data fields remains fixed irrespective of the presence or not of metadata fields;
  • Header field names must be uniquely named within a file;
  • Files that contain no data rows are NOT permitted;
  • For ALL fields that are optional and which have a defined enumeration token set, then the value of NULL/blank (i.e. a blank value "") is permitted to denote that NO data is supplied;
  • The support of ALL string-based data-types requires that a maximum length of at least 255 must be supported by implementations. If string lengths of greater than 255 are used then system may lose some part of the string without failing conformance.

When mapping from the data model OneRoster, 17a to the equivalent column headers the following special rules have been applied:

  • When referencing an object in the data model (using the data-type GUIDRef) the name has been extended using SourcedId e.g. 'org' to 'orgSourcedId';
  • When referring to a set of objects in the data model maps to a comma separated list, enclosed in quotation marks, under the corresponding column header (as per RFC 4180);
  • The references to resources in the Class and Course data models have NOT been mapped.

CSV import/export is envisioned to work in one of two ways, bulk and delta:

  • In BULK processing, the CSV files MUST NOT contain values in the "dateLastModified" and "status" fields. That is to say, for each row, there will be no data for either of these fields. Importing Systems MUST view this incoming data as the reference version of all data. That is to say, if records that were previously imported do not appear in this bulk csv file, then the importing system should internally mark those records as 'tobedeleted' with the 'dateLastModified' value set to the time of the bulk files import processing. If said records subsequently appear in a future bulk file, then those records should be updated and made active. When a set of Bulk files are created they must be semantically complete i.e. every object referenced by another object MUST also be defined in the corresponding source CSV file and included with the manifest.csv file;
  • In DELTA processing, the incoming CSV file MUST contain values in the "dateLastModified" and "status" fields. That is to say, each row in the CSV file should include data describing that (the row) data was last modified and its status. Importing systems must view this incoming delta file as changes to the reference data, and should make the appropriate changes, deletions and insertions of the new data.
    • Field Header - the name for the Column Header;
    • Required - denotes if data must be present. The meanings for this field are:
      • Yes for fields which are always required
      • Yes for Delta for fields which must be populated for Delta processing, and which must be blank for Bulk processing
      • No for fields which do not require values;
      • Format - the data-type for the data.
        • GUID denotes some form of globally unique identifier (this must be less than 256 characters). This is not restricted to the 128 bit UUID format
        • GUID Reference denotes the GUID of an object defined in some other CSV data file
        • List of GUID Reference denotes a comma-delimited list of GUID References of objects defined in another CSV data file
        • ID denotes an identifier defined outside of the OneRoster specification, e.g. identifiers generated by vendors for their resources in the Resources dataset
        • String denotes a sequence of characters that should follow the description. Generally this is aimed to be human-readable (e.g., "Science Lesson")
        • List of Strings denotes a sequence of Strings. The individual strings within the list must not contain commas, and the list is a comma separated list (e.g., "1,2,3"). If a List of Strings contains more than one element, then it must be double-quote encapsulated
        • Enumeration denotes a fixed set of values that will be referred to in the description. In the case of fields which are not required, an empty field denotes no value
        • Enumeration List denotes a list of Enumerations. The list is comma separated (e.g., "teacher,student,aide").
        • Float denotes a floating point number
        • DateTime denotes a timestamp format. DateTimes MUST be expressed in W3C profile of ISO 8601 and MUST contain the UTC timezone and MUST have a resolution of milliseconds (e.g., "2012-04-23T18:25:43.511Z").
        • Date denotes a date format. Dates MUST be expressed using ISO 8601 format (http://tools.ietf.org/html/rfc3339), more commonly formatted as "YYYY-MM-DD" e.g., "2002-04-23"
        • Year denotes a date format of year only. Years MUST be expressed using ISO 8601 format (http://tools.ietf.org/html/rfc3339), more commonly formatted as "YYYY" e.g., "2002"
      • Example/Description - an example of the expected data, and a brief statement of the nature of the content.

A number of important changes have been made from v1.0, namely:

  • The 'dateLastModified' in v1.1 had been changed from the v1.0 the resolution 'YYYY-MM-DD' to 'YYYY-MM-DDTHH:MM:SS.sssZ'. For backwards compatibility, a v1.0 format of 'YYYY-MM-DD' should be transformed to 'YYYY-MM-DDT23:59:59.999Z' in a v1.1 context;
  • In the 'status' field the enumeration value of 'inactive' has been removed. For compatibility with v1.0, a value of 'inactive' should be interpreted as 'tobedeleted';
  • When delta-processing information is being exchanged and records are marked 'tobedeleted' then all fields that have been identified as mandatory (with the exception of the object 'sourcedId') should be considered optional.

For CSV exchange, the data model can only be extended by adding new columns to the end of the defined set of columns. A suitable naming convention should be used for the new header names. An example is: metadata.[orglabel].[namepart]
Where:
metadata is the initial set of characters for all extensions;
[orglabel] is the label used to identify the organization that has created the extension. In the case of an extension from IMS Global this field will be empty and one of the delimiters removed;
[namepart] the unique header name part allocated by the organization for the data model.
Examples are: 'metadata.hmh.homeemail', 'metadata.pearson.namesuffix', etc.

academicSessions.csv Data Headers

A quick explanation of the different columns expected in the academicSessions.csv tab.

Column Field HeaderRequiredFormatDescription
sourcedIdYesGUIDSourcedId of this academicSession
statusYes for DeltaEnumerationSee section 4.13.8 (OneRoster, 20a) for the enumeration list. This must NOT be used for Bulk mode.
dateLastModifiedYes for DeltaDateTimeThe date that this record was last modified. This must NOT be used for Bulk mode.
titleYesStringName or title of the academic Session
startDateYesDateInclusive start date for the academic session
endDateYesDateExclusive end date for the academic session
parentSourcedIdNoGUID ReferenceSourcedId of the parent of this academic session
schoolYearYesYearThe school year for which the academic session contributes. This year should be that in which the school year ends. (Format is: YYYY)

classes.csv Data Headers

A quick explanation of the different columns expected in the classes.csv tab.

Column Field HeaderRequiredFormatDescription
sourcedIdYesGUIDUnique ID for the class. SourcedId is used in other files and must be unique across classes.
statusYes for DeltaEnumerationSee section 4.13.8 (OneRoster, 20a) for the enumeration list. This must NOT be used for Bulk mode.
dateLastModifiedYes for DeltaDateTimeThe date that this record was last modified. **This must NOT be used for Bulk mode. **
titleYesStringName of this class.
gradesNoList of StringsGrade(s) for which the class is attended. The permitted vocabulary is from CEDS (Version 5) and the Entry Grade Level element.
courseSourcedIdYesGUID ReferenceSourcedId of the course of which this class is an instance.
classCodeNoStringHuman readable code used to help identify this class.
classTypeYesEnumerationSee section 4.13.1 (OneRoster, 20a) for the enumeration list.
locationNoStringHuman readable description of where the class is physically located.
schoolSourcedIdYesGUID ReferenceSourcedId of the Org that teaches this class of OrgType "school"
termSourcedIdsYesList of GUID ReferenceSourcedId of the terms (the academicSessions) in which the class is taught.
subjectsNoList of StringsSubject name(s) in human readable form. If the 'subjectCodes' attribute is present, then the subjects and subjectCodes list must have the same length and have order significance. The vocabulary is from SCED (School Codes for the Exchange of Data) (Version 4) for the "Course Title" field: http://nces.ed.gov/forum/SCED.asp. If the value of the Course Title contains commas, then those commas must be removed. For example, the "Course Title" for "SCED Course Code" "03210" is "Science, Technology, and Society", this must be represented as "Science Technology and Society".
subjectCodesNoList of StringsSubject code(s) in machine readable form. If more than one subject code is needed, use double quotes and separate with commas (per RFC 4180). If the 'subjects' attribute is present, the two lists must have the same length and have order significance. For deployments in the USA, this vocabulary should be a School Courses for the Exchange of Data (SCED) code: [https://nces.ed.gov/scedfinder] (https://nces.ed.gov/scedfinder){target="_blank"}.
periodsNoList of StringsThe time slots in the day that the class will be given. If more than one period is needed, use double quotes and separate with commas (per RFC 4180). Examples: 1; "1,3,5".

courses.csv Headers

A quick explanation of the different columns expected in the courses.csv tab.

Column Field HeaderRequiredFormatDescription
sourcedIdYesGUIDUnique ID for the course
statusYes for DeltaEnumerationSee section 4.13.8 (OneRoster, 20a) for the enumeration list. This must NOT be used for Bulk mode.
dateLastModifiedYes for DeltaDateTimeThe date that this record was last modified. This must NOT be used for Bulk mode.
schoolYearSourcedIdNoGUID ReferenceSourcedId of an AcademicSession with type of "schoolYear"
titleYesStringName of the course
courseCodeNoStringHuman readable course code
gradesNoList of StringsGrade(s) for which the class is attended. The permitted vocabulary is from CEDS (Version 5) for the Entry Grade Level Element https://ceds.ed.gov/elements.aspx
orgSourcedIdYesGUID ReferenceSourcedId of an org to which this course belongs
subjectsNoList of StringsSubject name(s) in human readable form. If the 'subjectCodes' attribute is present, then the subjects and subjectCodes list must have the same length and have order significance. The vocabulary is from SCED (School Codes for the Exchange of Data) (Version 4) for the "Course Title" field: https://nces.ed.gov/scedfinder. If the value of the "Course Title" contains commas, then those commas must be removed. For example, the "Course Title" for "SCED Course Code" "03210" is "Science, Technology, and Society", this must be represented as "Science Technology and Society".
subjectCodesNoStringSubject code(s) in machine readable form. If the 'subjects' attribute is present, then the subjects and subjectCodes lists must have the same length and have order significance. For deployments in the USA, this vocabulary should be a School Courses for the Exchange of Data (SCED) code: https://nces.ed.gov/scedfinder

demographics.csv Headers

Demographics are optional, so data providers do not necessarily need to provide this information. If they do, this is the format to use.
The dependencies of this file on other files when supporting bulk processing are:

  • This requires a reference to the associated user (user) using the 'sourcedId' attribute. This produces a dependency on the users.csv. This creates a corresponding dependency on the orgs.csv file.
Column Field HeaderRequiredFormatDescription
sourcedIdYesGUID ReferenceSourcedId of the User to which the demographics refer. Typically this will be a student. Each user can only have one demographics record.
statusYes for DeltaEnumerationSee section 4.13.8 (OneRoster, 20a) for the enumeration list. This must NOT be used for Bulk mode.
dateLastModifiedYes for DeltaDateTimeThe date that this record was last modified. This must NOT be used for Bulk mode.
birthDateNoDateUser's date of birth
sexNoEnumerationSee section 4.13.8 (OneRoster, 20a) for the enumeration list.
americanIndianOrAlaskaNativeNoEnumerationPermitted values: { "true" or "false" }
asianNoEnumerationPermitted values: { "true" or "false" }
blackOrAfricanAmericanNoEnumerationPermitted values: { "true" or "false" }
nativeHawaiianOrOtherPacificIslanderNoEnumerationPermitted values: { "true" or "false" }
whiteNoEnumerationPermitted values: { "true" or "false" }
demographicRaceTwoOrMoreRacesNoEnumerationPermitted values: { "true" or "false" }
countryOfBirthCodeNoStringCountry where the user was born. The permitted vocabulary is from CEDS (Version 5) for the "Country of Birth" element: https://ceds.ed.gov/elements.aspx
stateOfBirthAbbreviationNoStringState where the user was born. The permitted vocabulary is from CEDS (Version 5) for the "State of Birth" element: https://ceds.ed.gov/elements.aspx
cityOfBirthNoStringCity where the user was born.
publicSchoolResidenceStatusNoStringAn indication of the location of the user's legal residence relative to (within or outside) the boundaries of the public school attended and its administrative unit. The permitted vocabulary is from CEDS (Version 5) for the "Public School Residence Status" element https://ceds.ed.gov/elements.aspx

enrollments.csv Headers

A quick explanation of the different columns expected in the enrollments.csv tab.

Column Field HeaderRequiredFormatDescription
sourcedIdYesGUIDUnique identifier of this enrollment.
statusYes for DeltaEnumerationSee section 4.13.8 (OneRoster, 20a) for the enumeration list. This must NOT be used for Bulk mode.
dateLastModifiedYes for DeltaDateTimeThe date that this record was last modified. This must NOT be used for Bulk mode.
classSourcedIdYesGUID ReferenceSourcedId of the Class.
schoolSourcedIdYesGUID ReferenceSourcedId of an Org with type 'school'.
userSourcedIdYesGUID ReferenceSourcedId of the User.
primaryNoEnumerationPermitted values: { "true" or "false" }
beginDateNoDateThe start date for the enrollment. This date must align with the associated academic session (term) identified for this class.
endDateNoDateThe end date for the enrollment (exclusive). This date must align with the associated academic session (term) identified for this class.

orgs.csv Headers

A quick explanation of the different columns expected in the orgs.csv tab.

Column Field HeaderRequiredFormatDescription
sourcedIdYesGUIDUnique ID for the organization. SourcedId is used in other files and must be unique across all organizations.
statusYes for DeltaEnumerationSee section 4.13.8 (OneRoster, 20a) for the enumeration list. This must NOT be used for Bulk mode.
dateLastModifiedYes for DeltaDateTimeThe date that this record was last modified. This must NOT be used for Bulk mode.
nameYesStringName of the organization.
typeYesEnumerationSee section 4.13.8 (OneRoster, 20a) for the enumeration list.
identifierNoStringNCES ID (National Center for Education Statistics) for the school district.
parentSourcedIDNoGUID ReferencesourcedId of an Org representing the parent organization

users.csv Headers

A quick explanation of the different columns expected in the users.csv tab.

Column Field HeaderRequiredFormatDescription
sourcedIdYesGUIDUnique ID for the user. SourcedId is used in other files and must be unique across all users.
statusYes for DeltaEnumerationSee section 4.13.8 (OneRoster, 20a) for the enumeration list. This must NOT be used for Bulk mode.
dateLastModifiedYes for DeltaDateTimeThe date that this record was last modified. This must NOT be used for Bulk mode.
enabledUserYesEnumerationPermitted values: { "true" or "false" }. "false" denotes that the user is an active record but system access is curtailed according to the local administration rules.
orgSourcedIdsYesList of GUID ReferencesSourcedIds of the Orgs to which this user belongs. Note that in most cases, it is expected that users will belong to a single school.
roleYesEnumerationSee section 4.13.5 (OneRoster, 20a) for the enumeration list.
usernameYesStringUser name
userIdsNoList of StringsExternal machine-readable ID (e.g. LDAP id, LTI id) for this user. The ID must be accompanied by a type to indicate the nature of the Identifier. The Type and ID values are enclosed in '{}' with a colon used to separate the values. If more than one userId is needed, use double quotes, and separate with commas (per RFC 4180). Examples: {LDAP:Id}"{LDAP:Id},{LTI:Id},{Fed:Id}"
givenNameYesStringUser's first name
familyNameYesStringUser's surname
middleNameNoStringUser's middle name(s). If more than one, they are separated by space.
identifierNoStringIdentifier for the user with a human readable meaning.
emailNoStringEmail address for the user.
smsNoStringSMS address for the user.
phoneNoStringPhone number for the user.
agentSourcedIdsNoList of GUID ReferencesSourcedIds of the users to which the user has a relationship. If multiple IDs are required, then use double quotes and separate with commas. Note: In most cases, this will be for indicating parental relationships.
gradesNoStringGrade(s) for which a user with the role "student" is enrolled. The permitted vocabulary is from CEDS (Version 5) for the "Entry Grade Level" element: https://ceds.ed.gov/elements.aspx.
passwordNoStringThe password for the user. This may or may not be an encrypted string. If encrypted, the systems processing must be aware of the encryption method.

Template Download

Download an Excel Template to begin your OneRoster journey in RapidIdentity Studio.

Studio CSV Provider File Template.xlsx

Note

Each worksheet will need to be saved as a .csv file for the OneRoster standard functionality.


Was this article helpful?