Subtypes and domains are fairly useful geodatabase features to know about especially when it comes to data validation and design of editing templates.
Knowing what they are and when to use them can mean the difference between a consistent semi-automatic field data collection experience and database anarchy. We recently looked a little closer at these two features when reviewing a template for collecting weed and other pest observations for a local council or recording tree records for an arborist. Faced with hundreds of different species of weeds, we needed some way to minimize the list of possible values.
We had an inkling that subtypes and domains were the way to go but needed to apply a little lateral thinking. Every Esri example that appears in the help documentation seems to gravitate to either streets or water infrastructure (hence the lateral thinking). The story seems to go something like this:
“For example, the following codes in a subtype named RoadClass could represent valid classes in a feature class for streets:
0 – Local Streets
1 – Secondary Streets
2 – Main Streets”
Well that’s great if you have three different pieces of road but how would that work when you have hundreds of classified values such as weed and tree species. Well fortunately, the council was already collecting a field called Weed Form which basically indicated the growth form of the weed – shrub, vine, tree, grass and aquatic… These then could effectively be feature class subtypes.
We read through the Esri desktop help and various tutorials which showed some great screenshots of creating subtypes and assigning default values and domains for each subtype but it was only when we found a bit of a short and sweet discussion topic on one of the GIS chat sites that the penny really dropped: Using subtypes means that you can use different domains for the same attribute field.
This little nugget is not readily apparent in the Esri help or is perhaps not of great significance in building sewer networks but its absolute gold when it comes to species lists.
This now meant that for each weed form, we could have a different species list or domain, i.e. AquaticSppDomain which was a significantly shorter list than the original single, all species in one domain. We could also arrange domains alphabetically or in whichever order makes the most sense to the field teams.
When we applied the subtypes, set an edit template and assigned the default values and made use of our new domains, the result was a much more useful configuration of Collector.
A sample Genus-Species data model
To demonstrate and to get you started, the following steps describe setting up a data model for tree inspections, in this case, the model was used with ArcGIS for Windows Mobile but works just as well with Collector.
In this case, there are Genus and Species fields and we make use of subtypes and domains to deliver the required user experience.
The sample data model is attached as an Excel Tree Data Model. The domain data can be imported to your geodatabase using the Excel to Table tool followed by the Table to Domain tool to create the domains for the different Species. You can manually manage domains via the workspace (geodatabase) Properties in ArcCatalog.
The first spreadsheet (tab) called Genus, has 2 columns (Code and Description). The Code column is the subtype which is an integer value. Note that we have ordered the Description in alphabetical order against the code sequence, as this will order the list of tree types and subsequently the Table of Contents or Legend. You could use any order that makes sense to your field team and improves efficiency.
Subtypes are a subset of features in a feature class, or objects in a table, that share the same attributes. They are used as a method to categorise your data. The only gotcha is that a subtype field is required and the subtype field must be of type integer. For more details on creating subtypes, read through the details here and remember that lateral thinking!
Tree Genus subtypes:
- 0 – None
- 1 – Acacia
- 2 – Annona
- 3 – Citrus
- 4 – Eucalyptus
- 5 – Ficus
- 6 – Pinus
- 7 – Ulmus
The subtype values and description will need to be typed in via the workspace properties and for this exercise, you could just copy the values from the spreadsheet Genus tab or the list above. You may want to define a range of default values for other attributes in your feature class related to each subtype (for instance the most common species within each Genus or even a particular default height value for the species).
The remaining worksheets (tabs) are Species names which are our Domains:
Tree Species domains:
For each species worksheet, there are 2 columns again (Code and Description). ArcGIS makes use of two domain types: Range and Coded. Range domains are numeric (float, integer, date, etc.) having a minimum and maximum bounded value. Coded domains are a set of specific values which are an integer or text type.
The domain description column has been populated with the common name of the species which can be useful for field teams who are not familiar with tree genera. If you wanted to list the tree species using the scientific taxonomy, you could replace the common name with the scientific name in the description, such that the Code and Description column has the same values. You could also change the Code column to use an integer set instead of a text set, just like we had done for Tree subtypes, above. However for this sample, the domain code uses a Text type.
The following domain listing is for the Citrus species:
Citrus Species domains:
- Citrus aurantifolia – Lime
- Citrus limon – Lemon
- Citrus medica – Citron
- Citrus reticulata – Mandarin
You will notice that we put the Genus name as a prefix in the Code. This was done for recognition and readability, considering the Tree taxonomy (classification) has hundreds of names. Of course, you could use any value you like. More information about working with domains can be found here.
A note on Domains and workspaces
Also remember that domains are workspace-related, meaning that its namespace applies across its workspace, in this case the geodatabase, where it is visible to all feature classes. This is different to subtypes which only apply to the assigned feature class.
This means that when you have many (hundreds) of domains, you need to use a naming convention such that they do not clash. In this case, we use the Species name as the Domain name, since they are unique.
Anyway, feel free to use this sample data model or develop your own data models where you may need one subtype with multiple domains!
Making use of the Attribute Assistant also helps minimise how much data is required to be collected in the field by replacing collection of attribute fields with a simple desktop workflow but that is a topic for another day…
Domains and Subtypes are really powerful components of your data model and hopefully you will start to see value in using them.
Happy data collecting,
Len and Geoffrey