Two Guys and a Toolkit - Week 4: Grouping

Oct 8, 2015



New to Shotgun or Toolkit? Check out Our Story, learn about Our Product, or get a Quick Toolkit Overview. If you have specific questions, please visit our Support Page!


Welcome back for part four of our series dedicated to building a simple pipeline using Toolkit.

Last week, we talked a lot about publishing and how various types of data flow down the pipe in our simple pipeline. This laid the foundation for future discussions about new features that do not come out of the box with Toolkit. We will revisit the topic of publishing in future posts when we will talk about larger potential features. We also plan to devote some time to discussing the more theoretical and philosophical aspects of publishing and digital asset management. As we move into that realm it will be important for us to have an open dialog with all of you. These topics will be less about how Josh and I did something and more about what could be done, and what approaches to asset management and publishing work best in what situations. It would be great if everyone could start thinking about this now, and if you have some thoughts on how we should approach this or what points we should be sure to hit, please let us know!

This week, I’ve been working on a grouping mechanism for published files and have a working proof of concept implementation that I will share. We’ll discuss the bits and pieces, but also the potential uses for such a feature.

Below are a few of the pages that we found useful for this week’s post:

All About Fields
Query Fields
Publishing and Sharing your Work
Load Published Files
App and Engine Config Reference: Hooks

Grouping Published Files

A simple explanation of what a group of published files is would be to say that they are a collection of multiple PublishedFile entities represented by a single item in Shotgun. From a user’s perspective, using the loader app to reference or import this group would result in ALL of the group’s contents being referenced or imported.

The grouping of published files is not a concept that is native to Shotgun, but it is something that can be added. It requires a number of small changes, and a truly-flexible implementation would take a good bit of thought and additional development beyond what I’ve put into the proof of concept that we will be discussing.

Why Group Publishes?

We should talk about why we would want to do this. What are the advantages of being able to group published files together?

What problems we can solve with this depends on what area of the pipeline we are looking at using it in. Below are a few example use cases, but I’m sure all of you can come up with many more. If you’re using something similar in your studio, or even if you have an idea of how it could be used that I’ve not covered, let us know!

Rendered Elements:

Publishing rendered elements from lighting to be used by a compositor can often produce a large number of published files. Grouping these published files before they flow down the pipe can allow the tools to logically structure these elements in a way that informs other code, like the routine that imports a published image sequence into Nuke, on how these elements fit together. Add to that the ability to store some sort of metadata file (or even a pre-built Nuke script?) as part of the group and it’s easy to see how quite a bit of information about the collection of elements can be gathered and sent along the way.

Another advantage to this sort of organization of published files is that we have a point in time when we know that a set of files are intended to be used together. We can group those compatible published elements so that when the compositor loads those into their session they know that they are getting everything, and that each element should be compatible with all of the others.

Another possibility would be to group in the camera(s) used to render the elements, along with any Alembic caches. This would help the compositor get everything they need into their Nuke script required for 3D compositing, and guarantee that they are using the same camera and geometry caches that the lighter or FX artist used to produce the rendered elements.

Look Development:

An Asset is made up of a number of components by the time it is ready for use in a shot. This typically includes a model, texture maps, shaders, and a rig. Each of these components come together to make the logical whole of the Asset, and a change to one component often requires updates to one or more of the others. An example is that when a model changes, if that requires the UV layout to also change, then the texture maps produced for the previous version of the model might need to be tweaked before they can be used on the new model. Similarly, topology changes might require rigging to adapt their work to the new model.

Given that at some point along the way we know what texture maps and rig pair properly with a specific version of the model, we could group those published files together. We would have a nice package of files that we know are meant to be used together.

Geometry Caches:

We discussed last week publishing multiple Alembic caches out of a scene file rather than a single cache containing the entire scene. One situation where that can be used to our advantage is when multiple resolutions of a mesh are exported and published from an Asset’s rig. It’s fairly common to build more than one resolution of geometry into a rig, as this allows background characters (or, even more importantly, crowds) to be made up of lower-density geometry than foreground, hero characters. It is also typical for these different geometry resolutions to be incorporated into a single rig, as it allows a rigging team to develop and maintain one rig rather than duplicating effort across multiple rigs for the same character.

This setup lends itself well to exporting an Alembic cache per resolution of the character. In this way, in Maya we would end up with a cache reference per resolution of Asset, and we could provide a tool that unloads/reloads those references when the user requests a specific resolution of the Asset to be used.

As for how grouping comes into play, the idea would be to bundle up all of the cache resolutions that were published for an Asset and provide a single group that has linked to it each of those caches as children. When a user loads the group into their scene, that group flattens out to a component list of published files and each one is referenced into the scene accordingly.


The basic implementation of groups as I have built them is simple. I added a children field to the PublishedFile entity in Shotgun. This field is configured to take a list of other PublishedFile entities, which are then considered to be its children. It’s simple and flexible, and creating new fields is a piece of cake.

This led to some difficulty, however. I wanted to add a couple more fields related to child entities and figured I would be able to make that happen with query fields. I was mostly right about that, and was able to get one of the two working after some frustration. I wanted to create is_group and child_count fields. The former comes in handy if you want a quick yes/no answer on whether something is a group, and the latter is more useful in the Shotgun web interface, as it’s a visual indicator of how many children a group has. I wanted is_group to end up as a boolean field, but I was not able to figure out how to make that work as a query field. I’m not saying it isn’t possible, only that I got frustrated and moved on before figuring it out. As for child_count, I did get it to work, as you can see:

You can also see that it took me 36 publishes to get one that was completely correct.

I got it to work, but I honestly don’t know why or how. Below is what I did to make it work, but I couldn’t for the life of me describe why that gives me the correct behavior. I just tried stuff until it did what I wanted it to do, but the words in that query field configuration dialog make little or no sense to me. I’m sure there’s a logical structure there, but to me it is nearly inscrutable.

I don't know what this means.

Initially, I purposely did not speak with an expert about the hows and whys related to query fields. I figured I would take a crack at it the way that I normally would have when working for a studio and see how it went. Randomly flailing about until it works is a time-honored tradition of mine, but it’s obviously not the ideal way to have to learn something.

Since then, I’ve had the opportunity to take a second look at this, and also got some feedback from some of the team. The general consensus seems to be that it would be best to not use query fields for this sort of thing at all. Instead, it would be better to make them normal fields and have the publish routine populate them at the time the group is created. This is simple and also bypasses the limitations of query fields; you can’t filter or sort on them, and they’re not accessible via the Python API.

Code and Configuration:

From a code and configuration standpoint, there were a few hoops to jump through. My first thought was to make use of the publish app’s post_publish hook to build the groups. It looked great, because it’s already provided a list of secondary publish tasks, which is exactly the list of things that I want to group. There were problems, though, as the secondary tasks did not come with the accompanying published file records from Shotgun, and I didn’t want to have to go to the database and look that stuff up again when I knew that the data was already available in the secondary_publish hook. What I ended up doing was taking the data returned from the publish routine and shoving it into the secondary publish item. Since that item is a reference to the same dictionary that is passed to the post_publish hook, I was able to save myself a trip to the database. You can see that here and how I extracted and used it here.

There’s another problem, as well, which is that in the official release of the publish app, the post_publish hook does not receive the same arguments as the other publish hooks.

This is most unfortunate.

Since I was planning to publish something from this hook I needed more than I was given, which exposes a bit of an unfortunate circumstance. Hooks are intended to allow for customization without the need to take ownership of an entire app. This is fairly successful, but with this I ran into something that required that I take control of the app itself, because I needed to change what a hook receives.

What I’ve done is fork tk-multi-publish, which can be found here. The changes are minor, and all of it is simply to provide more bits of data to the post-publish hook. The specific commit for these changes can be found here.

A more flexible solution might be to use the parent application object to store a dictionary of data that can be shared between hooks. That would allow one hook to store something away that another executed later could make use of. We will be discussing this as a team very soon, as it’s a situation that comes up often and it would be good to have a general-purpose solution for it.

As for configuration changes, I had to update several small things. You’ll notice in this commit in my forked publish app that there are two new keys added to the app’s info.yml file; one specifies whether to group a secondary output type’s publishes, and the other specifies the name of that group should one be created. The rest of the configuration changes can be found here and are very straight forward.

I know that I could have avoided altering the app itself by performing the grouping operation in the secondary publish hook, because it would have had access to all of the data that I needed already. Had I been doing this as a TD at a studio that’s exactly what I would have done, but in this case it seemed like a good example of the limitations of how publish hooks work.

Manifests and Metadata:

You might have noticed that the screenshot earlier in this post showing my group in the Shotgun web interface lists the path as a JSON file.

If you did then you also probably noticed the template that was added to templates.yml. In my hacked-together implementation, all I’ve done is shove the children of the group into a JSON file and used that as the path for my group.

This could just as easily contain nothing or anything deemed useful. Josh also had the idea for groups of rendered elements that the path for the group could be a Nuke script. It might be really cool for, as an example, an FX artist pumping out complex elements from Houdini to do their slap comp of everything the way that it should be put together, and then publish a group of elements with the group itself containing the node network that properly pieces the elements together. This would allow for custom comp setups curated by the artist handing off the elements. The possibilities are vast, so use your imagination and then tell us about it!


That’s it for this week. I hope that we’ve given everyone some things to consider and talk about. As always, we would love to hear from anyone that has comments, whether they be public or private. In fact, a portion of next week’s post will be directly related to comments we’ve received, as Josh will be going into publishing rendered elements from Maya for use in Nuke. In addition, he will be outlining how we’ve been managing our custom code via the tk-framework-simple framework. We will also update everyone about some small tweaks to Toolkit that we will have made (or will be making) that have come out of this project. Our hope is to continue to find things that we can immediately put into use that will make life easier for everyone using Toolkit.

Next week marks the halfway point for this blog series. Our intention is to dive into larger, more discussion-heavy topics as we progress. Some of those potential topics I mentioned at the end of Week 3, but we are always open to ideas, so feel free to let us know things you would like to see in the future!

About Jeff & Josh

Jeff was a Pipeline TD, Lead Pipeline TD, and Pipeline Supervisor at Rhythm and Hues Studios over a span of 9 years. After that, he spent 2+ years at Blur Studio as Pipeline Supervisor. Between R&H and Blur, he has experienced studios large and small, and a wide variety of both proprietary and third-party software. He also really enjoys writing about himself in the third person.

Josh followed the same career path as Jeff in the Pipeline department at R&H, going back to 2003. In 2010 he migrated to the Software group at R&H and helped develop the studio’s proprietary toolset. In 2014 he took a job as Senior Pipeline Engineer in the Digital Production Arts MFA program at Clemson University where he worked with students to develop an open source production pipeline framework.

Jeff & Josh joined the Toolkit team in August of 2015.