Release Tools Tagging

After my last post, I thought a bit more about Release Tools, and decided that requiring a tag was definitely the right way to go, and over a couple of days last week I implemented it.

I also decided that it was a big enough change that I may as well call this Release Tools 4.0, and take the opportunity to clean up and remove some legacy code.

Tagging

Managing release tags in rt is now a separate step.

Running rt tag will examine the existing tags, and the HEAD commit.

If there’s a release tag already at HEAD, it will exit with an error.

If not, it will figure out the latest version number build number in use, by examining the existing tags.

It will then make a new tag using the same version, and with an incremented build.

This tag will be in the form: vX.Y.Z-NNNN where X.Y.Z is the semantic version, and NNNN is the build number.

Note that this does not include the -platform component that rt previously used.

From 4.0 forwards, we assume that there will be just one version tag for any given commit, and we will use that tag for all platforms.

We do however still support the old format when scanning previous tags – so rt will pick up any legacy tags and correctly calculate the new build number.

Archiving And Submitting

With 4.0, if you run rt archive or rt submit, the first thing that rt does is examine the HEAD commit, looking for a version tag in the format that rt tag creates.

If no tag is found, it will exit with an error. This ensures that any build you submit from a given commit will be tagged correctly, and that if you submit multiple platforms from the same commit, they are guaranteed to have the same build number.

Continuous Build Injection

If your project is set up to run the rt update-build every time you build, it would clearly be unworkable for it to require a version tag to exist at the HEAD commit.

If there is a tag at HEAD, rt update-build will use it.

Otherwise, it scans backwards to find the highest previous build tag, and then adds one to the build number.

In general I would discourage using rt for debug builds, since running a script for every build will slow Xcode down a bit, and having the build number available is of limited value.

However, the rt update-build command may still be useful if you have a custom build pipeline but want to use rt to calculate and/or inject build information.

Build Variables

Version 4.0 of rt also changes the names of the variables that are injected, and adds a new variable.

We now set the following variables in the .h, .xcconfig and .plist files:

RT_BUILD
RT_COMMIT
RT_VERSION

These three variables are also passed to xcodebuild on the command line.

If you were using rt already, you will need to adjust your project accordingly.

The rationale for the change was that it was probably better to use our own variables, rather than using CURRENT_PROJECT_VERSION which already has a meaning within Xcode.

That said, you probably will want to set CURRENT_PROJECT_VERSION to the value of RT_BUILD. You can do this in the build settings for the project, or each target, or by setting up an .xcconfig file which includes the line CURRENT_PROJECT_VERSION = $(RT_BUILD).

The semantic version that is injected is taken from the build tag that rt tag creates. By default it’s the same version that the previous tag had, but you can change it explicitly by doing rt tag --explicit-version <x.y.z>.

This allows you to make the git version tag the single-source-of-truth for all version information, if you so desire, and to inject the semantic version into all other settings and plists.

Legacy Cleanup

In rt 4.0, two commands have been removed.

The rt install command used to exist as a quick way of linking rt into your path. In general I think it’s better that you install rt with a tool such as Mint. However, some people will want to build from source, or do something else, and each installation method is likely to involve a different way to update $PATH. On the whole, this feels like it’s something that it out-of-scope for rt itself.

In the early days of rt, the design encouraged you to run it via some standard shell scripts, and also to include some standard .xcconfig files in every target. The rt bootstrap command existed to help you copy (or update) these scripts and config files into your project. This way of working with rt is obsolete, and so the command was a bit of an unnecessary legacy.

For the sake of simplicity, these two commands are no longer supported.

The fallback mechanism for calculating a build number by counting git commits has also been removed.

Development Notes

Doing these updates was another opportunity to experiment with AI code generation, and that’s what I did.

Rather than making the code changes myself, I largely tried to instruct Copilot to do them for me.

This was an interesting process.

The codebase is fairly mature, and a lot of what I was asking for involved refactoring existing code to add a little bit of new functionality. The results were mixed, and once again, it felt quite akin to working with an enthusiastic and willing colleague who had a bit less experience.

The first challenge was to express clearly what exactly it was I wanted. Much like test-driven development, I quite enjoy the fact that this forces you, early on, to work out what you actually do want. Also much like test-driven development, I didn’t always do a good job of working it out to begin with. Luckily, AI coding agents have infinite patience, and probably don’t bitch about you behind your back whilst having lunch with the other juniors¹.

In general the code that came back did work, and performed as intended. It was however often repetitious. The AI rarely showed insight about the codebase as a whole, and it had to be explicit prompted to clean up duplication and to create appropriate abstractions, methods, or constants in order to generalize. Again, this is quite like working with someone less experienced. They focus on the problem you’ve set them, and they stop when they think they’ve got it working. They don’t see the larger picture, and don’t consider consistency or maintainability of the codebase.

Part of this I think may be down to the fact that I had not set any general instructions in the project. I have now added a copilot-instructions.md file, and am experimenting with trying to set some standards in it.

One thing that I did like, and do think is useful, is that the AI generally created accompanying tests as a way of verifying each change I’d asked it to make. This is something that I aspire to, but don’t always manage, and having another “person” encourage me to work that way is excellent.

It also proved to be the right approach, in that it unearthed some problems; not just in the new code, but in some of the dependencies that I was using. These were subtle concurrency issues which mostly manifested when performing tests in parallel, and they may not have impacted actual normal usage of the tool², but that makes it all the more valuable that I was forced to confront them.

At the end of this process I did find myself wondering whether I could have achieved what I wanted faster if I’d just done it myself. I think that perhaps I could, and the code would have been tighter, for a narrow definition of “what I wanted”.

However, having an essentially tireless worker, who pro-actively volunteered to make tests, encouraged me overall to do a better job. Whilst getting Copilot to fix the duplication that it had created, I was also motivated to ask it to take a number of other refactoring steps aimed at improving consistency and making the code that I had previously written more idiomatic. Whilst tracking down the concurrency issues in the dependency I was using to run subprocesses (another of my projects, called Runner), I think I improved my general understanding of the surrounding issues, whilst also fixing some bugs I probably didn’t know I had in other projects using the same package.

Overall, I continue to be cautiously positive about using Copilot. I think that the problem domain of programming is a good one, especially when the output from the AI tool is small enough that it can be reviewed by a human expert who can provide feedback.

Asymmetry

I was talking to friend over the weekend about a situation where they were using AI to review large documents. This is a very different scenario, and the balance is completely reversed.

In my case, the AI tool is taking a vast amount of external knowledge and applying it to produce a small amount of new content, which is easily reviewed.

I don’t need to know where the ideas behind the solution came from, I just need to be able to read the new code and evaluate whether it works, and whether it fits in with the existing codebase.

This is quite asymmetrical, but in a direction that is positive.

In my friend’s case, the output from the AI was going to be a report, but the accuracy of that report could only be evaluated by essentially doing the same job that the AI had been asked to do.

From my own experience I’ve seen the AI make more than enough mistakes, and disappear down enough rabbit holes, that I would never trust its answers without being able to at least scan the output to verify it.

I wouldn’t want to use it to analyze or summarize a large body of information unless I had a pretty reliable way to validate the results.

Future

I continue to use rt for my own projects, and so I expect that it will evolve further over time.

One item on my to-do list is to try to add support for installing it via homebrew.

That’s not something I really need myself right now though, and so it may take me a while. Pull requests gratefully received…

This may be a wildly optimistic assumption. ↩
Most of the time. Maybe… ↩