I recently finished a fairly long-running contract with Formation Games, working on Club.
It was a fun project and a really great team who I loved working with. The one thing I perhaps didn’t love was that I was mostly using Flutter & Dart, with some Go thrown in. I’m not a Flutter hater, but I couldn’t honestly say that I loved it. Nor am I am Go hater, but I really did not get on well with the idioms that seem to be prevalent. Maybe I’m just too old and set in my ways :)
At the moment I’m taking a bit of time to recharge my batteries, and to think about what I want to do next.
Part of that has been trying to figure out where exactly my perfect project is located. Not geographically, but in terms of tools used, industry served, and so on.
I haven’t reached any solid conclusions, but I drew a big diagram on my whiteboard, with some headings circled.
Swift inevitably ended up floating in the middle, with the other headings orbiting.
Then I positioned some possible projects or products between the headings. As an overview, it all looks quite plausible!
These were the main headings:
This is all just food for thought at the moment, but I can see a hazy picture coming in to focus which makes some sort of sense… maybe…
The furore surrounding Github Copilot interesting.
I’m no lawyer (nor do I play one on TV), but my feeling is that it may expose a flaw in the FLOSS community’s ideas about ownership of code.
If so, this is a good thing. The flaw (if it exists) has not been created by Copilot. It was already there, it just hadn’t come to light.
Anyone who’s been coding for a while will have come across the situation where you’ve found some code with a license you can’t use, you’ve used the act of reading (or maybe even debugging) the code to teach yourself the solution to the underlying problem, and then you’ve written new code.
Then maybe you’ve felt uneasy and wondered if you’ve broken the rules.
Maybe all you did was cynically copy & paste and change a few variable names - in which case you probably did break the rules. Maybe though you genuinely rewrote it all from scratch. Maybe after rewriting you pretty much ended up with the same code because - well - that’s the best expression of the underlying solution to the problem you’re trying to solve.
For any such situation, there’s going to be a blurry line. What did I copy here, and what did I create myself? The implementation? The algorithm? The expression of the algorithm in the context of the particular languague I’m using? The implementation in the context of the problem I’m applying it to?
Furthermore, how is this process essentially different from the one undertaken by the author of the GPL’d code?
Can I be sure in any way that they themselves weren’t just re-expressing something that has prior art?
To look at it another way:
For any sufficiently small fragment of code, there’s likely to be a canonical way to express it. Taken to an extreme, a single line may well be infintely rewriteable, but one formulation is probably clearer, more compact, or better meets your particular criteria than any other.
In most cases it would be self-evidently ridiculous to assert that the GPL license applied to a body of code actually applies to each line in isolation.
If the line includes variable names, function names, comments, or other incidental metadata, it could be argued that they are not directly related to the pure meaning of that line. They do have meaning and value, but probably only in the context in which the line exists.
These names can be replaced, rewritten, or even randomised; this may obfuscate the meaning of the code, but it doesn’t stop it working.
Once you get to a small enough granularity, the same line of code almost definitely exists in countless other programs, both open and closed source, GPL’d or liberally licensed. The names might be different, but the meaning of the code is the same. The machine code instructions emitted by the compiler will probably be the same.
So what exactly are we arguing about here?
If something like Copilot is taking chunks of GPL’d source and pasting them into someone else’s program, how many contiguous lines does it have to paste before there’s a problem? Is there an arbitrary N number of lines that’s ok, where N + 1 is not ok?
If Copilot applied some natural language processing to infer the context that the lines are pasted into, and then automatically renamed the variables (or even rewrote comments) to use words appropriate to the new context, would that now be ok? Same code - different names?
If it randomised the names and stripped all comments, would that be ok?
Maybe we arrive at some formulation that states that a line is ok, but a whole function is not. Are we then allowed to apply the copying process to each line in turn? Refactor the function into multiple smaller functions and use them?
This all feels very wooly to me. The code represents a muddle of knowledge, experience, style, and algorithm.
Any assertion of the right to control how each of these things are used in isolation, or even recombined into a larger whole, feels like over-reach.
Worse, it may well be politically dangerous. If an entity can assert their right to apply copyleft to small fragments of code, doesn’t that logically mean that they are claiming ownership of the underlying meaning of those fragments?
Doesn’t that put us into territory where another entity can assert ownership of the underlying meaning of other fragments and choose to patent them or in other ways suppress their use by others? Isn’t that sort of what the Free/Libre side of the community is trying to avoid in the first place?
I’m not claiming any great insights here, and certainly not offering any solutions.
It just seems to me that the problem is a lot knottier than some people are making out.
It’s not self-evidently the case that what Github Copilot is doing is breaking the rules, any more than it is clear that what happens when I read someone’s GPL’d code and learn something from it is following the rules…
For the last few years, the default setting for all of the Swift code I write has been open source.
As a result, I’ve accumulated a vast number of Github repositories and Swift Package Manager packages.
However, I’ve been really bad at telling people that they exist!
This post is an attempt to start to fix that, by talking about one small package I’ve recently created: Matchable.
First though, some disclaimers:
Part of the barrier to telling people about things I’ve done is sheer time it takes to write even quite a simple post like this one.
So my first disclaimer is just to say that this post is mostly a re-hash of the README file from the Github Repository. Nothing wrong with that I think, but just to be clear…
My second disclaimer is that this is work-in-progress code from the real world.
I’ve encountered a few people who subscribe to a fundamentalist view of open source code: that it’s useless unless it is fully polished, fully tested, 100% supported and actively maintained.
I understand this point of view; we’ve all encountered code that makes great claims and turns out to be broken or mostly unfinished.
Respectfully though, those people are wrong.
Imperfect open-source code can be frustrating. However, it can also be a helpful foundation for someone else to build on, a good example of the pros and cons of particular technique, or a useful supplier of that one crucial line you have been searching the internet for.
Aiming for perfection is setting the barrier way too high. I am as insecure as the next person when it comes to showing my workings in public. I’ve been a professional programmer for more than three decades, but I still suffer from impostor syndrome.
It’s tempting to hide away, but I’m trying to fight the urge, and I’d like to contribute in some small way to an environment where we aren’t scared to risk being wrong.
I offer up all of my open-source code in this spirit. It’s not perfect, because I am busy, and because I am still writing it. I find this code useful, and I hope someone else might. If you do find that it is fundamentally broken, please tell me why. That way I learn something.
That said…
The Matchable protocol defines a way to compare two objects, structures or values for equality.
Unlike the Equatable
protocol, Matchable
works by throwing an error when it encounters a mismatch.
You can view this as an assertion of equality. For this reason, the primary method is named assertMatches
.
This makes for compact code since you don’t need to write explicit return statements for every failed comparison.
It also allows the protocol to handle compound structures intelligently.
If a matching check of a structure fails on one of its members, the matchable code will wrap up the error thrown by the member, and throw another error from the structure.
Any catching code can dig down into these compound errors to cleanly report exactly where the mismatch occurred.
You can check that two values match with:
try x.assertMatches(y)
A sequence of checks can easily be performed – the first failure will throw, causing the remaining checks to be skipped:
try int1.assertMatches(int2)
try double1.assertMatches(double2)
try string1.assertMatches(string2)
A type can implement matching by conforming to the Matchable
protocol, and defining the assertMatches
method. Inside this method it can perform the necessary checks.
If it finds a failure, it can throw a MatchFailedError
to report the mismatch.
Implementations of assertMatches
are provided for most of the primitive types, and a few Foundation types (I’ve just done the ones I needed for now - pull requests gratefully received…).
Although you can match primitive value types, the protocol comes into its own when performing memberwise matching of compound types (structs, objects, etc).
In this case a type can conform to the MatchableCompound
protocol, and defining the assertContentMatches
method.
This works the same way as the basic assertMatches
method, except that if a check throws an error inside this method, the error will be wrapped in an outer error reporting that the whole structure failed to match.
As a convenience, we also define a form of assertMatches
which takes a key path or list of key paths, and calls assertMatches
on each path of two objects in turn.
This helps to keep down the amount of boiler-plate code to a minimum.
Here’s an example combining keypaths and the MatchableCompound
. This tests the matchability of a structure that has 13 properties, and manages to do it with a minimum of boilerplate.
extension Task: MatchableCompound {
public func assertContentMatches(_ other: Task, in context: MatchableContext) throws {
try assertMatches([\.state], of: other)
try assertMatches([\.name, \.icon, \.details], of: other)
try assertMatches([\.started], of: other)
try assertMatches([\.hasDescription, \.hasDuration, \.isScheduled], of: other)
try assertMatches([\.duration], of: other)
try assertMatches([\.scheduledHour, \.scheduledMinute], of: other)
try assertMatches([\.streaks], of: other)
try assertMatches([\.restDays], of: other)
}
}
Note that currently if you pass a list of keys, they all have to resolve to members of the same type. Unfortunately this somewhat reduces the helpfulness of this method.
The original motivating use-case for this protocol was unit testing, where it’s often necessary to compare two instances of something, and useful to be able to identify the exact point of divergence.
Whilst I still see this as the primary use for the protocol, I have split it out into a standalone package as it may be helpful in other places.
The fact that Matchable
is different from Equatable
is an advantage for unit testing, as it allows both to co-exist.
In your code, you might define Equatable
to only check part of a structure (a unique identifier, for example).
This is good for efficiency in production code, but no use for test code where you really do want to know if all members are equal.
In this situation you can define a thorough check with Matchable
, and use that for unit testing, without interfering with the efficient implemention of Equatable
.
Initially this protocol was defined as part of my XCTestExtensions package.
That package includes some additions to XCTAssert
which use Matchable
to let you perform matching checks:
XCTAssert(savedModel, matches: reloadedModel)
This assert method catches any errors and presents them in a nice way by calling XCTFail
, identifying the exact point of failure.
Because of the way the match-failure errors are wrapped for compound structures, the method can call XCTFail
at all levels of the failure, which results in Xcode showing an error marker at all levels.
This can be helpful when tracking down a mismatch in a deeply nested structure.
This is an early implementation, based on code pulled from elsewhere.
The API probably needs tweaking, and the methods definitely need documenting.
I also intend to explore the idea of using Swift’s introspection to automatically generate assertMatches
for structures/classes.
In theory this should work well, but it’s possible that it will hit wrinkles.
All feedback, suggestions, pull requests and bug reports gratefully received!
I can only explain it as lock-down madness, but a couple of weeks ago I decided to have a little play around with Vapor.
What I wanted to do, initially, was just make a simple website that did user authentication. You could register, login, and logout. If you were logged in, it knew who you were. If you were logged out, there were things you couldn’t see.
Now I’m no web developer. Admittedly I did write a WYSIWYG html editor in Hypercard, in about 1994, but I’m no web developer.
Ok, I might have also written a complete CMS using Hypercard as a CGI engine for MacHTTP, also around that kind of time, but honestly, I’m no web developer.
If really pressed, I might admit to having had a job creating the first interactive shopping basket for Robert Fripp’s DGM record label’s website in about 19981 - a job which I had to learn Perl for2 - but if that goes to prove anything, it is that I really am not a web developer.
Still, how hard could it be, right?
I was working at Abbey Road at the time. Yes, that Abbey Road. ↩
I still feel dirty ↩
I have been accused (by myself, mostly), of being a bit too much of a purist sometimes. It’s true that I do like things to have an intellectual rigour to them, but it’s mostly about being honest and clear with ourselves about what we’re doing and why. I welcome the application of common sense, and I’m fine with taking shortcuts as long as they’re consciously chosen for a good reason.
I’d like to think that I’m a pragmatist…